voice recognition to text - maemo.org

Active Topics

Firefox with Leste (5)
to Maemo 7 / Leste by teroyk - 1 day, 9 hrs ago
Handbook (2)
to Maemo 7 / Leste by teroyk - 2 days, 9 hrs ago
more...

Page 1 of 2

Next >

Thread Tools

kelkinny2004	2007-02-24 , 05:23
Posts: 31 \| Thanked: 5 times \| Joined on Feb 2007 @ san jose, ca	#1

Hi, I don't have a N800 yet, but I have been watching since the 770 and am very fascinated by its potential.

I wondered if there is an application for converting voice, from the mic to text on the screen.

My elderly father is really hard of hearing, he does have hearing aids, but I never know if he really hears me. He can still read though<G>.

Is there an application for this device to translate my voice into text, through the microphone, for him to see. I figure I would face the N800 toward him.

thanks for any help

Quote & Reply |

mwiktowy	2007-02-24 , 05:37
Posts: 373 \| Thanked: 56 times \| Joined on Dec 2005 @ Ottawa, ON	#2

Originally Posted by kelkinny2004

Hi, I don't have a N800 yet, but I have been watching since the 770 and am very fascinated by its potential.

I wondered if there is an application for converting voice, from the mic to text on the screen.

My elderly father is really hard of hearing, he does have hearing aids, but I never know if he really hears me. He can still read though<G>.

Is there an application for this device to translate my voice into text, through the microphone, for him to see. I figure I would face the N800 toward him.

thanks for any help

While there is a text-to-speech engine (flite) available, what you want is the other way around ... a speech recognition app. I have not heard of one for the N800. Unfortunately, I would suspect that there won't be since speech recognition is pretty CPU intensive and the N800 isn't a huge number cruncher. You never know though. Someone might find an efficient enough algorithm that matches the capability of the N800 so I would never say never.

Quote & Reply |

mwiktowy	2007-02-24 , 05:47
Posts: 373 \| Thanked: 56 times \| Joined on Dec 2005 @ Ottawa, ON	#3

Originally Posted by mwiktowy

Someone might find an efficient enough algorithm that matches the capability of the N800 so I would never say never.

Doing some digging around, it looks like there is something that would be appropriate is someone could do some integration work for the N800 (or maybe even the 770.

http://www.speech.cs.cmu.edu/pocketsphinx/

Quote & Reply |

msaunby	2007-02-24 , 10:23
Posts: 78 \| Thanked: 9 times \| Joined on Dec 2005 @ Devon, UK	#4

It's important not to forget that the 770 and N800 are *Internet* tablets. In the first instance if I were developing something like this I'd go for creating a service I could connect to - think media server style. Much easier to do the development that way. It might even create a better product - you could speak into one 770 and your father could read on another.

It might be worth suggesting such a product/service to these folks - http://www.spinvox.com/

Quote & Reply |

	Karel Jansens	2007-02-24 , 12:01
	Posts: 3,220 \| Thanked: 326 times \| Joined on Oct 2005 @ "Almost there!" (Monte Christo, Count of)	#5

Originally Posted by mwiktowy

While there is a text-to-speech engine (flite) available, what you want is the other way around ... a speech recognition app. I have not heard of one for the N800. Unfortunately, I would suspect that there won't be since speech recognition is pretty CPU intensive and the N800 isn't a huge number cruncher. You never know though. Someone might find an efficient enough algorithm that matches the capability of the N800 so I would never say never.

IBM did it on my Pentium 75 with 64 MB of RAM, ten years ago, with a software-only solution. Granted, that was OS/2, so the rest of the world will probably have to wait another decade.

So, no: voice recognition is not that CPU-intensive (these days). It is, however, quite algorithm-intensive, which seems to be what is lacking in the postmodern world.

Quote & Reply |

	konfoo	2007-02-24 , 14:17
	Posts: 116 \| Thanked: 12 times \| Joined on Dec 2005 @ OC, CA	#6

Well I have Sphinx compiled.. now its just a matter of figuring this blasted thing out and pointing the 64Mb speech base lib to the mmc... more news soon

Quote & Reply |

	konfoo	2007-02-24 , 16:54
	Posts: 116 \| Thanked: 12 times \| Joined on Dec 2005 @ OC, CA	#7

Ok this sucker is more time-intensive than I am willing to spend. If anyone wants to help out post to this thread. We need a Sphinx-expert to configure the speech templates and a /dev/dsp pocketsphinx_continuous script.

Quote & Reply |

mwiktowy	2007-02-24 , 18:32
Posts: 373 \| Thanked: 56 times \| Joined on Dec 2005 @ Ottawa, ON	#8

The section of that website that caught my attention was:

You can also download telephone-bandwidth models separately. To use these with raw audio data you need the following extra command-line options:

-nfft 256
-nfilt 31
-lowerf 200
-upperf 3500
-samprate 8000

Since the 770 and N800 seem to capture audio at 8000 Hz sampling rate (based on the maemorecorder abilities), this voice model might be the way to go. Plus it is only 8 MB or so rather than the 25 MB that you speak of.

They are available here:
http://www.speech.cs.cmu.edu/pockets...linterp.tar.gz

Quote & Reply |

	Karel Jansens	2007-02-24 , 18:53
	Posts: 3,220 \| Thanked: 326 times \| Joined on Oct 2005 @ "Almost there!" (Monte Christo, Count of)	#9

This is what caught my eye on the Carnegie-Mellon site (http://cmusphinx.sourceforge.net/html/cmusphinx.php):
"Note however that Sphinx is not a final product. Those with a certain level of expertise can achieve great results with the versions of Sphinx available here, but a naive user will certainly need further help. In other words, the software available here is not meant for users with no experience in speech, but for expert users."

Aren't we in over our heads here?

(BTW, the phrase "users with no experience in speech" is kinda funny)