Go Back   maemo.org - Talk > OS / Platform > Development
 
Register FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools
  #1  
Old 2011-04-27, 15:11
mc_teo mc_teo is offline
 
Join Date: Aug 2010
Posts: 7
Thanks!: 2
Thanked 47 Times in 3 Posts
Default [devel] PocketSphinx for Fremantle (Speech Recognition Engine)

Hello there, just a quick post on how I got PocketSphinx to work on my n900, as well as a basic python application to test your setup. I take no credit for anything on this thread, except the time spent putting these all together.


I downloaded all the .debs from http://repository.maemo.org/extras-d.../pocketsphinx/ into a new directory, removing any i386 specific .debs.

As root, I ran "dpkg -i *" and they tried to install, but were stopped, due to unment dependencies. (for me it was just python2.5-dbg)

This sucessfully ran, and installed pocketsphinx.

To try it out and make sure everything has installed correctly, run "pocketsphinx_continuous", and wait for everything to load. When prompted with "Ready..." say something clearly in the phones direction, (I used "Hello"). After another load of text there should be "000000001: hello (-12345676)".


To get the gstreamer hooks working, I had to install the package "gstreamer-tools".

After this I raw the Script here from the CMUSphinx example, tweaked to work for pulseaudio, http://pastebin.com/zCYzX65Z

Press the "Speak" button, then say your few words, and the textbox with update to show what you have said.

N.B. It uses the en_US acoustic model by default, therefore I had a good few mistakes at first which I attrute to my Irish accent.



This is another little sample that uses the JSGF grammer specification, and tries to interpret speech from a .wav file saved locally. (This needs to be recorded at 8khz mono, also)

==File grammer.jsgf==
PHP Code:
#JSGF V1.0;
grammar goforward;
public <
move> = go <direction> <distance> [meter meters];
<
direction>= forward backward;
<
distance>= (one two three four five six seven eight nine ten twenty)+; 
==File speechtest.py== (with myrecording.wav as the recording to interpret)
PHP Code:
#!/usr/bin/python
import pocketsphinx as ps
decoder 
ps.Decoder(jsgf=/path/to/your/jsgf/grammar.jsgf’,samprate=’8000&#8242;)
fh open(“myrecording.wav”“rb”)
nsamp decoder.decode_raw(fh)
hyputtidscore decoder.get_hyp()
print 
“Got result %%d” % (hypscore

Last edited by chemist; 2011-04-27 at 15:17. Reason: topic
Reply With Quote
  #2  
Old 2011-04-27, 15:26
Boemien's Avatar
Boemien Boemien is offline
 
Join Date: Mar 2010
Location: Abidjan
Posts: 770
Thanks!: 1,651
Thanked 558 Times in 254 Posts
Default Re: [devel] PocketSphinx for Fremantle (Speech Recognition Engine)

Yeah it seems interesting but noobs, like me of course, need some screenshots. Thanks in advance!!!
Reply With Quote
  #3  
Old 2011-04-27, 15:44
joerg_rw's Avatar
joerg_rw joerg_rw is offline
 
Join Date: Mar 2010
Location: SOL 3
Posts: 2,222
Thanks!: 3,399
Thanked 12,651 Times in 1,970 Posts
Default Re: [devel] PocketSphinx for Fremantle (Speech Recognition Engine)

many thanks for this kickoff. I think this can be the start of a nice project to bring a missing feature to N900.

/j
__________________
Maemo Community Council member [2012-10, 2013-05, 2013-11, 2014-06 terms]
Hildon Foundation Council inaugural member.
MCe.V. foundation member

EX Hildon Foundation approved
Maemo Administration Coordinator (stepped down due to bullying 2014-04-05)
aka "techstaff" - the guys who keep your infra running - Devotion to Duty http://xkcd.com/705/

IRC(freenode): DocScrutinizer*
First USB hostmode fanatic, father of H-E-N
Reply With Quote
The Following 5 Users Say Thank You to joerg_rw For This Useful Post:
  #4  
Old 2011-04-27, 15:48
skykooler skykooler is offline
 
Join Date: Oct 2010
Posts: 482
Thanks!: 467
Thanked 550 Times in 221 Posts
Default Re: [devel] PocketSphinx for Fremantle (Speech Recognition Engine)

So...would it be possible to use this for voice dialing via a dbus call?
Reply With Quote
The Following User Says Thank You to skykooler For This Useful Post:
  #5  
Old 2011-04-27, 16:28
niloy niloy is offline
 
Join Date: Feb 2011
Location: India
Posts: 105
Thanks!: 33
Thanked 99 Times in 26 Posts
Default Re: [devel] PocketSphinx for Fremantle (Speech Recognition Engine)

great, now if only someone could integrate it with the text editor of the phone
Reply With Quote
  #6  
Old 2011-04-27, 16:50
leojab leojab is offline
 
Join Date: Apr 2010
Posts: 102
Thanks!: 190
Thanked 23 Times in 18 Posts
Default Re: [devel] PocketSphinx for Fremantle (Speech Recognition Engine)

This is just great news and thanks mc_teo...
Now that joerg_rw is interested in this project.. it will be a greater news soon :-)

Last edited by leojab; 2011-04-27 at 17:15.
Reply With Quote
  #7  
Old 2011-04-27, 18:20
cfh11's Avatar
cfh11 cfh11 is offline
 
Join Date: May 2010
Location: Boston, MA
Posts: 1,062
Thanks!: 1,188
Thanked 961 Times in 392 Posts
Default Re: [devel] PocketSphinx for Fremantle (Speech Recognition Engine)

Awesome! Now if this becomes feature complete and incorporated into the CSSU that would be a dream come true...
__________________
Want to browse streamlined versions of websites automatically when in 2g? Vote for this brainstorm.

Sick of your cell signal not reconnecting after coming out of a bad signal area? Vote for this bug.
Reply With Quote
  #8  
Old 2011-04-28, 17:36
joerg_rw's Avatar
joerg_rw joerg_rw is offline
 
Join Date: Mar 2010
Location: SOL 3
Posts: 2,222
Thanks!: 3,399
Thanked 12,651 Times in 1,970 Posts
Default Re: [devel] PocketSphinx for Fremantle (Speech Recognition Engine)

voice-call via dbus: should be rather simple, as long as you start the speech input engine on headset pushbutton and use a small set of pretrained contact name vocabulary.

integration with text editor: an ambitious project, as the vocabulary is virtually unlimited

@leojab: I'm planning to come up with a system architecture RFC eventually, so this could actually integrate into hildon/maemo seamlessly. NB you want both a) use speech input with unpatched possibly even closed source apps, and also work on several concurrent apps without multiple instances of pocketsphinx fighting each other
@cfh11: regarding my comments 1 line above I think we might integrate this in a way we can deploy it via extras, no need for CSSU. Well maybe hildon-desktop needs some hooks for cooperating with speech controlled task switching etc

/j
__________________
Maemo Community Council member [2012-10, 2013-05, 2013-11, 2014-06 terms]
Hildon Foundation Council inaugural member.
MCe.V. foundation member

EX Hildon Foundation approved
Maemo Administration Coordinator (stepped down due to bullying 2014-04-05)
aka "techstaff" - the guys who keep your infra running - Devotion to Duty http://xkcd.com/705/

IRC(freenode): DocScrutinizer*
First USB hostmode fanatic, father of H-E-N
Reply With Quote
The Following 11 Users Say Thank You to joerg_rw For This Useful Post:
  #9  
Old 2011-05-01, 00:15
mc_teo mc_teo is offline
 
Join Date: Aug 2010
Posts: 7
Thanks!: 2
Thanked 47 Times in 3 Posts
Default Re: [devel] PocketSphinx for Fremantle (Speech Recognition Engine)

So, I haven't been working too hard on this, due to school and all, but I have put together this Demo of what can be done.

I have attached a player.zip. within this archive, find three files, "player.py" which is the main script, "dict.lm" which contains some language stuff, and "dict.dic" which contains the dictionary.

so ensuring pocketsphix in installed, as outlined in my first post, run this script.

if the default mediaplayer is not open, it will attempt to open it, via a dbus command (and complain about file not found). so perhaps opening it before hand is the best solution.

then start the script, and you will be presented with a simple form. press enable to enable, and then say either play/stop/pause/resume/next/previous to run a command.

English only supported at the moment.

happy speaking

~mc_teo
Attached Files
File Type: zip player.zip (2.9 KB, 163 views)
Reply With Quote
The Following 9 Users Say Thank You to mc_teo For This Useful Post:
  #10  
Old 2011-06-16, 18:40
Flandry's Avatar
Flandry Flandry is offline
 
Join Date: Oct 2009
Location: Boston
Posts: 1,559
Thanks!: 951
Thanked 1,786 Times in 648 Posts
Default Re: [devel] PocketSphinx for Fremantle (Speech Recognition Engine)

Good to see this getting some attention after it was passed over for the GSoC last year *.

A possibly less cumbersome alternative way for the curious to install is using fapman (choose the "All packages (ADVANCED)" under Category Filters and then search for sphinx). You don't need any of the debug packages or the two chinese model packages; install all the others. I did notice that the packages aren't optified, which means that with the available acoustic and language models you could eat up over 13MB root space. Consider yourself warned. I haven't access to my linux box to re-upload the packages with optification.

Worth a giggle if nothing else. With the provided large dictionary and language model the result of talking to your N900 is rather comical.

Edit: Removed command -- the default works fine.
__________________

Unofficial PR1.3/Meego 1.1 FAQ

***
Classic example of arbitrary Nokia decision making. Couldn't just fallback to the no brainer of tagging with lat/lon if network isn't accessible, could you Nokia?
MAME: an arcade in your pocket
Accelemymote: make your accelerometer more joy-ful

Last edited by Flandry; 2011-06-16 at 19:09.
Reply With Quote
The Following 3 Users Say Thank You to Flandry For This Useful Post:
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 18:00.