Active Topics

 



Notices


Reply
Thread Tools
BluesLee's Avatar
Posts: 411 | Thanked: 1,105 times | Joined on Jan 2010 @ Europe
#1
Some months ago a friend was really impressed by Google Now, a kind of answering machine on Android, more or less equivalent to Apple's Siri or whatever it is called. Looking at some demonstration videos on youtube i saw everything i needed to get the below shell script finished. By the way, it works pretty well for me. Feel free to use it, modify and optimize it, print and eat it to become one with the code

About Mee42
Essentially the shell script works as follows: Record audio from mic & convert to flac format -> Google Voice API for speech recognition -> do some text manipulations here and there -> get an answer from wiki.answers.com -> read the answer using espeak.

Security aspects
Sending a text phrase to answers.com is one thing, sending your individual voice samples to Google's cloud is another thing. Just be aware what you are doing. It's your data.

About GUI discussions, N900/N9 versions etc
My aim was to share the script as a proof of concept so that the community can profit from it as soon as possible. Taixzo who develops Saera, a Siri clone with GUI which is available for the N900, integrates the essential ideas into his app. The best thing to do now is to support him by voting for Saera in 2012's Coding competition. With a N9/N950 in his hands the chances are even better to see a Harmattan port of Saera.

Changelog
  • Changelog version 0.3.1 (25.09.2012), mee42_v0.3.1.sh:
    • Fixed potential issue when N9's GSM/UMTS network is used: Now the answer file ${file_prefix}.html is handled correctly when line breaks are missing.
    • Minor code changes like 'about' information.
  • Changelog version 0.3 (24.09.2012), mee42_v0.3.sh:
    • Added voice feedback before/after recording (thanks to bingomion for requesting this)
    • Added voice command to exit the loop: 'exit', 'quit' or 'stop' exits the loop now (again thanks to bingomion for requesting this)
  • Changelog version 0.2 (23.09.2012), mee42_v0.2.sh:
    • Renamed 42^infinity to Mee42
    • Runs now on the N9
    • Replaced arecord + flac with gst-launch.
    • Get rid of vi(m) command.
    • Several minor changes for code readability.

Output of a single (!) terminal session
Code:
~/bin $ ./mee42_v0.3.1.sh
-----------------------------------------------------------------------------
Mee42 is a shell script which records your voice and tries to answer your
question using Google Voice API for speech recognition, answers.com for db of
answers and espeak for speech synthesis. CopyStraightForward blueslee@tmo.

Recording sound input.................
Speech2txt via Google Voice...........who is Michael Jordan
Getting answer from answers.com.......He is a famous basketball player from the NBA (NationalBasketball Association). He is known for his glory days with the Chicago Bulls.
Txt2speech via espeak.................

Recording sound input.................
Speech2txt via Google Voice...........the capital of Tonga
Getting answer from answers.com.......Nuku'alofa is the capital of Tonga. .
Txt2speech via espeak.................

Recording sound input.................
Speech2txt via Google Voice...........what is the biggest prime number
Getting answer from answers.com.......There is no biggest prime number. The largest prime known at present is 243112609 - 1, a number with nearly 13 million decimal digits.
Txt2speech via espeak.................

Recording sound input.................
Speech2txt via Google Voice...........what is the best smartphone
Getting answer from answers.com.......Iphone_and_Blackberry:D
Txt2speech via espeak.................

Recording sound input.................
Speech2txt via Google Voice...........#### you
Getting answer from answers.com.......Something went wrong.
Txt2speech via espeak.................

Recording sound input.................
Speech2txt via Google Voice...........what is 42
Getting answer from answers.com.......42 is the answer to life, the universe, and everything. (From the book Hitch hikers guide to the galaxy.) EDIT: aww you beat me to it...
Txt2speech via espeak.................

Recording sound input.................
Speech2txt via Google Voice...........exit
Shell Script Version V0.3.1
Code:

#!/bin/sh
#
# Mee42 - Siri/Google Now for the poor man ;-)  
#
# CopyStraightForward blueslee@tmo
#
# What you need: gst-launch or alternatively arecord + flac, wget, espeak

# Choose your language here, only English is supported yet  
# (html structure of answers from *.answers.com are different)
lang="en"
if [ "${lang}" = 'en' ]; then 
   lang_flag='wiki'
else 
   lang_flag=${lang}
fi

# Maximum speech recording time in sec
speech_rec_time=4  					#use for arecord
speech_num_buffers="$(( 50 * ${speech_rec_time} ))"  	#used for gstreamer 

# Default sample rate for flac. Use 48000 if you are using arecord + flac 
# as flac's --sample-rate=16000 does not work for me  
sample_rate=16000

# Just a shortcut for Google's Voice API Url
api_url='http://www.google.com/speech-api/v1/recognize?lang=${lang}-${lang}\
         &client=chromium'

# Temporary files are named as below with the corresponding suffix
file_prefix='file_temp'
rm -f ${file_prefix}* 


about()
{
cat << EOF
-----------------------------------------------------------------------------
Mee42 is a shell script which records your voice and tries to answer your 
question using Google Voice API for speech recognition, answers.com for db of 
answers and espeak for speech synthesis. CopyStraightForward blueslee@tmo.

EOF
}


# Record voice via gst-launch, alternatively you can use arecord and flac
record_voice()
{
    gst-launch-0.10 pulsesrc num-buffers=${speech_num_buffers} ! audioconvert\
     ! audioresample ! audiorate ! audio/x-raw-int,rate=${sample_rate} ! flacenc\
     ! filesink location=${file_prefix}.flac 1>/dev/null 2>&1

    #arecord -f dat -d ${speech_rec_time} ${file_prefix}.wav 1>/dev/null 2>&1
    #flac --compression-level-5 ${file_prefix}.wav 1>/dev/null 2>&1
    #Remark: sox on N900 doesnt work as below i.e. has no flac support 
    #sox ${file_prefix}.wav ${file_prefix}.flac rate 16k gain -n -5 silence 1 5 2%
}


# Speech2Txt uses Google Voice API
speech2txt()
{
    wget -q -U "Mozilla/5.0" --post-file ${file_prefix}.flac\
     --header="Content-Type: audio/x-flac; rate=${sample_rate}" -O\
     - ${api_url} > ${file_prefix}.ret
    cat ${file_prefix}.ret | sed 's/.*utterance":"//'\
        | sed 's/","confidence.*//' > ${file_prefix}.txt
}


# Get answer from answers.com
get_answer()
{
    # We are replacing ' ' by '_' as a typical request is handled in the http 
    # header, for instance http://wiki.answers.com/Q/who_is_michael_jordan
    sed -e 's/ /_/g' ${file_prefix}.txt > ${file_prefix}2.txt

    request_url="http://${lang_flag}.answers.com/Q/"`cat ${file_prefix}2.txt`
    wget ${request_url} --output-document=${file_prefix}.html 1>/dev/null 2>&1
   
    # The first occurance of 'description' contains our answer
    grep 'og:description' ${file_prefix}.html | head -1\
        | sed -e 's/.*og:description//' | cut -d"\"" -f3,3 > ${file_prefix}.txt
}


# Sanity check if something went wrong, i.e. we have no or wrong answer
error_handling()
{
    errorcheck=`cat ${file_prefix}.txt | cut -c1-4`
    if [ "${errorcheck}" = "http" -o "${errorcheck}" = "" ]; then
       echo 'Something went wrong.' > ${file_prefix}.txt
    fi
}


# Txt2speech via espeak
txt2speech()
{
    espeak -v${lang} -f ${file_prefix}.txt 1>/dev/null 2>&1
}


# Cleanup of temporary files
cleanup()
{
    rm -f ${file_prefix}* 
}


clear
about

while true; 
do 	
    espeak -v${lang} "Ask your question or say exit." 1>/dev/null 2>&1
    printf 'Recording sound input.................'
    record_voice
    espeak -v${lang} "Thank you." 1>/dev/null 2>&1
    sleep 1
    echo

    printf 'Speech2txt via Google Voice...........'
    speech2txt; cat ${file_prefix}.txt

    # Leave the while loop if users says 'exit' || 'quit' || 'stop'
    exit_code=`cat ${file_prefix}.txt`
    if [ "${exit_code}" = "exit" -o "${exit_code}" = "quit"\
                                 -o "${exit_code}" = "stop" ]; then
        cleanup
        exit 0
    fi

    
    printf 'Getting answer from answers.com.......'   
    get_answer
    error_handling 
    cat ${file_prefix}.txt

    printf 'Txt2speech via espeak.................' 
    txt2speech
    echo; echo
   
    cleanup
done
Installation
  • N9
    You will need to install the debian packages mentioned in the script, i hope the below list is complete:
    • wget should be available from Nokia's default repository, install it via 'apt-get install wget'
    • Download and install gstreamer0.10-tools from Harmattan's repository using 'dpkg -i' on your device.
    • Save the shell script to your N9 as mee42.sh, do a 'chmod u+x mee42.sh' and run it via './mee42.sh'.
  • N900
    • Should be easier as all debian packages should be available on extras. Note that you need to install in addition gstreamer0.10-flac
  • Desktop
    • Try to switch to alsasrc for recording with gst-launch if pulsesrc is not available.


As always have fun.

Last edited by BluesLee; 2012-09-29 at 19:01. Reason: Security Aspects
 

The Following 38 Users Say Thank You to BluesLee For This Useful Post:
Posts: 293 | Thanked: 163 times | Joined on Jan 2012 @ beijing-islamabad
#2
cool yaaaaaaaaaaaaaar !
if some how this could be a part of saera ?
 

The Following 2 Users Say Thank You to imo For This Useful Post:
nicholes's Avatar
Posts: 1,103 | Thanked: 368 times | Joined on Oct 2010 @ india, indore
#3
This looks hard for a noob like me

so what should i do? should i wait fr some detailed or step by step instruction?

or some one is going to make deb for it?
__________________
N900 gave me a reason to live in this cruel world

get your smooth live wallpaper today
My YouTube videos
 

The Following 2 Users Say Thank You to nicholes For This Useful Post:
ibrakalifa's Avatar
Posts: 1,583 | Thanked: 1,203 times | Joined on Dec 2011 @ Everywhere
#4
so we need an icon for n9, and deb file too,

awesome
__________________
~$
~#
 

The Following 2 Users Say Thank You to ibrakalifa For This Useful Post:
Posts: 958 | Thanked: 3,426 times | Joined on Apr 2012
#5
What exactly does this line do:
Code:
vim -c ':/description' -c ':.-1' -c ":1,. d" -c ":.+1" -c ":.,$ d" -c ":wq!" ${file_prefix}.html
I'm trying to implement this in Python.
 

The Following 3 Users Say Thank You to taixzo For This Useful Post:
Posts: 543 | Thanked: 151 times | Joined on Feb 2010 @ Germany
#6
Great work
 

The Following User Says Thank You to Crogge For This Useful Post:
Posts: 235 | Thanked: 86 times | Joined on Dec 2010
#7
Originally Posted by taixzo View Post
What exactly does this line do:
Code:
vim -c ':/description' -c ':.-1' -c ":1,. d" -c ":.+1" -c ":.,$ d" -c ":wq!" ${file_prefix}.html
I'm trying to implement this in Python.
-c ':/description' => find 'description'
-c ':.-1' => goto previous line
-c ":1,. d" => delete all lines from beginning to current line (1 line before 'description')
-c ":.+1" => goto next line
-c ":.,$ d" => delete all lines all to the end
-c ":wq!" => save

basically it extracts the expected answer from the result HTML using 'description' as keyword
 

The Following 5 Users Say Thank You to figaro For This Useful Post:
BluesLee's Avatar
Posts: 411 | Thanked: 1,105 times | Joined on Jan 2010 @ Europe
#8
Originally Posted by taixzo View Post
What exactly does this line do:
Code:
vim -c ':/description' -c ':.-1' -c ":1,. d" -c ":.+1" -c ":.,$ d" -c ":wq!" ${file_prefix}.html
I'm trying to implement this in Python.
Essentially, it searches for the first occurence of a line containing "description" in ${file_prefix}.html and saves the file. Replacing it by
Code:
grep "description" ${file_prefix}.html | head -1
makes the usage of vi(m) redundant and it should be also much faster. For details see the answer of figaro.

Last edited by BluesLee; 2012-09-23 at 06:14.
 

The Following 4 Users Say Thank You to BluesLee For This Useful Post:
bingomion's Avatar
Posts: 528 | Thanked: 345 times | Joined on Aug 2010 @ MLB.AU
#9
Clever little script!!
 

The Following User Says Thank You to bingomion For This Useful Post:
www.rzr.online.fr's Avatar
Posts: 1,348 | Thanked: 1,863 times | Joined on Jan 2009 @ fr/35/rennes
#10
have you tested it on harmattan ? let me suggest to rename it too something less enigmatic , ie : google-speech-ui ..
__________________
Current obsession:

https://purl.org/rzr/abandonware

Please help to list all maemo existing apps :

https://github.com/abandonware/aband...ment-578143760

https://wiki.maemo.org/Apps#

I am looking for " 4 inch TFT LCD display screen " for Nokia n950 HandSet

http://rzr.online.fr/q/lcd


Also, I need online storage to archive files :

http://db.tt/gn5Qffd6#

https://my.pcloud.com/#page=register...e=g8ikZmcfEJy#

Last edited by www.rzr.online.fr; 2012-09-23 at 14:57.
 

The Following User Says Thank You to www.rzr.online.fr For This Useful Post:
Reply


 
Forum Jump


All times are GMT. The time now is 16:43.