Notices


Reply
Thread Tools
Posts: 293 | Thanked: 163 times | Joined on Jan 2012 @ beijing-islamabad
#71
when saera is going to hit the repos ,any idea ?
Edit ::taixzo please answer this ,i would like to know as everyone else does .
Would love to test it on a regular basis as there are different updates coming every new day !!

Last edited by imo; 2012-06-09 at 20:18.
 

The Following 4 Users Say Thank You to imo For This Useful Post:
Posts: 804 | Thanked: 1,598 times | Joined on Feb 2010 @ Gdynia, Poland
#72
Originally Posted by scoobydoo View Post
the only thing i have'nt done is apt-get upgrade as i was told this can cause issue's
You are a very clever man, never ever ever run "apt-get upgrade" with extras-devel enabled, it can make your Maemo system unbootable so you will need to either reflash or restore backup using backupmenu!

edit: ok, brick is not a good word here, you are right

Last edited by misiak; 2012-06-09 at 21:01. Reason: removed "brick"
 

The Following 2 Users Say Thank You to misiak For This Useful Post:
Posts: 1,523 | Thanked: 1,997 times | Joined on Jul 2011 @ not your mom's FOSS basement
#73
...up until now, the possibility for a "brick"* was almost solely given if marmisterz messed with extras-devel shortly before.
* the term "brick" isn't appropriate here (for the N900 anyway), and wrongfully used for ages now.
 

The Following 3 Users Say Thank You to don_falcone For This Useful Post:
Posts: 752 | Thanked: 2,808 times | Joined on Jan 2011 @ Czech Republic
#74
Originally Posted by taixzo View Post
That's more or less how I imagined writing a text. However, three things in that that I am still trying to figure out:
  1. Pocketsphinx uses a pre-trained grammatical model. This model apparently assigns a very low probability to multiple numbers being used in sequence, so it never seems to recognize a phone number. Even saying ten of the most distinctively pronounced number (seven), it only recognized four sevens. This is something I need to work on with the voice model, but have been putting off until I have enough time to recompile the model (maybe Sunday).
  2. Also, Pocketsphinx is not very good with names. This could possibly be alleviated by running a phoneme search on all contacts once it's determined to be not a number.
  3. If the user is dictating a text, there needs to be some way to edit what they said. Ideally, this would also train the voice model. This is definately possible, but I need to learn more about pocketsphinx first.


I'm working on it.
I see. I plan to look into creating custom language models after my upcoming exams. I have found some tutorials on Voxforge and if it is not beyond my capabilities, I could then make a 'translation' of your app to my language (AFAIK there ain't any publicly available language models for Czech yet). Do you think it would be useful to include some kind of GUI to choose your own model as part of settings to your app?
 

The Following 2 Users Say Thank You to nodevel For This Useful Post:
Posts: 1,548 | Thanked: 7,510 times | Joined on Apr 2010 @ Czech Republic
#75
Just got an idea - if you want to add some location & navigation queries, I can quite easily add some CLI options to modRana for that.

Example queries that modRana could be used to handle:

Routing
Code:
find a route to X.
find a route from X.
find a route from X to Y. # not sure how to separate the addresses reliably
X & Y could be anything Google Directions can handle (address, coordinates, etc.).

Local search
Code:
Find me the nearest X
The X could be anything that can be found using Google Local search, as that is the currently used backend. It handles ATMs, restaurants, pubs, stations, shops, carparks any many other amenities just fine.

Address search
Code:
Show me on the map where is X.
ModRana can also geocode a street level address and show results on the map (if there is just one results, it zooms right on it or shows a list of results if there are more than one).

Wikipedia search
ModRana can also search Wikipedia articles with geographic cordinates, shows the results on the map & provides a short description + clickable link to the full article. Not sure how a naturally sounding query for this would look like.

Showing current position on the map
Code:
Show me where I am.
Show me where I am on the map.
This would just start modRana and center to the current position.

The CLI arguments could look like this:
Code:
--route-to
--route-from
--local-search
--address-search
--wikipedia-search
--show-current-position
So, what do you think ?
__________________
modRana: a flexible GPS navigation system
Mieru: a flexible manga and comic book reader
Universal Components - a solution for native looking yet component set independent QML appliactions (QtQuick Controls 2 & Silica supported as backends)
 

The Following 13 Users Say Thank You to MartinK For This Useful Post:
Posts: 1,417 | Thanked: 2,619 times | Joined on Jan 2011 @ Touring
#76
Since voice control is most critical and useful in hands free applications we need help intergrating the answer button on both wired and Bluetooth headsets into Saera.
There seems to be existing plugins to do this for the Wired headset and media player but again the more sofisticated user on a motorcycle or bicycle will probably be using bluetooth to send commands to the N900 and to voice dial.
Anyone who has a deep knowledge of the wired and unwired headsets could greatly help the most promising new app for the N900.
 

The Following User Says Thank You to biketool For This Useful Post:
Posts: 752 | Thanked: 2,808 times | Joined on Jan 2011 @ Czech Republic
#77
This project is rather unique, but it might be good to look at some other open source speech projects (frontends to speech recognition software) too.

Some time ago, I heard about Simon. While its functionality is a bit different and I'm not even sure some of the tasks could be accomplished on the N900, it's worth while to look at the wiki and presentation:

http://www.youtube.com/watch?v=x_9ImaiOISs
EDIT: They had made a MeeGo version too, including support for voice dial and navigation! A blogpost with video here.

Regarding the text editing, I think it could be made the way Siri handles it (at least from what I've seen from this video). That means showing a simple text area as a confirmation.

Also, as you asked for user suggestions, it would be powerful to take advantage of online services as GoogleCL reportedly works on Maemo or Google Search CLI or Google Translate CLI for that matter. Of course, I do not expect you to include this in your app directly, just to add support to trigger it and possibly list results.
Modrana integration looks like a great idea and because it uses Google Maps API, it would make future translation an easy task, since Google recognizes different grammar (at least declension in the Czech language).

Last thing, some inspiration could come from the Apple TV ads . While Siri asks server for results, we will probably never be able to achieve all this functionality locally, but imagine how cool it'd be to show it off to friends together with the Apple ad

Last edited by nodevel; 2012-06-09 at 23:01.
 

The Following 3 Users Say Thank You to nodevel For This Useful Post:
Posts: 804 | Thanked: 1,598 times | Joined on Feb 2010 @ Gdynia, Poland
#78
Originally Posted by don_falcone View Post
...up until now, the possibility for a "brick"* was almost solely given if marmisterz messed with extras-devel shortly before.
* the term "brick" isn't appropriate here (for the N900 anyway), and wrongfully used for ages now.
you are right.


Coming back to the topic, as for AI - have you looked at projects like FreeHAL, Howie or anything other which is open source from http://en.wikipedia.org/wiki/List_of_chatterbots or http://en.wikipedia.org/wiki/List_of...gence_projects (but on the last one, there are also other types of AI, not only chatterbots)? I don't have much experience with open source AI, after quick research Howie seems to work well offline, I couldn't run FreeHAL without network connection on Windows unfortunately...
 

The Following 5 Users Say Thank You to misiak For This Useful Post:
Posts: 5,795 | Thanked: 3,151 times | Joined on Feb 2007 @ Agoura Hills Calif
#79
Originally Posted by misiak View Post
You are a very clever man, never ever ever run "apt-get upgrade" with extras-devel enabled, it can make your Maemo system unbootable so you will need to either reflash or restore backup using backupmenu!

edit: ok, brick is not a good word here, you are right
I have run apt-get upgrade literally thousands of times on the N900 with devel enabled. I have run into problems about once.
__________________
All I want is 40 acres, a mule, and Xterm.
 
Posts: 958 | Thanked: 3,426 times | Joined on Apr 2012
#80
Originally Posted by imo View Post
when saera is going to hit the repos ,any idea ?
Edit ::taixzo please answer this ,i would like to know as everyone else does .
Would love to test it on a regular basis as there are different updates coming every new day !!
Ok, since everyone keeps asking for this, I'll make a package tonight. But please understand it isn't working well enough yet that I would consider it repo-ready.

Originally Posted by nodevel
I see. I plan to look into creating custom language models after my upcoming exams. I have found some tutorials on Voxforge and if it is not beyond my capabilities, I could then make a 'translation' of your app to my language (AFAIK there ain't any publicly available language models for Czech yet). Do you think it would be useful to include some kind of GUI to choose your own model as part of settings to your app?
That would be great! I would love to have Saera run in other languages as well, although I would probably need help translating logic bits (e.g. setting times).

Originally Posted by MartinK
Just got an idea - if you want to add some location & navigation queries, I can quite easily add some CLI options to modRana for that.

Example queries that modRana could be used to handle:

Routing

Code:
find a route to X.
find a route from X.
find a route from X to Y. # not sure how to separate the addresses reliably
X & Y could be anything Google Directions can handle (address, coordinates, etc.).

Local search

Code:
Find me the nearest X
The X could be anything that can be found using Google Local search, as that is the currently used backend. It handles ATMs, restaurants, pubs, stations, shops, carparks any many other amenities just fine.

Address search

Code:
Show me on the map where is X.
ModRana can also geocode a street level address and show results on the map (if there is just one results, it zooms right on it or shows a list of results if there are more than one).

Wikipedia search
ModRana can also search Wikipedia articles with geographic cordinates, shows the results on the map & provides a short description + clickable link to the full article. Not sure how a naturally sounding query for this would look like.

Showing current position on the map

Code:
Show me where I am.
Show me where I am on the map.
This would just start modRana and center to the current position.

The CLI arguments could look like this:

Code:
--route-to
--route-from
--local-search
--address-search
--wikipedia-search
--show-current-position
So, what do you think ?
Wow, that would be great! I was planning to ask you about that. Also: is there any chance ModRana could be used to return an image (so that if one just searches nearby I could show it in-app rather than launching another)?

Originally Posted by nodevel
This project is rather unique, but it might be good to look at some other open source speech projects (frontends to speech recognition software) too.

Some time ago, I heard about Simon. While its functionality is a bit different and I'm not even sure some of the tasks could be accomplished on the N900, it's worth while to look at the wiki and presentation:

http://www.youtube.com/watch?v=x_9ImaiOISs
EDIT: They had made a MeeGo version too, including support for voice dial and navigation! A blogpost with video here.

Regarding the text editing, I think it could be made the way Siri handles it (at least from what I've seen from this video). That means showing a simple text area as a confirmation.

Also, as you asked for user suggestions, it would be powerful to take advantage of online services as GoogleCL reportedly works on Maemo or Google Search CLI or Google Translate CLI for that matter. Of course, I do not expect you to include this in your app directly, just to add support to trigger it and possibly list results.
Modrana integration looks like a great idea and because it uses Google Maps API, it would make future translation an easy task, since Google recognizes different grammar (at least declension in the Czech language).

Last thing, some inspiration could come from the Apple TV ads . While Siri asks server for results, we will probably never be able to achieve all this functionality locally, but imagine how cool it'd be to show it off to friends together with the Apple ad
I was actually partly inspired by Simon for this project. I've used it, and it does have very good accuracy with a small set of commands - the problem being, that for a natural-language based program, you need a pretty large wordlist. The WSJ wordlist could probably be pruned some though.
Siri shows a text area for confirmation; this works but not really for hands-free usage. I was thinking something along the lines of Microsoft's Voice Recognition system, where you can say "Correct x" and it will allow you to correct it by choosing from a numbered list.
Google and Wikipedia search are on the todo list. I will watch the new Apple ads (I only saw the first one).

Originally Posted by misiak
Coming back to the topic, as for AI - have you looked at projects like FreeHAL, Howie or anything other which is open source from http://en.wikipedia.org/wiki/List_of_chatterbots or http://en.wikipedia.org/wiki/List_of...gence_projects (but on the last one, there are also other types of AI, not only chatterbots)? I don't have much experience with open source AI, after quick research Howie seems to work well offline, I couldn't run FreeHAL without network connection on Windows unfortunately...
I got Howie running, and tested it, but it doesn't seem very well suited to this (Howie keeps asking questions and IMO Saera should mostly provide answers, although I suppose I could rewrite the AIML; also, it takes Howie like ten seconds to start on a desktop, so I think that would be too slow for a phone.) I will look at the other ones.
 

The Following 11 Users Say Thank You to taixzo For This Useful Post:
Reply

Tags
saera, speech-to-text


 
Forum Jump


All times are GMT. The time now is 04:04.