I think there is no difference -Voicy vs. Saera- in making them do anything useful. Saera does all the processing in one file and Voicy separates these things into different files, one for each app to control. This separation only makes it easier to add/remove controls for a user. The actual piece of python code needed to control something is the same.
From my perspective there aren't many situations where voice control for a phone is really useful (I am not talking about fun things like Chuck Norris news etc.), but at least there are some.
My list goes as follows:
1: navit, because it's better to concentrate on driving and to keep the hands off the phone
2: accept/reject a phone call while driving
3: dial a number while driving
4: control the mp3-player (for example in bed)
5: take a picture
Any suggestions for other useful things?
Well, something I'm still trying to get Saera to do is read me my email - very useful when I want to catch up on stuff while in the car. I've got that working for the N900, but I can't find documentation on how the N9 stores emails.
i think the goal is similar, but a separation is needed.
voicy, as it comes across is good for app specific functionality. it is by definition easier to implement functionality for a given app because you know what that app can do. app devs could include a config/description file with their app to add support.
saera tries to be siri. in this case it's role should be to interpret what you want and decide what to launch. this comes down to lexical analysis (i think thats the right term), trying to work out what the user wants to do. take a weather query, it needs to detect what, where and when and pass this on to app or voicy to execute request.
saera tries to be siri. in this case it's role should be to interpret what you want and decide what to launch. this comes down to lexical analysis (i think thats the right term), trying to work out what the user wants to do. take a weather query, it needs to detect what, where and when and pass this on to app or voicy to execute request.
While I get what you mean, I think Saera is aiming to be *more* than a Siri, nowadays. While siri pretends to be AI, in fact, it uses many "shortcuts" to look clever, without real AI routines.
Saera, OTOH, is - at least for me - mainly an AI experiments, and that's is what keeps me interested in project. Frankly, I have no need for "voice command monkey" for launching "apps"
While I get what you mean, I think Saera is aiming to be *more* than a Siri, nowadays. While siri pretends to be AI, in fact, it uses many "shortcuts" to look clever, without real AI routines.
Saera, OTOH, is - at least for me - mainly an AI experiments, and that's is what keeps me interested in project. Frankly, I have no need for "voice command monkey" for launching "apps"
/Estel
Absolutely true Estel, Voicy is nothing more than a small dumb "voice command monkey".
I, like all of us, would honestly LOVE to see Saera having true AI and beeing more than Siri, which -as you said- only pretends to be AI. To be honest, from my perspective this goal is far out of reach. The best AI system built so far is IBMs "Watson". It won the quiz show Jeopardy. When you compare the hardware of the N900 with Watsons hardware, with a total of 2880 POWER7 processor cores and 16 Terabytes of RAM, capable of processing 500 Gigabytes per second, you know what I mean. It's hardware is even by far bigger than IBMs super-computer "Deep Blue", the famous chess computer, the first one that beat a reigning chess world champion.
And even Watson is not true AI. It is 'just' a very big system with a huge database and driven by software massively parallel computing clever statistical algorithms.
Well, something I'm still trying to get Saera to do is read me my email - very useful when I want to catch up on stuff while in the car.
I've got that working for the N900, but I can't find documentation on how the N9 stores emails...
And this is the epic Harmattan email thread, where he still occasionally helps end-users: http://talk.maemo.org/showthread.php?t=78480&page=92
He's the only Harmattan dev that helped end-users for a reasonable period of time.
In fact, he went way beyond reasonable & into admirable territory...
This sounds perfect - i am looking for an application that can record and listen to voice commands and map these commands to a customizable command line execution - that is all is needed actually - anyone can take it from there to whatever they want
Absolutely true Estel, Voicy is nothing more than a small dumb "voice command monkey".
I, like all of us, would honestly LOVE to see Saera having true AI and beeing more than Siri, which -as you said- only pretends to be AI. To be honest, from my perspective this goal is far out of reach. The best AI system built so far is IBMs "Watson". It won the quiz show Jeopardy. When you compare the hardware of the N900 with Watsons hardware, with a total of 2880 POWER7 processor cores and 16 Terabytes of RAM, capable of processing 500 Gigabytes per second, you know what I mean. It's hardware is even by far bigger than IBMs super-computer "Deep Blue", the famous chess computer, the first one that beat a reigning chess world champion.
And even Watson is not true AI. It is 'just' a very big system with a huge database and driven by software massively parallel computing clever statistical algorithms.
I absolutely agree, and by no means I meant to discourage your work on Voicy. Furthermore, I see great benefits in joining forces, to allow feasible "switching" to (and merging with, when necessary), a "voice monkey", to make Saera more efficient in practice.
The thing I had in mind - and it seems to be consensus here - is to not make it *only* a "voice monkey", abandoning AI aspects all-together. Like you, I have no hopes for it to become "true-true" AI, in cybernetic sense All uses of "AI" here are to be understood as "semi AI".
Still, I think that expanding the AI (as "AI'ish' as we can get with hardware in question and software in reach) behind Saera, is what makes it very, very interesting project. Adding features of Voicy, can only make it better (if we're not going to abandon backend "AI" all-together" for sake of usefulness only).
The program only uses a limited set of very distinct commands at a time, but it changes from one set to another, depending on the current foreground task.
That's great, specially if you let pocketsphinx know the current mode's reduced vocabulary & grammar.
Pocketsphinx is sadly utterly useless in its current state for general dictation, specially for non-English languages. But for deciding between a much smaller vocabulary (e.g. numbers, a dozen commands) it's actually quite OK.