maemo.org - Talk

maemo.org - Talk (https://talk.maemo.org/index.php)
-   SailfishOS (https://talk.maemo.org/forumdisplay.php?f=52)
-   -   Advanced text entry on Sailfish (Swype or similar) (https://talk.maemo.org/showthread.php?t=92764)

spidernik84 2016-01-25 21:19

Re: Advanced text entry on Sailfish (Swype or similar)
 
Quote:

Originally Posted by itdoesntmatt (Post 1496351)
Ciao a te :) e grazie tante per il vostro impegno!

Ciao! No problem... it's getting more complicated than I thought :D
I get what you mean now. It's an option but this is up to Eber :)

I tried the various approaches suggested here. I'll try further with this:

Code:

tr '[:upper:]' '[:lower:]' < sanitized.list > sanitized.lower
tr -s [:space:] \\n < sanitized.lower | sort | uniq > sanitized.uniq

So, essentially: lowercase everything, put on single line, sort, remove duplicates.

It is a bit better:
Code:

wc sanitized.uniq
 13615329  13615329 236920596 sanitized.uniq

I'll then fetch a list of proper Italian nouns of people and cities and push them in the file, so to preserve some basic capitalisation.

I'll try to generate the file again tomorrow. Any ideas or help with the dict are more then welcome :)

ljo 2016-01-25 22:53

Re: Advanced text entry on Sailfish (Swype or similar)
 
Quote:

Originally Posted by spidernik84 (Post 1496352)
1) it's getting more complicated than I thought :D

2) It is a bit better:
Code:

wc sanitized.uniq
 13615329  13615329 236920596 sanitized.uniq


1) it usually is.
2) Well, this is what I did yesterday:
Code:

aspell -l it dump master | aspell -l it expand | aspell -l it clean | sort | sort -uf | wc -l
8565009

which folds but keeps as much of capitalisation as possible.

eber42 2016-01-26 16:33

Re: Advanced text entry on Sailfish (Swype or similar)
 
Check out the last commit: You can now provide your own dictionary file instead of using the aspell one. You just need to create a words-it.txt file in $CORPUS_DIR with one word per line.

Also, if your input corpus uses the word "dall'oceano" and your dictionary contains "dall" and "oceano" and not "dall'oceano", they will be handled as two different words.

itdoesntmatt 2016-01-26 17:07

Re: Advanced text entry on Sailfish (Swype or similar)
 
eber sorry my ignorance but what do you mean for input corpus? however is it possibile to show both dall and dall' when swyped d-a-l-l? thanks

PS: anything about eventual solution to avoid prediction bar get hidden from whatsapp input field in dalvik full screen mode?

eber42 2016-01-26 18:00

Re: Advanced text entry on Sailfish (Swype or similar)
 
Quote:

Originally Posted by itdoesntmatt (Post 1496445)
eber sorry my ignorance but what do you mean for input corpus? however is it possibile to show both dall and dall' when swyped d-a-l-l? thanks

You will have to input word separators manually for now. Handling this automatically needs a larger evolution (it is in the roadmap, but no promise or time estimate)

Quote:

Originally Posted by itdoesntmatt (Post 1496445)
PS: anything about eventual solution to avoid prediction bar get hidden from whatsapp input field in dalvik full screen mode?

If you talk about the transparency issue, it is a compatibility issue with SFOS 2. As i'm not in a hurry to upgrade, I will try to make a fix that work with both versions :)

spidernik84 2016-01-26 19:42

Re: Advanced text entry on Sailfish (Swype or similar)
 
Quote:

Originally Posted by ljo (Post 1496366)
1) it usually is.
2) Well, this is what I did yesterday:
Code:

aspell -l it dump master | aspell -l it expand | aspell -l it clean | sort | sort -uf | wc -l
8565009

which folds but keeps as much of capitalisation as possible.

Thank you! I'll use your variant.
Back to crunching the numbers. Let's see if this time it goes through :)

spidernik84 2016-01-27 16:55

Re: Advanced text entry on Sailfish (Swype or similar)
 
Hello. The last attempt failed with an overflow, despite limiting the dictç

Code:

13566000
13567000
13568000
13569000
13570000
13571000
13572000
Traceback (most recent call last):
  File "/home/nicvol/okboard/db/../tools/loadkb.py", line 28, in <module>
    t.endLoad()
  File "/home/nicvol/okboard/tools/gribouille.py", line 120, in endLoad
    self._rec_load(self.tree)
  File "/home/nicvol/okboard/tools/gribouille.py", line 147, in _rec_load
    child_index = self._rec_load(child, pre + letter)
  File "/home/nicvol/okboard/tools/gribouille.py", line 147, in _rec_load
    child_index = self._rec_load(child, pre + letter)
  File "/home/nicvol/okboard/tools/gribouille.py", line 147, in _rec_load
    child_index = self._rec_load(child, pre + letter)
  File "/home/nicvol/okboard/tools/gribouille.py", line 147, in _rec_load
    child_index = self._rec_load(child, pre + letter)
  File "/home/nicvol/okboard/tools/gribouille.py", line 147, in _rec_load
    child_index = self._rec_load(child, pre + letter)
  File "/home/nicvol/okboard/tools/gribouille.py", line 147, in _rec_load
    child_index = self._rec_load(child, pre + letter)
  File "/home/nicvol/okboard/tools/gribouille.py", line 147, in _rec_load
    child_index = self._rec_load(child, pre + letter)
  File "/home/nicvol/okboard/tools/gribouille.py", line 147, in _rec_load
    child_index = self._rec_load(child, pre + letter)
  File "/home/nicvol/okboard/tools/gribouille.py", line 147, in _rec_load
    child_index = self._rec_load(child, pre + letter)
  File "/home/nicvol/okboard/tools/gribouille.py", line 147, in _rec_load
    child_index = self._rec_load(child, pre + letter)
  File "/home/nicvol/okboard/tools/gribouille.py", line 147, in _rec_load
    child_index = self._rec_load(child, pre + letter)
  File "/home/nicvol/okboard/tools/gribouille.py", line 147, in _rec_load
    child_index = self._rec_load(child, pre + letter)
  File "/home/nicvol/okboard/tools/gribouille.py", line 147, in _rec_load
    child_index = self._rec_load(child, pre + letter)
  File "/home/nicvol/okboard/tools/gribouille.py", line 147, in _rec_load
    child_index = self._rec_load(child, pre + letter)
  File "/home/nicvol/okboard/tools/gribouille.py", line 147, in _rec_load
    child_index = self._rec_load(child, pre + letter)
  File "/home/nicvol/okboard/tools/gribouille.py", line 133, in _rec_load
    self._write_node(index, letter = None, last_child = (nchilds == 0), payload = True, dest_index = self.cur_index)
  File "/home/nicvol/okboard/tools/gribouille.py", line 61, in _write_node
    if dest_index >= (1 << 24) - 10: raise Exception("overflow")
Exception: overflow
make: *** [it.tre] Error 1
+ rsync -av '*.tre' '*.db' '*.ng' '*.rpt.bz2' /home/nicvol/okboard/db/
sending incremental file list
rsync: link_stat "/media/storage/nicvol/corpus/work/*.tre" failed: No such file or directory (2)
rsync: link_stat "/media/storage/nicvol/corpus/work/*.db" failed: No such file or directory (2)
rsync: link_stat "/media/storage/nicvol/corpus/work/*.ng" failed: No such file or directory (2)
rsync: link_stat "/media/storage/nicvol/corpus/work/*.rpt.bz2" failed: No such file or directory (2)

sent 12 bytes  received 12 bytes  48.00 bytes/sec
total size is 0  speedup is 0.00
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1070) [sender=3.0.9]

Did anyone encounter this with the other languages?

mautz 2016-01-27 17:25

Re: Advanced text entry on Sailfish (Swype or similar)
 
What is the size of your dictionary file and how many words does it contsin?

spidernik84 2016-01-27 17:33

Re: Advanced text entry on Sailfish (Swype or similar)
 
There we go!

Code:

#cat words-it.txt | wc -l
13572262

#ls -lrth
-rw-rw-r-- 1 nico nico  41M Jan 23 21:35 corpus-it.txt
-rw-rw-r-- 1 nico nico 226M Jan 26 21:12 words-it.txt


mautz 2016-01-27 17:43

Re: Advanced text entry on Sailfish (Swype or similar)
 
Your corpus file is way too small. How many sentences does it include? And on the other hand your dictionary is way too big. Even if it does compile OkBoard will crash with such a huge dictionary.

My corpora has a filesize about 200MB and contains around 2000000 sentences.
My dictionary has a size of nearly 1MB and contains around 100000 words. I tried a dictionary with 17 million words(size was around 30MB i think) and OKBoard crashed everytime i started it.


All times are GMT. The time now is 23:46.

vBulletin® Version 3.8.8