View Single Post
Posts: 58 | Thanked: 65 times | Joined on Oct 2009 @ Finland
#358
First, huge thanks for @eber42 for making OKboard!

I'm working on a Finnish dictionary, but it's really hard to find quality corpuses. I'm currently experimenting with Wikipedia-based and news based, but it seems I need bigger and better corpora... Does anyone know any good sources?

I did manage to get the thing to build (by cruely skipping the very last step that causes the build to fail and suggesting a bigger corpora - I wanted a proof of concept, won't be skipping the test in release version) but there are problems. I cut the original word list in half, but I'm still getting kinda huge (12MB...30MB) fi.tre file, predict-fi.db is 26kB and predict-fi.ng is 813kB. In comparison the English en.tre is below two megabytes... As a result, the delay between the gesture and the word appearing is...very noticable to be modest. What would be a good size to aim at?

Thanks all!
 

The Following 2 Users Say Thank You to mattiviljanen For This Useful Post: