Menu

Main Menu
Talk Get Daily Search

Member's Online

    User Name
    Password

    [Announcement]Open source text prediction input plugin

    Reply
    Page 1 of 4 | 1   2     3   | Next | Last
    martonmiklos | # 1 | 2018-03-22, 21:52 | Report

    Hello all,

    We have started a new sailfishos_keyboard development team at https://github.com/sailfish-keyboard. The team is happy to announce a new version from the presage based alternative text prediction solution.

    Currently we offer the following keyboards for downloading at openrepos:
    https://openrepos.net/content/sailfi...ext-prediction
    https://openrepos.net/content/sailfi...nput-predictor
    https://openrepos.net/content/sailfi...ext-prediction
    https://openrepos.net/content/sailfi...nput-predictor

    If you are using a community supported language pack or a ported device you should definitely try it out.

    If you cannot find your language in the supported list please read through this documenation about how can you add support for your language:
    * https://github.com/sailfish-keyboard...ased-predictor
    * https://github.com/sailfish-keyboard...utils/keyboard

    If you are having problems with the packaging or distribution at openrepos let us know (preferabliy in the form of a github issue), we are more than happy to help with it.

    In this release we have been mainly focusing on performance improvements. We are satisfied with the results thanks to moving the internal database format from SQLite to MarisaDb and the asynchronous prediction handling.

    Other than the performance improvements we have implemented the following:

    * Added Hunspell based prediction engine: if you mistype a word the predictor will offer the corrected version in the predictions. (The plugin does not do any auto-correction, just predicts the corrected word.)
    * Added ability to forgot the learned words: presage learns what you type to be able to make more accurate predictions. Now you can delete the mistyped which have been learned from your text input by long tapping on the predictions. (The forgot feature does not works on the words coming from the preinstalled database, but it is on our TODO list).
    * We have renamed the plugin to make the naming more consistent, and we get rid from the external library dependencies by statically linking to them, and removed some unnecessary configuration packages.

    **Important message to the users of the former revision**
    Because of the structural changes please follow the instructions below before installing the new plugin:

    * Deselect presage keyboards in Settings/Text input/Keyboards
    * Uninstall in command line (pkcon or zypper) all presage packages. Use zypper se presage to see which packages are installed
    * Refresh repositories (pkcon refresh or zypper ref)
    * Reboot
    * Enable sailfish_keyboard repository at OpenRepos
    * Install keyboard(s) of interest

    Please report back if you have any issues, problems!

    Wishing you happy Sailing,
    @ljo, @martonmiklos, and @rinigus

    Edit | Forward | Quote | Quick Reply | Thanks
    The Following 38 Users Say Thank You to martonmiklos For This Useful Post:
    acrux, ajalkane, Amboss, anapospastos, borghal, Bundyo, Cavalier, dcaliste, ejjoman, elastic, Feathers McGraw, imaginaryenemy, J4ZZ, jakibaki, Jordi, juiceme, Kabouik, kinggo, klinglerware, lal, maegon9y00, Manatus, mariusmssj, MartinK, max_power, meloferz, mrsellout, nieldk, olf, P@t, peterleinchen, santeira, Saturn, suicidal_orange, taixzo, tealc, wicozani, Wikiwide

     
    lal | # 2 | 2018-03-23, 06:02 | Report

    Working great. This now completes the SFOS port I am using.

    A suggestion/request/expected behaviour
    If i place the cursor on a word that is already typed and decides to replace it with one of the suggested words by presage, the new word gets inserted where the cursor is rather than replacing the existing word. I recall that was the same behaviour in Jolla C with xt9 as well. Could that be fixed?

    Expected/desired behaviour is, when the cursor is placed on an already typed word and one of the suggested words is selected, the existing word gets replaced completely with the suggested word.

    Edit | Forward | Quote | Quick Reply | Thanks
    The Following 10 Users Say Thank You to lal For This Useful Post:
    Amboss, ejjoman, imaginaryenemy, juiceme, ljo, martonmiklos, rinigus, santeira, suicidal_orange, Wikiwide

     
    ljo | # 3 | 2018-03-23, 06:14 | Report

    Originally Posted by lal View Post
    Working great. This now completes the SFOS port I am using.
    Great to hear!

    Originally Posted by lal View Post
    A suggestion/request/expected behaviour
    If i place the cursor on a word that is already typed and decides to replace it with one of the suggested words by presage, the new word gets inserted where the cursor is rather than replacing the existing word. I recall that was the same behaviour in Jolla C with xt9 as well. Could that be fixed?

    Expected/desired behaviour is, when the cursor is placed on an already typed word and one of the suggested words is selected, the existing word gets replaced completely with the suggested word.
    Thanks, yes this on my list of fixes to do.

    Edit | Forward | Quote | Quick Reply | Thanks
    The Following 11 Users Say Thank You to ljo For This Useful Post:
    Amboss, benny1967, ejjoman, imaginaryenemy, juiceme, lal, meloferz, olf, santeira, suicidal_orange, Wikiwide

     
    rinigus | # 4 | 2018-03-31, 13:34 | Report

    I have packaged Russian keyboard using ngrams as distributed at http://ruscorpora.ru/corpora-freq.html . In addition, hunspell dictionary was converted to UTF-8, as needed. Not sure whether its the best ngrams available, but give it a try. If someone gets better frequency distribution, please feel free to suggest improvements.

    I think Russian would heavily benefit from proper Unicode support in presage. Right now, Presage doesn't know how to normalize Russian letters (lowercase or otherwise). So, we would love to get an enthusiastic ICU specialist who would have time to work on Unicode support of Presage. I'll be happy to help with the database parts and normalization, but have rather limited time to do it all. In addition, my primary languages work quite well already

    Edit | Forward | Quote | Quick Reply | Thanks
    The Following 11 Users Say Thank You to rinigus For This Useful Post:
    Amboss, juiceme, lal, ljo, MartinK, martonmiklos, meloferz, olf, P@t, peterleinchen, Wikiwide

     
    suicidal_orange | # 5 | 2018-06-25, 10:55 | Report

    First good work on this - I'm using it in preference to the paid prediction because I can.

    I've made a keyboard for Colemak which doesn't make much sense given it's designed for keyboards where edge keys (A and ; on QWERTY) are easy to hit which isn't the case on a screen, but it works so may as well put it out there.

    As there is no international version of Colemak and it's set to use English predictions not sure if it should be called en-colemak-presage or colemak-presage - any preference?

    Next step will be to research/find a thumb optimised layout and en_GB-ise dictionary - will the script will handle a non four character languages or should it just be called en_GB?

    Edit | Forward | Quote | Quick Reply | Thanks
    The Following 4 Users Say Thank You to suicidal_orange For This Useful Post:
    imaginaryenemy, juiceme, olf, Wikiwide

     
    rinigus | # 6 | 2018-06-25, 17:53 | Report

    Originally Posted by suicidal_orange View Post
    First good work on this - I'm using it in preference to the paid prediction because I can.

    I've made a keyboard for Colemak which doesn't make much sense given it's designed for keyboards where edge keys (A and ; on QWERTY) are easy to hit which isn't the case on a screen, but it works so may as well put it out there.

    As there is no international version of Colemak and it's set to use English predictions not sure if it should be called en-colemak-presage or colemak-presage - any preference?

    Next step will be to research/find a thumb optimised layout and en_GB-ise dictionary - will the script will handle a non four character languages or should it just be called en_GB?
    Great to hear that you work on it. I am a bit surprised on rather low adoption of this work and the absence of contributions of other languages/keyboards. But let's see how it will progress in future.

    There are 3 packages that are needed for full support.

    Package 1: keyboard

    This is named as keyboard-presage-<YOUR_OWN_CODE, usually in form en_US>-0.1.0-1.noarch.rpm (version info may vary)

    For example, download and examine contents of one of the keyboard RPMs. As you will see, it has languageCode in .conf file of the keyboard definition. This code will have to match presage and hunspell dictionaries. If you want to go for UK, use en_GB

    The script will probably be not too excited about you trying to name and use some other dict names. So, I would suggest to hack the script and change keyboard RPM content by hand. Please see README at https://github.com/sailfish-keyboard...utils/keyboard


    Package 2: hunspell dict

    You will have to provide hunspell dictionary. This can be downloaded from somewhere and will have to be converted to UTF8. See readme at https://github.com/sailfish-keyboard/presage, last section


    Package 3: presage n-gram

    That will require text corpus to teach presage. Note that we are looking for something large, the more text the merrier with the context similar to the use at mobile. You may wish to filter profanity, but it is rather non-trivial problem.

    For help on generation n-gram database, see https://github.com/sailfish-keyboard...ased-predictor .

    I don't remember whether there was freely available en_GB corpus, though.

    Good luck!

    Edit | Forward | Quote | Quick Reply | Thanks
    The Following 4 Users Say Thank You to rinigus For This Useful Post:
    juiceme, olf, Stuubi, suicidal_orange

     
    rob_kouw | # 7 | 2018-06-26, 15:23 | Report

    Do we get an option to read all the First Names and Details (Business names) from the People app, and include them in the predicted words?

    Edit | Forward | Quote | Quick Reply | Thanks
    The Following 2 Users Say Thank You to rob_kouw For This Useful Post:
    juiceme, Wikiwide

     
    rinigus | # 8 | 2018-06-26, 20:21 | Report

    Originally Posted by rob_kouw View Post
    Do we get an option to read all the First Names and Details (Business names) from the People app, and include them in the predicted words?
    From presage point of view, its possible to add more predictors. That's the way the current learning is implemented. However, I don't know whether anyone has been working on this or other features. So, if you wish to see it and know how to program, I can only encourage you to work on it. We will all be happy to help and reply to your queries.

    Edit | Forward | Quote | Quick Reply | Thanks
    The Following 5 Users Say Thank You to rinigus For This Useful Post:
    juiceme, Kabouik, MartinK, rob_kouw, Wikiwide

     
    suicidal_orange | # 9 | 2018-06-27, 11:50 | Report

    As rinigus suggested the script for packaging keyboards wasn't very happy with what I'm trying to do, so I had to modify it. In doing so I found the .spec needed changing too, but that made it inflexible...

    My solution (which I'm sure is far from best practice) is to write the .spec file from the script with modified name and descriptions based on an optional 6th argument, resulting in a file called "keyboard-presage-colemak-en_US-1.0.0-1.noarch.rpm"

    Not sure anyone's interested in alternate layouts but if you are here it is. Criticism welcome, I want to learn

    Code:
    #!/bin/bash
    
    set -e
    
    PROGPATH=$(dirname "$0")
    
    if [ "$#" -lt 5 ]; then
        echo "Usage: $0 Language langcode version keyboard.qml keyboard.conf [layout name]"
        echo
        echo "Language: Specify language in English starting with the capital letter, ex 'Estonian'"
        echo "langcode: Specify language code, ex 'en_US'. Use the same notation as Hunspell dictionaries."
        echo "version: Version of the language package, ex '1.0.0'"
        echo "keyboard.qml: Keyboard QML file"
        echo "keyboard.conf: Keyboard Configuration file referencing the QML file"
        echo "layout name: Optional, for alternate layouts (BÉPO, Colemak, Dvorak...)"
        echo
        echo "When finished, the keyboard support will be packaged into RPM in the current directory"
        echo
        echo "The script requires rpmbuild to be installed. Note that rpmbuild can be installed on distributions that don't use RPM for packaging"
        echo
        exit 0
    fi
    
    L=$1
    CODE=$2
    VERSION=$3
    KQML=$4
    KCONF=$5
    if [ $6 != "" ]; then
    LAYOUT="-$6"
    fi
    
    NAME=keyboard-presage$LAYOUT-$CODE
    LAYOUT=" ${LAYOUT:1:50}"
    
    TMPDIR=`mktemp -d`
    
    mkdir -p $TMPDIR/$NAME-$VERSION/keyboard
    mkdir -p $TMPDIR/$NAME-$VERSION/rpm
    cp "$KQML" $TMPDIR/$NAME-$VERSION/keyboard
    cp "$KCONF" $TMPDIR/$NAME-$VERSION/keyboard
    
    echo "# Template for generation of keyboard RPMs
    # for Presage on Sailfish. This temlate is used
    # by package-keyboard.sh script
    
    # Prevent brp-python-bytecompile from running.
    %define __os_install_post %{___build_post}
    
    # \"Harbour RPM packages should not provide anything.\"
    %define __provides_exclude_from ^%{_datadir}/.*$
    
    Name: "$NAME"
    Version: __version__
    Release: 1
    Summary: Keyboard layout for"$LAYOUT" __Language__ with Presage support
    License: MIT
    URL: https://github.com/martonmiklos/sailfishos-presage-predictor
    Source: %{name}-%{version}.tar.xz
    BuildArch: noarch
    Requires: presage-lang-__langcode__
    Requires: hunspell-lang-__langcode__
    Requires: maliit-plugin-presage
    
    %description
    Keyboard layout for"$LAYOUT" __Language__ language with Presage text predictions
    
    %prep
    %setup -q
    
    %install
    mkdir -p %{buildroot}/usr/share/maliit/plugins/com/jolla/layouts
    cp -r keyboard/* %{buildroot}/usr/share/maliit/plugins/com/jolla/layouts
    
    %files
    %defattr(-,root,root,-)
    %{_datadir}/maliit/plugins/com/jolla/layouts" > $TMPDIR/$NAME-$VERSION/rpm/$NAME.spec
    
    sed -i "s/__langcode__/$CODE/"  $TMPDIR/$NAME-$VERSION/rpm/$NAME.spec
    sed -i "s/__Language__/$L/"  $TMPDIR/$NAME-$VERSION/rpm/$NAME.spec
    sed -i "s/__version__/$VERSION/"  $TMPDIR/$NAME-$VERSION/rpm/$NAME.spec
    
    tar -C $TMPDIR -cJf $TMPDIR/$NAME-$VERSION.tar.xz $NAME-$VERSION
    
    mkdir -p $HOME/rpmbuild/SOURCES
    mkdir -p $HOME/rpmbuild/SPECS
    
    cp $TMPDIR/$NAME-$VERSION.tar.xz $HOME/rpmbuild/SOURCES
    cp $TMPDIR/$NAME-$VERSION/rpm/$NAME.spec $HOME/rpmbuild/SPECS
    
    rm -rf $TMPDIR
    
    rm -rf $HOME/rpmbuild/BUILD/$NAME-$VERSION
    rpmbuild -ba --nodeps $HOME/rpmbuild/SPECS/$NAME.spec
    
    mkdir -p RPMS
    cp $HOME/rpmbuild/RPMS/noarch/$NAME-$VERSION-*.rpm .

    Edit | Forward | Quote | Quick Reply | Thanks
    The Following 6 Users Say Thank You to suicidal_orange For This Useful Post:
    Android_808, imaginaryenemy, juiceme, ljo, rinigus, Wikiwide

     
    FlyingAntero | # 10 | 2018-11-28, 14:01 | Report

    I installed https://openrepos.net/content/sailfi...nput-predictor to my X Compact (using official patched image from Xperia X) and it is working like a charm. However, swedish is my second language. Can anyone help to make layout for finnish?

    I have found data base for finnish words in UTF-8 format from Github:
    • https://github.com/hugovk/everyfinnishword

    Also, the layout is the same for finnish and swedish. Is it possible to just change the data base?

    Edit | Forward | Quote | Quick Reply | Thanks
    The Following 3 Users Say Thank You to FlyingAntero For This Useful Post:
    carlosgonz, juiceme, Wikiwide

     
    Page 1 of 4 | 1   2     3   | Next | Last
vBulletin® Version 3.8.8
Normal Logout