Thread
:
[Announcement]Open source text prediction input plugin
View Single Post
rinigus
2018-12-12 , 10:03
Posts: 1,414 | Thanked: 7,547 times | Joined on Aug 2016 @ Estonia
#
29
Profanity is an issue and would be great to get rid of it. I had the same problem when composing the database for English, large fraction of the time was spent on that. I would suggest to filter the database and remove all n-grams that include any of the words that are classified as "bad". For that, we need a list of the words (possibly as substrings). That would have to be provided by native speakers though. Maybe such list is composed already somewhere...
Quote & Reply
|
The Following 2 Users Say Thank You to rinigus For This Useful Post:
FlyingAntero
,
juiceme
rinigus
View Public Profile
Send a private message to rinigus
Find all posts by rinigus