Calendar
May 2013 M T W T F S S « Aug 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 -
Recent Posts
Categories
Category Archives: i18n
Updated LangDetect Library (4x faster)
I’ve updated LangDetect (Language Detection Library for Java). http://code.google.com/p/language-detection/ Download a package “langdetect-01-24-2011.zip” from Download List This updating has 4x faster detection based on Posted improvement code by Elmer Garduno. Very Thanks!! table: 100 times detection time for test data(ms). … Continue reading
Posted in i18n, Java, Language Detection, NLP
4 Comments
Language Detection Plugin for Apache Nutch
I developed a Language Detection plugin for Apache Nutch with our LangDetect library. Download (bundled in the LangDetect library) Setup manual Compatible to the Standard language identification plugin of Nutch 99% over accuracy Supports 49 languages Afrikaans, Albanian, Arabic, Bengali, … Continue reading
Posted in i18n, NLP, Nutch, Plugin, Search Engine, text analysis
Leave a comment
Language Detection Library for Java
I developed a Language Detection library for Java which is able to detect 49 languages for given text (English, Japanese, Chinese, …). http://code.google.com/p/language-detection/ This library has 99% over accuracy for news corpus (see below presentation). I’ll try to substitute Apache … Continue reading
Posted in i18n, Java, NLP, text analysis
1 Comment