Updated LangDetect Library (4x faster)

I’ve updated LangDetect (Language Detection Library for Java).

This updating has 4x faster detection based on Posted improvement code by Elmer Garduno. Very Thanks!!


table: 100 times detection time for test data(ms). left: the previous version, right: the updated version

lang prev updated
ar 937 812
ar 203 79
en 610 31
en 1656 93
fa 156 31
fa 219 62
fa 328 172
fa 344 203
fa 343 156
fa 266 94
fa 406 188
fa 375 156
fa 421 250
fa 547 390
fa 390 234
fa 390 203
fa 187 47
gu 78 31
gu 47 15
it 563 78
it 547 78
it 578 78
it 563 78
ja 110 15
ja 109 15
ja 125 31
zh-cn 78 16
hi 391 16
mr 719 47
ne 640 47
hi 1000 78
mr 296 15
ne 906 62
mr 203 16
mr 187 16
mr 266 47
mr 156 16
nl 1454 94
ru 406 93
tl 391 141
tr 1016 47
zh-tw 203 31
ur 235 141
ur 250 125
ur 265 172
sum 19560 4840
Advertisements
This entry was posted in i18n, Java, Language Detection, NLP. Bookmark the permalink.

4 Responses to Updated LangDetect Library (4x faster)

  1. Hi!

    Thanks for this sharing this wonderful library, it really does its job fast and seemingly accurate!

    I’ve used the built-in ability to train some new languages and noticed that the filesizes of the profiles I got were noticeably larger (~70kb average) than the ones supplied within the library (~ 30kb average).

    What could be the reason for this? Will it affect performance or quality?

    I tried for example this file: and got 77kB compared to 27kB for the one supplied with the library.

  2. shuyo says:

    Hi,
    It is because the binded profiles are generated by the older version of langdetect.
    I expect that the newer have no difference or the higher precision, but maybe not …

  3. fisya says:

    Hi Shuyo,

    May I know how to create our own language profiles?

  4. shuyo says:

    Please see “Generate language profile” on Tools page in langdetect project.
    http://code.google.com/p/language-detection/wiki/Tools

    http://code.google.com/p/language-detection/issues/detail?id=16
    Is this comment yours too?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s