Calendar
May 2013 M T W T F S S « Aug 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 -
Recent Posts
Categories
Category Archives: twitter
Estimation of ldig (twitter Language Detection) for LIGA dataset
LIGA[Tromp+ 11] is a twitter language detection for 6 languages (German, English, Spanish, French, Italian and Dutch). It uses a graph with 3-grams for long distance features and detects 95-98% accuracy. They open their dataset here which has 9066 tweets, … Continue reading
Posted in Language Detection, NLP, twitter
Leave a comment
twitter replaces a string ‘\u2028′ into ‘\u2070′
I posted a tweet about Unicode’s line feed code, including a string ‘\u2028′. Then it was replaced ‘\u2070′! Hence not only ‘\u2028′(LINE SEPARATOR) but also ‘\u2029′(PARAGRAPH SEPARATOR) is done so, twitter intends to do something (awful? ) for line feed … Continue reading
Posted in twitter
Leave a comment