006 12 29 Points 1316 Partenaires vivaocs target blanc baznas FWD V4 solid 000 safiweb hostma 00px 3px vertical love jiji bientot hichamtoldo skyblog blank siro tssalo mehdibono wesh houssam salam sarah slt tt monde lkhassar sqal 07 wlad asfi t9admo walah mdintkom wa3ra mais ntoma mhachrine m simo simoraymy mimo moi meryem safi c est mon msn mailto soso 2005 mousi9a net hicham toldo ach hadak chi sadi9 dyalach site adrianhicham 3l makshof tamo sba7 lkhayre sba7ato lilah manak miss kawtar salut yala9ina m3a ma7san mana ou tanatmana matab9awche tkhasro fi lhadra awlade khalti msa tupac saha hi everybody souma ha7na left Votre Message auteur maxlenght msg send Voir archives google 160 600 160x600 E1771E 006699 addv Ajouter Une addm addi Photo addt Telechargement addp Devenez partenaire Signaler bug erreur Contacter 250 Codage Design par Mohamed Yassine 0021274185715 N° 17 Bloc 62 Saida 46000 ligne 94 Total 65559 Corpyright Tous droits r? Computational Linguistics in the Netherlands Journal 4 (2014) Submitted 06/2014; Published 12/2014 Gender Recognition on Dutch Tweets Hans van Halteren Nander Speerstra Radboud University Nijmegen, CLS, Linguistics Abstract In this paper, we investigate gender recognition on Dutch Twitter material, using a corpus consisting of the full Tweet production (as far as present in the Twi NL data set) of 600 users (known to be human individuals) over 2011 and We experimented with several authorship profiling techniques and various recognition features, using Tweet text only, in order to determine how well they could distinguish between male and female authors of Tweets.
Later, in 2004, the group collected a Blog Authorship Corpus (BAC; (Schler et al.
With lexical N-grams, they reached an accuracy of 67.7%, which the combination with the sociolinguistic features increased to 72.33%. (2011) attempted to recognize gender in tweets from a whole set of languages, using word and character N-grams as features for machine learning with Support Vector Machines (SVM), Naive Bayes and Balanced Winnow2.
Their highest score when using just text features was 75.5%, testing on all the tweets by each author (with a train set of 3.3 million tweets and a test set of about 418,000 tweets). (2012) used SVMlight to classify gender on Nigerian twitter accounts, with tweets in English, with a minimum of 50 tweets.
They used lexical features, and present a very good breakdown of various word types.
When using all user tweets, they reached an accuracy of 88.0%.