University of Cambridge > > Computer Laboratory Systems Research Group Seminar > Classification of Twitter Accounts into Automated Agents and Human Users

Classification of Twitter Accounts into Automated Agents and Human Users

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Liang Wang.

Online social networks (OSNs) have seen a remark- able rise in the presence of surreptitious automated accounts. Massive human user-base and business-supportive operating model of social networks, such as Twitter, facilitates the creation of automated agents. In this paper we outline a systematic methodology and train a classifier to categorise Twitter accounts into ‘automated’ and ‘human’ users. To improve classification accuracy we employ a set of novel steps. First, we divide the dataset into four popularity bands to compensate for differences in types of accounts. Second, we create a large ground truth dataset using human annotations and extract relevant features from raw tweets. To judge accuracy of the procedure we calculate agreement among human annotators as well as with a bot detection tool. We then apply a Random Forests classifier that achieves an accuracy close to or surpassing human agreement. Finally, as a concluding step we perform tests to measure the efficacy of our results.

This talk is part of the Computer Laboratory Systems Research Group Seminar series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2024, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity