![]() |
|
||
Assessing the Coverage of Data Collection Campaigns on Twitter: A Case StudyVassilis Plachouras, Yannis Stavrakas, and Athanasios Andreou Institute for the Management of Information Systems (IMIS), ATHENA Research Center, Artemidos 6 & Epidavrou, Maroussi 15125, Athens, Greecevplachouras@imis.athena-innovation.gr yannis@imis.athena-innovation.gr athan.andreou@gmail.com Abstract. Online social networks provide a unique opportunity to access and analyze the reactions of people as real-world events unfold. The quality of any analysis task, however, depends on the appropriateness and quality of the collected data. Hence, given the spontaneous nature of user-generated content, as well as the high speed and large volume of data, it is important to carefully define a data-collection campaign about a topic or an event, in order to maximize its coverage (recall). Motivated by the development of a social-network data management platform, in this work we evaluate the coverage of data collection campaigns on Twitter. Using an adaptive language model, we estimate the coverage of a campaign with respect to the total number of relevant tweets. Our findings support the development of adaptive methods to account for unexpected real-world developments, and hence, to increase the recall of the data collection processes. Keywords: Social networks, data management, event tracking LNCS 8186, p. 598 ff. lncs@springer.com
|