5 Hits in 2.8 sec


Matko Boanjak, Eduardo Oliveira, José Martins, Eduarda Mendes Rodrigues, Luís Sarmento
2012 Proceedings of the 21st international conference companion on World Wide Web - WWW '12 Companion  
In this paper we describe TwitterEcho, an open source Twitter crawler for supporting this kind of research, which is characterized by a modular distributed architecture.  ...  Our crawler enables researchers to continuously collect data from particular user communities, while respecting Twitter's imposed limits.  ...  ACKNOWLEDGMENTS The authors would like to thank Jorge Texeira, Gustavo Laboreiro, and Arian Pasquali for valuable discussions and technical support.  ... 
doi:10.1145/2187980.2188266 dblp:conf/www/BoanjakOMRS12 fatcat:nahcnzkgybdjrat2vc63kcoh4a

Loklak - A Distributed Crawler and Data Harvester for Overcoming Rate Limits [article]

Sudheesh Singanamalla, Michael Peter Christen
2017 arXiv   pre-print
In this paper we describe Loklak, an open source distributed peer to peer crawler and scraper for supporting such research on platforms like Twitter, Weibo and other social networks.  ...  Our crawler enables researchers to continuously collect data while overcoming the barriers of authentication and rate limits imposed to provide a repository of open data as a service.  ...  TwitterEcho [1] proposed a distributed focused crawler to support open research with twitter data and talks about using web crawlers to collect the required information from such social networks.  ... 
arXiv:1704.03624v1 fatcat:6akxa2rjcjg73onr6742td47ae

Open challenges for data stream mining research

Georg Krempl, Myra Spiliopoulou, Jerzy Stefanowski, Indre Žliobaite, Dariusz Brzeziński, Eyke Hüllermeier, Mark Last, Vincent Lemaire, Tino Noack, Ammar Shaker, Sonja Sievi
2014 SIGKDD Explorations  
Our goal is to identify gaps between current research and meaningful applications, highlight open problems, and define new application-relevant research directions for data stream mining.  ...  This article presents a discussion on eight open challenges for data stream mining.  ...  Acknowledgments We would like to thank the participants of the RealStream2013 workshop at ECMLPKDD2013 in Prague, and in particular Bernhard Pfahringer and George Forman, for suggestions and discussions  ... 
doi:10.1145/2674026.2674028 fatcat:y3bozzeohveibgxb5wmiwfcogm

Data collection in a social network with weighted seed selection and data analysis based on rule-based methods

Changhyun Byun
For these reasons, Twitter enables researchers and data analyzers to access a variety of data in Twitter by providing Application Programming Interface (API).  ...  This allows us, as well as other researchers and data seekers, to build their own Twitter dataset.  ...  First of all, it would be more efficient to retrieve data for a focused community of interest, starting with multiple seeds. However, the program does not support this function.  ... 
doi:10.13016/m2bm67 fatcat:adnkftheyjcy5gumbstfratlgy

A study of aspect-based sentiment analysis in social media

Youngsub Han
In this research, we analyzed Twitter to discover characteristics of social media. This study is intended to address these topics to build a better understanding of Twitter usages.  ...  Therefore, by using the active audience concept, and relying on marketing literature, we chose a grounded theory approach and presented research questions for in-depth understanding of Twitter usage in  ...  Apache Hadoop is an open-source software for data processing and management with distributed manners [103] .  ... 
doi:10.13016/m24m91c1b fatcat:tbw4f7yqengydadg6i3xqhtm2y