RSS-Crawler Enhancement for Blogosphere-Mapping

Justus Bross, Patrick Hennig, Philipp Berger, Christoph Meinel
2010 International Journal of Advanced Computer Science and Applications  
The massive adoption of social media has provided new ways for individuals to express their opinions online. The blogosphere, an inherent part of this trend, contains a vast array of information about a variety of topics. It is a huge think tank that creates an enormous and ever-changing archive of open source intelligence. Mining and modeling this vast pool of data to extract, exploit and describe meaningful knowledge in order to leverage structures and dynamics of emerging networks within the
more » ... blogosphere is the higher-level aim of the research presented here. Our proprieteary development of a tailor-made feed-crawler-framework meets exactly this need. While the main concept, as well as the basic techniques and implementation details of the crawler have already been dealt with in earlier publications, this paper focuses on several recent optimization efforts made on the crawler framework that proved to be crucial for the performance of the overall framework.
doi:10.14569/ijacsa.2010.010209 fatcat:freiiorcwza5bpuaxjnr3uiuuu