Research on the Methods and Key Techniques of Web Archive Oriented Social Media Information Collection

Xinping Huang
2021 Journal of Web Engineering  
Social media information collection and preservation is a hot issue in the field of Web Archive. This paper makes a comparative analysis of the different social media information collection methods, deeply analyzes the key techniques of the three important parts-collection, evaluation and preservation in the information collection process, and provides the solutions for the problems in the key techniques. Through analysis, the collection method suitable for the social media information is
more » ... In terms of the problem that social websites impose restrictions on the call frequency of API, the paper provides solutions, for example, use the multiplexing mechanism, use the naive Bayesian algorithm to solve the spam filtering problem, and use MongoDB Dbased distributed storage to store collected massive data.
doi:10.13052/jwe1540-9589.20812 fatcat:jyyztztuz5ehtnv22nidjq55lu