Can Social Media Data Be Utilized to Enhance Early Warning: Retrospective Analysis of the U.S. Covid-19 Pandemic [article]

Lingyao Li, Lei Gao, Jiayan Zhou, Zihui Ma, David Choy, Molly Hall
2021 medRxiv   pre-print
The U.S. needs early warning systems to help it contain the spread of infectious diseases. Conventional early warning systems use lab-test results or dynamic records to signal early warning signs. New early warning systems can supplement these data with indicators of public awareness like news articles and search queries. This study aims to explore the potential of utilizing social media data to enhance early warning of the COVID-19 outbreak. To demonstrate the feasibility, this study conducts
more » ... retrospective analysis and investigates more than 14 million related Twitter postings in the date range from January 20 to March 10, 2020. With the aid of natural language processing tools and machine learning classifiers, this study classifies each of these tweets into either a signal or a non-signal. In this study, a 'signal' tweet implies that the user recognized the COVID-19 outbreak risk in the U.S. This study then proposes a parameter 'signal ratio' to signal warning signs of the COVID-19 pandemic over periods. Results reveal that social media data and the signal ratio can detect the hazards ahead of the COVID-19 outbreak. This claim has been validated with a leading time of 16 days through the comparison to other referenced methods based on Google trends or media news.
doi:10.1101/2021.04.11.21255285 fatcat:jrmwn566tnesbiatpeofuf5moq