Assessing the Predictive Power of Online Social Media to Analyze COVID-19 Outbreaks in the 50 U.S. States

Jiachen Sun, Peter A. Gloor
2021 Future Internet  
As the coronavirus disease 2019 (COVID-19) continues to rage worldwide, the United States has become the most affected country, with more than 34.1 million total confirmed cases up to 1 June 2021. In this work, we investigate correlations between online social media and Internet search for the COVID-19 pandemic among 50 U.S. states. By collecting the state-level daily trends through both Twitter and Google Trends, we observe a high but state-different lag correlation with the number of daily
more » ... firmed cases. We further find that the accuracy measured by the correlation coefficient is positively correlated to a state's demographic, air traffic volume and GDP development. Most importantly, we show that a state's early infection rate is negatively correlated with the lag to the previous peak in Internet searches and tweeting about COVID-19, indicating that earlier collective awareness on Twitter/Google correlates with a lower infection rate. Lastly, we demonstrate that correlations between online social media and search trends are sensitive to time, mainly due to the attention shifting of the public.
doi:10.3390/fi13070184 fatcat:n337mjibwvew5jikqeowbtbzj4