Clustering of Countries for COVID-19 Cases based on Disease Prevalence, Health Systems and Environmental Indicators [article]

Syeda Amna Rizvi, Muhammad Umair, Muhammad Aamir Cheema
2021 medRxiv   pre-print
ABSTRACTThe coronavirus has a high basic reproduction number (R0) and has caused the global COVID-19 pandemic. Governments are implementing lockdowns that are leading to economic fallout in many countries. Policy makers can take better decisions if provided with the indicators connected with the disease spread. This study is aimed to cluster the countries using social, economic, health and environmental related metrics affecting the disease spread so as to implement the policies to control the
more » ... idespread of disease. Thus, countries with similar factors can take proactive steps to fight against the pandemic. The data is acquired for 79 countries and 18 different feature variables (the factors that are associated with COVID-19 spread) are selected. Pearson Product Moment Correlation Analysis is performed between all the feature variables with cumulative death cases and cumulative confirmed cases individually to get an insight of relation of these factors with the spread of COVID-19. Unsupervised k-means algorithm is used and the feature set includes economic, environmental indicators and disease prevalence along with COVID-19 variables. The learning model is able to group the countries into 4 clusters on the basis of relation with all 18 feature variables. We also present an analysis of correlation between the selected feature variables, and COVID-19 confirmed cases and deaths. Prevalence of underlying diseases shows strong correlation with COVID-19 whereas environmental health indicators are weakly correlated with COVID-19.
doi:10.1101/2021.02.15.21251762 fatcat:gwftrvg4pbhivk5dlcurv4gxca