Data Science Analysis and Profile Representation applied to Secondary Prevention of Acute Coronary Syndrome

Antonio Garcia-Garcia, Ignacio Prieto-Egido, Alicia Guerrero-Curieses, Juan Ramon Feijoo-Martinez, Sergio Munoz-Romero, Sergio Manzano-Fernandez, Pedro Jose Flores-Blanco, Jose Luis Rojo-alvarez, Andres Martinez-Fernandez
2021 IEEE Access  
The analysis of large amounts of data from electronic medical records (EMRs) and daily clinical practice data sources has received increasing attention in the last years. However, few systematic approaches have been proposed to support the extraction of the wealth and diversity of information from these data sources. Specifically, Acute Coronary Syndrome (ACS) data are available in many hospitals and health units because ACS shows elevated morbidity and mortality. This work proposes a method
more » ... led Data Science Analysis and Representation (DSAR) to scrutinize and exploit, in a univariate way, scientific information content in limited ACS samples. DSAR uses Bootstrap Resampling to provide robust, cross-sectional, and non-parametric statistical tests on categorical and metric variables. It also constructs an informative graphical representation of the database variables, which helps to interpret the results and to identify the relevant variables. Our objectives were to validate DSAR by comparing it to conventional statistical methods when looking for the most relevant variables in the secondary prevention of ACS, and to determine the degree of correlation between them and the Exitus event (associated with patient death). To achieve this objective, we applied DSAR on an anonymized sample of 270 variables from 2377 patients diagnosed with ACS. The results showed that DSAR identified 44% significant variables while conventional methods offered weak correlation results. Then, the scientific literature was reviewed for a set of these variables, validating the agreement with clinical experience and previous ACS research. The conclusion is that DSAR is a valuable and a useful method for clinicians in the identification of potentially predictive variables and, overall, a good starting point for future multivariate secondary analyzes in the clinical field of ACS, or fields with similar information characteristics.
doi:10.1109/access.2021.3083523 fatcat:qguwrpm2vzeothfesgpgogty6a