Scalable Combinatorial Tools for Health Disparities Research

Michael Langston, Robert Levine, Barbara Kilbourne, Gary Rogers, Anne Kershenbaum, Suzanne Baktash, Steven Coughlin, Arnold Saxton, Vincent Agboto, Darryl Hood, Maureen Litchveld, Tonny Oyana (+2 others)
2014 International Journal of Environmental Research and Public Health  
Despite staggering investments made in unraveling the human genome, current estimates suggest that as much as 90% of the variance in cancer and chronic diseases can be attributed to factors outside an individual's genetic endowment, particularly to environmental exposures experienced across his or her life course. New analytical approaches are clearly required as investigators turn to complicated systems theory and ecological, place-based and life-history perspectives in order to understand
more » ... clearly the relationships between social determinants, environmental exposures and health disparities. While traditional data analysis techniques remain foundational to health disparities research, they are easily overwhelmed by the ever-increasing size and heterogeneity of available data needed to illuminate latent gene x environment interactions. This has prompted the adaptation and application of scalable combinatorial methods, many from genome science research, to the study of population health. Most of these powerful tools are algorithmically sophisticated, highly automated and mathematically abstract. Their utility motivates the main theme of this paper, which is to describe real applications of innovative transdisciplinary models and analyses in an effort to help move the research community closer toward identifying the causal mechanisms and associated environmental contexts underlying health disparities. The public health exposome is used as a contemporary focus for addressing the complex nature of this subject. on the exposome paradigm [2], and is aimed at describing the effects of multiple and cumulative environmental exposures from conception to death on population health outcomes using a life stage approach. The RCHDEE brings together a transdisciplinary team of investigators with training in traditional epidemiologic and statistical methods with those proficient in the use of advanced computational, multi-level, and spatial models and analytics [3] . A complete understanding of the mechanisms through which multiple and cumulative environmental exposures across the life span can affect individual and population health is not yet attainable. Nevertheless, the public health exposome model can help generate hypotheses and interpret ways in which population health outcomes are the combined product of the presence or absence of individual and ecological risk and protective influences. These may include social determinants, life events, epigenetics, toxic exposures, social networks, access to healthcare, and numerous other subtle and under-appreciated factors. An amalgam of diverse scientific techniques are described in this paper: computer science, mathematics and statistics join forces with data interpretation and domain knowledge to elucidate both known and previously unrecognized variable relationships, and to generate testable hypotheses on an unprecedented scale. Pioneering graph theoretical methods and their application to modern health disparities research are employed. Practical use is made of lessons learned over the last two decades in the analysis of high throughput biological data. While standard techniques can scrutinize at most a handful of parameters for obvious dependencies, combinatorial methods are able to extract latent signal from a sea of even only modest correlations spread across an entire spectrum of available variables. A prototypical toolchain and illustrative examples are also presented. This work can be placed in the context of health science research transformations or paradigm shifts [4] . Urged by the National Cancer Institute and the National Institute on Environmental Health Sciences, the scientific community has developed thematic recommendations [5] for health disparities investigations that include transdisciplinary knowledge integration, data sharing, and an expanded use of quantitative methods to include multilevel analyses, spatial analysis, and the utilization of so-called "big data." In a companion paper [6] we discuss many of these issues in depth.
doi:10.3390/ijerph111010419 pmid:25310540 pmcid:PMC4210988 fatcat:smmhbbw44rh45pxn73i46mmeie