Assisting Discovery in Public Health
Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics - HILDA'17
Several public health (PH) researchers have lately been arguing that big data can play a profound role in scientific discovery. Leveraging the vast amount of population-level data collected by public agencies and other organizations, could lead to important discoveries that were not necessarily suspected to be true. However, they also warn about the pitfalls of data-driven discovery: The large amount of data can easily lead to information overload for the researchers. Additionally, data-driven
... tudies that make a lot of tests in the search for important discoveries have the potential to lead to discoveries that seem important but are in fact random. We show that data-driven studies can be effective and yet avoid the potential pitfalls by keeping the researchers in the loop of the discovery process. To this end, we propose PHD; an interactive visual discovery system that allows public health researchers to gain interesting insights from large datasets. PHD generalizes the current workflow of PH researchers by facilitating the major analytics tasks involved in PH discovery, such as calculating important associations based on the standard notions of odds rations and confidence intervals, controlling for the effect of other variables and discovering interesting compounding effects. More importantly however, it leverages user interaction and the semantics of the domain to make sure that this workflow scales to large datasets, while avoiding information overload and random discoveries.