Integrating data to acquire new knowledge: Three modes of integration in plant science
Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences
This paper discusses what it means and what it takes to integrate data in order to acquire new knowledge about biological entities and processes. Maureen O'Malley and Orkun Soyer have pointed to the scientific work involved in data integration as important and distinct from the work required by other forms of integration, such as methodological and explanatory integration, which have been more successful in captivating the attention of philosophers of science. Here I explore what data
... n involves in more detail and with a focus on the role of data-sharing tools, like online databases, in facilitating this process; and I point to the philosophical implications of focusing on data as a unit of analysis. I then analyse three cases of data integration in the field of plant science, each of which highlights a different 2 mode of integration: (1) inter-level integration, which involves data documenting different features of the same species, aims to acquire an interdisciplinary understanding of organisms as complex wholes and is exemplified by research on Arabidopsis thaliana; (2) cross-species integration, which involves data acquired on different species, aims to understand plant biology in all its different manifestations and is exemplified by research on Miscanthus giganteus; and (3) translational integration, which involves data acquired from sources within as well as outside academia, aims at the provision of interventions to improve human health (e.g. by sustaining the environment in which humans thrive) and is exemplified by research on Phytophtora ramorum. Recognising the differences between these efforts sheds light on the dynamics and diverse outcomes of data dissemination and integrative research; and the relations between the social and institutional roles of science, the development of data-sharing infrastructures and the production of scientific knowledge. Highlights: • Data integration, particularly though online databases and other digital infrastructure, plays a central role in contemporary biological research. • Plant science constitutes a particularly interesting are to analyse data integration, as it strongly features collaborative efforts to integrate results acquired at multiple levels of organization (molecular, cellular, ecological) and across species. • I discuss three research traditions in plant science, which exemplify three different modes of integration: inter-level, cross-species and translational. • This analysis illuminates the challenges of making data usable to the scientific community, the scaffolding needed to transform data available online into new knowledge and the different forms of scientific knowledge that may result. • I also stress the importance of considering the whole spectrum of scientific activities, including so-called 'applied' research, in order to understand current scientific epistemology.