ECS in the Era of Data Science

Daniel T. Schwartz, Matthew D. Murbach, David A. C. Beck
2019 The Electrochemical Society Interface  
S ince founding the first Data Science Hack Day at the 232nd ECS Meeting, we've heard two main questions: What is data science, and why is it relevant to ECS? Thought leaders in data science have called it "the child of statistics and computer science," where the application of "modern statistical and computational tools to modern scientific questions requires significant human judgment and deep disciplinary knowledge." 1 Alternatively, a leading group of data science early adopters from
more » ... l engineering defined it more pragmatically as the application of modern data management practices, statistical and machine learning, and advanced visualization to ask and answer new questions. 2 When reading the papers in this issue, keep these definitions of data science in mind as you assess the relevance of the work to ECS's mission: "to advance theory and practice at the forefront of electrochemical and solid state science and technology, and allied subjects." The four papers in this issue provide a tangible look into data science-enabled scholarship from leading laboratories around the world. How we selected the contributions, and the meta-themes that emerged, may be of interest to readers looking for the abridged version of this issue. We first decided to narrow the focus to energy-related examples within ECS's scope, rather than picking a crosscut of papers from sensors, corrosion, electronics and photonics, etc. Topical focus, we felt, helped to reveal scholarly synergies among the different contributions. Nonetheless, we were confident that the authors would highlight data science-enabled scholarship that was keenly relevant to readers from every ECS division. The authors have delivered. Next, we sought contributions that exposed readers to some of the broad ways that "modern statistical and computational tools" have been applied to problems across the extraordinary spatial and temporal scales of electrochemical and solid state research. Again, the authors delivered, by presenting a combination of open source software and open datasets that combine physics-based, statistical, and machine learning methods to tackle questions ranging from molecular-level processes at the electrified interface all the way up to forecasting the reliability and operation of globally distributed clean energy systems. The attentive reader of these papers will see several meta-themes emerge. Research meta-theme4Adopting data science practices can accelerate one of the highest aspirations of electrochemical and solid state scholars-model-data convergence. Barriers between sophisticated modelers and sophisticated experimentalists come down when open analysis software, open data, and more automated analysis methods are combined in dynamic user communities. Data availability remains problematic, however. Educational meta-theme4Disciplinary scholars that seek to use "modern statistical and computational tools" to address their questions need additional training. The lack of formalized training at most universities means this effort becomes an extra responsibility of the host or collaborating research laboratories. Community meta-theme4"Modern statistical and computational tools" can underpin a much more open and collaborative research community. In an idealized version of this community, all of the foundational data and analysis that goes into a completed publication becomes immediately available for the next researcher to make the next advancement. However, without metrics, peer evaluation, and funding to legitimize software and datasets as research products, it will be difficult to build anything like this ideal community.
doi:10.1149/2.f03191if fatcat:n3toajlrjfaxngdqjoz7yt4f64