Open Science data without curation. Is it useful? An American Astronomical Society Publishing perspective

Greg Schwarz
2021 Zenodo  
The American Astronomical Society (AAS) publishes ~5000 scholarly astronomical articles each year in our 6 Journals. For the last 20 years we have accepted and solicited data from authors to be integrated into the published article. Data includes machine readable tables, data behind figures, interactive figures, and external links to data repositories. All of these data types are reviewed by two Data Editors who create or edit meta-data, enforce formatting standards, and obtain final author
more » ... oval before publication. This process is time consuming but ultimately results in robust and useful data products that are ready for immediate reader use and can be ingest into other databases, e.g. CDS/VizieR. This curation process is essentially a peer review of the data. A publishing data policy that follows the tenets of the Open Science movement would mandate that all data used in the article be available at publication for reproducibility. This can mean archiving in public repositories or even self archiving for a specific period of time leaving the level of curation left up to the author. Our data review work provides unique insights into the effort authors put into making data available and useful. In short, the quality is often lacking which means significant challenges for the end user. Errors are common in both the meta-data (generally inadequate documentation) and the data (missing data, duplication, significant digit issues, etc.). The reasons for poor data products is due to lack of author training in curation and laziness. Given these author limitations any Open Science policy without Data Editors does not fully support its underlying ethos of data reproducibility. Data needs to be treated with the same considerations as the science itself and reviewed accordingly
doi:10.5281/zenodo.4884917 fatcat:7l5snobkobeatbc4rjhokmchja