A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
TensorFlow Data Validation: Data Analysis and Validation in Continuous ML Pipelines
2020
Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data
Machine Learning (ML) research has primarily focused on improving the accuracy and efficiency of the training algorithms while paying much less attention to the equally important problem of understanding, validating, and monitoring the data fed to ML. Irrespective of the ML algorithms used, data errors can adversely affect the quality of the generated model. This indicates that we need to adopt a data-centric approach to ML that treats data as a first-class citizen, on par with algorithms and
doi:10.1145/3318464.3384707
dblp:conf/sigmod/CavenessCPP0Z20
fatcat:agjc4n4f5jgw3kfmatmmvcueeu