A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is
We demonstrate how the Auctus dataset search engine addresses some of these challenges. We describe the system architecture and how users can explore datasets through a rich set of queries. ... However, finding relevant data is difficult. While search engines have addressed this problem for Web documents, there are many new challenges involved in supporting the discovery of structured data. ... Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of NSF and DARPA. ...arXiv:2102.05716v2 fatcat:juyw3tujmjcdffbni3z5yls4pm
We present \system, an end-to-end system that takes as input a dataset and a data repository, and outputs an augmented data set such that training a predictive model on this augmented dataset results in ... Our system has two distinct components: (1) a framework to search and join data with the input data, based on various attributes of the input, and (2) an efficient feature selection algorithm that prunes ... Real World Datasets Real World datasets are such that given a base table you search open sourced datasets for joinable tables using Join Discovery systems such as Aurum or NYU Auctus. ...arXiv:2003.09758v1 fatcat:4glagrcvkzc67ege3u7hsccduq
AIDR 2019 (Artificial Intelligence for Data Discovery and Reuse) is a new conference that brings together researchers across a broad range of disciplines, computer scientists, tool developers, data providers ... There is great value embedded in reusing scientifc data for secondary discoveries. ... Fernando Chirigati described Auctus, a dataset search engine that targets the problem of incomplete or insufficient data, finds datasets that can be joined or unioned from the web, and uses these datasets ...doi:10.1184/r1/10093568 fatcat:luf2coraszhilps2qwuir753rm