2 Hits in 0.62 sec

FlashProfile: A Framework for Synthesizing Data Profiles

Saswat Padhi, Prateek Jain, Daniel Perelman, Oleksandr Polozov, Sumit Gulwani, Todd Millstein
2017 arXiv   pre-print
We present a technique for synthesizing such profiles over a given language of patterns, that also allows for interactive refinement by requesting a desired number of clusters.  ...  Using a state-of-the-art inductive synthesis framework, PROSE, we have implemented our technique as FlashProfile.  ...  We also thank the anonymous reviewers for their constructive feedback on earlier versions of this paper. This  ... 
arXiv:1709.05725v1 fatcat:ke3hashnozekpmzcfr4dqesr4e

FlashProfile: Interactive Synthesis of Syntactic Profiles

Saswat Padhi, Prateek Jain, Daniel Microsoft, Oleksandr Polozov, Sumit Gulwani, Todd Ucla
Our implementation, FlashProfile, shows a median profiling time of 0.7 s over 142 tasks on 74 real datasets.  ...  We address the problem of learning comprehensive syntactic profiles for a set of strings. Real-world datasets, typically curated from multiple sources, often contain data in various formats.  ...  PROSE provides a convenient framework with highly efficient algorithms and data-structures for building such program synthesizers.  ...