10 Hits in 1.7 sec

IDEBench: A Benchmark for Interactive Data Exploration [article]

Philipp Eichmann, Carsten Binnig, Tim Kraska, Emanuel Zgraggen
2018 arXiv   pre-print
In this paper, we argue that such benchmarks are not suitable for evaluating database workloads originating from interactive data exploration (IDE) systems where most queries are ad-hoc, not based on predefined  ...  As a main contribution, we present a novel benchmark called IDEBench that can be used to evaluate the performance of database systems for IDE workloads.  ...  CONCLUSION In this paper, we presented a new benchmark IDEBench for evaluating systems for interactive data exploration (IDE).  ... 
arXiv:1804.02593v1 fatcat:4bzoyltuazbu7ndtcuwtg2xx6e

An Adaptive Benchmark for Modeling User Exploration of Large Datasets [article]

Joanna Purich, Hira Mahmood, Diana Chou, Chidi Udeze, Leilani Battle
2022 arXiv   pre-print
In this paper, we present a tool that extends IDEBench to ingest visualization interfaces and a dataset, and estimate the expected database load that would be generated by real users.  ...  Interactive analysis systems provide efficient and accessible means by which users of varying technical experience can comfortably manipulate and analyze data using interactive widgets.  ...  of precision interface design, visualization benchmarks, and the effects of latency in interactive data exploration systems.  ... 
arXiv:2203.15748v1 fatcat:f4eszke22neariqzk4ad7rgbmy

Database Benchmarking for Supporting Real-Time Interactive Querying of Large Data

Leilani Battle, Philipp Eichmann, Marco Angelini, Tiziana Catarci, Giuseppe Santucci, Yukun Zheng, Carsten Binnig, Jean-Daniel Fekete, Dominik Moritz
2020 Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data  
In this paper, we present an initial benchmark that focuses on "crossfilter"-style applications, which are a popular interaction type for data exploration and a particularly demanding scenario for testing  ...  While there exist proposals for evaluating database systems on interactive data exploration workloads, none rely on real user traces for database benchmarking.  ...  We thank Schloss Dagstuhl-Leibniz Center for Informatics for their support in organizing "Dagstuhl Seminar 17461 -Connecting Visualization and Data Management Research".  ... 
doi:10.1145/3318464.3389732 dblp:conf/sigmod/BattleEACSZBFM20 fatcat:njdvag7hvrby7kcgsdch67omgu

Evaluating Visual Data Analysis Systems

Leilani Battle, Marco Angelini, Carsten Binnig, Tiziana Catarci, Philipp Eichmann, Jean-Daniel Fekete, Giuseppe Santucci, Michael Sedlmair, Wesley Willett
2018 Proceedings of the Workshop on Human-In-the-Loop Data Analytics - HILDA'18  
Visual data analysis is a key tool for helping people to make sense of and interact with massive data sets.  ...  ., database benchmarks, individual user studies) fail to capture the key points that make systems for visual data analysis (or visual data systems) challenging to design.  ...  and interactive data exploration scenarios.  ... 
doi:10.1145/3209900.3209901 dblp:conf/sigmod/BattleABCEFSSW18 fatcat:k57wsmv2pbh53b7ronweskm7ga

Towards Interactive Data Exploration [chapter]

Carsten Binnig, Fuat Basık, Benedetto Buratti, Ugur Cetintemel, Yeounoh Chung, Andrew Crotty, Cyrus Cousins, Dylan Ebert, Philipp Eichmann, Alex Galakatos, Benjamin Hättasch, Amir Ilkhechi (+10 others)
2019 Lecture Notes in Business Information Processing  
Finally, we discuss other important considerations for interactive data exploration systems including benchmarking, natural language interfaces, as well as interactive machine learning.  ...  Furthermore, we present the results of building IDEA, a new type of system for interactive data exploration that is specifically designed to integrate seamlessly with existing data management landscapes  ...  An initial version of the benchmark and results of running the benchmark on several data analytics backends for interactive data exploration is available 2 .  ... 
doi:10.1007/978-3-030-24124-7_11 fatcat:ixxjchbhe5awrlcjvr5kocqgmq


Tim Kraska
2018 Proceedings of the VLDB Endowment  
On the one hand, visual interfaces for data science have to be intuitive, easy, and interactive to reach users without a strong background in computer science or statistics.  ...  In this paper, we present Northstar, the Interactive Data Science System, which we have developed over the last 4 years to explore designs that make advanced analytics and model building more accessible  ...  Results We recently evaluated IDEA against other systems to create a benchmark for interactive data exploration, called IDEBench [33] .  ... 
doi:10.14778/3229863.3240493 fatcat:7v3sxuhth5gnpa2d5lc6rmu4b4

User Group Analytics Survey and Research Opportunities

Behrooz Omidvar-Tehrani, Sihem Amer-Yahia
2019 IEEE Transactions on Knowledge and Data Engineering  
In this survey, we discuss different approaches for each component of user group analytics, i.e., discovery, exploration, and visualization.  ...  It is also appealing to users in their role as information consumers who use the social Web for routine tasks such as finding a book club or choosing a physical activity.  ...  IDEBench is proposed in [182] , [183] , [184] as an EV benchmark.  ... 
doi:10.1109/tkde.2019.2913651 fatcat:6csthrt4zngqzkaur2lfytrupq

Approximate Query Processing using Deep Generative Models [article]

Saravanan Thirumuruganathan, Shohedul Hasan, Nick Koudas, Gautam Das
2019 arXiv   pre-print
In this work, we explore the usage of deep learning (DL) for answering aggregate queries specifically for interactive applications such as data exploration and visualization.  ...  The database community has pioneered many novel techniques for Approximate Query Processing (AQP) that could give approximate results in a fraction of time needed for computing exact results.  ...  Our approach is complementary to traditional AQP exploring a new research direction of utilizing deep generative models for data exploration.  ... 
arXiv:1903.10000v3 fatcat:g3zjzhwpxvcohgzwueywszwtgi

Mosaic: A Sample-Based Database System for Open World Query Processing [article]

Laurel Orr, Samuel Ainsworth, Walter Cai, Kevin Jamieson, Magda Balazinska, Dan Suciu
2020 arXiv   pre-print
Data scientists have relied on samples to analyze populations of interest for decades. Recently, with the increase in the number of public data repositories, sample data has become easier to access.  ...  In this paper, we show how our envisioned system solves this problem by having a unique sample-based data model with extensions to the SQL language.  ...  This work is supported by NSF AITF 1535565 and the Intel Science and Technology Center for Big Data.  ... 
arXiv:1912.07777v3 fatcat:mhwgbbocynfo7jyxyq7ydsbju4

Sample Debiasing in the Themis Open World Database System (Extended Version) [article]

Laurel Orr, Magda Balazinska, Dan Suciu
2020 arXiv   pre-print
while maintaining interactive query response times.  ...  We leverage apriori population aggregate information to develop and combine two different approaches for automatic debiasing: sample reweighting and Bayesian network probabilistic modeling.  ...  This work is supported by NSF AITF 1535565, NSF IIS 1907997, and a gift from Intel.  ... 
arXiv:2002.09799v2 fatcat:2pyct6ektfdkjlbsmdngeuzasi