Filters








499 Hits in 5.5 sec

Towards elastic transactional cloud storage with range query support

Hoang Tam Vo, Chun Chen, Beng Chin Ooi
2010 Proceedings of the VLDB Endowment  
We also enhance the system with an effective load balancing scheme using a self-tuning replication technique that is specially designed for large-scale data.  ...  In ec-Store, data objects are distributed and replicated in a cluster of commodity computer nodes located in the cloud.  ...  C.2 Effect of Self-tuning Range Histogram We now study the effect of self-tuning range histogram in handling access patterns with flash crowd queries.  ... 
doi:10.14778/1920841.1920907 fatcat:igcu5btk35eino33rsa64dppy4

Estimating Query Result Sizes for Proxy Caching in Scientific Database Federations

Tanu Malik, Randal Burns, Nitesh Chawla, Alex Szalay
2006 ACM/IEEE SC 2006 Conference (SC'06)  
CAROT estimates query result sizes by learning the distribution of query results, not by examining or sampling data, but from observing workload.  ...  nature of database federations.  ...  CXHist [16] builds workload-aware histograms for selectivity estimation on a broad class of XML string-based queries.  ... 
doi:10.1109/sc.2006.27 fatcat:yzq67pfs7za3ndj736tqdljapy

Data management and query---Estimating query result sizes for proxy caching in scientific database federations

Tanu Malik, Randal Burns, Nitesh V. Chawla, Alex Szalay
2006 Proceedings of the 2006 ACM/IEEE conference on Supercomputing - SC '06  
CAROT estimates query result sizes by learning the distribution of query results, not by examining or sampling data, but from observing workload.  ...  nature of database federations.  ...  The authors thank Amitabh Chaudhary for giving us the idea of the multidimensional example. We thank Xiodan Wang for his help with the SDSS workload.  ... 
doi:10.1145/1188455.1188562 dblp:conf/sc/MalikBCS06 fatcat:6avilhqbkzh6flkugosxtic57u

Tsunami: A Learned Multi-dimensional Index for Correlated Data and Skewed Workloads [article]

Jialin Ding and Vikram Nathan and Mohammad Alizadeh and Tim Kraska
2020 arXiv   pre-print
However, the performance of that work suffers in the presence of correlated data and skewed query workloads, both of which are common in real applications.  ...  Filtering data based on predicates is one of the most fundamental operations for any modern data warehouse.  ...  This research is supported by Google, Intel, and Microsoft as part of the MIT Data Systems and AI Lab (DSAIL) at MIT, NSF IIS 1900933, DARPA Award 16-43-D3M-FP040, and the MIT Air Force Artificial Intelligence  ... 
arXiv:2006.13282v1 fatcat:hkhmejy4w5dxtogv6gvrw2vabq

Research on Auto-Scaling of Web Applications in Cloud: Survey, Trends and Future Directions

Parminder Singh, Pooja Gupta, Kiran Jyoti, Anand Nayyar
2019 Scalable Computing : Practice and Experience  
Cloud computing emerging environment attracts many applications providers to deploy web applications on cloud data centers.  ...  The primary area of attraction is elasticity, which allows to auto-scale the resources on-demand. However, web applications usually have dynamic workload and hard to predict.  ...  It is further classified into three categories -Self-Tuning PID controller(SPID) -Self-tuning regulator(STR) -Gain scheduling(GS) The adaptive controller is also used in the literature.  ... 
doi:10.12694/scpe.v20i2.1537 fatcat:5zdylggvtjdslichn6mpoleese

Auto-scaling Web Applications in Clouds: A Taxonomy and Survey [article]

Chenhao Qu, Rodrigo N. Calheiros, Rajkumar Buyya
2017 arXiv   pre-print
Web application providers have been migrating their applications to cloud data centers, attracted by the emerging cloud computing paradigm. One of the appealing features of the cloud is elasticity.  ...  acquire or release computing resources on-demand, which enables web application providers to automatically scale the resources provisioned to their applications without human intervention under a dynamic workload  ...  Yaser Mansouri, Xunyun Liu, Minxian Xu, and Bowen Zhou for their valuable comments and suggestions in improving the quality of the paper.  ... 
arXiv:1609.09224v6 fatcat:dkk2ftpvpbcnvhcmc6lz2omwa4

Characterization of a Big Data Storage Workload in the Cloud

Sacheendra Talluri, Alicja Łuszczak, Cristina L. Abad, Alexandru Iosup
2019 Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering - ICPE '19  
Understanding the workloads of such systems facilitates tuning and could foster new designs.  ...  of big data specific formats.  ...  Projects Vidi Magna-Data and COMMIT/ co-support this work. The work of C. Abad is partially funded by a Google Faculty Research Award.  ... 
doi:10.1145/3297663.3310302 dblp:conf/wosp/TalluriLAI19 fatcat:qvh6hzr75zcszmsyog7u5cav3m

Advances in Large-Scale RDF Data Management [chapter]

Peter Boncz, Orri Erling, Minh-Duc Pham
2014 Lecture Notes in Computer Science  
One of the prime goals of the LOD2 project is improving the performance and scalability of RDF storage solutions so that the increasing amount of Linked Open Data (LOD) can be efficiently managed.  ...  Data (LOD).  ...  As ontologies usually contain hierarchies, we create a histogram of type property values per CS that is aware of hierarchies.  ... 
doi:10.1007/978-3-319-09846-3_2 fatcat:e7ndc2gnyjherk6o45jk4kj5fe

A Tile-Based Framework with a Spatial-Aware Feature for Easy Access and Efficient Analysis of Marine Remote Sensing Data

Weiwen Ye, Feng Zhang, Xianqiang He, Yan Bai, Renyi Liu, Zhenhong Du
2020 Remote Sensing  
The raw data are displayed and roamed on a virtual globe through the Internet as tiles, enhancing their spatial awareness, that can be intelligently used for visualization result tuning, data storage preloading  ...  The SatANA framework is supported by a hybrid database storage ideal for the cloud storage of massive MRS data.  ...  Figure 7 . 7 Comparison of rendering results for ocean Secchi Disk depth between (A) spatial-aware histogram equalization and (B) global histogram equalization.  ... 
doi:10.3390/rs12121932 fatcat:wxsi6iym3nc2dhrprnhozajw2a

Providing Scalable Database Services on the Cloud [chapter]

Chun Chen, Gang Chen, Dawei Jiang, Beng Chin Ooi, Hoang Tam Vo, Sai Wu, Quanqing Xu
2010 Lecture Notes in Computer Science  
In this paper, we present an overview of our current on-going work in developing epiC -an elastic and efficient power-aware data-intensive Cloud system.  ...  The storage system and the processing engine are loosely coupled, and have been designed to handle two types of workload simultaneously, namely data-intensive analytical jobs and online transactions (commonly  ...  for his valuable comments and the numerous discussions during the course of the implementation of epiC.  ... 
doi:10.1007/978-3-642-17616-6_1 fatcat:vxojjhaguzbn3bdiphoqnor5pi

Intelligent Similarity Joins for Big Data Integration

Mian Wang, Tiezheng Nie, Derong Shen, Yue Kou, Ge Yu
2013 2013 10th Web Information System and Application Conference  
We study how to apply diPs for complex query expressions and how the usefulness of diPs varies with the data statistics used to construct diPs and the data distributions.  ...  Using a new (slightly larger) statistic, of the queries in the TPC-H, TPC-DS and JoinOrder benchmarks can skip at least of the query input.  ...  A large area of related work improves data skipping using workload aware adaptations to data partitioning or indexing [ , , , , , , , , , ] ; they co-locate data that is accessed together or build correlated  ... 
doi:10.1109/wisa.2013.79 dblp:conf/IEEEwisa/WangNSKY13 fatcat:zpgecqejknhudaygsrergdqk7e

A Survey of Big Data Machine Learning Applications Optimization in Cloud Data Centers and Networks [article]

Sanaa Hamid Mohamed, Taisir E.H. El-Gorashi, Jaafar M.H. Elmirghani
2019 arXiv   pre-print
Moreover, we provide a brief review of data centers topologies, routing protocols, and traffic characteristics, and emphasize the implications of big data on such cloud data centers and their supporting  ...  In this survey, we present a summary of the characteristics of various big data programming models and applications and provide a review of cloud computing infrastructures, and related technologies such  ...  All data are provided in full in the results section of this paper.  ... 
arXiv:1910.00731v1 fatcat:kvi3br4iwzg3bi7fifpgyly7m4

A Collective Knowledge workflow for collaborative research into multi-objective autotuning and machine learning techniques [article]

Grigori Fursin, Anton Lokhmotov, Dmitry Savenko, Eben Upton
2018 arXiv   pre-print
We hope such approach will help teach students how to build upon each others' work to enable efficient and self-optimizing software/hardware/model stack for emerging workloads.  ...  As the first practical step, we have implemented customizable compiler autotuning, crowdsourced optimization of diverse workloads across Raspberry Pi 3 devices, reduced the execution time and code size  ...  Section 5 presents a snapshot of the latest optimization results from collaborative tuning of GCC flags for numerous shared workloads across Raspberry Pi3 devices.  ... 
arXiv:1801.08024v1 fatcat:k6ltuu6ihrgundwk2gfoe6kuhm

FlexIO: I/O Middleware for Location-Flexible Scientific Data Analytics

Fang Zheng, Hongbo Zou, Greg Eisenhauer, Karsten Schwan, Matthew Wolf, Jai Dayal, Tuan-Anh Nguyen, Jianting Cao, Hasan Abbasi, Scott Klasky, Norbert Podhorszki, Hongfeng Yu
2013 2013 IEEE 27th International Symposium on Parallel and Distributed Processing  
Experimental results demonstrate that FlexIO can support a variety of simulation and analytics workloads at large scale through flexible placement options, efficient data movement, and dynamic deployment  ...  of data manipulation functionalities.  ...  This work was funded by Scientific Data Management Center, U.S. Department of Energy, and Center for Exascale Simulation of Combustion in Turbulence (ExaCT), U.S. Department of Energy.  ... 
doi:10.1109/ipdps.2013.46 dblp:conf/ipps/ZhengZESWDNCAKPY13 fatcat:uogj5f6yvfhbtcmvccrqjybhoe

A Framework for supporting DBMS-like indexes in the cloud

Gang Chen, Hoang Tam Vo, Sai Wu, Beng Chin Ooi, M. Tamer Özsu
2011 Proceedings of the VLDB Endowment  
Each cluster node maintains a subset of the index data.  ...  Further, the distribution of indexes is not straight forward, and there is therefore always the question of scalability, in terms of data volume, network size, and number of indexes.  ...  This work is part of our cloud-based data management system, named epiC (elastic power-aware data-intensive Cloud) 1 , which is designed to support both analytical and OLTP workloads.  ... 
doi:10.14778/3402707.3402711 fatcat:gpry7ncnpzacbnzsbgh3dncq5y
« Previous Showing results 1 — 15 out of 499 results