17 Hits in 3.9 sec

AutoToken: Predicting Peak Parallelism for Big Data Analytics at Microsoft

Rathijit Sen, Alekh Jindal, Hiren Patel, Shi Qiao
2020 Proceedings of the VLDB Endowment  
AutoToken is computationally light, for both training and scoring, is easily deployable at scale, and is integrated with the Peregrine workload optimization infrastructure at Microsoft.  ...  Right-sizing resource allocation for big-data queries, particularly in serverless environments, is critical for improving infrastructure operational efficiency, capacity availability, query performance  ...  PUTTING IT ALL TOGETHER We have implemented AutoToken, as part of the broader workload optimization platform Peregrine [21] , as per requirement R7, and integrated it with the SCOPE query engine.  ... 
dblp:journals/pvldb/SenJP020 fatcat:dem47et5crdj3niymxk6ohdhmm

Deploying a Steered Query Optimizer in Production at Microsoft

Wangda Zhang, Matteo Interlandi, Paul Mineiro, Shi Qiao, Nasim Ghazanfari, Karlen Lie, Marc Friedman, Rafah Hosn, Hiren Patel, Alekh Jindal
2022 Proceedings of the 2022 International Conference on Management of Data  
Modern analytical workloads are highly heterogeneous and massively complex, making generic query optimizers untenable for many customers and scenarios.  ...  In this paper, we continue a recent line of work in steering a query optimizer towards better plans for a given workload, and make major strides in pushing previous research ideas to production deployment  ...  ACKNOWLEDGMENTS We would like to thank Carlo Curino, John Langford, Raghu Ramakrishnan, and Siddhartha Sen for their insightful feedback, as well as GT Ni for the work during early stages of the development  ... 
doi:10.1145/3514221.3526052 fatcat:bvpga6in7jblzo7bms3ta4daje

Pattern Morphing for Efficient Graph Mining [article]

Kasra Jamshidi, Keval Vora
2020 arXiv   pre-print
In this paper, we step beyond the traditional philosophy of optimizing the execution plans for a given set of patterns, and exploit the sub-structural similarities across different query patterns.  ...  We evaluate the effectiveness of pattern morphing by incorporating it in Peregrine, a recent state-of-the-art graph mining system, and show that pattern morphing significantly improves the performance  ...  We evaluate the effectiveness of such a cost-model based pattern morphing optimizer by incorporating it with our pattern morphing engine for Peregrine. Table 2 .  ... 
arXiv:2012.04553v1 fatcat:dahevnmsdnfvnktw5n7awe4y5a

Autonomic Fail-over for a Software-Defined Container Computer Network

Chien-Yung Lee, Yu-Wei Lee, Cheng-Chun Tu, Pai-Wei Wang, Yu-Cheng Wang, Chih-Yu Lin, Tzi-cker Chiueh
2013 IEEE International Conference on Autonomic Computing  
Compared with vanilla enterprise networks, Peregrine features a fast fail-over capability not only for network switch/link failures, but also for failures of its own control servers.  ...  The ITRI container computer is a modular computer designed to be a building block for constructing cloudscale data centers.  ...  Therefore, not every ARP query needs to be sent to the directory server; in fact, most ARP queries are expected to be answered by the caches maintained by Peregrine agents.  ... 
dblp:conf/icac/LeeLTWWLC13 fatcat:lic45tbgnnahhou2at6degjs7i

Predictive Price-Performance Optimization for Serverless Query Processing [article]

Rathijit Sen, Abhishek Roy, Alekh Jindal
2021 arXiv   pre-print
We discuss and evaluate in depth how our system, AutoExecutor, can use this framework to automatically select near-optimal executor and core counts for Spark SQL queries running on Azure Synapse.  ...  for data analytics in serverless query processing settings.  ...  The pickled file size on disk when trained over all 103 TPC-DS queries (for a given scale factor) was 0.8 MB for AE_AL and 0.9 MB for AE_PL.  ... 
arXiv:2112.08572v1 fatcat:nihsziq5knccrb72s4hw32p4le

Making parallel programs reliable with stable multithreading

Junfeng Yang, Heming Cui, Jingyue Wu, Yang Tang, Gang Hu
2014 Communications of the ACM  
For MySQL-tx, TERN has a lower reuse rate largely because the workload is too random to reuse schedules. Nonetheless, TERN manages to process 44.2% of the queries with a small number of schedules.  ...  Table 2 shows the results, obtained from TERN and replicable in PEREGRINE. The four workloads are either real workloads we collect or synthetic workloads used by the developers themselves ( §A).  ...  APPENDIX A Workloads for Evaluating Stability We used the following four workloads for evaluating TERN's stability: • Apache-CS: a four-day trace from the Columbia CS website with 122,000 HTTP requests  ... 
doi:10.1145/2500875 fatcat:b7suva2w3rgbfhgpmlznpq6hci

Distributed and interactive cube exploration

Niranjan Kamat, Prasanth Jayachandran, Karthik Tunga, Arnab Nandi
2014 2014 IEEE 30th International Conference on Data Engineering  
We discuss design considerations, implementation details and optimizations of our system.  ...  A novel framework is provided that combines three concepts: faceted exploration of data cubes, speculative execution of queries and query execution over subsets of data.  ...  to modern distributed query execution engines.  ... 
doi:10.1109/icde.2014.6816674 dblp:conf/icde/KamatJTN14 fatcat:cgniwdhebvdkff2ubv6qytboh4

Stable Multithreading: A New Paradigm for Reliable and Secure Threads

Heming Cui
To realize StableMT, we have built three StableMT systems, TERN, PEREGRINE, and PARROT, with each addressing a distinct research challenge.  ...  Unfortunately, despite decades of research and engineering effort, these programs remain notoriously difficult to get right, and they are plagued with harmful concurrency bugs that can cause wrong outputs  ...  To detect symbolic races, our race detector queries the underlying symbolic execution engine for pointer equality.  ... 
doi:10.7916/d83n225b fatcat:5izxdjqhmzcdtmbgpa6xkqpi3y

Phoebe: A Learning-based Checkpoint Optimizer [article]

Yiwen Zhu, Matteo Interlandi, Abhishek Roy, Krishnadhan Das, Hiren Patel, Malay Bag, Hitesh Sharma, Alekh Jindal
2021 pre-print
Easy-to-use programming interfaces paired with cloud-scale processing engines have enabled big data system users to author arbitrarily complex analytical jobs over massive volumes of data.  ...  For each stage of a job, Phoebe makes accurate predictions for: (1) the execution time, (2) the output size, and (3) the start/end time taking into account the inter-stage dependencies.  ...  The telemetry data from the query engine is collected into a workload repository and later used by Phoebe to re-train the models.  ... 
doi:10.14778/3476249.3476298 arXiv:2110.02313v1 fatcat:x4x3kuq2w5di7czzl6kbdqruw4

Flexible and efficient computation in large data centres

Ionel Corneliu Gog, Apollo-University Of Cambridge Repository, Apollo-University Of Cambridge Repository, Robert N. M. Watson
The diverse execution engines cause different workflow types to coexist within a data centre, opening up both opportunities for sharing and potential pitfalls for co-location interference.  ...  I propose an architecture for decoupling data processing, together with Musketeer, my proof-of-concept implementation of this architecture.  ...  In addition, there are many front-end frameworks for representing stream and interactive data workflows (e.g., PowerDrill [HBB + 12], Peregrine [MG12] ).  ... 
doi:10.17863/cam.18802 fatcat:h7m36dugajbybgw46rzefvhgyy

A Model for Online Interactive Remote Education for Medical Physics Using the Internet

Milton K Woo, Kwan-Hoong Ng
2003 Journal of Medical Internet Research  
We have installed a new MLC modeling package into PEREGRINE.  ...  The workload for primary barrier calculations for conventional multileaf collimator (MLC) IMRT treatments is determined using patient tumor doses.  ...  Different workloads and a variable mix of head and body techniques will be used to illustrate their effect on CT shielding requirements.  ... 
doi:10.2196/jmir.5.1.e3 pmid:12746208 pmcid:PMC1550549 fatcat:tsyf45yusza2jiglkq62rgo3qm

Proceedings of the BioCreative V.5 Challenge Evaluation Workshop

Martin Krallinger, Alfonso Valencia
2022 Zenodo  
for funding.  ...  We would like to thank Matthias Herzog for technical support and Milena Kraus for her support of mapping the semantic types. Acknowledgments  ...  Named entity recognition software The core of the NER system is a highly optimized dictionary-based tagger engine, implemented in C++.  ... 
doi:10.5281/zenodo.6519885 fatcat:gzzr6ogkwvfe3eglv6anrzt5s4

Operating system support for warehouse-scale computing

Malte Schwarzkopf, Apollo-University Of Cambridge Repository, Apollo-University Of Cambridge Repository, Steven Hand, Ian Leslie, Robert N. M. Watson
abstractions makes sharing and resource management inefficient, infrastructure software lacks end-to-end access control mechanisms, and work placement ignores the effects of hardware heterogeneity and workload  ...  I present a novel distributed operating system for data centres, focusing on two OS components: the abstractions for resource naming, management and protection, and the scheduling of work to compute resources  ...  "Policy-optimal" means that the solution is optimal for the given scheduling policy; it does not preclude the existence other, more optimal scheduling policies.  ... 
doi:10.17863/cam.26443 fatcat:lvxhwdcmlnbm7d7hg5xdrxqrsa

Journal of Digital Media & Interaction, Vol.3, No.8

Nelson Zagalo, Lídia Oliveira
of consumption are theorized about in three subdivisions: 1) Excess in terms of the fruition in the quantity of the represented content (binge-watching), 2) Excess in terms of the fruition during the peregrination  ...  The article points to the relevant presence of Martín-Barbero's thinking as a necessary alternative (and an opposition to technicism) for interpreting digital consumption.  ...  Acknowledgments This work was carried out with the support of the Coordination for the Improvement of Higher Education Personnel -Brazil (CAPES).  ... 
doi:10.34624/jdmi.v3i8.23560 fatcat:omebaaxtxvbo3he56vhemek4rq

Lips and Ships, Peers and Tears: [chapter]

Warwick Gould
2013 The Living Stream  
., 1925), and elsewhere on cover designs for Yeats's books, most notably that for Per Amica Silentia Lunae (1917), courtesy of the late Riette Sturge Moore.  ...  List of Illustrations Cover Image: Thomas Sturge Moore's cover for The Tower (1928), Private Collection, London.  ...  The balance of archival queries, cross-checking, and arrangements for copies after January 2005 are credited to Jared Curtis and Ann Saddlemyer, with the help of the series assistant editor, Declan Kiely  ... 
doi:10.2307/j.ctt5vjtw2.11 fatcat:hdljsi2dn5bu7co2wz46biy26y
« Previous Showing results 1 — 15 out of 17 results