1,885 Hits in 4.7 sec

JIZHI: A Fast and Cost-Effective Model-As-A-Service System for Web-Scale Online Inference at Baidu [article]

Hao Liu, Qian Gao, Jiang Li, Xiaochao Liao, Hao Xiong, Guangxing Chen, Wenlin Wang, Guobao Yang, Zhiwei Zha, Daxiang Dong, Dejing Dou, Haoyi Xiong
2021 arXiv   pre-print
In modern internet industries, deep learning based recommender systems have became an indispensable building block for a wide spectrum of applications, such as search engine, news feed, and short video  ...  Besides, JIZHI introduces heterogeneous and hierarchical storage to further accelerate the online inference process by reducing unnecessary computations and potential data access latency induced by ultra-sparse  ...  Moreover, a tailor-designed Heterogeneous and Hierarchical Storage (HHS) module is introduced for the huge and ultra-sparse DNN model management.  ... 
arXiv:2106.01674v1 fatcat:zogjjhrfezc3pkboekw56ebf5y

Understanding Data Storage and Ingestion for Large-Scale Deep Recommendation Model Training [article]

Mark Zhao, Niket Agarwal, Aarti Basant, Bugra Gedik, Satadru Pan, Mustafa Ozdal, Rakesh Komuravelli, Jerry Pan, Tianshu Bao, Haowei Lu, Sundaram Narayanan, Jack Langman (+5 others)
2022 arXiv   pre-print
As innovations in DSAs continue to increase training efficiency and throughput, the data storage and ingestion (DSI) pipeline, the systems and hardware responsible for storing and preprocessing training  ...  To this end, this paper presents Meta's end-to-end DSI pipeline, composed of a central data warehouse built on distributed storage and a Data PreProcessing Service (DPP) that scales to eliminate data stalls  ...  The DPP Master itself is replicated to avoid being a single point of failure. Finally, the DPP Master implements auto-scaling via a controller.  ... 
arXiv:2108.09373v3 fatcat:wwnk7w5t7rbldheztu6v6kunna


Alex Poms, Will Crichton, Pat Hanrahan, Kayvon Fatahalian
2018 ACM Transactions on Graphics  
In response, we have created Scanner, a system for productive and efficient video analysis at scale.  ...  The challenge is that scaling applications to operate on these datasets requires efficient systems for pixel data access and parallel processing across large numbers of machines.  ...  The cost of supporting compressed video storage in a system that must also support sparse frame-level data access is two-fold.  ... 
doi:10.1145/3197517.3201394 fatcat:dawsgbcvefab3mdmtne6kjtjb4

The anatomy of big data computing

Raghavendra Kune, Pramod Kumar Konugurthi, Arun Agarwal, Raghavendra Rao Chillarige, Rajkumar Buyya
2015 Software, Practice & Experience  
Big data computing demands a huge storage and computing for data curation and processing that could be delivered from on-premise or clouds infrastructures.  ...  , a new paradigm that combines large-scale compute, new data-intensive techniques, and mathematical models to build data analytics.  ...  ACKNOWLEDGEMENTS We thank Rodrigo Calheiros, Nikolay Grozev, Amir Vahid, and Harshit Gupta for their comments and suggestions on improving the quality of the paper.  ... 
doi:10.1002/spe.2374 fatcat:pe57kzvunrf7blpbm7cpwliysu

The Anatomy of Big Data Computing [article]

Raghavendra Kune, Pramodkumar Konugurthi, Arun Agarwal, Raghavendra Rao Chillarige, Rajkumar Buyya
2015 arXiv   pre-print
Big Data computing demands a huge storage and computing for data curation and processing that could be delivered from on-premise or clouds infrastructures.  ...  ; a new paradigm which combines large scale compute, new data intensive techniques and mathematical models to build data analytics.  ...  Acknowledgements We thank Rodrigo Calheiros, Nikolay Grozev, Amir Vahid, and Harshit Gupta for their comments and suggestions on improving the quality of paper.  ... 
arXiv:1509.01331v1 fatcat:vqea74fgk5h5jpapwaaplvhuda

D2.1 Definition of Use Cases, Service Requirements and KPIs

Marco Quagliotti, Albert Rafel, Oscar Gonzales De Dios, Víctor López, Rafael López Da Silva, José Alberto Hernández, Manuel Urueña, David Larrabeiti, Ignacio Martín, Neelakandan Manihatty Bojan, Ftoeini Ntavou, Emilio Hugues Salas (+21 others)
2018 Zenodo  
This document provides detailed analysis of the service KPIs and subsequent requirements have provided key objectives, definition, and apportionment of the service KPIs for the Metro-Haul network vision  ...  Each Use Case has been decomposed into their components and requirements, which are then mapped onto the end-to-end network infrastructure.  ...  and computational and memory storage resources for virtual machines on servers located in the node premises.  ... 
doi:10.5281/zenodo.1194063 fatcat:p7pu7rcpnndjlb6urdbzh6w4mi

Video Big Data Analytics in the Cloud: A Reference Architecture, Survey, Opportunities, and Open Research Issues

Aftab Alam, Irfan Ullah, Young-Koo Lee
2020 IEEE Access  
It also aims to bridge the gap among large-scale video analytics challenges, big data solutions, and cloud computing.  ...  The current technology and market trends demand an efficient framework for video big data analytics.  ...  A sparse auto-encoder was used as the primary model to generate the final summary, where the input and output were multiple videos and keyframes set, respectively.  ... 
doi:10.1109/access.2020.3017135 fatcat:qc62bhzlrfcwblnvurb5okfjxe


Zhou Shao, Muhammad Aamir Cheema, David Taniar, Hua Lu
2016 Proceedings of the VLDB Endowment  
Daniel Kang, John Emmons, Firas Abuzaid, Peter Bailis, Matei Zaharia 1586 TABLE OF CONTENTS OF Industrial and Applications TABLE OF CONTENTS OF Research PapersScalable Replay-Based Replication For  ...  Lars George, Bruno Cadonna, Techniques for Large-Scale Persistent-Main-Memory Systems ...................  ... 
doi:10.14778/3025111.3025115 fatcat:2vi4x3dmlvfohndw5iw2xoibnq

2021 Index IEEE Transactions on Parallel and Distributed Systems Vol. 32

2022 IEEE Transactions on Parallel and Distributed Systems  
., +, TPDS Jan. 2021 214- 228 Large-Scale Analysis of Docker Images and Performance Implications for Container Storage Systems.  ...  ., +, TPDS Jan. 2021 174-183 BOSSA: A Decentralized System for Proofs of Data Retrievability and Replication.  ... 
doi:10.1109/tpds.2021.3107121 fatcat:e7bh2xssazdrjcpgn64mqh4hb4

Design and Analysis of an Efficient Friend-to-Friend Content Dissemination System

Kanchana Thilakarathna, Aline Carneiro Viana, Aruna Seneviratne, Henrik Petander
2017 IEEE Transactions on Mobile Computing  
This paper, presents a novel hybrid content storage and distribution system addressing the trust and privacy concerns of users, lowering the cost of content distribution and storage, and shows how they  ...  The system exploit the fact that users will trust their friends, and by replicating content on friends' devices who are likely to consume that content it will be possible to disseminate it to other friends  ...  In this paper, we propose a new hybrid content storage and distribution system for user generated content (UGC).  ... 
doi:10.1109/tmc.2016.2570747 fatcat:dkrdxl5bcbhubhazrir5j5x4ci

Creating a Relational Distributed Object Store [article]

Robert Primmer, Scott Nyman, Wayzen Lin
2013 arXiv   pre-print
In and of itself, data storage has apparent business utility. But when we can convert data to information, the utility of stored data increases dramatically.  ...  It is the layering of relation atop the data mass that is the engine for such conversion.  ...  The database is typically modeled as a NoSQL "shared nothing" data store for horizontal scaling-replicating and partitioning data over many servers [4] .  ... 
arXiv:1306.5586v1 fatcat:zsqypwb5svaa5gsvlnyj2nqmla

The ISTI Rapid Response on Exploring Cloud Computing 2018 [article]

Carleton Coffrin, James Arnold, Stephan Eidenbenz, Derek Aberle, John Ambrosiano, Zachary Baker, Sara Brambilla, Michael Brown, K. Nolan Carter, Pinghan Chu, Patrick Conry, Keeley Costigan, Ariane Eberhardt (+31 others)
2019 arXiv   pre-print
By and large, the projects were successful and collectively they suggest that cloud computing can be a valuable computational resource for scientific computation at national laboratories.  ...  This report describes eighteen projects that explored how commercial cloud computing services can be utilized for scientific computation at national laboratories.  ...  Business Innovation (ADBI) for their technical support in this effort as well as the Chief Information Officer, Mike Fisk, and Matthew Heavner for their financial support.  ... 
arXiv:1901.01331v1 fatcat:cdkmje2agzfsdpyulbp4cxz22q

EMG Pattern Recognition in the Era of Big Data and Deep Learning

Angkoon Phinyomark, Erik Scheme
2018 Big Data and Cognitive Computing  
Finally, directions for future research in EMG pattern recognition are outlined and discussed.  ...  This paper begins with a brief introduction to the main factors that expand EMG data resources into the era of big data, followed by the recent progress of existing shared EMG data sets.  ...  Wireless EMG System for Ninapro 2, 3, 6 and 7, a Cometa Wave Plus wireless EMG system for Ninapro 4, and Thalmic Myo armbands for Ninapro 5.  ... 
doi:10.3390/bdcc2030021 fatcat:h24h4mj6xvgdtgeg5xrmqre6nm

Big Data Storage Tools Using NoSQL Databases and Their Applications in Various Domains: A Systematic Review

Amen Faridoon, Muhammad Imran
2021 Computing and informatics  
.: HBase and Hypertable for Large Scale Distributed Storage Systems. Department of Computer Science, Purdue University, 2006. [30] Küçükkeçeci, C.  ...  achieves over large- scale systems [2].  ... 
doi:10.31577/cai_2021_3_489 fatcat:mh6jl6vtznf5zmayq6l3yxaaga

Inter-datacenter bulk transfers with netstitcher

Nikolaos Laoutaris, Michael Sirivianos, Xiaoyuan Yang, Pablo Rodriguez
2011 Proceedings of the ACM SIGCOMM 2011 conference on SIGCOMM - SIGCOMM '11  
To this end, we have designed, implemented, and validated NetStitcher, a system that employs a network of storage nodes to stitch together unutilized bandwidth, whenever and wherever it exists.  ...  It gathers information about leftover resources, uses a store-and-forward algorithm to schedule data transfers, and adapts to resource fluctuations.  ...  It uses the Sparse Periodic Auto-Regression (SPAR) estimator [16] to derive a prediction for the next 24 hours.  ... 
doi:10.1145/2018436.2018446 dblp:conf/sigcomm/LaoutarisSYR11 fatcat:cc752m3z6rcb5knwrgie7ycl4u
« Previous Showing results 1 — 15 out of 1,885 results