Filters








420 Hits in 6.9 sec

DedupT: Deduplication for tape systems

Abdullah Gharaibeh, Cornel Constantinescu, Maohua Lu, Ramani Routray, Anurag Sharma, Prasenjit Sarkar, David Pease, Matei Ripeanu
2014 2014 30th Symposium on Mass Storage Systems and Technologies (MSST)  
efficient placement on tapes; and (iii) presents the design and evaluation of novel cross-tape and on-tape chunk placement algorithms that alleviate tape mount time overhead and reduce on-tape data fragmentation  ...  However, deduplication has not been used for tape-based pools: tape characteristics, such as high mount and seek times combined with data fragmentation resulting from deduplication create a toxic combination  ...  a major contender for archival load.  ... 
doi:10.1109/msst.2014.6855555 dblp:conf/mss/GharaibehCLRSSPR14 fatcat:lbvxcmecgbfpfllu6r3ssi5bbq

Fair-share scheduling algorithm for a tertiary storage system

Pavel Jakl, Jérôme Lauret, Michal Šumbera
2010 Journal of Physics, Conference Series  
Starting from an explanation of the key criterion of such a policy, we will present evaluations and comparisons of three dierent MSS le restoration algorithms which meet fair-share requirements, and discuss  ...  If a robotic system is used as the primary storage solution, the intrinsically long access times (latencies) can dramatically aect the overall performance.  ...  Essentially, FCFS is simplistic scheduling algorithm and it is mainly used for base-lining and comparison purposes.  ... 
doi:10.1088/1742-6596/219/5/052005 fatcat:ma3x3lwdmfblja7tsyioqytzvq

Cheap data analytics using cold storage devices

Renata Borovica-Gajić, Raja Appuswamy, Anastasia Ailamaki
2016 Proceedings of the VLDB Endowment  
Driven by the price/performance aspect of CSD, this paper makes a case for using CSD as a replacement for both capacity and archival tiers of enterprise databases.  ...  In such a setting, data waterfalls from an SSDbased high-performance tier when it is "hot" (frequently accessed) to a disk-based capacity tier and finally to a tape-based archival tier when "cold" (rarely  ...  Donald Kossmann, the anonymous reviewers, and the DIAS laboratory members for their constructive feedback that substantially improved the presentation of the paper.  ... 
doi:10.14778/2994509.2994521 fatcat:uerdxyl4wjcv3jqklys6nm266u

A Backup-as-a-Service (BaaS) Software Solution

Heitor Faria, Priscila Solís, Jarcir Bordim, Rodrigo Hagstrom
2019 Proceedings of the 9th International Conference on Cloud Computing and Services Science  
Still, most backup systems have been designed and optimized for outdated environments and use cases.  ...  Backup is a replica of any data that can be used to restore its original form.  ...  [43] , load balancing is essential part of Cloud Computing and elastic scalability.  ... 
doi:10.5220/0007250902250232 dblp:conf/closer/FariaSBH19 fatcat:jribus3jsjb3rlzscw555akkse

Survey of Storage Systems for High-Performance Computing

2018 Supercomputing Frontiers and Innovations  
In current supercomputers, storage is typically provided by parallel distributed file systems for hot data and tape archives for cold data.  ...  A thorough understanding of today's storage infrastructures, including their strengths and weaknesses, is crucially important for designing and implementing scalable storage systems suitable for demands  ...  This material reflects only the authors' view and the EU commission is not responsible for any use that may be made of the information it contains.  ... 
doi:10.14529/jsfi180103 fatcat:hi3qctpl7rfvjgl53pxmqwqviy

Business Model With Alternative Scenarios (D4.1)

Jakob Luettgau, Julian Kunkel, Jens Jensen, Bryan Lawrence
2017 Zenodo  
Storage models are evaluated in several scenarios each introducing some architectural changes to currently deployed high-performance systems and discusses the cost and performance implications.  ...  and performance metrics.  ...  Load Balancers: Either special switches, or more or less compute only nodes responsible to assign targets for read and write access.  ... 
doi:10.5281/zenodo.1228749 fatcat:fdwtwtfcwjhhbkiusam6njs26a

Scalable full-text search for petascale file systems

Andrew W. Leung, Ethan L. Miller
2008 2008 3rd Petascale Data Storage Workshop  
Using a novel index partitioning mechanism that utilizes file system namespace locality, we are able to improve search scalability and performance and we discuss how such a design can potentially improve  ...  As file system capacities reach the petascale, it is becoming increasingly difficult for users to organize, find, and manage their data.  ...  Finally, we thank the anonymous reviewers for their insightful feedback.  ... 
doi:10.1109/pdsw.2008.4811884 fatcat:nhgxd5h46nbhlfz55sxrvmmady

The National Scalable Cluster Project: Three Lessons about High Performance Data Mining and Data Intensive Computing [chapter]

Robert Grossman, Robert Hollebeek
2002 Massive Computing  
The National Scalable Cluster Project (NSCP) collaboration of research groups has pioneered the application of cluster computing and high performance wide area networks to a variety of problems in data  ...  NSCP also developed several software packages for data intensive computing using the Meta-Cluster.  ...  Using load balancing on parallel nodes, data striping across parallel disks and parallel execution on many nodes driven by parallel aware application programming interfaces (MPI for example), an application  ... 
doi:10.1007/978-1-4615-0005-6_23 fatcat:j4enzen4inhdvpyffhd5kwi6v4

A Set of Transfer-Related Services

Justin Littman
2009 D-Lib Magazine  
We also thank our colleagues at HP and the British Library for their support and comments. Of course, any errors are solely the responsibility of the authors.  ...  Acknowledgments We gratefully acknowledge the help, advice and ideas of the LOCKSS project members, especially David Rosenthal and Vicky Reich.  ...  The key drivers for archival storage systems are data longevity, low cost, and scalability over time, technologies and vendors.  ... 
doi:10.1045/january2009-littman fatcat:63qis423rrgede3t7wbnbsbioa

Storage management solutions and performance tests at the INFN Tier-1

M Bencivenni, A Carbone, A Chierici, A D'Apice, D D Girolamo, L dell'Agnello, M Donatelli, G Donvito, A Fella, A Forti, F Furano, D Galli (+18 others)
2008 Journal of Physics, Conference Series  
Performance, reliability and scalability in data access are key issues in the context of HEP data processing and analysis applications.  ...  We also describe the deployment of a StoRM SRM instance at CNAF, configured to manage a GPFS file system, presenting and discussing its performances. 6 Corresponding author. email: Luca.dellAgnello@cnaf.infn.it  ...  is higher than a configurable threshold and the data have already been migrated to tape.  ... 
doi:10.1088/1742-6596/119/5/052003 fatcat:uwtnnx627rdbdlthyw5vuoyzw4

Petabyte-scale data migration at CERNBox

Hugo Gonzalez Labrador, Jose Ramon Mendez Reboredo
2019 Zenodo  
I decided to take the opportunity to use this activity as the main source for this thesis.  ...  The FDO section operates and supports the storage and file system ser- vices for physics. I joined the FDO section as Technical Student in 2014 and currently I am a Staff member of the section.  ...  The group is divided into three sections: IT-ST-TAB (Tapes, Archives and Backups Section), IT-ST-FDO (File systems and Disk Operations Section) and IT-ST-AD (Design and Transitions Section).  ... 
doi:10.5281/zenodo.3402900 fatcat:i2ywssmct5fpjej4e2uuik5s2e

D12.4: Performance Optimized Lustre

Ernest Artiaga, Alberto Miranda
2012 Zenodo  
In this line we have observed that in both Lustre and GPFS there are some scalability issues that reduce the performance of metadata operation when many files are used by the applications or when the number  ...  After our mechanism has been added to the [...]  ...  By now, our mechanism supports the following combinations of policies for the caching zone and the archival zone:  RAID0+SEQUENTIAL: This variant uses a sequential strategy for the archival zone and a  ... 
doi:10.5281/zenodo.6572353 fatcat:nkqucavqavapzabifinzn6ucxq

Rucio: Scientific Data Management

Martin Barisits, Thomas Beermann, Frank Berghaus, Brian Bockelman, Joaquin Bogado, David Cameron, Dimitrios Christidis, Diego Ciangottini, Gancho Dimitrov, Markus Elsing, Vincent Garonne, Alessandro di Girolamo (+18 others)
2019 Computing and Software for Big Science  
Rucio was originally developed to meet the requirements of the high-energy physics experiment ATLAS, and now is continuously extended to support the LHC experiments and other diverse scientific communities  ...  Rucio is an open-source software framework that provides scientific collaborations with the functionality to organize, manage, and access their data at scale.  ...  We also thank former colleagues Miguel Branco, Pedro Salgado, and Florbela Viegas for their contributions to the Rucio predecessor system DQ2.  ... 
doi:10.1007/s41781-019-0026-3 fatcat:3erfeeamhvg5ndqiiiibaunphm

Rucio - Scientific Data Management [article]

Martin Barisits, Thomas Beermann, Frank Berghaus, Brian Bockelman, Joaquin Bogado, David Cameron, Dimitrios Christidis, Diego Ciangottini, Gancho Dimitrov, Markus Elsing, Vincent Garonne, Alessandro di Girolamo, Luc Goossens, Wen Guan (+15 others)
2019 arXiv   pre-print
Rucio has been originally developed to meet the requirements of the high-energy physics experiment ATLAS, and is continuously extended to support the LHC experiments and other diverse scientific communities  ...  Rucio is an open source software framework that provides scientific collaborations with the functionality to organize, manage, and access their volumes of data.  ...  We also thank former colleagues Miguel Branco, Pedro Salgado, and Florbela Viegas for their contributions to the Rucio predecessor system DQ2.  ... 
arXiv:1902.09857v1 fatcat:6rrkmqtlqrfq5koydykwv7vlnq

TECHNOLOGIES FOR LARGE DATA MANAGEMENT IN SCIENTIFIC COMPUTING

ALBERTO PACE
2014 International Journal of Modern Physics C  
This paper focuses on the strategies in use: it reviews the various components that are necessary for an e®ective solution that ensures the storage, the long term preservation, and the worldwide distribution  ...  The paper also mentions several examples of data management solutions used in High Energy Physics for the CERN Large Hadron Collider (LHC) experiments in Geneva, Switzerland which generate more than 30,000  ...  It can be used for cold storage and archiving where data is transferred using the \freight train" approach and can be used as a valid alternative to tape storage with the advantage that random access mode  ... 
doi:10.1142/s0129183114300012 fatcat:6ry5jhuiizgwzocypegk32aywm
« Previous Showing results 1 — 15 out of 420 results