2,110 Hits in 3.6 sec

99 Deduplication Problems

Philip Shilane, Ravi Chitloor, Uday Kiran Jonnala
2016 USENIX Workshop on Hot Topics in Storage and File Systems  
Based on feedback from customers as well as internal architecture discussions, we present new deduplication problems that will hopefully spur the next generation of research.  ...  While future research will continue to optimize these areas, we believe that there are numerous novel, deduplication-specific problems that have been largely ignored in the academic community.  ...  Creating a list of new deduplication problems is an ongoing task, as each advancement triggers another set of problems, and features added to non-deduplicated storage systems are requested on deduplicated  ... 
dblp:conf/hotstorage/ShilaneCJ16 fatcat:ygh2yk45xndf5bzp37kedmyera

Dynamic Deduplication Algorithm for Cross-User Duplicate Data in Hybrid Cloud Storage

Feng Gang, DaHuan Wei, Mukesh Soni
2022 Security and Communication Networks  
Consequently, many cloud storage providers will implement deduplication to compress data, reduce transfer bandwidth, and reduce cloud storage space.  ...  Merkle hash trees are constructed using additional encryption algorithms to generate encryption keys during file- and block-level deduplication, ensuring that generated ciphertexts are unpredictable.  ...  95 99 Da-4 52 85 54 81 99 98 Da-5 51 73 50 70 96 99 Da-6 39 69 37 62 92 98 Da-7 38 79 32 85 97 99 Da-8 41 84 44 80 95 98 Da-9 32 86 31 83 99 99 Da-10 36 83 37 88 99 98 Table 3 : 3 Redundant data throughput  ... 
doi:10.1155/2022/8354903 fatcat:vsnckhuf25e73c52kvjyzovg7i

PeerDedupe: Insights into the Peer-Assisted Sampling Deduplication

Y. Xing, Z. Li, Y. Dai
2010 2010 IEEE Tenth International Conference on Peer-to-Peer Computing (P2P)  
Driven by the problems behind the mainstream server-side deduplication schemes, recently there has been a tendency of introducing peer-assisted methods into the deduplication systems.  ...  In this paper, we conduct in-depth and quantitative investigation on the peer-assisted deduplication.  ...  As Fig. 2 shows, 4 MVHs could cover 99% inter-peer duplication. We target to achieve 99% estimation accuracy with 4 MVHs, i.e., eliminating 98% (≈ 99% × 99%) interpeer duplication.  ... 
doi:10.1109/p2p.2010.5570004 dblp:conf/p2p/XingLD10 fatcat:sjpyan4rn5bn7mkm5du76p3kti

Deduplication Potential of HPC Applications' Checkpoints

Jurgen Kaiser, Ramy Gad, Tim SuB, Federico Padua, Lars Nagel, Andre Brinkmann
2016 2016 IEEE International Conference on Cluster Computing (CLUSTER)  
A viable solution to remove the resulting pressure from the I/O backends is to deduplicate the checkpoints.  ...  In this paper, we perform a broad study about the deduplication behavior of HPC application checkpointing and its impact on system design.  ...  74% (23%) 88% (20%) 88% (20%) gromacs 99% (88%) 99% (88%) 99% (88%) 99% (88%) 99% (88%) 99% (88%) 99% (88%) 99% (88%) 99% (88%) NAMD 81% (31%) 81% (31%) 81% (31%) 88% (31%) 88% (31%  ... 
doi:10.1109/cluster.2016.32 dblp:conf/cluster/KaiserGSPNB16 fatcat:zsl4d2lmjjbd3h6hxh5lxk7kpa

Exploring Shared State in Key-Value Store for Window-Based Multi-pattern Streaming Analytics

Ovidiu-Cristian Marcu, Radu Tudoran, Bogdan Nicolae, Alexandru Costan, Gabriel Antoniu, Maria S. Perez-Hernandez
2017 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)  
We design a deduplication method specifically for windowbased operators that rely on key-value stores to hold a shared state.  ...  We experiment with a synthetically generated workload while considering several deduplication scenarios and based on the results, we identify several potential areas of improvement.  ...  We summarize our contributions as follows: • We formulate the problem of deduplication in the context of stream processing (Section 2).  ... 
doi:10.1109/ccgrid.2017.126 dblp:conf/ccgrid/MarcuTNCAP17 fatcat:5oxtilgnibhj5dhp3zb47jhe2a

An Effective Systemfor Storing Data and Resources using Cloud Computing

Specifically, going for conducting the two facts stability and deduplication in cloud, we advocatecozy systems, unequivocally SecCloud and SecCloud+.  ...  Regardless, because the redistributed appropriated placing away isn't simply strong, it raises security stacks at the most succesful procedure to see statistics deduplication in cloud at the same time  ...  The second one problem is comfortable deduplication. The smart errand of cloud affiliations is joined by using making volumes of informational rundown away at remote cloud servers.  ... 
doi:10.35940/ijitee.f1089.0486s419 fatcat:pwfmrd22bve7dpthznd6qvq4hu

Clustering-based acceleration for virtual machine image deduplication in the cloud environment

Jiwei Xu, Wenbo Zhang, Zhenyu Zhang, Tao Wang, Tao Huang
2016 Journal of Systems and Software  
As a result, VM image deduplication is a common daily activity in datacenters. Our previous work Crab is such a product and it is on duty regularly in our datacenter.  ...  Experimental results show that it significantly reduces the performance interference to hosting virtual machine with an acceptable increase in disk space usage, compared with existing deduplication methods  ...  to address the virtual machine image deduplication problem.  ... 
doi:10.1016/j.jss.2016.02.021 fatcat:iltur2jsabdi7onbu547msygmi

Secure and constant cost public cloud storage auditing with deduplication

Jiawei Yuan, Shucheng Yu
2013 2013 IEEE Conference on Communications and Network Security (CNS)  
We prove the security of our proposed scheme based on the Computational Diffie-Hellman problem, the Static Diffie-Hellman problem and the t-Strong Diffie-Hellman problem.  ...  Our design allows deduplication of both files and their corresponding authentication tags. Data integrity auditing and storage deduplication are achieved simultaneously.  ...  , the Static Dif e-Hellman problem or the t-SDH problem.  ... 
doi:10.1109/cns.2013.6682702 dblp:conf/cns/YuanY13 fatcat:42ofmrjqxbbebnboorfqrp54hi

Modeling the Fault Tolerance Consequences of Deduplication

Eric W.D. Rozier, William H. Sanders, Pin Zhou, Nagapramod Mandagere, Sandeep M. Uttamchandani, Mark L. Yakushev
2011 2011 IEEE 30th International Symposium on Reliable Distributed Systems  
We present a framework composed of data analysis methods and a model of data deduplication that is useful in studying the reliability impact of data deduplication.  ...  Modern storage systems are employing data deduplication with increasing frequency.  ...  Deduplication itself poses two potential reliability problems.  ... 
doi:10.1109/srds.2011.18 dblp:conf/srds/RozierSZMUY11 fatcat:4z2nkaaxxredfgl35qjgt6cwxe

A Comprehensive Study of the Past, Present, and Future of Data Deduplication

Wen Xia, Hong Jiang, Dan Feng, Fred Douglis, Philip Shilane, Yu Hua, Min Fu, Yucheng Zhang, Yukun Zhou
2016 Proceedings of the IEEE  
Finally, we outline the open problems and future research directions facing deduplication-based storage systems.  ...  data deduplication process.  ...  Zadok for valuable discussions about deduplicated storage literature.  ... 
doi:10.1109/jproc.2016.2571298 fatcat:krfdbgm5pjemnmaswml7k4uv4e

Content Sharing Graphs for Deduplication-Enabled Storage Systems

Maohua Lu, Cornel Constantinescu, Prasenjit Sarkar
2012 Algorithms  
storage systems, whereas in general the partitioning problem is NP-complete.  ...  First, a quasi-linear algorithm was developed to partition deduplication domains with a minimal amount of deduplication loss (i.e., data replicated across partitioned domains) in commercial deduplication-enabled  ...  Routray of IBM Research for his insights in IBM's commercial deduplicated storage systems, and Colin S.  ... 
doi:10.3390/a5020236 fatcat:rzaf4d7hnzfmbnrlqtw2zxoqbe

An Enhanced Unsupervised Fuzzy Expectation Maximization Clustering for Deduplication of Records in Big data

2019 International journal of recent technology and engineering  
This work converts whole content of data to numeric values for applying deduplication using radix method.  ...  These kinds of missing values and formatting problems are determined and data are converted to a standard format.  ...  To overcome this problem, the proposed model used radix formula which converts the string, numeric or date format values to numeric format.  ... 
doi:10.35940/ijrte.c1269.1083s219 fatcat:pkrs4igpongidk6wmrvhn3wbha

Research Method of Data Deduplication Backup System

Together, they are capable of cast off 99% of the disk accesses for de duplication of real global workloads.  ...  The merging operation suffers from a normal common overall performance problem.  ...  FORESTALL On this paintings we've got defined facts fragmentation in structures with in-line deduplication and quantified impact of fragmentation because of inter-version deduplication on re-save pace.  ... 
doi:10.35940/ijitee.k1015.09811s219 fatcat:spnzliuyrbalzhp42fdzsumfyy

Provable ownership of files in deduplication cloud storage

Chao Yang, Jian Ren, Jianfeng Ma
2013 Security and Communication Networks  
, is proposed to identify the client's deduplication and save the bandwidth of uploading copies of existing files to the server.  ...  In this paper, to solve this problem, we propose a cryptographically secure and efficient scheme for a client to prove to the server his ownership on the basis of actual possession of the entire original  ...  However, client-side deduplication introduces some new security problems. Harnik et al.  ... 
doi:10.1002/sec.784 fatcat:sxtqcyo6wvhvldwuyxirkgcqmm

Similarity and Locality Based Indexing for High Performance Data Deduplication

Wen Xia, Hong Jiang, Dan Feng, Yu Hua
2015 IEEE transactions on computers  
One of the main challenges for centralized data deduplication is the scalability of fingerprint-index search.  ...  Data deduplication has gained increasing attention and popularity as a space-efficient approach in backup storage systems.  ...  SiLo removes more than 99 percent of duplicate data when N > 99.  ... 
doi:10.1109/tc.2014.2308181 fatcat:szqge3jt5zhsnnnn7yhntj64j4
« Previous Showing results 1 — 15 out of 2,110 results