3,908 Hits in 3.3 sec

Verification and validation of MapReduce program model for parallel K-means algorithm on Hadoop cluster

Amresh Kumar, M. Kiran, B. R. Prathap
2013 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT)  
This experiment is basically a research study of above MapReduce applications and also to verify and validate the MapReduce Program model for Parallel K-Means algorithm on Hadoop Cluster having four nodes  ...  Thus, the data volumes processing by many applications will routinely cross the petabyte threshold range, in that case it would increase the computational requirements.  ...  In other word it can tell that MapReduce represents to a framework that runs on a computational cluster to extract the Knowledge from a large datasets.  ... 
doi:10.1109/icccnt.2013.6726852 fatcat:hzl3hp7vzne3hnfbvnfy7ytprq

Practical Verification of MapReduce Computation Integrity via Partial Re-execution [article]

Eunjung Yoon, Peng Liu
2020 arXiv   pre-print
In this paper, we present a solution called V-MR (Verifiable MapReduce), which is a framework that verifies the integrity of MapReduce computation outsourced in the untrusted cloud via partial re-execution  ...  Despite a growing interest and recent progress in verifiable computation, the existing techniques are still not practical enough for big data processing due to high verification overhead.  ...  We investigated the computation result integrity of MapReduce applications as the case study.  ... 
arXiv:2002.09560v1 fatcat:25t25xsnabhpfkggrt5oo7pf5i

SecureMR: A Service Integrity Assurance Framework for MapReduce

Wei Wei, Juan Du, Ting Yu, Xiaohui Gu
2009 2009 Annual Computer Security Applications Conference  
In this paper, we present SecureMR, a practical service integrity assurance framework for MapReduce.  ...  To deploy MapReduce as a data processing service over open systems such as service oriented architecture, cloud computing, and volunteer computing, we must provide necessary security mechanisms to protect  ...  In this paper, we present SecureMR, a practical service integrity assurance framework for MapReduce.  ... 
doi:10.1109/acsac.2009.17 dblp:conf/acsac/WeiDYG09 fatcat:lw5b744m7fe7xh7uvmo5ahjzre

Towards verified cloud computing environments

Frederic Loulergue, Frederic Gava, Nikolai Kosmatov, Matthieu Lemerre
2012 2012 International Conference on High Performance Computing & Simulation (HPCS)  
We argue that most of the layers could be practically formally verified, even if the work to verify all levels is huge.  ...  In this paper we study a usual software stack of a cloud environment from the perspective of formal verification. This software stack ranges from applications to the hypervisor.  ...  In the case of MapReduce programs, a semantics analysis of Google's MapReduce has been conducted in [40] .  ... 
doi:10.1109/hpcsim.2012.6266896 dblp:conf/ieeehpcs/LoulergueGKL12 fatcat:mijfmtaszvgdro3c763jr2tf3e

Research on Computing Efficiency of MapReduce in Big Data Environment

Tilei Gao, Ming Yang, Rong Jiang, Yu Li, Yao Yao, G. Lee
2019 ITM Web of Conferences  
The emergence of big data has brought a great impact on traditional computing mode, the distributed computing framework represented by MapReduce has become an important solution to this problem.  ...  Based on the big data, this paper deeply studies the principle and framework of MapReduce programming.  ...  Subsequently, we will transfer the operation to the actual application scenarios to study how MapReduce can better play its computing advantages in practical applications. Figure 1 . 1 Figure 1.  ... 
doi:10.1051/itmconf/20192603002 fatcat:z5rufyf2mnaeharfhbnv5t2iem

Towards Trusted Services: Result Verification Schemes for MapReduce

Chu Huang, Sencun Zhu, Dinghao Wu
2012 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)  
Recent development in Internet-scale data applications and services, combined with the proliferation of cloud computing, has created a new computing model for data intensive computing best characterized  ...  In this work, we focus on a unique security concern on the MapReduce architecture.  ...  RELATED WORK Security of distributed systems has been studied by many researchers, but only a few of them really focus on MapReduce. Xiao et al.  ... 
doi:10.1109/ccgrid.2012.77 dblp:conf/ccgrid/HuangZW12 fatcat:qnomdsvrqjgvxjq7edr5woyzsa

Assisting developers of Big Data Analytics Applications when deploying on Hadoop clouds

Weiyi Shang, Zhen Ming Jiang, Hadi Hemmati, Brain Adams, Ahmed E. Hassan, Patrick Martin
2013 2013 35th International Conference on Software Engineering (ICSE)  
Through a case study on three representative Hadoop-based BDA Apps, we show that our approach can rapidly direct the attention of BDA App developers to the major differences between the two deployments  ...  Knowledge of such differences is essential in verifying BDA Apps when analyzing big data in the cloud.  ...  Cloud computing platform: Hadoop This sub-section introduces Hadoop, a widely used clou computing platform that we choose for our case studies.  ... 
doi:10.1109/icse.2013.6606586 dblp:conf/icse/ShangJHAHM13 fatcat:onitzimqdnbztiyrqvb3asvqry

MapReduce rationality verification based on object Petri net

2019 Journal of Systems Engineering and Electronics  
To solve this problem, a method for verifying the rationality of a MapReduce procedure before executing it on a computer cluster is proposed.  ...  The results from extensive case studies demonstrate that the proposed method is feasible and effective.  ...  Fig. 8 shows that the execution time of the OPN models of the two cases decreases with the increase of λ 1 .  ... 
doi:10.21629/jsee.2019.05.05 fatcat:nlnpjoh7e5ctlnfvnyhpqgjqai

On Scheduling Algorithms for MapReduce Jobs in Heterogeneous Clouds with Budget Constraints [chapter]

Yang Wang, Wei Shi
2013 Lecture Notes in Computer Science  
Our empirical studies verify the proposed optimal algorithm and show the efficiency of the greedy algorithm to minimize the scheduling length.  ...  In this paper, we consider task-level scheduling algorithms with respect to budget constraints for a bag of MapReduce jobs on a set of provisioned heterogeneous (virtual) machines in cloud platforms.  ...  Empirical Studies To verify and evaluate the proposed algorithms and study their performance behaviours in reality, we developed a Budget Distribution Solver (BDS) in Java that efficiently implements the  ... 
doi:10.1007/978-3-319-03850-6_18 fatcat:hxeo7hrsujgcherehjpfynmli4

IntegrityMR: Exploring Result Integrity Assurance Solutions for Big Data Computing Applications

Yongzhi Wang, Jinpeng Wei, Mudhakar Srivatsa, Yucong Duan, Wencai Du
2016 International Journal of Networked and Distributed Computing (IJNDC)  
In this paper, we propose IntegrityMR, a multi-public clouds architecture-based solution, which performs the MapReduce-based result integrity check techniques at two alternative layers: the task layer  ...  Our experimental results show that solutions in both layers offer a high result integrity but non-negligible performance overheads.  ...  We build a prototype system that support most big data applications. At the application layer, we make a case study on Pig Latin 8 , a popular MapReduce based big data management application.  ... 
doi:10.2991/ijndc.2016.4.2.5 fatcat:hg7qph5vavfexmjmwqijdh4pqm

Scheduling MapReduce Jobs under Multi-Round Precedences [article]

Dimitris Fotakis, Ioannis Milis, Orestis Papadigenopoulos, Vasilis Vassalos, Georgios Zois
2016 arXiv   pre-print
Since the number of rounds per job in typical MapReduce algorithms is a small constant, our scheduling algorithms achieve a small approximation ratio in practice.  ...  We consider non-preemptive scheduling of MapReduce jobs with multiple tasks in the practical scenario where each job requires several map-reduce rounds.  ...  First, in terms of modeling the MapReduce scheduling process: (i) We consider the practical scenario of multiround multi-task MapReduce jobs and capture their task dependencies, and (ii) we study both  ... 
arXiv:1602.05263v1 fatcat:hdkyrqfwtjat3bdo4bxzgqth3u

IntegrityMR: Integrity assurance framework for big data analytics and management applications

Yongzhi Wang, Jinpeng Wei, Mudhakar Srivatsa, Yucong Duan, Wencai Du
2013 2013 IEEE International Conference on Big Data  
Big data analytics and knowledge management is becoming a hot topic with the emerging techniques of cloud computing and big data computing model such as MapReduce.  ...  We design and implement the system at both layers based on Apache Hadoop MapReduce and Pig Latin, and perform a series of experiments with popular big data analytics and management applications such as  ...  At the application layer, we make a case study on Pig Latin [8] , a popular MapReduce based big data management application.  ... 
doi:10.1109/bigdata.2013.6691780 dblp:conf/bigdataconf/WangWSDD13 fatcat:7e5ntsf7n5brvf3ahxod47rksm

Scalability and Validation of Big Data Bioinformatics Software

Andrian Yang, Michael Troup, Joshua W.K. Ho
2017 Computational and Structural Biotechnology Journal  
We discuss how modern cloud computing and big data programming frameworks such as MapReduce and Spark are being used to effectively implement divide-and-conquer in a distributed computing environment.  ...  Scalability is defined as the ability for a program to scale based on workload. It has always been an important consideration when developing bioinformatics algorithms and programs.  ...  Acknowledgements This work was supported in part by funds from the New South Wales Ministry of Health, a National Health and Medical Research Council/National Heart Foundation Career Development Fellowship  ... 
doi:10.1016/j.csbj.2017.07.002 pmid:28794828 pmcid:PMC5537105 fatcat:nnkrlwg35fd3hkpbg2jtosdicq

A Case for Understanding End-to-End Performance of Topic Detection and Tracking Based Big Data Applications in the Cloud [chapter]

Meisong Wang, Rajiv Ranjan, Prem Prakash Jayaraman, Peter Strazdins, Pete Burnap, Omer Rana, Dimitrios Georgakopulos
2016 Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering  
In this vision paper, we propose a layered performance model for topic detection and tracking based big data analytic applications that take into account big data characteristics, the data and event flow  ...  On the other hand, cloud computing recently has emerged as the platform that can provide an effective and economical infrastructure for collection and analysis of big data produced by applications such  ...  Many MapReduce model assume that the shuffle part starts when all the Map tasks have been done. However, it is not always the case.  ... 
doi:10.1007/978-3-319-47063-4_33 fatcat:als5yfffvvgunlg6ippnwz4jfq

Algorithms for Managing, Querying and Processing Big Data in Cloud Environments

Alfredo Cuzzocrea
2016 Algorithms  
They study the problem of effectively and efficiently computing traversals of large-scale RDF graphs over MapReduce and propose a solution that is based on the Breadth First Search (BFS) strategy for visiting  ...  In the solution proposed by authors, MMAS is combined with Spark MapReduce to execute the path building and the pheromone operation in a distributed computer Cluster.  ...  Acknowledgments: The Special Issue editor would like to express his gratitude to all contributors and reviewers whose efforts allowed making this special issue a success, as well as to the editorial staff  ... 
doi:10.3390/a9010013 fatcat:gw6w3qv53fe5dbykhkciuzawsq
« Previous Showing results 1 — 15 out of 3,908 results