Filters








647 Hits in 3.7 sec

Massive Semantic Web data compression with MapReduce

Jacopo Urbani, Jason Maassen, Henri Bal
2010 Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing - HPDC '10  
In this paper we propose a MapReduce algorithm that efficiently compresses and decompresses a large amount of Semantic Web data.  ...  The Semantic Web consists of many billions of statements made of terms that are either URIs or literals.  ...  MAPREDUCE DATA COMPRESSION MapReduce can be either used alone or in combination with an external DBMS.  ... 
doi:10.1145/1851476.1851591 dblp:conf/hpdc/UrbaniMB10 fatcat:q4oryyljlbg3fevkpt4ee6kzzu

A note on exploration of IoT generated big data using semantics

Rajiv Ranjan, Dhavalkumar Thakker, Armin Haller, Rajkumar Buyya
2017 Future generations computer systems  
optimized classification algorithm for massive web page classification using semantics networks, such as Wikipedia, WordNet.  ...  The Semantic Web and its derivatives in the form of Linked data and Web of data can play a crucial role in addressing various big data exploration challenges.  ... 
doi:10.1016/j.future.2017.06.032 fatcat:nc3gui7mwrcp3djzzvbugzjorm

An Effective and Efficient MapReduce Algorithm for Computing BFS-Based Traversals of Large-Scale RDF Graphs

Alfredo Cuzzocrea, Mirel Cosulschi, Roberto de Virgilio
2016 Algorithms  
When RDF graphs are defined on top of big (Web) data, they lead to the so-called large-scale RDF graphs, which reasonably populate the next-generation Semantic Web.  ...  Nowadays, a leading instance of big data is represented by Web data that lead to the definition of so-called big Web data.  ...  Contribution [66] studies how to apply compression paradigms to obtain scalable RDF processing with MapReduce.  ... 
doi:10.3390/a9010007 fatcat:wxorzsnnovbjvpnoc4prrucnty

Crowdsourcing MapReduce

Philipp Langhans, Christoph Wieser, François Bry
2013 Proceedings of the 22nd International Conference on World Wide Web - WWW '13 Companion  
JSMapReduce is an implementation of MapReduce which exploits the computing power available in the computers of the users of a web platform by giving tasks to the JavaScript engines of their web browsers  ...  Data collected with GWAPs is processed with MapReduce for building a semantic search index.  ...  This fact makes JSMapReduce suitable for MapReduce jobs that draw on exploiting the CPU rather than relying on massive data exchange.  ... 
doi:10.1145/2487788.2487915 dblp:conf/www/LanghansWB13 fatcat:pwmqofdrwjgzvddg77c4sr7elq

A Parallel Computing Method of Social Tag Cooccurrence Relation

Xiangqian Wang, Huizong Li, Yerong He
2013 Applied Mathematics & Information Sciences  
The research may play a role in the facilitative effect on the expanding application of social tagging system with abundant, multiplex, complex annotated data.  ...  In recent years, cloud computing was proposed to deal with huge data, which provide a new parallel computing framework for massive data processing.  ...  relationship from the massive annotating data. There is a simpler and more efficient method for dealing with this huge annotating data, which is a parallel computing method.  ... 
doi:10.12785/amis/070648 fatcat:67ve34oijjhwlpqwm4klmqwcly

Sempala: Interactive SPARQL Query Processing on Hadoop [chapter]

Alexander Schätzle, Martin Przyjaciel-Zablocki, Antony Neu, Georg Lausen
2014 Lecture Notes in Computer Science  
Driven by initiatives like Schema.org, the amount of semantically annotated data is expected to grow steadily towards massive scale, requiring cluster-based solutions to query it.  ...  For Hadoop-based applications, a common data pool (HDFS) provides many synergy benefits, making it very attractive to use these infrastructures for semantic data processing as well.  ...  Though Hadoop is not developed with regard to the Semantic Web, we advocate its adaptation for Semantic Web purposes for two main reasons: (1) The expected growth of semantic data requires solutions  ... 
doi:10.1007/978-3-319-11964-9_11 fatcat:nbt5xlop2jahlapxvfikcqkn5y

Big Data Summarization : Framework, Challenges and Possible Solutions

Shilpa G. Kolte, Jagdish W. Bakal
2016 Advanced Computational Intelligence An International Journal (ACII)  
Future work in this direction can be providing the novel clustering algorithm and semantic data indexing and data compression techniques for big data summarization over the MapReduce framework.  ...  DATA PROCESSING In Hadoop framework following MapReduce and Hbase component are used for data processing: MapReduce is a parallel processing framework which is massively scalable, parallel processing frame  ... 
doi:10.5121/acii.2016.3401 fatcat:jshfr2clf5cgxjwz7zbmnfz3n4

Scalable RDF Data Compression using X10 [article]

Long Cheng, Avinash Malik, Spyros Kotoulas, Tomas E Ward, Georgios Theodoropoulos
2014 arXiv   pre-print
The Semantic Web comprises enormous volumes of semi-structured data elements. For interoperability, these elements are represented by long strings.  ...  A typical method for alleviating the impact of this problem is through the use of compression methods that produce more compact representations of the data.  ...  INTRODUCTION The Semantic Web is becoming mainstream.  ... 
arXiv:1403.2404v1 fatcat:isjn3vqu4rfybgv3pqpc36o7r4

Fast Compression of Large Semantic Web Data Using X10

Long Cheng, Avinash Malik, Spyros Kotoulas, Tomas E Ward, Georgios Theodoropoulos
2016 IEEE Transactions on Parallel and Distributed Systems  
(i.e. compress) huge RDF datasets, so as to meet the big data challenges from large data warehouses and the Semantic Web  ...  Based on that, we have introduced a new dictionary encoding algorithm for the fast compression of big Semantic Web data.  ... 
doi:10.1109/tpds.2015.2496579 fatcat:vk6rafssszfhlgj75ggjagfkfi

Cheetah

Songting Chen
2010 Proceedings of the VLDB Endowment  
In fact, each node with commodity hardware in our cluster is able to process raw data at 1GBytes/s. Lastly, we show how to seamlessly integrate Cheetah into any adhoc MapReduce jobs.  ...  This allows MapReduce developers to fully leverage the power of both MapReduce and data warehouse technologies.  ...  This way, MapReduce developers can take full advantage of the power of both MapReduce (massive parallelism and scalability) and data warehouse (easy and efficient data access) technologies.  ... 
doi:10.14778/1920841.1921020 fatcat:wjwjuyw6jjadrgp45dk72veiuy

The Internet of Things: A Survey from the Data-Centric Perspective [chapter]

Charu C. Aggarwal, Naveen Ashish, Amit Sheth
2012 Managing and Mining Sensor Data  
In addition, such data is often sensitive, and brings a number of privacy challenges associated MANAGING AND MINING SENSOR DATA with it.  ...  This chapter will discuss a data analytics perspective about mining and managing data associated with this phenomenon, which is now known as the internet of things.  ...  The work in [105] addresses the issue of semantic web compression with the use of the MapReduce framework.  ... 
doi:10.1007/978-1-4614-6309-2_12 fatcat:rgrbqjfllzb2tkgnixdrzfvdfy

Improving I/O Efficiency in Hadoop-Based Massive Data Analysis Programs

Kyong-Ha Lee, Woo Lam Kang, Young-Kyoon Suh
2018 Scientific Programming  
In this article, we address the problem of the I/O inefficiency in Hadoop-based massive data analysis by introducing our efficient modification of Hadoop.  ...  Apache Hadoop has been a popular parallel processing tool in the era of big data.  ...  Apache Hadoop [1] , an open-sourced implementation of Google's MapReduce [2] , is a prominent data processing tool that processes a massive volume of data with a shared-nothing architecture in a parallel  ... 
doi:10.1155/2018/2682085 fatcat:cynmnwikwnffpcn342p4omsp3m

The Application and Practice of Parallel Cloud Computing in ISP

Zhilan Huang, Guoliang Yang, Shengyong Ding
2011 2011 Sixth Open Cirrus Summit  
In order to deal with massive data processing, C hina T elecom has deployed a platform named distributed service engine, which is based on parallel cloud computing technology.  ...  O n this platform, we also developed some tele-data processing system and internet applications to show the effectiveness of this platform.  ...  THE PROBLEM OF MASS DATA MANAGEMENT With the development of Web 2.0, internet is developing at an unprecedented speed.  ... 
doi:10.1109/ocs.2011.7 fatcat:uwwlvrsyhrckdpefkkd3ga2gpy

Scalable Semantics – The Silver Lining of Cloud Computing

Andrew Newman, Yuan-Fang Li, Jane Hunter
2008 2008 IEEE Fourth International Conference on eScience  
Semantic inferencing and querying across largescale RDF triple stores is notoriously slow.  ...  We then present some implementation details for our MapReduce-based RDF molecule store.  ...  The combination of MapReduce and Semantic Web technologies appears to offer a perfect solution to the problem of large-scale heterogeneous data integration, querying and reasoning.  ... 
doi:10.1109/escience.2008.23 dblp:conf/eScience/NewmanLH08 fatcat:xmeh3hi5mfdl3l2ekz7qdh54u4

Manimal

Michael J. Cafarella, Christopher Ré
2010 Procceedings of the 13th International Workshop on the Web and Databases - WebDB '10  
Manimal can address many different optimization opportunities, including projections, structure-aware data compression, and others.  ...  This paper proposes Manimal, which uses static code analysis to detect MapReduce program semantics and thereby enable wholly-automatic optimization of MapReduce programs.  ...  First, because there is no semantic information associated with a bytestream-oriented file, any compression in conventional MapReduce must be applied to the total set of bytes.  ... 
doi:10.1145/1859127.1859141 dblp:conf/webdb/CafarellaR10 fatcat:ueowovmgpzhwrnihmxguywz3mq
« Previous Showing results 1 — 15 out of 647 results