2,064 Hits in 4.0 sec

Rethinking Data-Intensive Science Using Scalable Analytics Systems

Frank Austin Nothaft, Michael Linderman, Michael J. Franklin, Anthony D. Joseph, David A. Patterson, Matt Massie, Timothy Danford, Zhao Zhang, Uri Laserson, Carl Yeksigian, Jey Kottalam, Arun Ahuja (+1 others)
2015 Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data - SIGMOD '15  
We can attack the problem caused by exponential data growth by applying horizontally scalable techniques from current analytics systems to accelerate scientific processing pipelines.  ...  From building this system, we were able to distill a set of techniques for implementing scientific analyses efficiently using commodity "big data" systems.  ...  Author FAN is supported by a National Science Foundation Graduate Research Fellowship.  ... 
doi:10.1145/2723372.2742787 dblp:conf/sigmod/NothaftMDZLYKAH15 fatcat:nokfli3y4fe6zi6avrluhncvau

Data-intensive applications, challenges, techniques and technologies: A survey on Big Data

C.L. Philip Chen, Chun-Yang Zhang
2014 Information Sciences  
However, there are so much potential and highly useful values hidden in the huge volume of data.  ...  It is already true that Big Data has drawn huge attention from researchers in information sciences, policy and decision makers in governments and enterprises.  ...  Big Data has a deep relationship with e-Science [66] , which is computationally intensive science which usually is implemented in distributed computing systems.  ... 
doi:10.1016/j.ins.2014.01.015 fatcat:mdnyfqqnh5citdt4moyfcggpfi

Computing infrastructure for big data processing

Ling Liu
2013 Frontiers of Computer Science  
With the push of big data, we are entering a new era of parallel computing driven by novel and ground breaking research innovation on elastic parallelism and scalability.  ...  With computing systems transforming from single-processor devices to the ubiquitous and networked devices and the datacenter-scale computing in the cloud, the parallelism has become ubiquitous at many  ...  The second contest is the richness of analytics. As more digital data and information technology penetrate into science and engineering fields, we are confronted with richer analytics to perform.  ... 
doi:10.1007/s11704-013-3900-x fatcat:dbhdg4b6r5a5jlbzcescjrvusy

Big data challenges in simulation-based science

Manish Parashar
2014 Proceedings of the sixth international workshop on Data intensive distributed computing - DIDC '14  
Traditional data analysis pipeline We need to Rethink the Data Management Pipeline!  ...  sufficient for staging data Tradeoffs in the Analytics Pipeline Impact of data placement and movement p Canonical system architecture composed of DRAM, NVRAM, SSD and disk storage levels and networked  ... 
doi:10.1145/2608020.2612731 dblp:conf/hpdc/Parashar14 fatcat:f4a4iueobjgfdhsq7mfytlhs3i

Eliminating Dark Bandwidth: A Data-Centric View of Scalable, Efficient Performance, Post-Moore [chapter]

Jonathan C. Beard, Joshua Randall
2017 Lecture Notes in Computer Science  
Most of computing research has focused on the computing technologies themselves versus how full systems make use of them (e.g., memory fabric, interconnect, software, and compute elements combined).  ...  This paper examines the problem of dark bandwidth and offers a holistic approach to reduce overall data movement within future compute systems.  ...  The time is ripe for a rethink of virtual memory and a rethink for the relationship of the operating system, memory system, and runtime.  ... 
doi:10.1007/978-3-319-67630-2_9 fatcat:7tyhuw3k6jbgvlrau6yfizkram

Big Data Analytics: A Perspective View

Suman Pandey
2017 International Journal of Advanced Research in Computer Science and Software Engineering  
Big data analytics challenges the situation of the present infrastructure of data storage management and also statistical data estimation.  ...  The process of diving into large amounts of data to discover patterns and disguised correlations is named as big data analytics.  ...  These unprecedented changes require us to rethink how to design, build and operate data processing components.  ... 
doi:10.23956/ijarcsse/sv7i5/0237 fatcat:mq75vo3n4rbihnc43mtpktzrru

Many-task computing for grids and supercomputers

Ioan Raicu, Ian T. Foster, Yong Zhao
2008 2008 Workshop on Many-Task Computing on Grids and Supercomputers  
Tasks may be small or large, uniprocessor or multiprocessor, computeintensive or data-intensive.  ...  Many task computing includes loosely coupled applications that are generally communication-intensive but not naturally expressed using standard message passing interface commonly found in high performance  ...  We also thank the Argonne Leadership Computing Facility for allowing us to test MTC in practice on the IBM Blue Gene/P.  ... 
doi:10.1109/mtags.2008.4777912 dblp:conf/sc/RaicuFZ08 fatcat:u3fyhpolcrav7ggrmrgron4qja

Rethinking High Performance Computing System Architecture for Scientific Big Data Applications

Yong Chen, Chao Chen, Yanlong Yin, Xian-He Sun, Rajeev Thakur, William Gropp
2016 2016 IEEE Trustcom/BigDataSE/ISPA  
netCDF (PnetCDF), and Adapt- data-intensive sciences.  ...  cli- and study the impact of a new decoupled high performance mate sciences, astrophysics, computational chemistry, com- computing system architecture for data-intensive sciences. putational  ... 
doi:10.1109/trustcom.2016.0248 dblp:conf/trustcom/ChenCYSTG16 fatcat:ci6xudtd4jaerm6zum2ezvrh5a

The Challenges of Global-scale Data Management

Faisal Nawab, Divyakant Agrawal, Amr El Abbadi
2016 Proceedings of the 2016 International Conference on Management of Data - SIGMOD '16  
of data management systems.  ...  Global-scale data management (GSDM) empowers systems by providing higher levels of fault-tolerance, read availability, and efficiency in utilizing cloud resources.  ...  His current interests are in the area of scalable data management and data analysis in Cloud Computing environments, security and privacy of data in the cloud, and scalable analytics over big data.  ... 
doi:10.1145/2882903.2912571 dblp:conf/sigmod/NawabAA16 fatcat:4ja2phoh7zajxnckcupa25o65y

An Incremental Approach for Real-Time Big Data Visual Analytics

Ignacio Garcia, Ruben Casado, Abdelhamid Bouchachia
2016 2016 IEEE 4th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW)  
This should allow the data analyst to take decisions in real-time.  ...  To address such a requirement, this paper introduces a new approach for real-time visualization of extremely large data-at-rest as well as data-in-motion by showing intermediate results as soon as they  ...  To deal with these issues, we propose to build an innovative real-time visual analytics system, capable of addressing the scalability issues.  ... 
doi:10.1109/w-ficloud.2016.46 dblp:conf/ficloud/GarciaCB16 fatcat:yxcga4gnpbaj5nz5wouciy2be4

Systems and Algorithms for Large-scale Graph Analytics (Dagstuhl Seminar 14462)

Eiko Yoneki, Amitabha Roy, Derek Murray, Marc Herbstritt
2015 Dagstuhl Reports  
This report documents the program and the outcomes of Dagstuhl Seminar 14462 "Systems and Algorithms for Large-scale Graph Analytics".  ...  The seminar was a successful gathering of computer scientists from the domains of systems, algorithms, architecture and databases all of whom are interested in graph processing.  ...  about their scalability with data set size.  ... 
doi:10.4230/dagrep.4.11.59 dblp:journals/dagstuhl-reports/Yoneki0M14 fatcat:dilicps65jgipetfid3udj2q4i

Improving the Scalability of Cloud-Based Resilient Database Servers [chapter]

Luís Soares, José Pereira
2011 Lecture Notes in Computer Science  
Then we compare this proposal to other database server architectures using an analytical model focused on peak throughput and conclude that it provides the best performance/cost trade-off while at the  ...  Then we use a simple analytical model to seek scalability boundaries of different architectures and how shared resources in a cloud infrastructure can better be allocated.  ...  This means that efficiency when applying updates, for instance by using a dedicated low level interface, can offset the scalability obstacle presented by wrote intensive loads.  ... 
doi:10.1007/978-3-642-21387-8_11 fatcat:mivklz6tzra3tcq6jjegn2ngm4

Cybercosm: New Foundations for a Converged Science Data Ecosystem [article]

Mark Asch, François Bodin, Micah Beck, Terry Moore, Michela Taufer, Martin Swany, Jean-Pierre Vilotte
2021 arXiv   pre-print
Enabling ground breaking science that makes full use of this new, data saturated research environment will require distributed systems that support dramatically improved resource sharing, workflow portability  ...  The Cybercosm vision presented in this white paper describes a radically different approach to the architecture of distributed systems for data-intensive science and its application workflows.  ...  As noted above, Cybercosm's transvisor layer does not simply provide technological tools; rather, it represents a rethinking of the architecture of data-intensive science application workflows by introducing  ... 
arXiv:2105.10680v3 fatcat:m2vytmdkdvdvzc2deal5odte4u

Towards a Data-Centric Architecture in the Automotive Industry

Daniel Alvarez-Coello, Daniel Wilms, Adnan Bekan, Jorge Marx Gómez
2021 Procedia Computer Science  
Several enterprises from different domains are currently focusing on improving their data architectures by re-defining the underlying data models to enable core support for analytics and artificial intelligence  ...  Several enterprises from different domains are currently focusing on improving their data architectures by re-defining the underlying data models to enable core support for analytics and artificial intelligence  ...  We believe that the cornerstones of a modern data architecture for automotive presented in this paper can be applied to facilitate the transformation.  ... 
doi:10.1016/j.procs.2021.01.215 fatcat:q45i2hyrfzajplxttipwrahsbi

Making a case for distributed file systems at Exascale

Ioan Raicu, Ian T. Foster, Pete Beckman
2011 Proceedings of the third international workshop on Large-scale system and application performance - LSAP '11  
We propose that future high-end computing systems be designed with non-volatile memory on every compute node, allowing every compute node to actively participate in the metadata and data management and  ...  Storage has the potential to be the Achilles heel of exascale systems.  ...  Dept. of Energy under Contract DE-AC02-06CH11357, as well as the National Science Foundation grant NSF-0937060 CIF-72 and NSF-1054974.  ... 
doi:10.1145/1996029.1996034 fatcat:bon3bizokzckhl5ajrzsqt7tqq
« Previous Showing results 1 — 15 out of 2,064 results