234 Hits in 11.4 sec

Efficient runtime thread management for the nano-threads programming model [chapter]

Dimitrios S. Nikolopoulos, Eleftherios D. Polychronopoulos, Theodore S. Papatheodorou
1998 Lecture Notes in Computer Science  
We evaluate the exploitation of processor affinity for the management of nano-thread contexts, and the use of hierarchical queues to implement user-level scheduling strategies for applications with inherent  ...  The use of hierarchical queues gives significant performance improvements between 17% and 40%, compared to scheduling strategies that use local queues. ¡ Throughout this paper the term processors refers  ...  (CEPBA) for providing us access to their Origin2000 system and the referees for their helpful comments.  ... 
doi:10.1007/3-540-64359-1_688 fatcat:54eahhbswfb2pe37piqtibzpoi

Cpp-Taskflow v2: A General-purpose Parallel and Heterogeneous Task Programming System at Scale [article]

Tsung-Wei Huang, Dian-Lun Lin, Yibo Lin, Chun-Xun Lin
2020 arXiv   pre-print
The Cpp-Taskflow project addresses the long-standing question: How can we make it easier for developers to write parallel and heterogeneous programs with high performance and simultaneous high productivity  ...  Our programming model empowers users with both static and dynamic task graph constructions to incorporate a broad range of computational patterns including hybrid CPU-GPU computing, dynamic control flow  ...  The graph has 6 weak dependencies and 1 strong dependency.  ... 
arXiv:2004.10908v2 fatcat:snwlszx6bnhnflbpmddx5ileyi

Accelerating Bowtie2 with a lock-less concurrency approach and memory affinity

Claudia Misale
2014 2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing  
Only the reference genome is left shared. As a further optimisation, the Master and each Worker were pinned on cores and the reference genome was allocated interleaved among memory nodes.  ...  A concrete example is Bowtie2, one of the fastest (concurrent, Pthread-based) and state of the art not GPU-based alignment tool.  ...  ACKNOWLEDGMENT This work has been supported by the European Union Framework 7 grant IST-2011-288570 ParaPhrase: Parallel Patterns for Adaptive Heterogeneous Multicore Systems,  ... 
doi:10.1109/pdp.2014.50 dblp:conf/pdp/Misale14 fatcat:6k225zt4hza7fn44fr6phsqvjq

Extending database task schedulers for multi-threaded application code

Florian Wolf, Iraklis Psaroudakis, Norman May, Anastasia Ailamaki, Kai-Uwe Sattler
2015 Proceedings of the 27th International Conference on Scientific and Statistical Database Management - SSDBM '15  
We present a general approach to address this issue by integrating shared memory programming solutions into the task schedulers of databases.  ...  Multi-threaded application code, however, introduces a resource competition between the threads of applications and the threads of the database task scheduler.  ...  Jobs are passed to the JobExecutor together with their execution priority, and pushed to the priority queues.  ... 
doi:10.1145/2791347.2791379 dblp:conf/ssdbm/WolfPMAS15 fatcat:skw2oyo4jjbmxpnfqjdhpi32hq

Space and time efficient execution of parallel irregular computations

Cong Fu, Tao Yang
1997 SIGPLAN notices  
The irregular parallelism is modeled by task dependence graphs with mixed granularities. The trade-off in achieving both time and space efficiency is investigated.  ...  Solving problems of large sizea is an important goal for parallel machines with multiple CPU and memory resources.  ...  Other anti/output dependence edges can be eliminated by program transformation. A transformed dependence graph contains true dependencies only.  ... 
doi:10.1145/263767.263773 fatcat:fe7tm53q6re6tn4vg54u4f7llm

High-Performance Statistical Computing in the Computing Environments of the 2020s [article]

Seyoon Ko, Hua Zhou, Jin J. Zhou, Joong-Ho Won
2021 arXiv   pre-print
Deep learning software libraries make programming statistical algorithms easy and enable users to write code once and run it anywhere – from a laptop to a workstation with multiple graphics processing  ...  As a case in point, we analyze the onset of type-2 diabetes from the UK Biobank with 200,000 subjects and about 500,000 single nucleotide polymorphisms using the HPC ℓ_1-regularized Cox regression.  ...  Jobs in a queue are managed and prioritized by a job scheduler. • Master: an instance that manages the job scheduler. • Worker: an instance that executes the jobs. • Job scheduler: an application program  ... 
arXiv:2001.01916v3 fatcat:p745jgoj3bgixpwn54npychoci


Hamed Esfahani, Jonas Fietz, Qi Ke, Alexei Kolomiets, Erica Lan, Erik Mavrinac, Wolfram Schulte, Newton Sanches, Srikanth Kandula
2016 Proceedings of the 38th International Conference on Software Engineering Companion - ICSE '16  
It is essential that this continuous integration scales, guarantees short feedback cycles, and functions reliably with minimal human intervention.  ...  ousands of Microso engineers build and test hundreds of soware products several times a day.  ...  Di erences in scheduling are described later ( § . ). Extracting Dependency Graph e dependency graph of a build determines the order of execution.  ... 
doi:10.1145/2889160.2889222 dblp:conf/icse/EsfahaniFKKLMSS16 fatcat:2wmnt6sqhvhujoegd2kkurxxtm

Abstracts of Current Computer Literature

1970 IEEE transactions on computers  
APL Operations 7996 Microprogrammed Instructions in a Parallel Processor 7996 Feedback Queueing System for Time-Shared Computers with Batch Arrivals, Bulk Service, and Queue-Dependent Service Time  ...  ., November 1969; CFSTI, AD 699 928, $3.00. 8016 8017 tem . tem 8021A Feedback Queueing System With Batch Arrivals, Bulk Service, and Queue-Dependent Service Time, L. E. N.  ... 
doi:10.1109/t-c.1970.222819 fatcat:oauihu4f65hwjir4uopjkaxrk4

Big Graph Analytics Platforms

Da Yan, Yingyi Bu, Yuanyuan Tian, Amol Deshpande
2017 Foundations and Trends in Databases  
Since many graph algorithms are iterative, Pregel keeps the graph data in the main memory and adopts an iterative, message-passing computation model (inspired by the well-known Bulk Synchronous Parallel  ...  Pregel+ is written in C/C++, and thus recycles memory in time and keeps the memory footprint small.  ...  Then, parallel asynchronous I/O requests are submitted to the SSD for those required pages that are not in the buffer pool.  ... 
doi:10.1561/1900000056 fatcat:ucqrtzo4q5g2lpj6dmp7jv3e5m

HiCOPS: High Performance Computing Framework for Tera-Scale Database Search of Mass Spectrometry based Omics Data [article]

Muhammad Haseeb, Fahad Saeed
2021 arXiv   pre-print
Existing serial, and high-performance computing (HPC) search engines, otherwise highly successful, are known to exhibit poor-scalability with increasing size of theoretical search-space needed for increased  ...  Consequently, the bottleneck for computational techniques is the communication costs of moving the data between hierarchy of memory, or processing units, and not the arithmetic operations.  ...  Acknowledgments This work used the NSF Extreme Science and Engineering Discovery Environment (XSEDE) Supercomputers through allocations: TG-CCR150017 and TG-ASC200004.  ... 
arXiv:2102.02286v2 fatcat:zllqiadjqjftxhvexjwt7oxsle

A Framework for Opportunistic Cluster Computing Using JavaSpaces1) [chapter]

Jyoti Batheja, Manish Parashar
2001 Lecture Notes in Computer Science  
In this thesis we present the design, implementation and evaluation of a framework that uses JavaSpaces to support this type of opportunistic (adaptive) parallel/distributed computing in a non-intrusive  ...  The framework targets applications exhibiting coarse grained parallelism and has three key features: 1) portability across heterogeneous platforms, 2) reduced overheads for configuration of participating  ...  However there is an inter-iteration dependency in the DO-WHILE loop for Page Rank calculation. Hence we do not expect to see a speed up with parallel processing.  ... 
doi:10.1007/3-540-48228-8_74 fatcat:72oua4fshnealddyxfhcqfwtxm

Comparison of VM deployment methods for HPC education

Nicholas Aaron Robison, Thomas J. Hacker
2012 Proceedings of the 1st Annual conference on Research in information technology - RIIT '12  
Coupled with the growth in virtualization is the need for reliable, high performance storage subsystems optimized for the specific performance needs of the installation.  ...  This paper describes our experiences with using virtualization for virtual high performance computing clusters for education, and compares the performance of the popular OpenNebula virtualization manager  ...  Job Scheduler: While manually submitting MPI jobs is an option for more experienced users it doesn't offer the robustness or automation a more full featured job scheduler, thus Torque 3.0.3 [1] was compiled  ... 
doi:10.1145/2380790.2380801 dblp:conf/riit/RobisonH12 fatcat:jlgxjle5rjh7ngd7gxhgja7h54

Computer-Integrated Assembly for Cost Effective Developments [chapter]

Rinaldo Michelini, Gabriella Acaccia, Massimo Callegari, Rezia Molfino, Roberto Razzoli
2000 Computer-Aided Design, Engineering, and Manufacturing  
(also in 3D), without any extra work; · the user is helped with analysis through reports and graphs; · built in statistical features help the user with pre-and post-simulation analyses; · the package  ...  of the assembly sequence, with ordering index based on job difficulty; • updating of the schedules with change of the priorities setting; • and the likes.  ... 
doi:10.1201/9781420049947.ch2 fatcat:g3r4oc4qqzfbbkullvpimgsvme

Multiprocessor Real-Time Locking Protocols: A Systematic Review [article]

Björn B. Brandenburg
2019 arXiv   pre-print
critical sections, and implementation and system-integration aspects.  ...  We systematically survey the literature on analytically sound multiprocessor real-time locking protocols from 1988 until 2018, covering the following topics: progress mechanisms that prevent the lock-holder  ...  locking protocols and their integration with programming languages.  ... 
arXiv:1909.09600v1 fatcat:tmqcpiuxfbbd5jrcecgvoeanpm

A Survey of Big Data Machine Learning Applications Optimization in Cloud Data Centers and Networks [article]

Sanaa Hamid Mohamed, Taisir E.H. El-Gorashi, Jaafar M.H. Elmirghani
2019 arXiv   pre-print
The MapReduce programming model and its widely-used open-source platform; Hadoop, are enabling the development of a large number of cloud-based services and big data applications.  ...  This survey article reviews the challenges associated with deploying and optimizing big data applications and machine learning algorithms in cloud data centers and networks.  ...  This work was supported by the Engineering and Physical Sciences Research Council, INTERNET (EP/H040536/1), STAR (EP/K016873/1) and TOWS (EP/S016570/1) projects.  ... 
arXiv:1910.00731v1 fatcat:kvi3br4iwzg3bi7fifpgyly7m4
« Previous Showing results 1 — 15 out of 234 results