12 Hits in 5.6 sec

GPU-Based Graph Decomposition into Strongly Connected and Maximal End Components [chapter]

Anton Wijs, Joost-Pieter Katoen, Dragan Bošnački
2014 Lecture Notes in Computer Science  
This paper presents parallel algorithms for component decomposition of graph structures on General Purpose Graphics Processing Units (GPUs).  ...  We explain the main rationales behind our GPU-algorithms, and show a significant speed-up over the sequential counterparts in several case studies.  ...  MEC Decomposition on the GPU Our GPU implementation for MEC decomposition is based on the basic algorithm presented in Section 2. For step 1, we use our GPU SCC decomposition.  ... 
doi:10.1007/978-3-319-08867-9_20 fatcat:phxsozhcq5ea7gtaczbfp4wqcq

Efficient GPU algorithms for parallel decomposition of graphs into strongly connected and maximal end components

Anton Wijs, Joost-Pieter Katoen, Dragan Bošnački
2016 Formal methods in system design  
This article presents parallel algorithms for component decomposition of graph structures on general purpose graphics processing units (GPUs).  ...  We explain the main rationales behind our GPU-algorithms, and show a significant speed-up over the sequential (as well as existing parallel) counterparts in several case studies.  ...  Conclusions We presented GPU algorithms for finding SCCs and MECs in sparse graphs that are based on FBT and a bounding version of it.  ... 
doi:10.1007/s10703-016-0246-7 fatcat:5v5nbayonnabrmen2x6lrdipti

GPU-based Commonsense Paradigms Reasoning for Real-Time Query Answering and Multimodal Analysis [article]

Nguyen Ha Tran, Erik Cambria
2018 arXiv   pre-print
In addition, in order to ex- tract important textual features from multimodal sources we generate domain-specific graphs based on commonsense knowledge and apply GPU-based graph traversal for fast feature  ...  In particular, we focus on the problem of multimodal sentiment analysis, which consists in the simultaneous analysis of different modali- ties, e.g., speech and video, for emotion and polarity detection  ...  They choose CPU-based SCC approaches such as Forward-Backward [83] , Coloring Head-off [84] and Recursive OBF [85] and modify them in order to support the GPU execution.  ... 
arXiv:1807.08804v1 fatcat:xu5m2oh55ndahct7rtrbecjz3m

Parallel Logic Programming: A Sequel [article]

Agostino Dovier, Andrea Formisano, Gopal Gupta, Manuel V. Hermenegildo, Enrico Pontelli, Ricardo Rocha
2022 arXiv   pre-print
Multi-core and highly-connected architectures have become ubiquitous, and this has brought renewed interest in language-based approaches to the exploitation of parallelism.  ...  This has been paralleled by significant advances within logic programming, such as tabling, more powerful static analysis and verification, the rapid growth of Answer Set Programming, and in general, more  ...  GPU-Based Parallelism GPUs are designed to execute a very large number of concurrent threads on multiple data.  ... 
arXiv:2111.11218v2 fatcat:hek4fidju5fblprut2squ6o3rm

Graph Reachability on Parallel Many-Core Architectures

Stefano Quer, Andrea Calabrese
2020 Computation  
General-purpose computing has been successfully used on Graphics Processing Units (GPUs) to parallelize algorithms that present a high degree of regularity.  ...  To prove the validity of our approach, we compare (in terms of time and memory requirements) our GPU-based approach with the original sequential CPU-based tool.  ...  Acknowledgments: The author wish to thank Antonio Caiazza for implementing the first version of the tool and performing the initial experimental evaluation.  ... 
doi:10.3390/computation8040103 fatcat:dnqybvtlsvh5pc2pe5v7fjvmna

GPU power modeling and architectural enhancements for GPU energy efficiency [article]

Jan Lucas, Technische Universität Berlin, Technische Universität Berlin, Ben Juurlink
Initially designed for 3D graphics, they evolved into general purpose accelerators, able to outperform CPUs on many tasks. The architecture of GPUs is optimized for massively parallel applications.  ...  GPU and DRAM by up to 6%.  ...  Many of the used GPU benchmarks also performed correctness checks on the output and did not detect errors.  ... 
doi:10.14279/depositonce-7874 fatcat:wbmij23r2ngtfaskosnrsxt5gu

Real-time registration and simulation in medical imaging [article]

Ramtin Shams, University, The Australian National, University, The Australian National
To my beautiful wife for the meaning she brings into my life and to my loving parents for the passion they bestowed on me to embark on the path to learning. l D e c la ra tio n The contents of this thesis  ...  are the results of original research and have not been subm itted for a higher degree to any other university or institution.  ...  The surgeon may take a number of intraoperative scans to correct the plan based on patient's current state and also to detect complications such as bleeding.  ... 
doi:10.25911/5d4eb23317f42 fatcat:bmcadet2xvfc3bpekye4a3nnbe

Dynamic task scheduling and binding for many-core systems through stream rewriting

Lars Middendorf
In order to estimate the performance and scalability of stream rewriting, a large number of experiments have been evaluated on many-core systems and the task management has been implemented in software  ...  and hardware.  ...  Based on the relaxed pattern matching, the stream can be partitioned into arbitrary segments for parallel rewriting on multiple cores.  ... 
doi:10.18453/rosdok_id00001530 fatcat:ikxsq7kcuvb7phvraau7c743zm

Heiko Falk OASIcs-OpenAccess Series in Informatics

Heiko Falk, Daniel Cremers, Barbara Hammer, Marc Langheinrich, Dorothea Wagner
2014 14th International Workshop on Worst-Case Execution Time Analysis WCET 2014   unpublished
This work was partially supported by COST Action IC1202: Timing Analysis On Code-Level (TACLe), and by the Swedish Research Council project Contesse (2010-4276). Acknowledgement.  ...  The work of Damien Jacquemart is financially supported by DGA (Direction Générale de l'Armement) and Onera. Acknowledgment.  ...  Some of the works focus on how to assign colors (i.e. set partitions) to the tasks.  ... 

Many-Core Architectures: Hardware-Software Optimization and Modeling Techniques

Christian <1986> Pinto, Luca Benini
This thesis is focused on virtualization techniques with the goal to mitigate, and overtake when possible, some of the challenges introduced by the many-core design paradigm.  ...  As a result of the increased silicon density of modern Systems-on-a-Chip (SoC), the design space exploration needed to find the best design has exploded and hardware designers are in fact facing the problem  ...  The Fermi GPU Architecture and CUDA The Fermi-based GPU used in this work is a Nvidia GeForce GTX 480, a twolevel shared memory parallel machine comprising 480 SPs organized in 16 SMs The device works  ... 
doi:10.6092/unibo/amsdottorato/6824 fatcat:3mkrxied45h3bd55s626axmvj4

Scalable System-on-Chip Design

Paolo Mantovani
Multi-core and many-core architectures sought more energy-efficient computation by replacing a power-hungry processor with multiple simpler cores exploiting parallelism.  ...  On the other hand, increasing the number of homogeneous cores incurs more and more diminishing returns.  ...  Such parallelism led GPUs to evolve as massively parallel architectures, integrating on a single chip a growing number of small cores.  ... 
doi:10.7916/d8ws95mk fatcat:egzvpahukvc43ccz7ge66afyze

Dagstuhl Reports, Volume 11, Issue 7, August 2021, Complete Issue [article]

The overall system is often determined by an interplay of many model aspects (topology, temporal ordering, type of dynamics) and we need to detect which of these interactions aspects are qualitatively  ...  In my talk, I outline two ways of extending higher-order model research, motivated by my previous work on the interplay of dynamics and multi-body topology [1, 2].  ...  The SCC mechanism, which assigns the SCC partition to each hedonic coalition formation problem with friend-oriented preferences, is group strategy-proof.  ... 
doi:10.4230/dagrep.11.7 fatcat:4b73kynisffo7payzzghe5rfiq