1,081 Hits in 2.9 sec

Analyzing the instructions vulnerability of dense convolutional network on GPUS

Khalid Adam, Izzeldin I. Mohd, Younis Ibrahim
2021 International Journal of Power Electronics and Drive Systems (IJPEDS)  
Then, we analyze vulnerable instructions of the DenseNet201 on the GPU.  ...  Due to the high computations of the DNN models, DNNs are often executed on the graphics processing units (GPUs).  ...  ACKNOWLEDGEMENTS This research was supported by Universiti Malaysia Pahang, through the UMP internal grant (RDU1903149).  ... 
doi:10.11591/ijece.v11i5.pp4481-4488 fatcat:sjvfjemysrf2nmfodixdwwvz4i

On the Efficiency of Supernodal Factorization in Interior-Point Method using CPU-GPU Collaboration

Usman Ali Shah, Suhail Yousaf, Iftikhar Ahmad, Muhammad Ovais Ahmad
2020 IEEE Access  
Factorization method used in the state-of-the-art solver performs only selected operations related to large supernodes on GPU.  ...  These shortcomings encouraged us to adapt another factorization method, which processes sets of related supernodes on GPU, and introduce it to the PDIPM implementation of a popular open-source solver.  ...  a CPU-based LAPACK implementation available on the system.  ... 
doi:10.1109/access.2020.3006353 fatcat:aj3k6kuvorhaboeni4dypfeeqq

Application-Based Fault Tolerance Techniques for Fully Protecting Sparse Matrix Solvers

Grzegorz Pawelczak, Simon McIntosh-Smith, James Price, Matt Martineau
2017 2017 IEEE International Conference on Cluster Computing (CLUSTER)  
We also extend thanks to the Intel Parallel Computing Centre at the University of Bristol, for providing access to the Zoo testbed, and to GW4 for providing access to their Tier 2 Isambard supercomputer  ...  ACKNOWLEDGMENTS The authors would like to thank EPSRC for funding this research.  ...  The hardware ECC on this GPU incurs a measured overhead of 8.1% for TeaLeaf, due to the fact that TeaLeaf is a memory bandwidth bound application and this ECC method requires some of the bandwidth for  ... 
doi:10.1109/cluster.2017.49 dblp:conf/cluster/PawelczakMPM17 fatcat:sl67izpvmffipczzwpcppe5vl4

HPC Systems in the Next Decade – What to Expect, When, Where

Dirk Pleiter, C. Doglioni, D. Kim, G.A. Stewart, L. Silvestris, P. Jackson, W. Kamleh
2020 EPJ Web of Conferences  
HPC systems have seen impressive growth in terms of performance over a period of many years. Soon the next milestone is expected to be reached with the deployment of exascale systems in 2021.  ...  The analysis of upcoming architectural options and emerging technologies allow for setting expectations for application developers, which will have to cope with heterogeneous architectures, increasingly  ...  The design of future exascale systems depends strongly on long-term trends of key technologies.  ... 
doi:10.1051/epjconf/202024511004 fatcat:qk4yzblkevb2pkwf3qxderxd2a

Preparing sparse solvers for exascale computing

Hartwig Anzt, Erik Boman, Rob Falgout, Pieter Ghysels, Michael Heroux, Xiaoye Li, Lois Curfman McInnes, Richard Tran Mills, Sivasankaran Rajamanickam, Karl Rupp, Barry Smith, Ichitaro Yamazaki (+1 others)
2020 Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences  
We address the demands of systems with thousands of high-performance node devices where exposing concurrency, hiding latency and creating alternative algorithms become essential.  ...  This paper describes the challenges, strategies and progress of the US Department of Energy Exascale Computing project towards providing sparse solvers for exascale computing platforms.  ...  Another important design trend is the availability of memory systems that support increasing concurrency to assist the programmer in reading and writing data.  ... 
doi:10.1098/rsta.2019.0053 pmid:31955673 fatcat:bqw6xqixbrabddmxglmtcbw2wa

Trends in energy-efficient computing: A perspective from the Green500

Balaji Subramaniam, Winston Saunders, Tom Scogland, Wu-chun Feng
2013 2013 International Green Computing Conference Proceedings  
Specifically, we first provide an analysis of energy efficiency trends in HPC systems from the Green500. We then model and forecast the energy efficiency of future HPC systems.  ...  For instance, DARPA's target of a 20-MW exaflop system will require a 56.8-fold performance improvement with only a 2.4-fold increase in power consumption, which seems unachievable in light of the above  ...  We also thank Jason Lockhart, Mark Gardner, Heshan Lin, the Green500 community, and the participants at the Green500 birds-of-a-feather sessions for their continued support and feedback.  ... 
doi:10.1109/igcc.2013.6604520 dblp:conf/green/SubramaniamSSF13 fatcat:pvmes4v2zrgpbcc27oilfq2i2e

Stochastic Gradient Descent on Highly-Parallel Architectures [article]

Yujing Ma, Florin Rusu, Martin Torres
2018 arXiv   pre-print
The choice between synchronous GPU and asynchronous CPU depends on the task and the characteristics of the data.  ...  on GPU.  ...  Department of Energy Early Career Award (DOE Career).  ... 
arXiv:1802.08800v1 fatcat:hn6jbdcwfngw3b4l2zn7wug6ta

Mixed-Precision Tomographic Reconstructor Computations on Hardware Accelerators

Nicolas Doucet, Hatem Ltaief, Damien Gratadour, David Keyes
2019 2019 IEEE/ACM 9th Workshop on Irregular Applications: Architectures and Algorithms (IA3)  
The computation of tomographic reconstructors (ToR) is at the core of a simulation framework to design the next generation of adaptive optics (AO) systems to be installed on future Extremely Large Telescopes  ...  Experimental results demonstrate the accuracy robustness and the high performance of the mixed-precision ToR on synthetic datasets, which paves the way for future instrument deployments on the E-ELT.  ...  In this paper, we dive deeply into the design of a mixedprecision algorithm for solving a dense linear system of equations using task-based programming model, without the need of iterative refinement.  ... 
doi:10.1109/ia349570.2019.00011 dblp:conf/sc/DoucetLGK19 fatcat:mh3kpd4divh3tpzzhh5hhphrgm

PHINEAS: An Embedded Heterogeneous Parallel Platform [chapter]

Nikhil Khatri, Nithin Bodanapu, T. S. B. Sudarshan
2019 Lecture Notes in Computer Science  
the on-board GPU to perform general purpose compute tasks.  ...  The results show that the cluster meets the stringent latency requirements of embedded systems.  ...  The authors would like to thank Dr. Kiran D C of Presidency University-Bangalore. His original work towards an embeddable cluster provided the basis for this work.  ... 
doi:10.1007/978-3-030-18645-6_4 fatcat:hpid3pse2veaxbrr2rez7dxk4q

Stability and Performance of Various Singular Value QR Implementations on Multicore CPU with a GPU

Ichitaro Yamazaki, Stanimire Tomov, Jack Dongarra
2016 ACM Transactions on Mathematical Software  
Our focus is on the dense triangular solve, which performs half of the total floating-point operations of SVQR.  ...  In this article, we study the stability and performance of various SVQR implementations on multicore CPUs with a GPU.  ...  TRIANGULAR SOLVE ON A GPU To solve the triangular system with multiple right-hand sides on a GPU, we first developed GPU kernels, each of which is specialized for a specific size of the uppertriangular  ... 
doi:10.1145/2898347 fatcat:ewuk7ku3mvc33h47wqba6dew3m

Evaluating the Error Resilience of Parallel Programs

Bo Fang, Karthik Pattabiraman, Matei Ripeanu, Sudhanva Gurumurthi
2014 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks  
Evaluating the error resilience of HPC applications is an essential step for building efficient fault-tolerant mechanisms for these applications.  ...  We find that the error resilience of OpenMP applications depends on the program structure and thread model; hence, these need to be taken into account while characterizing error resilience.  ...  We thank the anonymous reviewers of FTXS2014 for their feedback to improve the paper.  ... 
doi:10.1109/dsn.2014.73 dblp:conf/dsn/FangPRG14 fatcat:emiejqiiknbgfd5vfemxxlh4ve

Big Computing: Where are we heading?

Sabuzima Nayak, Ripon Patgiri, Thoudam Singh
2018 EAI Endorsed Transactions on Scalable Information Systems  
The paper also highlights the future direction of Big computing systems for Bioinformatics, social media, hardware and software requirements, data intensive computation and then towards GPU era.  ...  This paper presents the overview of the current trends of Big data against the computing scenario from different aspects.  ...  The storage system has to flush the memory for external storage. And, rolling back of memory for checkpoint on occasion of failure should be rare [38] .  ... 
doi:10.4108/eai.13-7-2018.163972 fatcat:h7dtabdx6fdolbfez3h57s463i

Energy balance between voltage-frequency scaling and resilience for linear algebra routines on low-power multicore architectures

Sandra Catalán, José R. Herrero, Enrique S. Quintana-Ortí, Rafael Rodríguez-Sánchez
2018 Parallel Computing  
implementation of the LU factorization with partial pivoting built on top of gemm.  ...  Furthermore, we tailor the study for a number of representative 32-bit and 64-bit multicore processors from ARM that were specifically designed for energy efficiency.  ...  Igual, from Universidad Complutense de Madrid, for his help with the configuration and experimentation with the Juno (r0) system.  ... 
doi:10.1016/j.parco.2017.05.004 fatcat:snhctnw47bffbbp5xzvkf2i3ei


Leonardo Bautista-Gomez, Seiji Tsuboi, Dimitri Komatitsch, Franck Cappello, Naoya Maruyama, Satoshi Matsuoka
2011 Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '11  
To demonstrate the performance of FTI, we present a case study of the Mw9.0 Tohoku Japan earthquake simulation with SPECFEM3D on TSUBAME2.0.  ...  Large scientific applications deployed on current petascale systems expend a significant amount of their execution time dumping checkpoint files to remote storage.  ...  Hybrid systems have become an important trend in the HPC community [32] during the last few years.  ... 
doi:10.1145/2063384.2063427 dblp:conf/sc/Bautista-GomezTKCMM11 fatcat:g6dkxun7gvdr5lwezlj6xfg47a

High Performance Computing using Containers in Cloud

Manish Kumar Abhishek
2020 International Journal of Advanced Trends in Computer Science and Engineering  
Where containers usage based on operating system virtualization widespread now a days is handling the performance impact via facilitating one of the lightweight layers of virtualization in terms of computing  ...  Virtualization is playing as a core component in cloud computing but performance overhead impact in its one of layer is forcing the researchers to think its usage in research areas with cloud for high  ...  ACKNOWLEDGEMENT A special vote of thanks to the Koneru Lakshmaiah Education Foundation for helping and facilitating me required infrastructure and my guide as well as staff members who have helped me to  ... 
doi:10.30534/ijatcse/2020/220942020 fatcat:ofgebygj65fvreacenhrzfplmm
« Previous Showing results 1 — 15 out of 1,081 results