Filters








2,821 Hits in 9.0 sec

Efficient Machine Learning, Compilers, and Optimizations for Embedded Systems [article]

Xiaofan Zhang, Yao Chen, Cong Hao, Sitao Huang, Yuhong Li, Deming Chen
2022 arXiv   pre-print
To address these challenges, we will introduce a series of effective design methods in this book chapter to enable efficient algorithms, compilers, and various optimizations for embedded systems.  ...  However, these emerging AI applications also come with increasing computation and memory demands, which are challenging to handle especially for the embedded systems where limited computation/memory resources  ...  These accelerators attempt to take advantage of customized or specialized hardware and software designs, such as adopting acceleration libraries on CPUs [26], exploring kernel optimization on GPUs [27  ... 
arXiv:2206.03326v1 fatcat:th66tbqxibez7hmctl2ytdiroa

Editorial Performance Modelling, Benchmarking and Simulation of High-Performance Computing Systems

S. A. Jarvis
2011 Computer journal  
Papers were sought which reported the ability to measure and make tradeoffs in hardware/software co-design to improve sustained application performance.  ...  ; performance concerns in software/hardware co-design; tuning and auto-tuning of HPC applications and The Editorial 137 algorithms; benchmark suites; performance visualization; and real-world case studies  ...  I am grateful to Jutta Mackwell (at the Computer Journal Editorial Office) and Erol Gelenbe (Editor-in-Chief) for assisting with the production of this issue of the Computer Journal.  ... 
doi:10.1093/comjnl/bxr113 fatcat:z2tb7cuqnvci5ffs6mouc3fddy

DySER: Unifying Functionality and Parallelism Specialization for Energy-Efficient Computing

Venkatraman Govindaraju, Chen-Han Ho, Tony Nowatzki, Jatin Chhugani, Nadathur Satish, Karthikeyan Sankaralingam, Changkyu Kim
2012 IEEE Micro  
Specialization is a promising direction for improving processor energy efficiency. With functionality specialization, hardware is designed for application-specific units of computation.  ...  With parallelism specialization, hardware is designed to exploit abundant data-level parallelism.  ...  Insufficient UNR,STR, VEC-Hybrid VEC-Inter and VEC-Intra on different arrays. Poor warp occupancy for GPU. FFT Fast Fourier Transform Regular Memory Access. Heavy use of Sin/Cos.  ... 
doi:10.1109/mm.2012.51 fatcat:vhuwzkylqzh7bhwyecd7k2bree

Hardware-Software Co-Design: Not Just a Cliché

Adrian Sampson, James Bornholt, Luis Ceze, Marc Herbstritt
2015 Summit on Advances in Programming Languages  
We reflect on the challenges and successes of approximation research and, with these lessons in mind, distill opportunities for future hardware-software co-design efforts.  ...  It is time to embrace hardware-software co-design in earnest, to cooperate between programming languages and architecture to upend legacy constraints on computing.  ...  Future explorations of hardware-software co-design would benefit from architectural support for separating control flow from data flow.  ... 
doi:10.4230/lipics.snapl.2015.262 dblp:conf/snapl/SampsonBC15 fatcat:w53z5tuoujcx5eqfyn4c5s5eau

An Empirical-cum-Statistical Approach to Power-Performance Characterization of Concurrent GPU Kernels [article]

Nilanjan Goswami, Amer Qouneh, Chao Li, Tao Li
2020 arXiv   pre-print
Growing deployment of power and energy efficient throughput accelerators (GPU) in data centers demands enhancement of power-performance co-optimization capabilities of GPUs.  ...  Realization of exascale computing using accelerators requires further improvements in power efficiency.  ...  RELATED WORK We distinguish our work from three major aspects: [53] has explored asymmetric multi processor based software/hardware co-design for big and small cores and the system software, and in  ... 
arXiv:2011.02368v2 fatcat:xgce6gvcjjcilfwem452yd3hsi

2020-2021 Index IEEE Transactions on Computers Vol. 70

2021 IEEE transactions on computers  
The Author Index contains the primary entry for each item, listed under the first author's name.  ...  ., +, TC June 2021 922-935 Enabling Highly Efficient Capsule Networks Processing Through Software-Hardware Co-Design.  ...  ., +, TC April 2021 524-538 Enabling Highly Efficient Capsule Networks Processing Through Software-Hardware Co-Design.  ... 
doi:10.1109/tc.2021.3134810 fatcat:p5otlsapynbwvjmqogj47kv5qa

Comparing performance and energy efficiency of FPGAs and GPUs for high productivity computing

Brahim Betkaoui, David B. Thomas, Wayne Luk
2010 2010 International Conference on Field-Programmable Technology  
To compare the GPU and FPGA approaches, we select a set of established benchmarks with different memory access characteristics, and compare their performance and energy efficiency on an FPGAbased Hybrid-Core  ...  This paper provides the first comparison of performance and energy efficiency of high productivity computing systems based on FPGA (Field-Programmable Gate Array) and GPU (Graphics Processing Unit) technologies  ...  However, programming these co-processing architectures, in particular FPGAs, requires software developers to learn a whole new set of skills and hardware design concepts, and accelerated application development  ... 
doi:10.1109/fpt.2010.5681761 dblp:conf/fpt/BetkaouiTL10 fatcat:nanujaq7gbhljc7x67muq3ygjy

Analytical Cost Metrics : Days of Future Past [article]

Nirmal Prajapati, Sanjay Rajopadhye, Hristo Djidjev
2018 arXiv   pre-print
(ii) Complete System Design - Simultaneously optimize all the cost models for the programs (computational problems) to obtain the most time/area/power/energy efficient solution.  ...  With Moore's law driving the evolution of hardware platforms towards exascale, the dominant performance metric (time efficiency) has now expanded to also incorporate power/energy efficiency.  ...  The main focus is on the methodology; specifically, to develop a software-hardware codesign framework and to illustrate how models built using it can be used for efficient exploration of the design space  ... 
arXiv:1802.01957v1 fatcat:r6lajnt75zb4xahkznt5gb4wx4

Software-defined Radios: Architecture, State-of-the-art, and Challenges [article]

Rami Akeela, Behnam Dezfouli
2018 arXiv   pre-print
Software-defined Radio (SDR) is a programmable transceiver with the capability of operating various wireless communication protocols without the need to change or update the hardware.  ...  Progress in the SDR field has led to the escalation of protocol development and a wide spectrum of applications, with more emphasis on programmability, flexibility, portability, and energy efficiency,  ...  Hybrid Design (a.k.a., co-design) The fourth approach towards realizing SDRs is the hybrid approach, where both hardware and software-based techniques are combined into one platform.  ... 
arXiv:1804.06564v1 fatcat:ogkut4aibnfarbrvjkihdfiqnu

FASTCUDA: Open Source FPGA Accelerator & Hardware-Software Codesign Toolset for CUDA Kernels

I. Mavroidis, I. Mavroidis, I. Papaefstathiou, L. Lavagno, M. Lazarescu, E. de la Torre, F. Schafer
2012 2012 15th Euromicro Conference on Digital System Design  
FASTCUDA is a platform that provides the necessary software toolset, hardware architecture, and design methodology to efficiently adapt the CUDA approach into a new FPGA design flow.  ...  On the other hand, languages such as CUDA and OpenCL provide standard development environments for Graphical Processing Unit (GPU) programming.  ...  Some tools, such as Synopsys Platform Architect, or CoFluent Studio, provide some level of hardware/software interfacing and co-simulation.  ... 
doi:10.1109/dsd.2012.58 dblp:conf/dsd/MavroidisMPLLTS12 fatcat:lemadgy5jff4hnhk27hdj5lgtq

Accelerating Board Games Through Hardware/Software Codesign

Javier Olivito, Javier Resano, Jose Luis Briz
2017 IEEE Transactions on Computational Intelligence and AI in Games  
The results demonstrate that the use of hardware/software co-design to develop board games allows sustaining or even improving the user experience across platforms while keeping power and energy low.  ...  The designs analyzed include hardware accelerators for board processing which improve performance and energy efficiency by an order of magnitude leading to much stronger and battery-aware applications.  ...  HARDWARE/SOFTWARE CO-DESIGN Hardware/software co-design allows designers to partition an application into hardware and software blocks that interact among them.  ... 
doi:10.1109/tciaig.2016.2604923 fatcat:y4p3onodnbcn5l6yulazojjmsi

Energy Efficiency for Ultrascale Systems: Challenges and Trends from Nesus Project

2015 Supercomputing Frontiers and Innovations  
Energy consumption is one of the main limiting factors for designing and deploying ultrascale systems.  ...  simulation of ultrascale systems, energy-aware scheduling and resource management, and energy-efficient application design.  ...  For example this approach should study software-hardware co-design or dependencies between IT systems components and infrastructure (including thermal management, cooling, and appropriate metrics for assessment  ... 
doi:10.14529/jsfi150206 fatcat:uaq7p3nalvb7jb6xnholszwpla

Reconfigurable computing for future vision-capable devices

Miguel Bordallo Lopez, Alejandro Nieto, Olli Silven, Jani Boutellier, David Lopez Vilarino
2015 2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)  
However, this type of applications still pose significant challenges in terms of latency, throughput and energy-efficiency.  ...  architectures: A low-power EnCore processor with a Configurable Flow Accelerator co-processor, a hybrid reconfigurable SIMD/MIMD platform and Transport-Triggered Architecture-based processors.  ...  Thesis "Designing for energy efficient vision-based interactivity on mobile devices", by Miguel Bordallo López make them unusable for extended periods of time.  ... 
doi:10.1109/samos.2015.7363657 dblp:conf/samos/LopezNSBV15 fatcat:ljzx4e4iv5a6jidj2abdwcefo4

Accelerating Deep Neural Networks implementation: A survey

Meriam Dhouibi, Ahmed Karim Ben Salem, Afef Saidi, Slim Ben Saoud
2021 IET Computers & Digital Techniques  
However, it is necessary to guarantee the best performance when designing hardware accelerators for DL applications to run at full speed, despite the constraints of low power, high accuracy and throughput  ...  Given that the number of operations and parameters increases with the complexity of the model architecture, the performance will strongly depend on the hardware target resources and basically the memory  ...  Zhang et al. designed and implemented Caffeine [99] , a HW/SW co-designed library which decreased underutilised memory bandwidth.  ... 
doi:10.1049/cdt2.12016 fatcat:3kl4j5ztl5eahmgv7vetu2egay

GPU-Accelerated Database Systems: Survey and Open Challenges [chapter]

Sebastian Breß, Max Heimel, Norbert Siegmund, Ladjel Bellatreche, Gunter Saake
2014 Lecture Notes in Computer Science  
Unsurprisingly, the database research community identified GPUs as effective co-processors for data processing several years ago.  ...  In the past years, there were many approaches to make use of GPUs at different levels of a database system. In this paper, we explore the design space of GPU-accelerated database management systems.  ...  We propose a reference architecture for GDBMSs. This architecture provides insights on how to integrate GPU acceleration in main-memory DBMSs. 5.  ... 
doi:10.1007/978-3-662-45761-0_1 fatcat:rpwqxejbkjh6dhzp27ppiawzsu
« Previous Showing results 1 — 15 out of 2,821 results