A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
Efficient Machine Learning, Compilers, and Optimizations for Embedded Systems
[article]
2022
arXiv
pre-print
To address these challenges, we will introduce a series of effective design methods in this book chapter to enable efficient algorithms, compilers, and various optimizations for embedded systems. ...
However, these emerging AI applications also come with increasing computation and memory demands, which are challenging to handle especially for the embedded systems where limited computation/memory resources ...
These accelerators attempt to take advantage of customized or specialized hardware and software designs, such as adopting acceleration libraries on CPUs [26], exploring kernel optimization on GPUs [27 ...
arXiv:2206.03326v1
fatcat:th66tbqxibez7hmctl2ytdiroa
Editorial Performance Modelling, Benchmarking and Simulation of High-Performance Computing Systems
2011
Computer journal
Papers were sought which reported the ability to measure and make tradeoffs in hardware/software co-design to improve sustained application performance. ...
; performance concerns in software/hardware co-design; tuning and auto-tuning of HPC applications and The Editorial 137 algorithms; benchmark suites; performance visualization; and real-world case studies ...
I am grateful to Jutta Mackwell (at the Computer Journal Editorial Office) and Erol Gelenbe (Editor-in-Chief) for assisting with the production of this issue of the Computer Journal. ...
doi:10.1093/comjnl/bxr113
fatcat:z2tb7cuqnvci5ffs6mouc3fddy
DySER: Unifying Functionality and Parallelism Specialization for Energy-Efficient Computing
2012
IEEE Micro
Specialization is a promising direction for improving processor energy efficiency. With functionality specialization, hardware is designed for application-specific units of computation. ...
With parallelism specialization, hardware is designed to exploit abundant data-level parallelism. ...
Insufficient
UNR,STR,
VEC-Hybrid
VEC-Inter and VEC-Intra on different
arrays. Poor warp occupancy for GPU.
FFT
Fast Fourier
Transform
Regular Memory Access. Heavy use
of Sin/Cos. ...
doi:10.1109/mm.2012.51
fatcat:vhuwzkylqzh7bhwyecd7k2bree
Hardware-Software Co-Design: Not Just a Cliché
2015
Summit on Advances in Programming Languages
We reflect on the challenges and successes of approximation research and, with these lessons in mind, distill opportunities for future hardware-software co-design efforts. ...
It is time to embrace hardware-software co-design in earnest, to cooperate between programming languages and architecture to upend legacy constraints on computing. ...
Future explorations of hardware-software co-design would benefit from architectural support for separating control flow from data flow. ...
doi:10.4230/lipics.snapl.2015.262
dblp:conf/snapl/SampsonBC15
fatcat:w53z5tuoujcx5eqfyn4c5s5eau
An Empirical-cum-Statistical Approach to Power-Performance Characterization of Concurrent GPU Kernels
[article]
2020
arXiv
pre-print
Growing deployment of power and energy efficient throughput accelerators (GPU) in data centers demands enhancement of power-performance co-optimization capabilities of GPUs. ...
Realization of exascale computing using accelerators requires further improvements in power efficiency. ...
RELATED WORK We distinguish our work from three major aspects: [53] has explored asymmetric multi processor based software/hardware co-design for big and small cores and the system software, and in ...
arXiv:2011.02368v2
fatcat:xgce6gvcjjcilfwem452yd3hsi
2020-2021 Index IEEE Transactions on Computers Vol. 70
2021
IEEE transactions on computers
The Author Index contains the primary entry for each item, listed under the first author's name. ...
., +, TC June 2021 922-935 Enabling Highly Efficient Capsule Networks Processing Through Software-Hardware Co-Design. ...
., +, TC April 2021 524-538 Enabling Highly Efficient Capsule Networks Processing Through Software-Hardware Co-Design. ...
doi:10.1109/tc.2021.3134810
fatcat:p5otlsapynbwvjmqogj47kv5qa
Comparing performance and energy efficiency of FPGAs and GPUs for high productivity computing
2010
2010 International Conference on Field-Programmable Technology
To compare the GPU and FPGA approaches, we select a set of established benchmarks with different memory access characteristics, and compare their performance and energy efficiency on an FPGAbased Hybrid-Core ...
This paper provides the first comparison of performance and energy efficiency of high productivity computing systems based on FPGA (Field-Programmable Gate Array) and GPU (Graphics Processing Unit) technologies ...
However, programming these co-processing architectures, in particular FPGAs, requires software developers to learn a whole new set of skills and hardware design concepts, and accelerated application development ...
doi:10.1109/fpt.2010.5681761
dblp:conf/fpt/BetkaouiTL10
fatcat:nanujaq7gbhljc7x67muq3ygjy
Analytical Cost Metrics : Days of Future Past
[article]
2018
arXiv
pre-print
(ii) Complete System Design - Simultaneously optimize all the cost models for the programs (computational problems) to obtain the most time/area/power/energy efficient solution. ...
With Moore's law driving the evolution of hardware platforms towards exascale, the dominant performance metric (time efficiency) has now expanded to also incorporate power/energy efficiency. ...
The main focus is on the methodology; specifically, to develop a software-hardware codesign framework and to illustrate how models built using it can be used for efficient exploration of the design space ...
arXiv:1802.01957v1
fatcat:r6lajnt75zb4xahkznt5gb4wx4
Software-defined Radios: Architecture, State-of-the-art, and Challenges
[article]
2018
arXiv
pre-print
Software-defined Radio (SDR) is a programmable transceiver with the capability of operating various wireless communication protocols without the need to change or update the hardware. ...
Progress in the SDR field has led to the escalation of protocol development and a wide spectrum of applications, with more emphasis on programmability, flexibility, portability, and energy efficiency, ...
Hybrid Design (a.k.a., co-design) The fourth approach towards realizing SDRs is the hybrid approach, where both hardware and software-based techniques are combined into one platform. ...
arXiv:1804.06564v1
fatcat:ogkut4aibnfarbrvjkihdfiqnu
FASTCUDA: Open Source FPGA Accelerator & Hardware-Software Codesign Toolset for CUDA Kernels
2012
2012 15th Euromicro Conference on Digital System Design
FASTCUDA is a platform that provides the necessary software toolset, hardware architecture, and design methodology to efficiently adapt the CUDA approach into a new FPGA design flow. ...
On the other hand, languages such as CUDA and OpenCL provide standard development environments for Graphical Processing Unit (GPU) programming. ...
Some tools, such as Synopsys Platform Architect, or CoFluent Studio, provide some level of hardware/software interfacing and co-simulation. ...
doi:10.1109/dsd.2012.58
dblp:conf/dsd/MavroidisMPLLTS12
fatcat:lemadgy5jff4hnhk27hdj5lgtq
Accelerating Board Games Through Hardware/Software Codesign
2017
IEEE Transactions on Computational Intelligence and AI in Games
The results demonstrate that the use of hardware/software co-design to develop board games allows sustaining or even improving the user experience across platforms while keeping power and energy low. ...
The designs analyzed include hardware accelerators for board processing which improve performance and energy efficiency by an order of magnitude leading to much stronger and battery-aware applications. ...
HARDWARE/SOFTWARE CO-DESIGN Hardware/software co-design allows designers to partition an application into hardware and software blocks that interact among them. ...
doi:10.1109/tciaig.2016.2604923
fatcat:y4p3onodnbcn5l6yulazojjmsi
Energy Efficiency for Ultrascale Systems: Challenges and Trends from Nesus Project
2015
Supercomputing Frontiers and Innovations
Energy consumption is one of the main limiting factors for designing and deploying ultrascale systems. ...
simulation of ultrascale systems, energy-aware scheduling and resource management, and energy-efficient application design. ...
For example this approach should study software-hardware co-design or dependencies between IT systems components and infrastructure (including thermal management, cooling, and appropriate metrics for assessment ...
doi:10.14529/jsfi150206
fatcat:uaq7p3nalvb7jb6xnholszwpla
Reconfigurable computing for future vision-capable devices
2015
2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)
However, this type of applications still pose significant challenges in terms of latency, throughput and energy-efficiency. ...
architectures: A low-power EnCore processor with a Configurable Flow Accelerator co-processor, a hybrid reconfigurable SIMD/MIMD platform and Transport-Triggered Architecture-based processors. ...
Thesis "Designing for energy efficient vision-based interactivity on mobile devices", by Miguel Bordallo López make them unusable for extended periods of time. ...
doi:10.1109/samos.2015.7363657
dblp:conf/samos/LopezNSBV15
fatcat:ljzx4e4iv5a6jidj2abdwcefo4
Accelerating Deep Neural Networks implementation: A survey
2021
IET Computers & Digital Techniques
However, it is necessary to guarantee the best performance when designing hardware accelerators for DL applications to run at full speed, despite the constraints of low power, high accuracy and throughput ...
Given that the number of operations and parameters increases with the complexity of the model architecture, the performance will strongly depend on the hardware target resources and basically the memory ...
Zhang et al. designed and implemented Caffeine [99] , a HW/SW co-designed library which decreased underutilised memory bandwidth. ...
doi:10.1049/cdt2.12016
fatcat:3kl4j5ztl5eahmgv7vetu2egay
GPU-Accelerated Database Systems: Survey and Open Challenges
[chapter]
2014
Lecture Notes in Computer Science
Unsurprisingly, the database research community identified GPUs as effective co-processors for data processing several years ago. ...
In the past years, there were many approaches to make use of GPUs at different levels of a database system. In this paper, we explore the design space of GPU-accelerated database management systems. ...
We propose a reference architecture for GDBMSs. This architecture provides insights on how to integrate GPU acceleration in main-memory DBMSs. 5. ...
doi:10.1007/978-3-662-45761-0_1
fatcat:rpwqxejbkjh6dhzp27ppiawzsu
« Previous
Showing results 1 — 15 out of 2,821 results