620 Hits in 5.3 sec

Energy-Efficient Reconfigurable Computing Using a Circuit-Architecture-Software Co-Design Approach

Somnath Paul, Subho Chatterjee, Saibal Mukhopadhyay, Swarup Bhunia
2011 IEEE Journal on Emerging and Selected Topics in Circuits and Systems  
, leading to large improvement in energy-delay product (EDP).  ...  In this paper, we show that an integrated circuit-architecture-software co-design approach can be extremely effective to simultaneously improve the power and performance of a reconfigurable hardware framework  ...  SRAM based MBC framework and mapped a set of benchmark circuits (ISCAS and MCNC) using the proposed flow.  ... 
doi:10.1109/jetcas.2011.2165232 fatcat:tlnx3mng3remfghm4fgulfd27u

Circuits and architectures for field programmable gate array with configurable supply voltage

Y. Lin, Fei Li, Lei He
2005 IEEE Transactions on Very Large Scale Integration (vlsi) Systems  
Compared to the baseline architecture similar to the leading commercial architecture, our best architecture reduces the minimal energy-delay product by 54.39% with 17% more area and 3% more configuration  ...  In this paper, we first design novel Vdd-programmable and Vdd-gateable interconnect switches with minimal number of configuration SRAM cells.  ...  Our recent work [32] studied the device and FPGA architecture co-optimization for higher power reduction compared to this paper.  ... 
doi:10.1109/tvlsi.2005.857180 fatcat:zpjhtfstb5d7fjnoakanby7dwq

Computing with nanoscale memory: Model and architecture

Somnath Paul, Swarup Bhunia
2009 2009 IEEE/ACM International Symposium on Nanoscale Architectures  
On the other hand, dense and periodic structures of most emerging nanodevices as well as their bi-stable nature make them amenable to large high-density memory array design.  ...  In this paper, first we study nanoscale FPGA, which extends conventional spatial CMOS FPGA architecture using nanoscale memory and interconnect.  ...  improve the energy-delay product.  ... 
doi:10.1109/nanoarch.2009.5226362 dblp:conf/nanoarch/PaulB09 fatcat:kagx3f3j35fspfiamat7rqc6l4

Power modeling and architecture evaluation for FPGA with novel circuits for Vdd programmability

Yan Lin, Fei Li, Lei He
2005 Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays - FPGA '05  
Compared to the baseline architecture similar to the leading commercial architecture, the best architecture reduces the minimal energy-delay product by 44.14% with 48% area overhead and 3% SRAM cell increase  ...  Applying our power model to placed and routed benchmark circuits, we evaluate Vddprogrammable FPGA architecture using the new switches.  ...  Acknowledgement The authors like to thank Mr. Lerong Cheng and Ms. Ho-Yan Phoebe Wong at UCLA for generating circuit models for a variety of basic FPGA circuits and for helpful discussions.  ... 
doi:10.1145/1046192.1046218 dblp:conf/fpga/LinLH05 fatcat:hdfjicsvprhwto5bgwkyupsnse

Exploiting Challenges of Sub-20 nm CMOS for Affordable Technology Scaling [article]

Kaushik Vaidyanathan
2015 arXiv   pre-print
Just continuing to co-optimize leaf cell circuit and layout designs with process technology does not enable us to exploit the challenges of a sub-20 nm CMOS.  ...  To this end, we propose to broaden the scope of design technology co-optimization (DTCO) to be more holistic by including micro-architecture design and CAD, along with circuits, layout and process technology  ...  [45] have demonstrated the significant energy and area benefits of co-optimizing SRAM circuits with application design, by making applicationspecific customizations to SRAM blocks.  ... 
arXiv:1509.00885v1 fatcat:5mcetdrz2rbhbbtwwf5av36p2a

Architectural and Circuit Design Techniques for Power Management of Ultra-Low-Power MCU Systems

Michael Lueders, Bjoern Eversmann, Johannes Gerber, Korbinian Huber, Ruediger Kuhn, Michael Zwerg, Doris Schmitt-Landsiedel, Ralf Brederlow
2014 IEEE Transactions on Very Large Scale Integration (vlsi) Systems  
Field Programmab le Gate Arrays (FPGAs) are widely used for imp lementation of dig ital system design due to their flexibility, low time-to-market, growing density and speed.  ...  Clock Gat ing reduces the power consumption by the factor 50% and also by using latest novel devices like Tunnel FET power can be reduced much lower than present.  ...  While using clock gating, on FPGAs in particu lar, the user should take care of the p lacement of gating logic to minimize delay in the clock network.  ... 
doi:10.1109/tvlsi.2013.2290083 fatcat:j2wsmzogunazpbwdx3wicplemu

Using multifunctional standardized stack as universal spintronic technology for IoT

M. Tahoori, S. M. Nair, R. Bishnoi, S. Senni, J. Mohdad, F. Mailly, L. Torres, P. Benoit, A. Gamatie, P. Nouet, F. Ouattara, G. Sassatelli (+6 others)
2018 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE)  
For monolithic heterogeneous integration, fast yet low-power processing and storage, and high integration density, the objective of the ED GREAT project is to co-integrate multiple digital and analog functions  ...  together within CMOS by adapting the Magnetic Tunneling Junctions (MTJ s) into a single baseline technology enabling logic, memory, and analog functions, particularly for Internet of Things (loT) platforms  ...  Fig. 12 shows for each kernel its execution time, energy and Energy-Delay-Product (EDP), for the LITTLE-L2-STT-MRAM, big-L2-STT-MRAM and Full-L2-STT-MRAM scenarios compared to the Full-SRAM reference scenario  ... 
doi:10.23919/date.2018.8342143 dblp:conf/date/TahooriNBSMMTBG18 fatcat:gd5dr66q7jbglkuts7ohbdiam4

NVMExplorer: A Framework for Cross-Stack Comparisons of Embedded Non-Volatile Memories [article]

Lillian Pentecost, Alexander Hankin, Marco Donato, Mark Hempstead, Gu-Yeon Wei, David Brooks
2022 arXiv   pre-print
a freely available ( set of tools for application experts, system designers, and device experts to better understand, compare, and quantify the next generation of  ...  Repeated off-chip memory accesses to DRAM drive up operating power for data-intensive applications, and SRAM technology scaling and leakage power limits the efficiency of embedded memories.  ...  This result reflects the effect of different array optimization targets (read energy-delay product, write characteristics, area) on the internal bank configuration and periphery overhead.  ... 
arXiv:2109.01188v2 fatcat:h7jx7rulzbgtjgb55zzp3cdi2a

Device-Circuit-Architecture Co-Exploration for Computing-in-Memory Neural Accelerators [article]

Weiwen Jiang, Qiuwen Lou, Zheyu Yan, Lei Yang, Jingtong Hu, Xiaobo Sharon Hu, Yiyu Shi
2020 arXiv   pre-print
To address these challenges, we propose a cross-layer exploration framework, namely NACIM, which jointly explores device, circuit and architecture design space and takes device variation into consideration  ...  Co-exploration of neural architectures and hardware design is promising to simultaneously optimize network accuracy and hardware efficiency.  ...  We employ the same circuit optimization procedure, and obtain the hardware efficiency metrics, including area and energy delay product (EDP), speed (TOPs), and energy efficiency (TOPs/W).  ... 
arXiv:1911.00139v2 fatcat:we43szndbzduzizoq75c64rhhq

Analog Architecture Complexity Theory Empowering Ultra-Low Power Configurable Analog and Mixed Mode SoC Systems

Jennifer Hasler
2019 Journal of Low Power Electronics and Applications  
This discussion develops a theoretical analog architecture framework similar to the well developed digital architecture theory.  ...  Designing analog systems, whether small or large scale, must optimize their architectures for energy consumption.  ...  Acknowledgments: The author appreciates the discussions with several people about This paper, particularly with Jeffery Young, Aishwarya Natarajan, as well as insightful comments by individuals after seminars  ... 
doi:10.3390/jlpea9010004 fatcat:cgycffc65bhmdaky2apedcjxma

A cross-layer approach to cognitive computing

Gobinda Saha, Cheng Wang, Anand Raghunathan, Kaushik Roy
2022 Proceedings of the 59th ACM/IEEE Design Automation Conference  
We argue that such crosslayer innovations in cognitive computing are well-poised to enable a new wave of autonomous intelligence across the computing spectrum, from resource-constrained IoT devices to  ...  In this article, we present a cross-layer approach to the exploration of new paradigms in cognitive computing.  ...  This framework can identify an optimal DNN architecture for CiM hardware and demonstrate sizeable improvement in energy-delay-area product while maintain near-software level inference accuracy.  ... 
doi:10.1145/3489517.3530642 fatcat:iflcowivyvchriny7qcqukua7q

Pragmatic Integration of an SRAM Row Cache in Heterogeneous 3-D DRAM Architecture Using TSV

Dong Hyuk Woo, Nak Hee Seong, Hsien-Hsin S. Lee
2013 IEEE Transactions on Very Large Scale Integration (vlsi) Systems  
In particular, we found that, if we want to design an SRAM row cache in a DRAM chip, simple stacking alone cannot address the majority of traditional SRAM row cache design issues.  ...  In this paper, to address these issues, we propose a novel floorplan and several architectural techniques that fully exploit the benefits of 3-D stacking technology.  ...  Architecture-Level Simulation 1) Simulation Framework: In addition to such circuit-level modeling, we also performed architecture-level study using SESC [27] .  ... 
doi:10.1109/tvlsi.2011.2176761 fatcat:tgdsehnjife6rawz36td74vquu

Prospects for Analog Circuits in Deep Networks [article]

Shih-Chii Liu, John Paul Strachan, Arindam Basu
2021 arXiv   pre-print
It then presents an outlook for the use ofanalog circuits in low-power deep network accelerators suitable for edge or tiny machine learning applications.  ...  Power in these designs is usually dominated by the memory access power of off-chip DRAM needed for storing the network weights and activations.  ...  The authors would like to thank Mr. Sumon K. Bose and Dr. Joydeep Basu for help with figures; and T. Delbruck for comments.  ... 
arXiv:2106.12444v1 fatcat:2peafaa3z5bytmblqm7pon7rfi

2018 IndexIEEE Transactions on Very Large Scale Integration (VLSI) SystemsVol. 26

2018 IEEE Transactions on Very Large Scale Integration (vlsi) Systems  
., see 2723-2736 , VLSI Design of an ML-Based Power-Efficient Motion Estimation Controller for Intelligent Mobile Systems; TVLSI Feb. 2018 262-271 Hsieh, Y., see Tsai, Y., TVLSI May 2018 945-957  ...  ., +, TVLSI Oct. 2018 2099-2107 + Check author entry for co-authors Cost reduction Systematic Design of an Approximate Adder: The Optimized Lower Part Constant-OR Adder.  ...  SRAM Circuits for True Random Number Generation Using Intrinsic Bit Instability.  ... 
doi:10.1109/tvlsi.2019.2892312 fatcat:rxiz5duc6jhdzjo4ybcxdajtbq


Brandon Reagen, Paul Whatmough, Robert Adolf, Saketh Rama, Hyunkwang Lee, Sae Kyu Lee, José Miguel Hernández-Lobato, Gu-Yeon Wei, David Brooks
2016 SIGARCH Computer Architecture News  
This paper presents Minerva, a highly automated co-design approach across the algorithm, architecture, and circuit levels to optimize DNN hardware accelerators.  ...  The continued success of Deep Neural Networks (DNNs) in classification tasks has sparked a trend of accelerating their execution with specialized hardware.  ...  A single parity bit enables fault detection simply through inspecting the read SRAM data. In contrast, Razor and canary circuits provide fault detection by monitoring delays in the circuits.  ... 
doi:10.1145/3007787.3001165 fatcat:pko4yhxfordu7epqncumxlyoey
« Previous Showing results 1 — 15 out of 620 results