Filters








1,242 Hits in 7.4 sec

Threads and input/output in the synthesis kernal

H. Massalin, C. Pu
1989 Proceedings of the twelfth ACM symposium on Operating systems principles - SOSP '89  
The Synthesis operating system kernel combines several techniques to provide high performa.nce, incl1iding kernel code synthesis, fine-gra.in scheduling. and optimistic sylicllroliiza,tioii.  ...  Kernel code synthesis reduces the execution path for frequently used kernel calls. Optimistic synchronization increases concurrency within the kernel.  ...  Specia.1 Purpose Grant, and by the National Science Foundation under the grant CDA-88-20754. We gladly a.cltnowl-edge the h;rrdwa.re parts contributed by AT&T, Hitachi, IBM, and Mot.orola..  ... 
doi:10.1145/74850.74869 dblp:conf/sosp/MassalinP89 fatcat:knf4smdtvfdqlhok62j3np2o2a

Threads and input/output in the synthesis kernal

H. Massalin, C. Pu
1989 ACM SIGOPS Operating Systems Review  
The Synthesis operating system kernel combines several techniques to provide high performa.nce, incl1iding kernel code synthesis, fine-gra.in scheduling. and optimistic sylicllroliiza,tioii.  ...  Kernel code synthesis reduces the execution path for frequently used kernel calls. Optimistic synchronization increases concurrency within the kernel.  ...  Specia.1 Purpose Grant, and by the National Science Foundation under the grant CDA-88-20754. We gladly a.cltnowl-edge the h;rrdwa.re parts contributed by AT&T, Hitachi, IBM, and Mot.orola..  ... 
doi:10.1145/74851.74869 fatcat:pyo57fm7sjfbvkn7izprvhwizq

RTL C-based methodology for designing and verifying a multi-threaded processor

L. Semeria, A. Seawright, R. Mehra, D. Ng, A. Ekanayake, B. Pangrle
2002 Proceedings 2002 Design Automation Conference (IEEE Cat. No.02CH37324)  
A RTL C-based design and verification methodology is presented which enabled the successful high speed validation of a 7 million gate simultaneous multi-threaded (SMT) network processor.  ...  The methodology is centered on statically scheduled C-based coding style, C to HDL translation, and a novel RTL-C to RTL-Verilog equivalence checking flow.  ...  The definition of this design methodology involved the work of several members of the design and verification teams.  ... 
doi:10.1109/dac.2002.1012606 fatcat:2otw277idfbg7lcekm6vlecmhm

RTL c-based methodology for designing and verifying a multi-threaded processor

Luc Sèmèria, Renu Mehra, Barry Pangrle, Arjuna Ekanayake, Andrew Seawright, Daniel Ng
2002 Proceedings - Design Automation Conference  
A RTL C-based design and verification methodology is presented which enabled the successful high speed validation of a 7 million gate simultaneous multi-threaded (SMT) network processor.  ...  The methodology is centered on statically scheduled C-based coding style, C to HDL translation, and a novel RTL-C to RTL-Verilog equivalence checking flow.  ...  The definition of this design methodology involved the work of several members of the design and verification teams.  ... 
doi:10.1145/513918.513951 dblp:conf/dac/SemeriaMPESN02 fatcat:3w7rm3b2yrg7ljvc2uc5rkyeha

RTL c-based methodology for designing and verifying a multi-threaded processor

Luc Sèmèria, Renu Mehra, Barry Pangrle, Arjuna Ekanayake, Andrew Seawright, Daniel Ng
2002 Proceedings - Design Automation Conference  
A RTL C-based design and verification methodology is presented which enabled the successful high speed validation of a 7 million gate simultaneous multi-threaded (SMT) network processor.  ...  The methodology is centered on statically scheduled C-based coding style, C to HDL translation, and a novel RTL-C to RTL-Verilog equivalence checking flow.  ...  The definition of this design methodology involved the work of several members of the design and verification teams.  ... 
doi:10.1145/513950.513951 fatcat:s3h3m2nudjfitphlzfkubh4mvq

Performance evaluation and analysis of thread pinning strategies on multi-core platforms: Case study of SPEC OMP applications on intel architectures

Abdelhafid Mazouz, Sid-Ahmed-Ali Touati, Denis Barthou
2011 2011 International Conference on High Performance Computing & Simulation  
On smaller Core2 and Nehalem machines, we show that the benefit of thread pinning is not satisfactory in terms of speedups versus OSbased scheduling, but the performance stability is much better.  ...  This means that the current Linux OS scheduling strategy is not necessarily the best choice in terms of performance on ccNUMA machines, even if it is a good choice in terms of cores usage ratio and work  ...  ACKNOWLEDGEMENTS We would like to thank the following colleagues for their hint on the algorithm of Edmonds: Sandrine VIAL, Bertrand LECUN, Thierry MAUTOR and Franck QUES-SETTE.  ... 
doi:10.1109/hpcsim.2011.5999834 dblp:conf/ieeehpcs/MazouzTB11 fatcat:vdgwcupbhfcjhebmoyhyj2efam

Fast critical sections via thread scheduling for FPGA-based multithreaded processors

Martin Labrecque, J. Gregory Steffan
2009 2009 International Conference on Field Programmable Logic and Applications  
We address this challenge by proposing a method of scheduling threads in hardware that allows the multithreaded pipeline to be more fully utilized without significant costs in area or frequency.  ...  As FPGA-based systems including soft processors become increasingly common, we are motivated to better understand the architectural trade-offs and improve the efficiency of these systems.  ...  Hence the scalability of our system will ultimately be limited by the amount of computation per packet/task and the amount of parallelism across tasks, rather than the packet input/output capabilities  ... 
doi:10.1109/fpl.2009.5272561 dblp:conf/fpl/LabrecqueS09 fatcat:ja4etxlmgvcyrhheke6cdx2fum

A Lock-Free Multiprocessor OS Kernel

Henry Massalin, Calton Pu
1992 ACM SIGOPS Operating Systems Review  
We h a v e implemented a complete multiprocessor OS kernel including threads, virtual memory, and I O including a window system and a le system using only lock-free synchronization methods based on Compare-and-Swap  ...  Measured numbers show the low o v erhead of our implementation, competitive with user-level thread management systems.  ...  Sections 4, 5, and 6 describe the use of lock-free synchronization in the implementation of threads, virtual memory and input output.  ... 
doi:10.1145/142111.993246 fatcat:bgw3bmlfgna77jizdemn2rrgre

A Lock-Free Multiprocessor OS Kernel

Henry Massalin, Calton Pu
1992 ACM SIGOPS Operating Systems Review  
We h a v e implemented a complete multiprocessor OS kernel including threads, virtual memory, and I O including a window system and a le system using only lock-free synchronization methods based on Compare-and-Swap  ...  Measured numbers show the low o v erhead of our implementation, competitive with user-level thread management systems.  ...  Sections 4, 5, and 6 describe the use of lock-free synchronization in the implementation of threads, virtual memory and input output.  ... 
doi:10.1145/142111.964561 fatcat:mh72dc5ehjdhfnh3ldeuumh5iq

Multilevel Granularity Parallelism Synthesis on FPGAs

Alexandros Papakonstantinou, Yun Liang, John A. Stratton, Karthik Gururaj, Deming Chen, Wen-Mei W. Hwu, Jason Cong
2011 2011 IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines  
Recent progress in High-Level Synthesis (HLS) techniques has helped raise the abstraction level of FPGA programming.  ...  However implementation and performance evaluation of the HLS-generated RTL, involves lengthy logic synthesis and physical design flows.  ...  ACKNOWLEDGMENT This work is partially supported by the Gigascale Systems Research Center (GSRC) and the Advanced Digital Sciences Center (ADSC) under a grant from the Agency for Science, Technology and  ... 
doi:10.1109/fccm.2011.29 dblp:conf/fccm/PapakonstantinouLSGCHC11 fatcat:krlcxo36ebeojfy7neqtu642kq

AMC: Advanced Multi-accelerator Controller

Tassadaq Hussain, Amna Haider, Shakaib A. Gursal, Eduard Ayguadé
2015 Parallel Computing  
The rapid advancement, use of diverse architectural features and introduction of High Level Synthesis (HLS) tools in FPGA technology have enhanced the capacity of data-level parallelism on a chip.  ...  In this article, we propose the integration of an intelligent memory system and efficient scheduler in the HLS-based multi-accelerator environment called Advanced Multi-accelerator Controller (AMC).  ...  The architecture is based on five units: Input/Output Link The Input/Output (I/O) link provides an interface between AMC and HLS multi-accelerator unit.  ... 
doi:10.1016/j.parco.2014.10.003 fatcat:z7xne5erxjbihk54ns6kjwjpve

GPU based implementation of multichannel adaptive room equalization

Jorge Lorente, Miguel Ferrer, Maria de Diego, Alberto Gonzalez
2014 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
Although the GPUs seem to be suitable platforms for multichannel scenarios, an efficient use of parallel computation in the adaptive filtering context is not straightforward due to the feedback loops.  ...  Results show the usefulness of GPUs to develop versatile, scalable and low cost multichannel AE systems.  ...  The MOTU audio uses the ASIO (Audio Stream Input/Output) driver to communicate with the CPU.  ... 
doi:10.1109/icassp.2014.6855065 dblp:conf/icassp/LorenteFDG14 fatcat:yaulvzsy4zgatoz2gymhksqzl4

Accelerating Forward Algorithm for Stochastic Automata on Graphics Processing Units

Muhammad Umer Sarwar, Muhammad Kashif Hanif, Ramzan Talib, Muhammad Haris Aziz
2020 IEEE Access  
A stochastic automaton is a non-deterministic automata with input and output behavior which works serially and synchronously. Stochastic automata is being used in different application areas.  ...  In this study, a parallel version of inference algorithm for stochastic automata is designed. The parallel version is mapped to graphics processing unit using the dynamic parallelism.  ...  FORWARD ALGORITHM Stochastic automata are abstract machines with input/output behavior.  ... 
doi:10.1109/access.2020.2973741 fatcat:am4i7gwjfnfmfokvf4e6snnm3a

A Toolchain for Dynamic Function Off-load on CPU-FPGA Platforms

Takaaki Miyajima, David Thomas, Hideharu Amano
2015 Journal of Information Processing  
input-output data.  ...  The Pipeline Generator also builds a pipeline control program by using Intel Threading Building Block (Intel TBB) to run both hardware modules and software functions in parallel.  ...  Acknowledgments The present study is supported in part by the JST/CREST program entitled "Research and Development on Unified Environment of Accelerated Computing and Interconnection for Post-Petascale  ... 
doi:10.2197/ipsjjip.23.153 fatcat:2pb5qh2ocjaiver3f2bpiykyba

Enabling Parallelized-QEMU for Hardware/Software Co-Simulation Virtual Platforms

Edel Díaz, Raúl Mateos, Emilio J. Bueno, Rubén Nieto
2021 Electronics  
This growth is appreciated in Multi-Processor System-On-Chips (MPSoC), composed of more cores in heterogeneous and homogeneous architectures in recent years.  ...  The results show that the novel synchronization mechanism does not add any appreciable computational load and enables parallelized-QEMU in hardware/software co-simulation virtual platforms.  ...  Acknowledgments: The authors would like to thank Slavka Madarova for her huge support in English review. Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/electronics10060759 fatcat:dnxlzta6pvft7inpxjotplf2g4
« Previous Showing results 1 — 15 out of 1,242 results