Filters








323 Hits in 6.9 sec

Distributed Recommendation Inference on FPGA Clusters

Yu Zhu, Zhenhao He, Wenqi Jiang, Kai Zeng, Jingren Zhou, Gustavo Alonso
2021 2021 31st International Conference on Field-Programmable Logic and Applications (FPL)  
To implement recommendation inference efficiently in the context of a real deployment, we design and implement an FPGA cluster optimizing the performance of both stages.  ...  To match the required DNN computation throughput, we partition the workload across multiple FPGAs interconnected via a 100 Gbps TCP/IP network.  ...  We would like to thank Xilinx for their generous donation of the XACC FPGA cluster at ETH Zurich on which the experiments were conducted.  ... 
doi:10.1109/fpl53798.2021.00057 fatcat:wxektrvidreldcy2tbrsan2j6a

An FPGA Realization of a Random Forest with k-Means Clustering Using a High-Level Synthesis Design

Akira JINGUJI, Shimpei SATO, Hiroki NAKAHARA
2018 IEICE transactions on information and systems  
In this paper, to further reduce the amount of hardware, we use k-means clustering to share comparators of the branch nodes on the decision tree.  ...  clustering, FPGA  ...  On the other hand, the disadvantages are follows: 1. Too deep decision trees fall into over fitting 2.  ... 
doi:10.1587/transinf.2017rcp0006 fatcat:dqj45utqsff7vfs7jcy2finhbq

A Flexible HLS Hoeffding Tree Implementation for Runtime Learning on FPGA [article]

Luís Miguel Sousa, Nuno Paulino, João Canas Ferreira, João Bispo
2021 arXiv   pre-print
For a problem size of D3, K5, and N40000, a single decision tree operating at 103MHz is capable of 8.3x faster inference than the 1.2GHz ARM Cortex-A53 core.  ...  Decision trees are often preferred when implementing Machine Learning in embedded systems for their simplicity and scalability.  ...  As future work, we envision the use of tree ensembles, and the partitioning of training and inference task between software and hardware based on problem size. Fig. 1 . 1 Fig. 1.  ... 
arXiv:2112.01875v1 fatcat:lpf3i7dzpbf3bj4wd7rix2q3hm

Lowering the latency of data processing pipelines through FPGA based hardware acceleration

Muhsen Owaida, Gustavo Alonso, Laura Fogliarini, Anthony Hock-Koon, Pierre-Etienne Melet
2019 Proceedings of the VLDB Endowment  
Our solution uses a novel decision tree ensemble implementation on an FPGA to: 1) increase the number of entries that can be scored per unit of time, and 2) provide a compact implementation that can be  ...  With a real use case as a baseline and motivation, we focus on accelerating the scoring function implemented as a decision tree ensemble, a common approach to scoring and classification in search systems  ...  a cluster of FPGAs to process very large ensembles [40] .  ... 
doi:10.14778/3357377.3357383 fatcat:xfngr7mstjawjibykyhpzbizce

Booster: An Accelerator for Gradient Boosting Decision Trees [article]

Mingxuan He, T. N. Vijaykumar, Mithuna Thottethodi
2020 arXiv   pre-print
We propose Booster, a novel accelerator for gradient boosting trees based on the unique characteristics of gradient boosting models.  ...  Based on ASIC synthesis of FPGA-validated RTL using 45 nm technology, we estimate a Booster chip to occupy 60 mm^2 of area and dissipate 23 W when operating at 1-GHz clock speed.  ...  Batch inference on Booster Implementing batch inference on Booster is an extension of the one-tree traversal in training with the key difference being multiple decision trees in the former versus only  ... 
arXiv:2011.02022v2 fatcat:2mcbj4ipuvgnrfkn4xf3bfrah4

Experience with PCIe streaming on FPGA for high throughput ML inferencing [article]

Piyush Manavar, Manoj Nambiar, Nupur Sumeet, Rekha Singhal, Sharod Choudhary, Amey Pandit
2021 arXiv   pre-print
We have presented our results for inferences on a gradient boosted trees model, for online retail recommendations.  ...  Further, we analyze the run time statistics on GPU and FPGA and identify opportunities to enhance performance on both the platforms.  ...  In [2] the authors discuss ways to partition large decision tree ensembles across multiple FPGAs (cluster).  ... 
arXiv:2110.11719v1 fatcat:kwp7iyljznfwlh7guwd5omatsy

Do OS abstractions make sense on FPGAs?

Dario Korolija, Timothy Roscoe, Gustavo Alonso
2020 USENIX Symposium on Operating Systems Design and Implementation  
work which replicates subsets of the traditional OS execution environment (virtual memory, processes, etc.) on the FPGA.  ...  FPGAs can deliver tremendous improvements in performance and energy efficiency for a range or workloads, but development and deployment of FPGA-based applications remains cumbersome, leading to recent  ...  Acknowledgements This work has been made possible through a generous equipment donation from Xilinx and through access to the Xilinx Adaptive Compute Cluster (XACC) Program.  ... 
dblp:conf/osdi/KorolijaRA20 fatcat:qpwo6kurj5ddbot6dn7lwdbjly

Random Forest-LNS Architecture and Vision [chapter]

Hassab Elgawi
2010 New Advances in Machine Learning  
., decision trees) to build an ensemble is an advanced machine learning technique with substantially improvement over single-based classifiers.  ...  Then a decision tree can be converting into an equivalent 'Tree Unit' by extracting one logic function per class from the tree structure.  ... 
doi:10.5772/9386 fatcat:ngjippqg3bbz7fzwporvdax43y

BioCNN: A Hardware Inference Engine for EEG-Based Emotion Detection

Hector A. Gonzalez, Shahzad Muzaffar, Jerald Yoo, Ibrahim M. Elfadel
2020 IEEE Access  
His current research interests include neuromorphic computing, hardware design for machine learning algorithms, biomedical applications and digital signal processing for radar systems.  ...  He is the author of several publications covering the areas of digital system verification, digital signal processing for frequencymodulated continuous-wave radar, emotion detection and industry applications  ...  A decision tree is a hierarchical graph model whose branches are generated according to a decision parameter. A Random Forest is an ensemble of decision trees, constructed during training phase.  ... 
doi:10.1109/access.2020.3012900 fatcat:y4yer3ztsvge5bew4bzfnebaie

A Survey of Machine Learning for Computer Architecture and Systems [article]

Nan Wu, Yuan Xie
2021 arXiv   pre-print
For ML-based modelling, we discuss existing studies based on their target level of system, ranging from the circuit level to the architecture/system level.  ...  While HLSPredict targets the same FPGA platform in training and testing, XPPE [149] considers different FPGA platforms, and uses ANNs to predict the speedup of an application on a target FPGA over an  ...  Examples of supervised learning: (a) regression, (b) SVM, (c) decision tree, (d) MLP, and (e) ensemble learning. Fig. 3 . 3 Fig. 3. Examples of unsupervised learning: (a) clustering, and (b) PCA.  ... 
arXiv:2102.07952v1 fatcat:vzj776a6abesljetqobakoc3dq

Field-Programmable Gate Arrays and Quantum Monte Carlo: Power Efficient Co-processing for Scalable High-Performance Computing [article]

Salvatore Cardamone, Jonathan R. Kimmitt, Hugh G. A. Burton, Alex J. W. Thom
2018 arXiv   pre-print
We have focussed our efforts on Variational Monte Carlo, and report on the benefits of co-processing with an FPGA relative to a purely multicore system.  ...  However, not all applications exhibit an adequate level of data and/or task parallelism to exploit such platforms.  ...  radix point partitions integer from fractional parts, amongst other related decisions.  ... 
arXiv:1808.02402v1 fatcat:s6f46ymjznd6dhh5jemx5gnziq

An Efficient Framework for Floor-plan Prediction of Dynamic Runtime Reconfigurable Systems

Ahmed Al-Wattar, Shawki Areibi, Gary Grewal
2015 International Journal of Reconfigurable and Embedded Systems (IJRES)  
<br />In this paper, we present a novel adaptive and dynamic methodology, based on a Machine Learning approach, for predicting and<br />estimating the necessary resources for an application based on past  ...  <br />An important feature of the proposed methodology is that the system is able to learn and generalize and, therefore, is expected to improve <br />its accuracy over time.  ...  The FPGA fabric is first partitioned uniformly then partitioned with 50% increase in size, while varying the number of PRRs.  ... 
doi:10.11591/ijres.v4.i2.pp99-121 fatcat:tkpnyjn76vfxbma5vbth2br7cq

Deep Neural Network Approximation for Custom Hardware: Where We've Been, Where We're Going [article]

Erwei Wang, James J. Davis, Ruizhe Zhao, Ho-Cheung Ng, Xinyu Niu, Wayne Luk, Peter Y. K. Cheung, George A. Constantinides
2019 arXiv   pre-print
We also include proposals for future research based on a thorough analysis of current trends.  ...  Application-tailored accelerators, when co-designed with approximation-based network training methods, transform large, dense and computationally expensive networks into small, sparse and hardware-efficient  ...  computation of CNN inference on FPGAs [71] .  ... 
arXiv:1901.06955v3 fatcat:rkgo2oisdrgv3dtnbtlldlkpba

Predictive Analytics On Big Data - An Overview

Gayathri Nagarajan, Dhinesh Babu L.D
2019 Informatica (Ljubljana, Tiskana izd.)  
While research works carried out continuously to handle big data is at one end, processing it to develop the business insights is a hot topic to work on the other end.  ...  The overview throws light on the core predictive models, challenges of these models on big data, research gaps in several domain sectors and using different techniques.  ...  There are many decision tree algorithms but few variants are shown in table 2. Strength and weakness: The major advantage of decision tree over other classifiers is its interpretability.  ... 
doi:10.31449/inf.v43i4.2577 fatcat:hqi45o6t7jb63dr3aaesink6l4

A Survey on Efficient Processing of Similarity Queries over Neural Embeddings [article]

Yifan Wang
2022 arXiv   pre-print
By comparing the solutions, we show how neural embeddings benefit those applications.  ...  Finally, we investigate the specific solutions with and without using embeddings in selected application domains of similarity queries, including entity resolution and information retrieval.  ...  Partition / tree based Tree based (or space-partitioning based) indexes are one of the most commonly used indexes for similarity search, e.g., KD-tree.  ... 
arXiv:2204.07922v1 fatcat:u5osyghs6vgppnj5gpnrzhae5y
« Previous Showing results 1 — 15 out of 323 results