A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Multi-platform Auto-vectorization
International Symposium on Code Generation and Optimization (CGO'06)
code to vector code by the compiler. ...
Intrinsics vector float vb = vec_load (0, ptr_b); vector float vc = vec_load (0, ptr_c); vector float va = vec_add (vb, vc); vec_store (va, 0, ptr_a); Autovectorization: Automatically transform serial ...
IBM Labs in HaifaMulti-Platform Auto-Vectorization -Talk LayoutLarsen,Amarasinghe ; Shin,Chame,Hall) -Altivec Vectorizing compilers available for multiple SIMD targets source-to-source compilers Vienna ...
doi:10.1109/cgo.2006.25
dblp:conf/cgo/NuzmanH06
fatcat:gwcjkpxr6ffcjkyre3ntqr2ax4
Multi-sensor kernel design for time-frequency analysis of sparsely sampled nonstationary signals
2015
2015 IEEE Radar Conference (RadarCon)
In this paper, we examine the sparsity-based timefrequency signal representation (TFSR) of randomly thinned nonstationary signals in a multi-sensor platform to yield improved performance with reduced number ...
We develop a robust multi-sensor AOK design based on data fusion across all sensors so as to enhance the signal auto-terms while effectively mitigating artifacts, cross-terms, and noise. ...
While CSbased TF approaches were considered for a single-sensor scenario, we extend such treatment into a multi-sensor platform. ...
doi:10.1109/radar.2015.7131122
fatcat:gowplr5m6rbd5dxfbeouwcv77q
Multi-tier Service Differentiation: Coordinated Resource Provisioning and Admission Control
2012
2012 IEEE 18th International Conference on Parallel and Distributed Systems
We propose a coordinated self-adaptive resource management and admission control for multi-tier Internet service differentiation and performance improvement in a shared virtualized platform. ...
We implement the integrated approach in a virtualized blade server system hosting multi-tier RUBiS applications. ...
the shared platform to the R M×N vector so that Eq. (3) and Eq. (4) are satisfied at the same time. ...
doi:10.1109/icpads.2012.20
dblp:conf/icpads/MuppalaZC12
fatcat:bqppzzqxxrbrfkpkqsb3l3dk4e
DOA estimation of sparsely sampled nonstationary signals
2015
2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)
The paper deals with sparsely sampled nonstationary signals in a multi-sensor array platform. ...
The reconstructed auto-and cross-sensor TFSRs enable the formation of the spatial time-frequency distribution (STFD) matrix, which is used, in turn, to propose the sparse time-frequency MUSIC (STF-MUSIC ...
The averaged AF over all sensors is given by A Σ (r, ψ) = 1 N N q=1 A q (r, ψ). (11) Then, an improved kernel in the multi-sensor platform is obtained by replacing A(r, ψ) in (10) by A Σ (r, ψ) in (11 ...
doi:10.1109/chinasip.2015.7230412
dblp:conf/chinasip/GuoZWA15
fatcat:cjlup4djizhe7fejczqlshwb7y
On the performance and energy-efficiency of multi-core SIMD CPUs and CUDA-enabled GPUs
2013
2013 IEEE International Symposium on Workload Characterization (IISWC)
This paper explores the performance and energy efficiency of CUDA-enabled GPUs and multi-core SIMD CPUs using a set of kernels and full applications. ...
Our implementations efficiently exploit both SIMD and thread-level parallelism on multi-core CPUs and the computational capabilities of CUDA-enabled GPUs. ...
C: compiler auto-vectorized, only accounts for single-threaded optimization effort. **: Multi-core effort only. ...
doi:10.1109/iiswc.2013.6704683
dblp:conf/iiswc/DuarteSV13
fatcat:qs36ks5kezdrldp5zk3nstai2i
Architectural Support for Reducing Parallel Processing Overhead in an Embedded Multiprocessor
2010
2010 IEEE/IFIP International Conference on Embedded and Ubiquitous Computing
The host-multi-SIMD chip multiprocessor (CMP) architecture has been proved to be an efficient architecture for high performance signal processing which explores both task level parallelism by multi-core ...
Implementing an algorithm in a parallel platform usually produces control and communication overhead which is not parallelizable. ...
The ePUMA platform uses the host-multi-SIMD with architectural optimizations to minimize parallel processing overheads. ...
doi:10.1109/euc.2010.17
dblp:conf/euc/WangSL10
fatcat:uboiys3jzncufgqeana4pn7lxy
iDev: Enhancing Social Coding Security by Cross-platform User Identification Between GitHub and Stack Overflow
2019
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
Then, we propose a novel AHIN representation learning model AHIN2Vec to efficiently learn node (i.e., user) representations in AHIN for cross-platform user identification. ...
To solve this problem, an important insight brought by this work is to leverage social coding properties in addition to user attributes for cross-platform user identification. ...
Multi-view network built from AHIN. ...
doi:10.24963/ijcai.2019/315
dblp:conf/ijcai/FanZHCYSZX19
fatcat:ry6it7fhs5hzzhszfaybmdk77a
Towards A Multi-agent System for Online Hate Speech Detection
[article]
2021
arXiv
pre-print
This paper envisions a multi-agent system for detecting the presence of hate speech in online social media platforms such as Twitter and Facebook. ...
We conclude with a discussion of how our system may be of use to provide recommendations to users who are managing online social networks, showcasing the immense potential of intelligent multi-agent systems ...
text
none
45.18
33.4
38.41
BiL
multi
text+caption
none
45.38 33.67 38.67
VBiL
multi
image+text+caption
Concat
55.27 35.54 43.04
VBiL
multi
image+text+caption Auto-Fusion 59.65 43.87 ...
arXiv:2105.01129v1
fatcat:lakkm66thrfy3kidfc734cfjbe
Vectorization of Riemann solvers for the single- and multi-layer Shallow Water Equations
2018
2018 International Conference on High Performance Computing & Simulation (HPCS)
We discuss vectorization of normal and transverse Riemann solvers for the single-and multi-layer shallow water equations. ...
Our approach is simple and portable, as it is based on auto-vectorization by the compiler, aided by OpenMP 4.0 directives. ...
Although auto-vectorization was possible for their f -Wave solver, intrinsics functions were necessary to achieve vectorization of the augmented Riemann solver, because the compiler was not able to auto-vectorize ...
doi:10.1109/hpcs.2018.00073
dblp:conf/ieeehpcs/FerreiraMB18
fatcat:apwymwd64fefvbcdziuolsok7i
Use of SIMD Vector Operations to Accelerate Application Code Performance on Low-Powered ARM and Intel Platforms
2013
2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum
On the ARM platforms the hand-tuned NEON benchmarks were between 1.05× and 13.88× faster than the auto-vectorized code, while for the Intel platforms the hand-tuned SSE benchmarks were between 1.34× and ...
The performance obtained using compiler auto-vectorization is compared with that achieved using hand-tuning across a range of five different benchmarks and ten different hardware platforms. ...
These figures show that in general the benefit of using hand coded SIMD intrinsics over auto-vectorization appears to be slightly greater on the Intel platforms compared with the ARM platforms. ...
doi:10.1109/ipdpsw.2013.207
dblp:conf/ipps/MitraJRMZ13
fatcat:43t6svygpzefdobbxwy4gsjad4
Special issue on Intelligence Computation Evolutionary Computation: ICEV2018
2019
Evolutionary Intelligence
Imbalanced data classification algorithm with support vector machine kernel extensions proposes a imbalanced data classification algorithm of support vector machines (KE-SVM). ...
maximum margin classification SVM model, and then obtaining a new kernel extension function. based on Chi square test and weight coefficient calculation, through training the samples again by the new vector ...
Firstly, the method input historical data which contains power load, weather information, and holiday information, and use auto-encoding to compress the historical data; and then, the multi-layer GRU is ...
doi:10.1007/s12065-019-00271-0
fatcat:s454clzxjnejfbzindrypt7nsa
Fusion OLAP: Fusing the Pros of MOLAP and ROLAP Together for In-memory OLAP
2018
IEEE Transactions on Knowledge and Data Engineering
The Fusion OLAP model can be integrated into the state-of-the-art in-memory databases with additional surrogate key indexes and vector indexes. ...
This is achieved by mapping the relation tables into virtual multidimensional model and binding the multidimensional operations into a set of vector indexes to enable multidimensional computing on relation ...
The vector access latency can be improved by two roadmaps, by cache locality or by simultaneous multi-threading. ...
doi:10.1109/tkde.2018.2867522
fatcat:vfrtcmiqsvfodeahtx6oake2uu
Evaluating Auto-Vectorizing Compilers through Objective Withdrawal of Useful Information
2019
ACM Transactions on Architecture and Code Optimization (TACO)
With our new method in place, we exhaustively evaluated five industry-grade compilers: GNU, Intel, Clang, PGI and IBM; on four representative vector platforms: AVX-2, AVX-512 (Skylake), AVX-512 (KNL) and ...
a method to objectively supply and withdraw information that would otherwise aid the compiler in the auto-vectorization process. ...
(f) Global Data Flow and Symbolics categories show good auto-vectorization results across all platforms and compilers. ...
doi:10.1145/3356842
fatcat:iztjlyb7lffvrehu4mcx3dgroy
Pushing the Limits of Online Auto-tuning: Machine Code Optimization in Short-Running Kernels
[article]
2017
arXiv
pre-print
We propose an online auto-tuning approach for computing kernels. ...
This allows auto-tuning to pay off in very short-running applications. ...
and the best statically auto-tuned kernels, in the real platforms (all run-time overheads included). ...
arXiv:1707.04566v1
fatcat:sdtgqm6iv5ekzmxnuxtuvveisq
Exploring source-to-source compiler transformation of OpenMP SIMD constructs for Intel AVX and Arm SVE vector architectures
2022
Proceedings of the Thirteenth International Workshop on Programming Models and Applications for Multicores and Manycores
Finally, we conduct performance evaluations on Intel AVX and Arm SVE to demonstrate how this method of vectorization can bridge the gap between auto-and manual-vectorization. ...
We present the design of a unified IR that is easily translated to AVX and SVE vector architectures. ...
GPUs and multi-core CPUs (via threading). ...
doi:10.1145/3528425.3529100
fatcat:b6zh5b3gfvcndancatw4lunu6q
« Previous
Showing results 1 — 15 out of 28,526 results