Filters








2,932 Hits in 2.6 sec

Subword parallelism with MAX-2

R.B. Lee
1996 IEEE Micro  
Table 5 . 5 Operation parallelism in PA-8000 with MAX-2.  ...  MAX speeds this up with parallel shift left and add, and parallel shift right and add.  ... 
doi:10.1109/40.526925 fatcat:xd575wxq5zh73hz2osunqjo4ta

Multimedia instructions in ia-64

R.B. Lee, A.M. Fiskiran, A. Bubshait
2001 IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.  
They are both a subset and a superset of the multimedia instructions from the predecessor architectures: MMX, SSE and SSE-2 from the IA-32 architecture, and MAX and MAX-2 from the PA-RISC architecture.  ...  These multimedia instructions implement subword parallelism, also called packed parallelism or microSIMD parallelism.  ...  These instructions implement the ISA concept of subword parallelism [2, 3] , also called packed parallelism [4] or microSIMD parallelism [5] .  ... 
doi:10.1109/icme.2001.1237694 dblp:conf/icmcs/LeeFB01 fatcat:hu6cvwfttffxfn7igmzlbi6vji

PLX: An Instruction Set Architecture and Testbed for Multimedia Information Processing

Ruby B. Lee, A. Murat Fiskiran
2005 Journal of VLSI Signal Processing Systems for Signal, Image and Video Technology  
Key characteristics of PLX are a fully subword-parallel architecture with novel features like wordsize scalability from 32-bit to 128-bit words, a new definition of predication, and an innovative set of  ...  subword permutation instructions.  ...  Operation 1 Instructions Explanation Clip all a i at an arbitrary maximum value a max , where 0 ≤ a max2 15 −1. padd.s.2 a, a, b Initially all b i contain 2 15 − 1 − a max .  ... 
doi:10.1007/s11265-005-4940-8 fatcat:vzndq4zbfvdt7cey2yncygiuoi

Subword extensions for video processing on mobile systems

M.D. Jennings, T.M. Coate
1998 IEEE Concurrency  
We'll refer to this particular form of micro-SIMD execution as subword execution.  ...  execution on data organized in parallel.  ...  With both MAX-1 and MAX-2, HP takes a minimalist approach to multimedia extensions.  ... 
doi:10.1109/4434.708250 fatcat:5uaj7z4luvdl7lffllfpvokgci

Multimedia Instructions in Microprocessors for Native Signal Processing [chapter]

A Murat Fiskiran, Ruby Lee
2001 Signal Processing and Communications  
Native signal processing is DSP performed in the microprocessor itself, with the addition of general-purpose multimedia instructions.  ...  Common usage TM * Through this chapter, the subwords in a register will be indexed from 1 to n, where n will be the number of subwords in that register.  ...  MAX-2, although designed simultaneously with MAX-1, was introduced later with the 64-bit PA-RISC 2.0 architecture.  ... 
doi:10.1201/9780203908068.ch3 fatcat:2jijojacfvaablxttyp3eko5be

Efficiency of microSIMD architectures and index-mapped data for media processors

Ruby B. Lee, Sethuraman Panchanathan, Subramania I. Sudharsanan, V. Michael Bove, Jr.
1998 Media Processors 1999  
We define alternative mappings of data onto subwords, and show that the index mapping is an ideal mapping for achieving maximal subword parallelism with minimal revamping of the original serial loop code  ...  We also show how to convert rapidly between data mappings by using the Mix permutation instructions, first defined in the MAX-2 multimedia extensions for PA-RISC processors.  ...  In general, the subword parallelism in figure 3 can be combined with the functional unit parallelism of a uniprocessor in figure 2 , and with the processor parallelism in figure 1 , for even higher  ... 
doi:10.1117/12.334770 fatcat:xn2hirfmo5ax3h3xgw5an3iktq

External inverse pattern matching [chapter]

Leszek Gasieniec, Piotr Indyk, Piotr Krysta
1997 Lecture Notes in Computer Science  
The entire problem is to nd a patternP MAX 2 m which is not a subword of T and which maximizes the sum of Hamming distances betweenP MAX and all subwords of T of length m.  ...  Moreover we discuss a fast parallel implementation of our algorithm on the CREW PRAM model.  ...  subwords of P MAX (using LCA queries and table  ... 
doi:10.1007/3-540-63220-4_53 fatcat:hsoif5hterehdczipwca4io5qq

SoftSIMD - Exploiting Subword Parallelism Using Source Code Transformations

Stefan Kraemer, Rainer Leupers, Gerd Ascheid, Heinrich Meyr
2007 2007 Design, Automation & Test in Europe Conference & Exhibition  
Vendors often use proprietary platforms which are incompatible with others. Therefore, porting software is a very complex and time consuming task.  ...  A, B ∈ {−M max , ..., M max } A bias = A + M max B bias = B + M max C = A bias − B bias = A − B C ∈ {−2 M max , ..., 2 M max } Figure 2.  ...  Thus, with a 32bit multiplication it is only possible to implement a SIMD 2 multiplication for two 8 -bit values in parallel.  ... 
doi:10.1109/date.2007.364485 fatcat:nsvil2ufczhfvibeu6itpfeb4q

Challenges to combining general-purpose and multimedia processors

T.M. Conte, P.K. Dubey, M.D. Jennings, R.B. Lee, A. Peleg, S. Rathnam, M. Schlansker, P. Song, A. Wolfe
1997 Computer  
In addition to microarchitectural features, the ISA extensions are continuing to evolve (for example, HP has released two generations of extensions, MAX-1 and MAX-2).  ...  These include MAX-2 extensions to Hewlett-Packard PA-RISC, 1 MMX for Intel's x 86, 2,3 UltraSparc's VIS, 4 and MDMX extensions to MIPS V. 5 Processors targeted to embedded multimedia applications-the  ... 
doi:10.1109/2.642799 fatcat:j2gnaqh77fagbgxqjy5hbolqiq

Character Transformations for Non-Autoregressive GEC Tagging [article]

Milan Straka, Jakub Náplava, Jana Straková
2021 arXiv   pre-print
We propose a character-based nonautoregressive GEC approach, with automatically generated character transformations.  ...  correction for languages other than English, with character transformations applied at subwords, inferred automatically from parallel GEC corpus.  ...  () = g.strip() 0.5 • LevenshSimilarity(s[i], g) w[i, j] ← max(w[i, j], c+w[i, j +1]) end end end return alignment with weight w[0, 0], as in LCS with an attempt at non-autoregressive grammatical error  ... 
arXiv:2111.09280v1 fatcat:2zk6abkwufecheqh5b7wat2xai

Cross-language spoken document retrieval using HMM-based retrieval model with multi-scale fusion

Wai-Kit Lo, Helen Meng, P. C. Ching
2003 ACM Transactions on Asian Language Information Processing  
In this work the extended HMM-based retrieval model has been applied to an English-Mandarin CL-SDR task, which is to search the Mandarin spoken document collection with English queries at word and subword  ...  In addition, this HMM-based CLIR retrieval model is also extended for retrieval at subword scales.  ...  D i p(D i |Q) = arg max D i p(Q · D i ) p(Q) = arg max D i p(Q|D i ) p(D i ) p(Q) (2) (2) becomes arg max D i p(D i |Q) = arg max D i p(Q|D i ) p(D i ) (3) If we assume that the a priori probability p(  ... 
doi:10.1145/964161.964162 fatcat:meyid3zxrjglvpf3zm4nn3u2wu

Overview of research efforts on media ISA extensions and their usage in video coding

V. Lappalainen, T.D. Hamalainen, P. Liuha
2002 IEEE transactions on circuits and systems for video technology (Print)  
Optimized applications include, in addition to some proprietary methods, all of the major video coding standards such as H.261, H.263, MPEG-4, MPEG-1, and MPEG-2.  ...  This paper summarizes the results of over 25 research groups or individual researchers that have presented video coding implementations on general-purpose processors with the new single instruction multiple  ...  This subset of subword parallelism is also referred to as packed parallelism. In MMX, MAX, VIS, and MDMX, for example, the subwords are integer subwords, while in SSE, 3DNow!  ... 
doi:10.1109/tcsvt.2002.800865 fatcat:tf62dj6pozda5e2iqck6iipu2u

Subword Segmentation and a Single Bridge Language Affect Zero-Shot Neural Machine Translation [article]

Annette Rios and Mathias Müller and Rico Sennrich
2020 arXiv   pre-print
We show that this bias towards English can be effectively reduced with even a small amount of parallel data in some of the non-English pairs.  ...  We find that language-specific subword segmentation results in less subword copying at training time, and leads to better zero-shot performance compared to jointly trained segmentation.  ...  To test this hypothesis, we train two models with languagespecific subword segmentation: a) a model with language-specific subword seg-2 See also (Ha et al., 2017; Arivazhagan et al., 2019; Zhang et al  ... 
arXiv:2011.01703v1 fatcat:qoy243alazfp3nq7z2vtrikcfm

Performance Improvement of Multimedia Kernels by Alleviating Overhead Instructions on SIMD Devices [chapter]

Asadollah Shahbahrami, Ben Juurlink
2009 Lecture Notes in Computer Science  
The extended subwords, uses four extra bits for every byte in a media register and it provides additional parallelism.  ...  SIMD extension is one of the most common and effective technique to exploit data-level parallelism in today's processor designs.  ...  GPP with Multimedia Extension ISA Name AltiVec/VMX MAX-1/2 MDMX MMX/ VIS MMX/ SSE SSE2 SPU ISA 3DNow SIMD Company Motorola/IBM HP MIPS AMD Sun Intel Intel Intel IBM/Sony/Toshiba Instruction  ... 
doi:10.1007/978-3-642-03644-6_31 fatcat:tvpvpowuzzcqncqefdhl4tkesu

SimAlign: High Quality Word Alignments without Parallel Training Data using Static and Contextualized Embeddings [article]

Masoud Jalili Sabet, Philipp Dufter, François Yvon, Hinrich Schütze
2021 arXiv   pre-print
We find that alignments created from embeddings are superior for four and comparable for two language pairs compared to those produced by traditional statistical aligners, even with abundant parallel data  ...  Statistical word aligners perform well, as do methods that extract alignments jointly with translations in NMT.  ...  (S, n max , α ∈ [0, 1]) 2: A, M = zeros like(S) 3: for n ∈ [1, . . . , n max ] do 4: ∀i, j : 5: Mij =        1 if max le l=0 A lj , l f l=0 A il = 0 0 if min le l=0 A lj , l f l=0 A il > 0 α otherwise  ... 
arXiv:2004.08728v4 fatcat:pxp5aq5m2vemtepw4py33yd5ry
« Previous Showing results 1 — 15 out of 2,932 results