A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2005; you can also visit the original URL.
The file type is application/pdf
.
Filters
Eliminating redundancies in sum-of-product array computations
2001
Proceedings of the 15th international conference on Supercomputing - ICS '01
) S
Number of Processors 16 1 1 1 1 1 1 1 1 1 1 1 1 S V 1 + 1 1 ...+ 1 1 + 1 1 S V 2 Sweep 1 Sweep 2 1 1 ... ...
3.1 Normalized array statement sequences c ¥ 0 3 ) 1 U T @ ¦ 5 % ) U % ) g i ' % d @ ¤ ! T Q X 6 ¡ ¤ £ ( ¤ ª R 6 ¤ ¦ £ p £ e g ¡ g £ ¤ c i ¢ ¦ 0 t @ ! ...
doi:10.1145/377792.377807
dblp:conf/ics/DeitzCS01
fatcat:hphzyfjawnhjlmejhrrcug2yii
Using of Redundant Signed-Digit Numeral System for Accelerating and Improving the Accuracy of Computer Floating-Point Calculations
2020
International Journal of Advanced Computer Science and Applications
The effect of accelerating computations is obtained for the problems of calculating the sum of an array of numbers and determining the dot product of vectors. ...
The article proposes a method for software implementation of floating-point computations on a graphics processing unit (GPU) with an increased accuracy, which eliminates sharp increase in rounding errors ...
If the array contains k numbers, then this summation method requires k-1 synchronizations in the process of summing this array. ...
doi:10.14569/ijacsa.2020.0110942
fatcat:45qklytg6bbzbhizkrabykwbwy
Subregion Analysis and Bounds Check Elimination for High Level Arrays
[chapter]
2011
Lecture Notes in Computer Science
For example, high-level arrays in the X10 language support rank-independent specification of multidimensional loop and array computations using regions and points. ...
| R ::= restriction of array onto region R A.sum(), A.max() ::= sum/max of elements in array A1 A2 ::= result of applying point-wise op on A1 and A2, when A1.region = A2. region ( can include +, -, * ...
For simplicity, an additional loop is introduced to compute the weighted sum using the elements in the stencil, but this loop could be replaced by a high level array sum() operation as well. ...
doi:10.1007/978-3-642-19861-8_14
fatcat:vvondsmiknes3a6ha2opwe27vu
Fast multiplication without carry-propagate addition
1990
IEEE transactions on computers
4] [5] [6] [7] Abstract-Conventional schemes for fast multiplication accumulate the partial products in redundant form (carry-save or signed-digit) and convert the result to conventional representation ...
in the last step. ...
Since a product of n bits is to be computed, those digits of the array that do not influence the result can be eliminated. ...
doi:10.1109/12.61047
fatcat:njn6aqomh5cvldx7qnnhn4xvtm
GLORE: generalized loop redundancy elimination upon LER-notation
2017
Proceedings of the ACM on Programming Languages
This paper presents GLORE, a novel approach to enabling the detection and removal of large-scope redundant computations in nested loops. ...
GLORE works on LER-notation, a new representation of computations in both regular and irregular loops. ...
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of DOE or NSF. ...
doi:10.1145/3133898
dblp:journals/pacmpl/DingS17
fatcat:jeam5uobn5cincs2lbqcp2jfby
Algorithm-Based Fault Tolerance for Matrix Operations
1984
IEEE transactions on computers
The rapid progress in VLSI technology has reduced the cost of hardware, allowing multiple copies of low-cost processors to provide a large amount of computational capability for a small cost. ...
In addition to achieving high performance, high reliability is also important to ensure that the results of long computations are valid. ...
of the computed sum of the row or column data elements and the checksum to the erroneous element in the information part, (ii) or by replacing the checksum by the computed sum of the information elements ...
doi:10.1109/tc.1984.1676475
fatcat:esqcnwz4nff7xbbxbaisezj2jm
Compiling stencils in high performance Fortran
1997
Proceedings of the 1997 ACM/IEEE conference on Supercomputing (CDROM) - Supercomputing '97
from the translation of Fortran90 array constructs. ...
For many F ortran90 and HPF programs performing dense matrix computations, the main computational portion of the program belongs to a class of k ernels kno wn as stencils. ...
Acknowledgments This work has been supported in part by the IBM Corporation, the Center for Research on Parallel Computation (an NSF Science and Technology Center), and DARPA Contract DABT63-92-C-0038. ...
doi:10.1145/509593.509605
dblp:conf/sc/RothMKB97
fatcat:f5vd27hdvvhu7cdbvay6y57tym
Fast Multiplication Based on Different Compressors
IJIREEICE - Electrical, Electronics, Instrumentation and Control
2015
IJIREEICE
IJIREEICE - Electrical, Electronics, Instrumentation and Control
In many of digital systems like graphic processors, digital signal processors fast parallel multiplication using adder trees are present. To speed up the computation like addition is very important. ...
This approach is defined in parameterizable HDL code, which makes it compatible with any FPGA family. ...
Figure 2 show the CSA compute flow and Table 1 will show the CSA working. The computation can be in two steps, first we compute S and C using a CSA, and then we use CPA to compute the total sum. ...
doi:10.17148/ijireeice.2015.3230
fatcat:o5h5bqrq4bckdnruyn5dxlkcbe
FPGA-Based Data Storage System on NAND Flash Memory in RAID 6 Architecture for In-Line Pipeline Inspection Gauges
2018
IEEE transactions on computers
At the hardware level, we interleaved 8 NAND flash chips in a Redundant Array of Independent Disks (RAID) type-6 architecture. ...
Our controller computes the ECC and redundancy bytes while it transfers the information to the cache register of the selected die in the memory chips. ...
All authors would like to thank Joseph Moeller for his help in improving the English manuscript. ...
doi:10.1109/tc.2018.2794986
fatcat:p5uqu2wpifeizgefxzqrhey2qi
Nanofabric PLA Architecture with Double Variable Redundancy
2007
2007 IEEE Region 5 Technical Conference
It has been shown that fundamental electronic crosspoint can be programmed ON or OFF by applying a structures such as Diodes, and FET's can be constructed using voltage Differential of 3.6V. ...
OUR APPROACH: DOUBLE VARIABLE REDUNDANCY determines the working of the array as AND array or OR array, as seen in
We have successfully simulated the configuration of this DVR in MATLAB to verify the ...
We product-sum terms using DVR-based PLA. allocate two vertical Nanowires per Product (or Sum) term in i=l Pcp= 0.05 to 0.2 (Probability that a single crosspoint is non programmable) Expression (3) gives ...
doi:10.1109/tpsd.2007.4380347
fatcat:sn2f5lhzhrbzzdngtmeaub4cvq
Space-time trade-off optimization for a class of electronic structure calculations
2002
Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation - PLDI '02
Its utility is demonstrated by applying it to a computation representative of a component in the CCSD(T) formulation in the NWChem quantum chemistry suite from Pacific Northwest National Laboratory. ...
In this paper, we present an algorithm that starts with an operationminimal form of the computation and systematically explores the possible space-time trade-offs to identify the form with lowest cost ...
of the product of several input arrays. ...
doi:10.1145/512529.512551
dblp:conf/pldi/CociorvaBLSRNBH02
fatcat:hdy6zbuuhrggjf7kwlalazbcmi
Space-time trade-off optimization for a class of electronic structure calculations
2002
SIGPLAN notices
Its utility is demonstrated by applying it to a computation representative of a component in the CCSD(T) formulation in the NWChem quantum chemistry suite from Pacific Northwest National Laboratory. ...
In this paper, we present an algorithm that starts with an operationminimal form of the computation and systematically explores the possible space-time trade-offs to identify the form with lowest cost ...
of the product of several input arrays. ...
doi:10.1145/543552.512551
fatcat:dotf2cim75e3vftocdzhc3hvki
Space-time trade-off optimization for a class of electronic structure calculations
2002
Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation - PLDI '02
Its utility is demonstrated by applying it to a computation representative of a component in the CCSD(T) formulation in the NWChem quantum chemistry suite from Pacific Northwest National Laboratory. ...
In this paper, we present an algorithm that starts with an operationminimal form of the computation and systematically explores the possible space-time trade-offs to identify the form with lowest cost ...
of the product of several input arrays. ...
doi:10.1145/512549.512551
fatcat:y3zuziytzjallmtgysjt2472dq
Arithmetic operators based on the binary stored-carry-or-borrow representation
2010
2010 Conference Record of the Forty Fourth Asilomar Conference on Signals, Systems and Computers
In the latter design, the conventional initial AND matrix is transformed and expressed with a redundant radix-2 representation. ...
Several BSCB arithmetic elements, including full-adder, ripple-carry adder, and carry-lookahead adder are presented, followed by detailed design of an array multiplier. ...
Introduction Redundant number representations allow fast addition by eliminating the carry propagation chains [Aviz61] . ...
doi:10.1109/acssc.2010.5757584
fatcat:vovggyn5bbgq7kpgsbl3jbqlai
Optimizing array bound checks using flow analysis
1993
ACM Letters on Programming Languages and Systems
The optimizations reduce the program execution time through elimination of checks and propagation of checks out of loops. ...
Bound checks are introduced in programs for the run-time detection of array bound violations. Compile-time optimizations are employed to reduce the execution-time overhead due to bound checks. ...
The range information
is used to eliminate
redundant
bound checks on array subscripts. ...
doi:10.1145/176454.176507
fatcat:r4wbngdpfvhb5jzq5u4rjwzsm4
« Previous
Showing results 1 — 15 out of 29,569 results