Filters








1,332 Hits in 5.4 sec

DRAM-Based Statistics Counter Array Architecture With Performance Guarantee

Hao Wang, Haiquan Zhao, Bill Lin, Jun Xu
2012 IEEE/ACM Transactions on Networking  
In particular, we propose a DRAM-based counter architecture that can effectively maintain wirespeed updates to large counter arrays.  ...  The proposed architecture makes use of a simple randomization scheme, a small cache, and small request queues to statistically guarantee a near-perfect load-balancing of counter updates to the DRAM banks  ...  We then propose DRAM-based counter architectures that allow for wirespeed updates to large counter arrays.  ... 
doi:10.1109/tnet.2011.2171360 fatcat:gsra6al4pvc43fbgecc2tm4dii

DRAM is plenty fast for wirespeed statistics counting

Bill Lin, Jun (Jim) Xu
2008 Performance Evaluation Review  
Per-flow network measurement at Internet backbone links requires the efficient maintanence of large arrays of statistics counters at very high speeds (e.g. 40 Gb/s).  ...  In this paper, we present a contrarian view that modern commodity DRAM architectures, driven by aggressive performance roadmaps for consumer applications (e.g. video games), have advanced architecture  ...  In particular, several designs of large counter arrays based on hybrid SRAM/DRAM counter architectures have been proposed.  ... 
doi:10.1145/1453175.1453183 fatcat:5ffkw34qrrbqrigixwwdm2yyey

CHOP: Integrating DRAM Caches for CMP Server Platforms

Xiaowei Jiang, Niti Madan, Li Zhao, Mike Upton, Ravi Iyer, Srihari Makineni, Don Newell, Yan Solihin, Rajeev Balasubramonian
2011 IEEE Micro  
Filter cache (CHOP-FC) Figure 2a shows our basic filter-based DRAM-caching architecture (CHOP-FC), which incorporates a filter cache on die with the DRAM cache tag array (the data array is off die via  ...  T he adaptive filter cache schemes show their performance robustness and guarantee performance improvement by quickly adapting to the bandwidth utilization situation.  ... 
doi:10.1109/mm.2010.100 fatcat:3rdpbptksjgyzodishdbvwdmge

Robust Pipelined Memory System with Worst Case Performance Guarantee for Network Processing

Hao Wang, Haiquan Zhao, Bill Lin, Jun Xu
2012 IEEE transactions on computers  
The design is based on the interleaving of DRAM banks together with the use of a reservation table that serves in part as a data cache.  ...  In this paper, we analyze a robust pipelined memory architecture that can emulate an ideal SRAM by guaranteeing with very high probability that the output sequence produced by the pipelined memory architecture  ...  Several designs of large counter arrays based on hybrid SRAM/DRAM counter architectures have been proposed.  ... 
doi:10.1109/tc.2011.171 fatcat:j7qdt7k2n5blbizqfw37pfbtci

Discount Counting for Fast Flow Statistics on Flow Size and Flow Volume

Chengchen Hu, Bin Liu, Hongbo Zhao, Kai Chen, Yan Chen, Yu Cheng, Hao Wu
2014 IEEE/ACM Transactions on Networking  
The results demonstrate that DISCO is more accurate than related work given the same counter sizes. DISCO is also implemented on the network processor Intel IXP2850 for a performance test.  ...  For each incoming packet of length , DISCO increases the corresponding counter assigned to the flow with an increment that is less than .  ...  Based on modern fast DRAM, [21] and [23] proposed a randomized DRAM architecture that can harness the performance of fast DRAM offerings by interleaving counter updates to multiple memory banks.  ... 
doi:10.1109/tnet.2013.2270439 fatcat:e5x2424lxndujl2cgmz6fdd55m

BRICK: A Novel Exact Active Statistics Counter Architecture

Nan Hua, Jun Jim Xu, Bill Lin, Haiquan Chuck Zhao
2011 IEEE/ACM Transactions on Networking  
Experiments with Internet traces show that our solution can indeed maintain large arrays of exact active statistics counters with moderate amounts of SRAM.  ...  In this paper, we present an exact active statistics counter architecture called BRICK (Bucketized Rank Indexed Counters) that can efficiently store per-flow variable-width statistics counters entirely  ...  In particular, several SRAM-efficient designs of large counter arrays based on hybrid SRAM/DRAM counter architectures have been proposed.  ... 
doi:10.1109/tnet.2011.2111461 fatcat:cqtokvhx5vfcxnli42j52qrvya

BRICK

Nan Hua, Bill Lin, Jun (Jim) Xu, Haiquan (Chuck) Zhao
2008 Proceedings of the 4th ACM/IEEE Symposium on Architectures for Networking and Communications Systems - ANCS '08  
Experiments with Internet traces show that our solution can indeed maintain large arrays of exact active statistics counters with moderate amounts of SRAM.  ...  In this paper, we present an exact active statistics counter architecture called BRICK (Bucketized Rank Indexed Counters) that can efficiently store per-flow variable-width statistics counters entirely  ...  In particular, several SRAM-efficient designs of large counter arrays based on hybrid SRAM/DRAM counter architectures have been proposed.  ... 
doi:10.1145/1477942.1477956 dblp:conf/ancs/HuaLXZ08 fatcat:as3duathsvayrpz254qdk7i4za

SRAM-DRAM hybrid memory with applications to efficient register files in fine-grained multi-threading

Wing-kei S. Yu, Ruirui Huang, Sarah Q. Xu, Sung-En Wang, Edwin Kan, G. Edward Suh
2011 Proceeding of the 38th annual international symposium on Computer architecture - ISCA '11  
This configuration results in significant area and energy savings compared to the SRAM array with the same capacity due to compact DRAM cells.  ...  Circuit and architecture simulations of GPU benchmarks suites show significant savings in register file area (38%) and energy (68%) over the traditional SRAM implementation, with minimal (1.4%) performance  ...  For example, a 16-KB 2context memory array (8-KB SRAM with 2 8-KB DRAM) is replaced with two 8-KB 2-context arrays (4-KB SRAM with 2 4-KB DRAM).  ... 
doi:10.1145/2000064.2000094 dblp:conf/isca/YuHXWKS11 fatcat:qdj774nwnjb53nu25gamdfptta

SRAM-DRAM hybrid memory with applications to efficient register files in fine-grained multi-threading

Wing-kei S. Yu, Ruirui Huang, Sarah Q. Xu, Sung-En Wang, Edwin Kan, G. Edward Suh
2011 SIGARCH Computer Architecture News  
This configuration results in significant area and energy savings compared to the SRAM array with the same capacity due to compact DRAM cells.  ...  Circuit and architecture simulations of GPU benchmarks suites show significant savings in register file area (38%) and energy (68%) over the traditional SRAM implementation, with minimal (1.4%) performance  ...  For example, a 16-KB 2context memory array (8-KB SRAM with 2 8-KB DRAM) is replaced with two 8-KB 2-context arrays (4-KB SRAM with 2 4-KB DRAM).  ... 
doi:10.1145/2024723.2000094 fatcat:i2trkix455hyfa5l7bibjgfxz4

CHOP: Adaptive filter-based DRAM caching for CMP server platforms

Xiaowei Jiang, Niti Madan, Li Zhao, Mike Upton, Ravishankar Iyer, Srihari Makineni, Donald Newell, Yan Solihin, Rajeev Balasubramonian
2010 HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture  
We conduct detailed simulations with server workloads to show that our filter-based DRAM caching techniques achieve the following: (a) on average over 30% performance improvement over previous solutions  ...  As manycore architectures enable a large number of cores on the die, a key challenge that emerges is the availability of memory bandwidth with conventional DRAM solutions.  ...  Filter Cache (CHOP-FC) Figure 4(a) shows our first filter-based DRAM caching architecture, where a Filter Cache (CHOP-FC) is incorporated on die along with the DRAM cache tag arrays (labeled as DT).  ... 
doi:10.1109/hpca.2010.5416642 dblp:conf/hpca/JiangMZUIMNSB10 fatcat:o7fotbnqdbeinf5kbou6hpkvse

A survey of sketches in traffic measurement: Design, Optimization, Application and Implementation [article]

Shangsen Li, Lailong Luo, Deke Guo, Qianzhen Zhang, Pengtao Fu
2021 arXiv   pre-print
At their cores, sketches usually maintain one or multiple counter array(s), and rely on hash functions to select the counter(s) for each flow.  ...  Then the space-efficient sketches from the distributed measurement nodes are aggregated to provide statistics of the undergoing flows.  ...  The architecture consists of a simple randomization scheme, a small cache, and small request queues to statistically guarantee a near-perfect load-balancing of counters to the DRAM banks. 3) Implementation  ... 
arXiv:2012.07214v2 fatcat:lme2ghsshje3tag2m5q3xgvcna

Self-Tuning the Parameter of Adaptive Non-linear Sampling Method for Flow Statistics

Chengchen Hu, Bin Liu
2009 2009 International Conference on Computational Science and Engineering  
A parameter self-tuning algorithm is proposed in this paper, which enlarges the parameter to a equilibrium tuning point and renormalizes the counter when counter overflows.  ...  Flow statistics is a basic task of passive measurement and has been widely used to characterize the state of the network.  ...  The basic idea of BRICK is intuitive and is based on statistical multiplexing, which bundles groups of a fixed number (say 64) of counters that is randomly selected from the array, into buckets.  ... 
doi:10.1109/cse.2009.19 dblp:conf/cse/HuL09 fatcat:qi5lsdach5fa5b55e4qzif25vi

CREAM: A Concurrent-Refresh-Aware DRAM Memory architecture

Tao Zhang, Matt Poremba, Cong Xu, Guangyu Sun, Yuan Xie
2014 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA)  
The proposed CREAM architecture distinguishes itself with the following key contributions: (1) Under a given DRAM power budget, sub-rank-level refresh (SRLR) is developed to reduce refresh power and the  ...  addition, novel sub-array level refresh scheduling schemes, such as sub-array round-robin and dynamic scheduling, are designed to further improve the performance.  ...  As the maximum number of sub-arrays in a single DRAM chip is 128, the row counter for one bank is a 7-bit register.  ... 
doi:10.1109/hpca.2014.6835947 dblp:conf/hpca/ZhangPXSX14 fatcat:lpwxedokhne7pa6bhvxfocuyqu

Extending the effectiveness of 3D-stacked DRAM caches with an adaptive multi-queue policy

Gabriel H. Loh
2009 Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture - Micro-42  
While significant performance benefits can be gained with such an approach, there remain additional opportunities beyond the simple integration of commodity DRAM chips.  ...  In this work, we leverage the hardware organization typical of DRAM architectures to propose new cache management policies that would otherwise not be practical for standard SRAM-based caches.  ...  With a DRAM array, accessing any row requires loading the row into the row buffer and eventually writing the row back.  ... 
doi:10.1145/1669112.1669139 dblp:conf/micro/Loh09 fatcat:6lz7fhx4nngkbl42e3jbkkcbxy

Programmable DDRx Controllers

Mahdi Nazm Bojnordi, Engin Ipek
2013 IEEE Micro  
control policies practical. 10 Pardis divides the tasks associated with high-performance DRAM control among a request processor, a transaction processor, and dedicated command logic.  ...  The off-chip memory subsystem is a significant performance, power, and quality-of-service (QoS) bottleneck in modern computers, necessitating a high-performance memory controller that can overcome DRAM  ...  Special-purpose registers are implemented using a 64-entry array of programmable counters.  ... 
doi:10.1109/mm.2013.29 fatcat:wf5qo3fmqbdzbdcvkbkptntnm4
« Previous Showing results 1 — 15 out of 1,332 results