A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Managing Response Time Tails by Sharding
2019
ACM Transactions on Modeling and Performance Evaluation of Computing Systems
by sharding a data object into N fragments, only K < N of which are required to reconstruct the object. ...
Matrix analytic methods are developed to compute the probability distribution of response times (i.e. data access times) in distributed storage systems protected by erasure coding, which is implemented ...
Such a lumpy tail is highly undesirable in parallel systems, where tails can become magnified, and we wish to determine whether sharding can protect read-access times from these tail characteristics. ...
doi:10.1145/3300143
fatcat:cy6xa4tcrbhdzjvubbylo7u5pu
Efficient Energy Management in Distributed Web Search
2018
Proceedings of the 27th ACM International Conference on Information and Knowledge Management - CIKM '18
Our results show that PESOS can reduce the CPU energy consumption of a distributed WSE by up to 18% with respect to PEGASUS, while providing query response times which are in line with user expectations ...
A state-of-the-art research approach is the PESOS (Predictive Energy Saving Online Scheduling) algorithm, which can reduce the energy consumption of a WSE' single server by up to 50%. ...
ACKNOWLEDGMENTS This paper is partially supported by the BIGDATAGRAPES (grant agreement N • 780751) project that received funding from the European Union's Horizon 2020 research and innovation programme ...
doi:10.1145/3269206.3269263
dblp:conf/cikm/CatenaFT18
fatcat:65pztdk7vnc2hg2hgifjkqhumy
Efficient Energy Management in Distributed Web Search
2018
Zenodo
Our results show that PESOS can reduce the CPU energy consumption of a distributed WSE by up to 18% with respect to PEGASUS, while providing query response times which are in line with user expectations ...
A state-of-the-art research approach is the PESOS (Predictive Energy Saving Online Scheduling) algorithm, which can reduce the energy consumption of a WSE' single server by up to 50%. ...
ACKNOWLEDGMENTS This paper is partially supported by the BIGDATAGRAPES (grant agreement N • 780751) project that received funding from the European Union's Horizon 2020 research and innovation programme ...
doi:10.5281/zenodo.2710863
fatcat:4guimicsn5ht7dnu2xgmbsfb4i
Leveraging sharding in the design of scalable replication protocols
2013
Proceedings of the 4th annual Symposium on Cloud Computing - SOCC '13
the shards interact in order to improve availability. ...
Most if not all datacenter services use sharding and replication for scalability and reliability. Shards are more-orless independent of one another and individually replicated. ...
This work was funded, in part, by grants from DARPA, NSF, ARPAe, Amazon.com and Microsoft Corporation. ...
doi:10.1145/2523616.2523623
dblp:conf/cloud/Abu-LibdehRV13
fatcat:ovbqrgown5gwlkz2nzwlqjfysu
The Case for RackOut
2016
Proceedings of the Seventh ACM Symposium on Cloud Computing - SoCC '16
To avoid violating tail latency service-level objectives, systems tend to keep server utilization low and organize the data in micro-shards, which provides units of migration and replication for the purpose ...
Despite the natural parallelism across lookups, the load imbalance, introduced by heavy skew in the popularity distribution of keys, limits performance. ...
This work has been partially funded by the Nano-Tera YINS project, the CHIST-ERA DIVIDEND project, and the Scale-Out NUMA project of the Microsoft-EPFL Joint Research Center. ...
doi:10.1145/2987550.2987577
dblp:conf/cloud/NovakovicDBFG16
fatcat:j6px37guazcx7o6q5hwm3byuna
Size-aware Sharding For Improving Tail Latencies in In-memory Key-value Stores
[article]
2018
arXiv
pre-print
Tail latencies are crucial in distributed applications with high fan-out ratios, because overall response time is determined by the slowest response. ...
Size-aware sharding improves tail latencies by avoiding head-of-line blocking, in which a request for a small item gets queued behind a request for a large item. ...
From the application's standpoint, the overall response time is then determined by the slowest of the responses to these requests, hence the crucial importance of tail latency for KV stores [14] . ...
arXiv:1802.00696v1
fatcat:knl4xrabxfb35jfjiu37tvipcu
LogPlayer: Fault-tolerant Exactly-once Delivery using gRPC Asynchronous Streaming
[article]
2019
arXiv
pre-print
We model check the correctness of LogPlayer by TLA+. ...
In this paper, we present the design of our LogPlayer that is a component responsible for fault-tolerant delivery of transactional mutations recorded on a WAL to the backend storage shards. ...
However, it requires more work by the transaction manager, as it needs to write to multiple topics and run the 2PC which results in higher transaction response time. ...
arXiv:1911.11286v1
fatcat:spujttcg7ff23kvjq24yafb6vm
BigDataGrapes D4.4 - Resource Optimization Methods and Algorithms
2020
Zenodo
They often require responsiveness in the sub-second time scale at high request rates. ...
(BDG) platform to optimize computing resource management. ...
Figure 2 Figure 3 23 Tail response time (in milliseconds). Energy consumption (in Joules). ...
doi:10.5281/zenodo.4546120
fatcat:yrnyjqbnsvfrxnzya3zvioswee
Understanding Capacity-Driven Scale-Out Neural Recommendation Inference
[article]
2020
arXiv
pre-print
Overall, we observe only a marginal latency overhead when the data-center scale recommendation models are served with the distributed inference manner--P99 latency is increased by only 1% in the best case ...
Our primary takeaways are: • Increasing the number of shards can manage latency overheads incurred by the RPC operators by effectively increasing model parallelism. • However, increased sharding also incurs ...
RM3's size is dominated by a single large table, compared to the heavier tails exhibited by RM1 and RM2.
TABLE II : II Sharding Summary for RM1. Each column is a different sharding configuration. ...
arXiv:2011.02084v2
fatcat:6aei4zk5vngzlcsfel2avfjney
A High-Performance Persistent Memory Key-Value Store with Near-Memory Compute
[article]
2021
arXiv
pre-print
Managed blocks collectively represent versions over time. ...
It is a sharded architecture enabling a lock-free design with the restriction of only allowing 1 : N shard-to-pool mapping (i.e. any pool can only be serviced by a single shard). ...
arXiv:2104.06225v1
fatcat:zeeqr5tu55crda7ohy4a3icycm
RSS++ incurs up to 14x lower 95 th percentile tail latency and orders of magnitude fewer packet drops compared to RSS under high CPU utilization. ...
RSS++ keeps the flowstate by groups that can be migrated at once, leading to a 20% higher efficiency than a state of the art shared flow table. ...
This work was also funded by the Swedish Foundation for Strategic Research (SSF). ...
doi:10.1145/3359989.3365412
dblp:conf/conext/BarbetteKMK19
fatcat:mfqyg32ywvbmjpg4lppgmk76iu
MITOSIS: Practically Scaling Permissioned Blockchains
[article]
2021
arXiv
pre-print
Our results show that MITOSIS can be ported with little modifications and manageable overhead to existing permissioned blockchains, such as Hyperledger Fabric. ...
However, most existing sharding proposals exploit features of the permissionless model and are therefore restricted to cryptocurrency applications. ...
This work was supported in part by the by the European Commission H2020 TeraFlow Project under Grant Agreement No 101015857. ...
arXiv:2109.10302v1
fatcat:rojjhpyusve7zfcnhs7gcjg2vy
Anytime Ranking on Document-Ordered Indexes
[article]
2021
arXiv
pre-print
Anytime query processing can be used to effectively reduce high-percentile tail latency which is essential for operational scenarios in which a service level agreement (SLA) imposes response time requirements ...
Our experiments show that processing document-ordered topical segments selected by a simple score estimator outperforms existing anytime algorithms, and allows query runtimes to be accurately limited in ...
This work was supported by the Australian Research Council Discovery Project DP200103136. ...
arXiv:2104.08976v2
fatcat:bnsq6cn56zhglo65labrh6i75q
Kodiak
2016
Proceedings of the VLDB Endowment
For an interactive response time, they have to have tens of milliseconds latency. At Turn's scale of operations, no existing system was able to deliver this performance in a cost effective manner. ...
At query time, the system auto-selects the most suitable view to serve each query. Kodiak has been used in production for over a year. ...
These queries are served by the Cheetah system running on top of Hadoop as they do not have views to cover them and hence the very high response time.
Kodiak Vs. ...
doi:10.14778/3007263.3007266
fatcat:cqshmswshje7bl6xylxyyrwfx4
Dissecting UbuntuOne
2015
Proceedings of the 2015 ACM Conference on Internet Measurement Conference - IMC '15
Personal Cloud services, such as Dropbox or Box, have been widely adopted by users. ...
Second, by means of tracing the U1 servers, we provide an extensive analysis of its back-end activity for one month. ...
2-R) as well as by the European Union through projects FP7 CloudSpaces (FP7−317555) and H2020 IOStack (H2020-644182). ...
doi:10.1145/2815675.2815677
dblp:conf/imc/TinedoTSHLLAV15
fatcat:dnxgy6zaybeldp62yeciytpxxq
« Previous
Showing results 1 — 15 out of 1,275 results