95,225 Hits in 5.8 sec

Large Scale Distributed Distance Metric Learning [article]

Pengtao Xie, Eric Xing
2014 arXiv   pre-print
In this paper, we present a distributed algorithm for DML, and a large-scale implementation on a parameter server architecture.  ...  In large scale machine learning and data mining problems with high feature dimensionality, the Euclidean distance between data points can be uninformative, and Distance Metric Learning (DML) is often desired  ...  This demonstrates that our framework scales very well with the number of CPU cores (machines).  ... 
arXiv:1412.5949v1 fatcat:eiwlon6iqrglxbkbh46s3uez7a

Locating distributed information

C.E. Wills
1989 IEEE INFOCOM '89, Proceedings of the Eighth Annual Joint Conference of the IEEE Computer and Communications Societies  
machines.  ...  In this paper we first examlne general techniques for managing distributed information and how various evaluation criteria, when applied to the techniques, are affected by the parameters and scale of a  ...  The scale parameters c, oS, and m are the number of client, server, and master machines, respectively.  ... 
doi:10.1109/infcom.1989.101469 dblp:conf/infocom/Wills89 fatcat:fvzvrk3wozg5poubthyckj3ja4

Hydra: Hybrid Server Power Model [article]

Nigel Bernard, Hoa Nguyen, Aman Chandan, Savyasachi Jagdeeshan, Namdev Prabhugaonkar, Rutuja Shah, Hyeran Jeon
2022 arXiv   pre-print
Compared with state-of-the-art solutions, Hydra outperforms across all compute-intensity levels on heterogeneous servers.  ...  Some complicated machine learning models themselves incur performance and power overheads and hence it is not desirable to use them frequently.  ...  But, the prediction accuracy of RAE was over 10% worse than Hydra because RAE does not consider server heterogeneity, while Hydra is trained with power scale factors with more system parameters.  ... 
arXiv:2207.10217v1 fatcat:tfmy5wgnozcwtitz7lcnf5qyju

Performance Modeling of a Consolidated Java Application Server

Hitoshi Oi, Kazuaki Takahashi
2011 2011 IEEE International Conference on High Performance Computing and Communications  
The model breaks down the CPU utilization of the workload into servers and transaction types, and use these service time parameters in the network of queues to predict the performance.  ...  When the CPU utilization of 4-core execution is predicted by the data from 1 to 3-core executions, the prediction errors range from -3.6 to 43.4%, with the largest error occurring in the database domain  ...  The scaling factor (SF) is maximum in the sense at which the highest throughput with valid response time 3 is achieved.  ... 
doi:10.1109/hpcc.2011.118 dblp:conf/hpcc/OiT11 fatcat:hn2m2ad3x5dnfpn6jadkvhofgu

DisSAGD: A Distributed Parameter Update Scheme Based on Variance Reduction

Haijie Pan, Lirong Zheng
2021 Sensors  
We implemented DisSAGD in distributed clusters in order to train a machine learning model by sharing parameters among nodes using an asynchronous communication protocol.  ...  Machine learning models often converge slowly and are unstable due to the significant variance of random data when using a sample estimate gradient in SGD.  ...  [30] proposed a sufficient factor broadcast (SFB) computational model for the efficient distributed learning of large-scale matrix parameterized models.  ... 
doi:10.3390/s21155124 fatcat:h3kppjqc5rb75ayxmybgkt7g2a

Optimizing Network Performance in Distributed Machine Learning

Luo Mai, Chuntao Hong, Paolo Costa
2015 USENIX Workshop on Hot Topics in Cloud Computing  
To cope with the ever growing availability of training data, there have been several proposals to scale machine learning computation beyond a single server and distribute it across a cluster.  ...  A key feature of MLNET is its compatibility with existing hardware and software infrastructure so it can be immediately deployed.  ...  We evaluate its effectiveness in Section 4 by means of large-scale simulations with 800 servers and 50 to 400 parameter servers.  ... 
dblp:conf/hotcloud/MaiHC15 fatcat:7kvzzfh3r5c3xicfoholbymnym

DS-FACTO: Doubly Separable Factorization Machines [article]

Parameswaran Raman, S.V.N. Vishwanathan
2020 arXiv   pre-print
Our solution is fully de-centralized and does not require the use of any parameter servers.  ...  Despite using a low-rank representation for the pairwise features, the memory overheads of using factorization machines on large-scale real-world datasets can be prohibitively high.  ...  [17] discusses how factorization machines can be used on relational data and scaled to large datasets.  ... 
arXiv:2004.13940v1 fatcat:utu5p6zfynajrgsumu3li2nhvi

Poseidon: An Efficient Communication Architecture for Distributed Deep Learning on GPU Clusters [article]

Hao Zhang, Zeyu Zheng, Shizhen Xu, Wei Dai, Qirong Ho, Xiaodan Liang, Zhiting Hu, Jinliang Wei, Pengtao Xie, Eric P. Xing
2017 arXiv   pre-print
Deep learning models can take weeks to train on a single GPU-equipped machine, necessitating scaling out DL training to a GPU-cluster.  ...  However, current distributed DL implementations can scale poorly due to substantial parameter synchronization over the network, because the high throughput of GPUs allows more data batches to be processed  ...  We thank the CMU Parallel Data Laboratory for their machine resources and Henggang Cui for insightful discussion. This research is supported by NSF Big Data IIS1447676 and NSF XPS Parallel CCF1629559.  ... 
arXiv:1706.03292v1 fatcat:bjlsi42zdze27dbkxkivv44ggq

Computing Web-scale Topic Models using an Asynchronous Parameter Server

Rolf Jagerman, Carsten Eickhoff, Maarten de Rijke
2017 Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR '17  
We present APS-LDA, which integrates state-of-the-art topic modeling with cluster computing frameworks such as Spark using a novel asynchronous parameter server.  ...  However, classical methods for inferring topic models do not scale up to the massive size of today's publicly available Web-scale data sets.  ...  Finally, there are two promising directions for future work: (1) Large-scale information retrieval tasks o en require machine learning methods such as factorization machines and deep learning, which are  ... 
doi:10.1145/3077136.3084135 dblp:conf/sigir/JagermanER17 fatcat:acilgidzindmljvhkqtgkv6fl4

Project Adam: Building an Efficient and Scalable Deep Learning Training System

Trishul M. Chilimbi, Yutaka Suzue, Johnson Apacible, Karthik Kalyanaraman
2014 USENIX Symposium on Operating Systems Design and Implementation  
We describe the design and implementation of a distributed system called Adam comprised of commodity server machines to train such models that exhibits world-class performance, scaling and task accuracy  ...  We also show that task accuracy improves with larger models.  ...  ACKNOWLEDGMENTS We would like to thank Patrice Simard for sharing his gradient descent toolkit code that we started with as a single machine reference implementation.  ... 
dblp:conf/osdi/ChilimbiSAK14 fatcat:u5jrmcprlzbf7eptvuup52cd5a

Evaluating the Performance and Power Consumption of Systems with Virtual Machines

Ricardo Lent
2011 2011 IEEE Third International Conference on Cloud Computing Technology and Science  
Virtualization allows multiple applications to run on different execution platforms, but sharing the same host machine.  ...  Parameter Comment System A System B I Idle power consumption 60.30 61.60 α C Scaling factor: core 25.70 12.82 α N Scaling factor: network port 0.66 1.79 α D Scaling factor: drive 7.21  ...  Virtual machines (and their corresponding virtualized application) are modeled with infinite server nodes (delay nodes).  ... 
doi:10.1109/cloudcom.2011.120 dblp:conf/cloudcom/Lent11 fatcat:d3zg7wxrzzajtkwdol6m2fzn5q

Scaling Distributed Machine Learning with the Parameter Server

Mu Li
2014 Proceedings of the 2014 International Conference on Big Data Science and Computing - BigDataScience '14  
We propose a parameter server framework for distributed machine learning problems.  ...  To demonstrate the scalability of the proposed framework, we show experimental results on petabytes of real data with billions of examples and parameters on problems ranging from Sparse Logistic Regression  ...  We run the parameter server with 90 virtual server nodes on 15 machines of a research cluster [40] (each has Acknowledgments: This work was supported in part by gifts and/or machine time from Google,  ... 
doi:10.1145/2640087.2644155 fatcat:l2lwr5t6undvroygn6ohatx3de

Impact of Elasticity on Cloud Systems

Pancham Baruah, Arti Mohanpurkar
2015 International Journal of Computer Applications  
Elasticity of cloud is very necessary as it allows the servers to resize the virtual machine deployed in the system and thereby fulfilling the requirement of new resources.  ...  With Cloud computing, the service providers can provide ondemand services to users as needed.  ...  The framework reads the latest state of the application along with the load on the servers and the performance parameters.  ... 
doi:10.5120/21296-4325 fatcat:ooa6qk7yjneetmk67tgs7bnaei

Large Scale Distributed Deep Networks

Jeffrey Dean, Greg Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Quoc V. Le, Mark Z. Mao, Marc'Aurelio Ranzato, Andrew W. Senior, Paul A. Tucker, Ke Yang, Andrew Y. Ng
2012 Neural Information Processing Systems  
In this paper, we consider the problem of training a deep network with billions of parameters using tens of thousands of CPU cores.  ...  We have developed a software framework called DistBelief that can utilize computing clusters with thousands of machines to train large models.  ...  Because DistBelief models are themselves partitioned across multiple machines, each machine needs to communicate with just the subset of parameter server shards that hold the model parameters relevant  ... 
dblp:conf/nips/DeanCMCDLMRSTYN12 fatcat:n2xrsnu6jvdannhqp54gm3gzse

Autonomic Provisioning with Self-Adaptive Neural Fuzzy Control for Percentile-Based Delay Guarantee

Palden Lama, Xiaobo Zhou
2013 ACM Transactions on Autonomous and Adaptive Systems  
NFC is a hybrid of control-theoretical and machine learning techniques. It is capable of self-constructing its structure and adapting its parameters through fast online learning.  ...  We demonstrate the feasibility and performance of the NFC-based approach with a testbed implementation in virtualized blade servers hosting a multi-tier online auction benchmark.  ...  ACKNOWLEDGMENTS The authors thank the National Institute of Science, Space and Security centers for providing blade server equipments for conducting the case study.  ... 
doi:10.1145/2491465.2491468 fatcat:kmeage4fajcotdb7buv26g5nba
« Previous Showing results 1 — 15 out of 95,225 results