Filters








12,058 Hits in 8.2 sec

Toward Understanding Soft Faults in High Performance Cluster Networks [chapter]

Jeffrey J. Evans, Seongbok Baik, Cynthia S. Hood, William Gropp
2003 Integrated Network Management VIII  
Fault management in high performance cluster networks has been focused on the notion of hard faults (i.e., link or node failures).  ...  Network degradations that negatively impact performance but do not result in failures often go unnoticed. In this paper, we classify such degradations as soft faults.  ...  Acknowledgments This work was supported in part by the U.S. Department of Energy, under Contract W-31-109-Eng-38 and NSF 9984811.  ... 
doi:10.1007/978-0-387-35674-7_14 fatcat:h7doiiiil5cg7goigkeycysouy

Toward understanding soft faults in high performance cluster networks

J.R.J. Evans, Seongbok Baik, C.S. Hood, W. Gropp
IFIP/IEEE Eighth International Symposium on Integrated Network Management, 2003.  
Fault management in high performance cluster networks has been focused on the notion of hard faults (i.e., link or node failures).  ...  Network degradations that negatively impact performance but do not result in failures often go unnoticed. In this paper, we classify such degradations as soft faults.  ...  Acknowledgments This work was supported in part by the U.S. Department of Energy, under Contract W-31-109-Eng-38 and NSF 9984811.  ... 
doi:10.1109/inm.2003.1194169 fatcat:4utktc7esjef5heueaaprzn5fu

A Heuristic Approach Applied to Time Reversal MUSIC Method for Soft Fault Location in Noisy Transmission Line Networks

M. Kafal, A. Cozza
2019 2019 PhotonIcs & Electromagnetics Research Symposium - Spring (PIERS-Spring)  
Time-Reversal multiple signal classification (TR-MUSIC) has emerged as a promising technique to locate multiple soft faults in complex wire networks, thanks to its location accuracy and sub-millimeter  ...  Accordingly, we will study in this paper the effect noise can bring on the fault location accuracy in different complex wire networks.  ...  This state of affair worsens whenever soft faults occur within multiple-branch networks.  ... 
doi:10.1109/piers-spring46901.2019.9017914 fatcat:z4od6fho2nailfw7uvar4ilhxy

Quantifying the performability of cluster-based services

Kiran Nagaraja, G. Gama, R. Bianchini, R.P. Martin, W. Meira, T.D. Nguyen
2005 IEEE Transactions on Parallel and Distributed Systems  
In particular, we evaluate the performance and availability of three soft state maintenance strategies for an online bookstore service in the presence of seven classes of faults.  ...  In this paper, we propose a two-phase methodology for systematically evaluating the performability (performance and availability) of cluster-based Internet services.  ...  In terms of performance, we indeed observe that storing the soft state in the database produces lower throughput under high load.  ... 
doi:10.1109/tpds.2005.61 fatcat:rrsacvxpabge3gtkht2ohhsvhu

EBSCN: An Error Backtracking Method for Soft Errors Based on Clustering and a Neural Network

Nan Zhang, Jianjun Xu, Xiankai Meng, Qingping Tan
2019 IEEE Access  
INDEX TERMS Error backtracking, fault tolerant, hierarchical clustering, neural networks, soft error, reliability. 147266 This work is licensed under a Creative Commons Attribution 4.0 License.  ...  The EBSCN method includes a feature extraction method based on clustering and a feature analysis method based on a deep neural network.  ...  Therefore, in this paper, we present an error backtracking method for soft errors based on clustering and neural network (EBSCN).  ... 
doi:10.1109/access.2019.2947005 fatcat:gdze52g6cbbdvbzvwurdl4yjsq

Enhancing the Analysis of Software Failures in Cloud Computing Systems with Deep Learning [article]

Domenico Cotroneo, Luigi De Simone, Pietro Liguori, Roberto Natella
2021 arXiv   pre-print
The results show that the performance of the proposed approach, in terms of purity of clusters, is comparable to, or in some cases even better than manually fine-tuned clustering, thus avoiding the need  ...  In all cases, the proposed approach provides better performance than unsupervised clustering when no feature engineering is applied to the data.  ...  We are grateful to Gabriella Karamanolis for her help in the early stage of this work.  ... 
arXiv:2106.15182v1 fatcat:n77ysqsxybhybnuomnre5edsv4

A guided tour of data-center networking

Dennis Abts, Bob Felderman
2012 Communications of the ACM  
Large-scale parallel computers are grounded in HPC (high-performance computing) where kilo-processor Each cluster is homogeneous in both the processor type and speed.  ...  As a result, a data-center cluster may use virtualization for both performance and fault isolation, and Web applications are programmed with this sharing in mind.  ...  He then helped found Myricom, which became a leader in cluster-computing networking technology. after seven years there, he moved to Packet design where he applied high-performance computing ideas to the  ... 
doi:10.1145/2184319.2184335 fatcat:lqw2xn2dcrd3vpcrcb3bmceghq

A Guided Tour through Data-center Networking

Dennis Abts, Bob Felderman
2012 Queue  
In recent years, Ethernet networks have made significant progress toward bridging the performance and scalability gap between capacity-oriented clusters built using COTS (commodityoff-the-shelf) components  ...  While the network plays a central role in the overall system performance, it typically represents only 10-15 percent of the cluster cost.  ...  Alternatively, implementing QoS (quality of service) policies to segregate traffic into distinct classes and provide performance isolation and high-level traffic engineering is a step toward ensuring that  ... 
doi:10.1145/2208917.2208919 fatcat:6qdyiq7kufckrh5tmfuxvoirwm

Toward a Multi-Hop, Multi-Path Fault-Tolerant and Load Balancing Hierarchical Routing Protocol for Wireless Sensor Network

Mokhtar Beldjehem
2013 Wireless Sensor Network  
This paper describes a novel energy-aware multi-hop cluster-based fault-tolerant load balancing hierarchical routing protocol for a self-organizing wireless sensor network (WSN), which takes into account  ...  The main idea is using hierarchical fuzzy soft clusters enabling non-exclusive overlapping clusters, thus allowing partial multiple membership of a node to more than one cluster, whereby for each cluster  ...  Energy efficiency is often gained by accepting a reduction in network performance.  ... 
doi:10.4236/wsn.2013.511025 fatcat:lrbbzcljjfhatjfr4hj63jal6y

State maintenance and its impact on the performability of multi-tiered Internet services

G. Gamat, K. Nagaraja, R. Bianchini, R.R. Martin, W. Meira, T.D. Nguyen
2004 Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004.  
In this paper, we evaluate the performance, availability, and combined performability of four soft state maintenance strategies in two multi-tier Internet services, an online book store and an auction  ...  Based on these results, we conclude that service designers need to provision the cluster and balance the load with availability and cost, as well as performance, in mind.  ...  They focused on demonstrating high performance and the ability of the tier maintaining the soft state to tolerate faults.  ... 
doi:10.1109/reldis.2004.1353015 dblp:conf/srds/GamaNBMMN04 fatcat:tr73urklozgntjzqfmav6ejfvi

A Survey on Fault Diagnosis in Wireless Sensor Networks

Zeyu Zhang, Amjad Mehmood, Lei Shu, Zhiqiang Huo, Yu Zhang, Mithun Mukherjee
2018 IEEE Access  
To improve data quality, shorten response time, strengthen network security, and prolong network lifespan, many studies have focused on fault diagnosis.  ...  Wireless sensor networks (WSNs) often consist of hundreds of sensor nodes that may be deployed in relatively harsh and complex environments.  ...  In order to quickly understand the current state of the literature, we present the most important fault diagnosis terms in Fig. 2 . B.  ... 
doi:10.1109/access.2018.2794519 fatcat:gtus35aotrd65ptpkxicw75j4a

A Survey on Fault Tolerance BasedClustering Evolution in WSN

Hitesh Mohapatra, Amiya Kumar Rath
2020 IET Networks  
In this line of thought, clustering has been proven as an efficient strategy for prolonging the sensor network lifetime and selecting the correct topological structure for the sensor network.  ...  In our survey work, we focus on low R e -based fault and its related fault-tolerant algorithms. The preferences of the clustering approach-based fault are illustrated in Fig. 2.  ...  Our study on 30 selected algorithms provided us an evolutionary approach of clustering towards fault tolerance approach.  ... 
doi:10.1049/iet-net.2019.0155 fatcat:syvobur3czbejem6rmgumzheuq

A cloud middleware for assuring performance and high availability of soft real-time applications

Kyoungho An, Shashank Shekhar, Faruk Caglar, Aniruddha Gokhale, Shivakumar Sastry
2014 Journal of systems architecture  
First, it describes an architecture for a fault-tolerant framework that can be used to automatically deploy replicas of virtual machines in data centers in a way that optimizes resources while assuring  ...  Second, it describes the design of a pluggable framework within the fault-tolerant architecture that enables plugging in different placement algorithms for VM replica deployment.  ...  Lately, however, a class of soft real-time applications that demand both high availability and predictable response times are moving towards cloud-based hosting [2, 3, 4] .  ... 
doi:10.1016/j.sysarc.2014.01.009 fatcat:h6umd3tgyfcsjda34j7zeg3vlu

A Survey on Proactive, Active and Passive Fault Diagnosis Protocols for WSNs: Network Operation Perspective

Amjad Mehmood, Nabil Alrajeh, Mithun Mukherjee, Salwani Abdullah, Houbing Song
2018 Sensors  
Although wireless sensor networks (WSNs) have been the object of research focus for the past two decades, fault diagnosis in these networks has received little attention.  ...  In addition to illuminating the details of past efforts, this survey also reveals new research challenges and strengthens our understanding of the field of fault diagnosis.  ...  Generally, two types of faults are detected while performing fault diagnosis operations in a network: permanent faults and soft faults.The following subsection describes how to address both.  ... 
doi:10.3390/s18061787 pmid:29865210 pmcid:PMC6021939 fatcat:342xuyhfbrhfzfejptp6odhcqa

Resolving Wireless Sensor Networks Issues using Machine Learning Techniques: A Review

Harshitha S
2021 International Journal for Research in Applied Science and Engineering Technology  
., the performance of WSN will change dynamically, and therefore it requires depreciating dispensable redesign of the network.  ...  In this paper, Machine learning techniques for solving various issues in WSN are presented; we discussed machine learning techniques for anomaly, fault, and event detection.  ...  To find optimal cluster heads for routing the data towards to base station in WSN, k -means clustering is the simplest clustering which will be more useful.  ... 
doi:10.22214/ijraset.2021.37085 fatcat:tb4uujt5pbgftintfjbkezncfy
« Previous Showing results 1 — 15 out of 12,058 results