Filters








730 Hits in 2.2 sec

Structured Differential Learning for Automatic Threshold Setting [article]

Jonathan Connell, Benjamin Herta
2018 arXiv   pre-print
Connell, Benjamin W. Herta IBM T.J.  ... 
arXiv:1808.00361v1 fatcat:bclkdigtxfh2tbzis3ey5kzcxm

M3R: Increased performance for in-memory Hadoop jobs [article]

Avraham Shinnar, David Cunningham, Benjamin Herta, Vijay Saraswat
2012 arXiv   pre-print
Main Memory Map Reduce (M3R) is a new implementation of the Hadoop Map Reduce (HMR) API targeted at online analytics on high mean-time-to-failure clusters. It does not support resilience, and supports only those workloads which can fit into cluster memory. In return, it can run HMR jobs unchanged -- including jobs produced by compilers for higher-level languages such as Pig, Jaql, and SystemML and interactive front-ends like IBM BigSheets -- while providing significantly better performance than
more » ... the Hadoop engine on several workloads (e.g. 45x on some input sizes for sparse matrix vector multiply). M3R also supports extensions to the HMR API which can enable Map Reduce jobs to run faster on the M3R engine, while not affecting their performance under the Hadoop engine.
arXiv:1208.4168v1 fatcat:lnsnqc2ak5adblsp4ojzctbfb4

M3R

Avraham Shinnar, David Cunningham, Vijay Saraswat, Benjamin Herta
2012 Proceedings of the VLDB Endowment  
Main Memory Map Reduce (M3R) is a new implementation of the Hadoop Map Reduce (HMR) API targeted at online analytics on high mean-time-to-failure clusters. It does not support resilience, and supports only those workloads which can fit into cluster memory. In return, it can run HMR jobs unchanged -including jobs produced by compilers for higher-level languages such as Pig, Jaql, and SystemML and interactive front-ends like IBM BigSheets -while providing significantly better performance than the
more » ... Hadoop engine on several workloads (e.g. 45x on some input sizes for sparse matrix vector multiply). M3R also supports extensions to the HMR API which can enable Map Reduce jobs to run faster on the M3R engine, while not affecting their performance under the Hadoop engine.
doi:10.14778/2367502.2367513 fatcat:bnh6bmmorfdstgcbmczabbo5tu

Resilient X10

David Cunningham, David Grove, Benjamin Herta, Arun Iyengar, Kiyokuni Kawachiya, Hiroki Murata, Vijay Saraswat, Mikio Takeuchi, Olivier Tardieu
2014 SIGPLAN notices  
Node failure is a reality on commodity clusters • Hardware failure • Memory errors, leaks, race conditions (including in the kernel) • Evictions • Evidence: Popularity of Hadoop Ignoring failures causes serial MTBF aggregation: 24 hour run, 1000 nodes, 6 month node MTBF => under 1% success rate Transparent checkpointing causes significant overhead. Failure awareness © 2014 IBM Corporation Resilient X10 Overview 3 Provide helpful semantics: •Failure reporting •Continuing execution on unaffected
more » ... odes •Preservation of synchronization: HBI principle (described later) Application-level failure recovery, use domain knowledge •If the computation is approximate: trade accuracy for reliability (e.g. Rinard, ICS06) •If the computation is repeatable: replay it •If lost data is unmodified: reload it •If data is mutated: checkpoint it •Libraries can hide, abstract, or expose faults (e.g. containment domains) •Can capture common patterns (e.g. map reduce) via application frameworks No changes to the language, substantial changes to the runtime implementation •Use exceptions to report failure •Existing exception semantics give strong synchronization guarantees Performance is within 90% of non-resilient X10 Kernel found in a number of algorithms, e.g. GNMF, Page Rank, ... An N*N sparse (0.1%) matrix, G, multiplied by a 1xN dense vector V Resulting vector used as V in the next iteration. Matrix block size is 1000x1000, matrix is double precision G distributed into row blocks. Every place starts with entire V, computes fragment of V'. Every place communicates fragments of V to place 0 to be aggregated. New V broadcast from place 0 for next iteration (G is never modified).
doi:10.1145/2692916.2555248 fatcat:up5khhdg2rahdnne3zwcw752qe

NGS read classification using AI

Benjamin Voigt, Oliver Fischer, Christian Krumnow, Christian Herta, Piotr Wojciech Dabrowski, Yanbin Yin
2021 PLoS ONE  
Software: Benjamin Voigt, Oliver Fischer, Christian Krumnow. Supervision: Christian Herta, Piotr Wojciech Dabrowski.  ...  Writing -original draft: Benjamin Voigt, Oliver Fischer, Christian Krumnow, Christian Herta, Piotr Wojciech Dabrowski.  ... 
doi:10.1371/journal.pone.0261548 pmid:34936673 pmcid:PMC8694450 fatcat:vvywhxwklvacfebaze2d3nnnpi

GLB: Lifeline-based Global Load Balancing library in X10 [article]

Wei Zhang, Olivier Tardieu, David Grove, Benjamin Herta, Tomio Kamada, Vijay Saraswat, Mikio Takeuchi
2013 arXiv   pre-print
We present GLB, a programming model and an associated implementation that can handle a wide range of irregular paral- lel programming problems running over large-scale distributed systems. GLB is applicable both to problems that are easily load-balanced via static scheduling and to problems that are hard to statically load balance. GLB hides the intricate syn- chronizations (e.g., inter-node communication, initialization and startup, load balancing, termination and result collection) from the
more » ... ers. GLB internally uses a version of the lifeline graph based work-stealing algorithm proposed by Saraswat et al. Users of GLB are simply required to write several pieces of sequential code that comply with the GLB interface. GLB then schedules and orchestrates the parallel execution of the code correctly and efficiently at scale. We have applied GLB to two representative benchmarks: Betweenness Centrality (BC) and Unbalanced Tree Search (UTS). Among them, BC can be statically load-balanced whereas UTS cannot. In either case, GLB scales well-- achieving nearly linear speedup on different computer architectures (Power, Blue Gene/Q, and K) -- up to 16K cores.
arXiv:1312.5691v1 fatcat:jdt65j6mbbg4bchrqdzvingzgu

X10 and APGAS at Petascale

Olivier Tardieu, Benjamin Herta, David Cunningham, David Grove, Prabhanjan Kambadur, Vijay Saraswat, Avraham Shinnar, Mikio Takeuchi, Mandana Vaziri
2014 SIGPLAN notices  
doi:10.1145/2692916.2555245 fatcat:l57fpwlkyjciti4frjljsljhs4

SatX10: A Scalable Plug&Play Parallel SAT Framework [chapter]

Bard Bloom, David Grove, Benjamin Herta, Ashish Sabharwal, Horst Samulowitz, Vijay Saraswat
2012 Lecture Notes in Computer Science  
We propose a framework for SAT researchers to conveniently try out new ideas in the context of parallel SAT solving without the burden of dealing with all the underlying system issues that arise when implementing a massively parallel algorithm. The framework is based on the parallel execution language X10, and allows the parallel solver to easily run on both a single machine with multiple cores and across multiple machines, sharing information such as learned clauses.
doi:10.1007/978-3-642-31612-8_38 fatcat:cozucflfgjhj3amyurhclgmvkq

NeuNetS: An Automated Synthesis Engine for Neural Network Design [article]

Atin Sood, Benjamin Elder, Benjamin Herta, Chao Xue, Costas Bekas, A. Cristiano I. Malossi, Debashish Saha, Florian Scheidegger, Ganesh Venkataraman, Gegi Thomas, Giovanni Mariani, Hendrik Strobelt (+8 others)
2019 arXiv   pre-print
Application of neural networks to a vast variety of practical applications is transforming the way AI is applied in practice. Pre-trained neural network models available through APIs or capability to custom train pre-built neural network architectures with customer data has made the consumption of AI by developers much simpler and resulted in broad adoption of these complex AI models. While prebuilt network models exist for certain scenarios, to try and meet the constraints that are unique to
more » ... ch application, AI teams need to think about developing custom neural network architectures that can meet the tradeoff between accuracy and memory footprint to achieve the tight constraints of their unique use-cases. However, only a small proportion of data science teams have the skills and experience needed to create a neural network from scratch, and the demand far exceeds the supply. In this paper, we present NeuNetS : An automated Neural Network Synthesis engine for custom neural network design that is available as part of IBM's AI OpenScale's product. NeuNetS is available for both Text and Image domains and can build neural networks for specific tasks in a fraction of the time it takes today with human effort, and with accuracy similar to that of human-designed AI models.
arXiv:1901.06261v1 fatcat:w4e4celkwfawpdu7vlwgcnzrqe

GLB

Wei Zhang, Olivier Tardieu, David Grove, Benjamin Herta, Tomio Kamada, Vijay Saraswat, Mikio Takeuchi
2014 Proceedings of the first workshop on Parallel programming for analytics applications - PPAA '14  
We present GLB, a programming model and an associated implementation that can handle a wide range of irregular parallel programming problems running over large-scale distributed systems. GLB is applicable both to problems that are easily load-balanced via static scheduling and to problems that are hard to statically load balance. GLB hides the intricate synchronizations (e.g., inter-node communication, initialization and startup, load balancing, termination and result collection) from the
more » ... GLB internally uses a version of the lifeline graph based work-stealing algorithm proposed by Saraswat et al [25] . Users of GLB are simply required to write several pieces of sequential code that comply with the GLB interface. GLB then schedules and orchestrates the parallel execution of the code correctly and efficiently at scale. We have applied GLB to two representative benchmarks: Betweenness Centrality (BC) and Unbalanced Tree Search (UTS). Among them, BC can be statically load-balanced whereas UTS cannot. In either case, GLB scales wellachieving nearly linear speedup on different computer architectures (Power, Blue Gene/Q, and K) -up to 16K cores.
doi:10.1145/2567634.2567639 dblp:conf/ppopp/ZhangTGHKST14 fatcat:e2qr3awy7zfqnfjwslnks7khku

Dependability in a Multi-tenant Multi-framework Deep Learning as-a-Service Platform [article]

Scott Boag, Parijat Dube, Kaoutar El Maghraoui, Benjamin Herta, Waldemar Hummer, K. R. Jayaram, Rania Khalaf, Vinod Muthusamy, Michael Kalantar, Archit Verma
2018 arXiv   pre-print
Deep learning (DL), a form of machine learning, is becoming increasingly popular in several application domains. As a result, cloud-based Deep Learning as a Service (DLaaS) platforms have become an essential infrastructure in many organizations. These systems accept, schedule, manage and execute DL training jobs at scale. This paper explores dependability in the context of a DLaaS platform used in IBM. We begin by explaining how DL training workloads are different, and what features ensure
more » ... dability in this context. We then describe the architecture, design and implementation of a cloud-based orchestration system for DL training. We show how this system has been architected with dependability in mind while also being horizontally scalable, elastic, flexible and efficient. We also present an initial empirical evaluation of the overheads introduced by our platform, and discuss tradeoffs between efficiency and dependability.
arXiv:1805.06801v1 fatcat:pstnbqbpk5gonfqttjlkp5ji7a

Resilient X10

David Cunningham, David Grove, Benjamin Herta, Arun Iyengar, Kiyokuni Kawachiya, Hiroki Murata, Vijay Saraswat, Mikio Takeuchi, Olivier Tardieu
2014 Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP '14  
Node failure is a reality on commodity clusters • Hardware failure • Memory errors, leaks, race conditions (including in the kernel) • Evictions • Evidence: Popularity of Hadoop Ignoring failures causes serial MTBF aggregation: 24 hour run, 1000 nodes, 6 month node MTBF => under 1% success rate Transparent checkpointing causes significant overhead. Failure awareness © 2014 IBM Corporation Resilient X10 Overview 3 Provide helpful semantics: •Failure reporting •Continuing execution on unaffected
more » ... odes •Preservation of synchronization: HBI principle (described later) Application-level failure recovery, use domain knowledge •If the computation is approximate: trade accuracy for reliability (e.g. Rinard, ICS06) •If the computation is repeatable: replay it •If lost data is unmodified: reload it •If data is mutated: checkpoint it •Libraries can hide, abstract, or expose faults (e.g. containment domains) •Can capture common patterns (e.g. map reduce) via application frameworks No changes to the language, substantial changes to the runtime implementation •Use exceptions to report failure •Existing exception semantics give strong synchronization guarantees Performance is within 90% of non-resilient X10 Kernel found in a number of algorithms, e.g. GNMF, Page Rank, ... An N*N sparse (0.1%) matrix, G, multiplied by a 1xN dense vector V Resulting vector used as V in the next iteration. Matrix block size is 1000x1000, matrix is double precision G distributed into row blocks. Every place starts with entire V, computes fragment of V'. Every place communicates fragments of V to place 0 to be aggregated. New V broadcast from place 0 for next iteration (G is never modified).
doi:10.1145/2555243.2555248 dblp:conf/ppopp/CunninghamGHIKMSTT14 fatcat:zumxyvkjhneztervvth7hjokou

Stronger Pharmacological Cortisol Suppression and Anticipatory Cortisol Stress Response in Transient Global Amnesia

Martin Griebe, Frauke Nees, Benjamin Gerber, Anne Ebert, Herta Flor, Oliver T. Wolf, Achim Gass, Michael G. Hennerici, Kristina Szabo
2015 Frontiers in Behavioral Neuroscience  
† Martin Griebe and Frauke Nees have contributed equally to this work. Transient global amnesia (TGA) is a disorder characterized by a sudden attack of severe anterograde memory disturbance that is frequently preceded by emotional or physical stress and resolves within 24 h. By using MRI following the acute episode in TGA patients, small lesions in the hippocampus have been observed. Hence, it has been hypothesized that the disorder is caused by a stress-related transient inhibition of memory
more » ... rmation in the hippocampus. To study the factors that may link stress and TGA, we measured the cortisol day-profile, the dexamethasone feedback inhibition and the effect of experimental exposure to stress on cortisol levels (using the socially evaluated cold pressor test and a control procedure) in 20 patients with a recent history of TGA and in 20 healthy controls. We used self-report scales of depression, anxiety and stress, and a detailed neuropsychological assessment to characterize our collective. We did not observe differences in mean cortisol levels in the cortisol day-profile between the two groups. After administration of low-dose dexamethasone,TGA patients showed significantly stronger cortisol suppression in the daytime profile compared to the control group (p = 0.027). The mean salivary cortisol level was significantly higher in the TGA group prior to and after the experimental stress exposure (p = 0.008 and 0.010 respectively), as well as prior to and after the control condition (p = 0.022 and 0.024, respectively). The TGA group had higher scores of depressive symptomatology (p = 0.021) and anxiety (p = 0.007), but the groups did not differ in the neuropsychological assessment. Our findings of a stronger pharmacological suppression and higher cortisol levels in anticipation of experimental stress in participants with a previous TGA indicate a hypersensitivity of the HPA axis. This suggests that an individual stress sensitivity might play a role in the pathophysiology of TGA.
doi:10.3389/fnbeh.2015.00063 pmid:25805980 pmcid:PMC4353300 fatcat:jcymklfsg5bbnlo3z5qecnl2e4

X10 and APGAS at Petascale

Olivier Tardieu, Benjamin Herta, David Cunningham, David Grove, Prabhanjan Kambadur, Vijay Saraswat, Avraham Shinnar, Mikio Takeuchi, Mandana Vaziri
2014 Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP '14  
doi:10.1145/2555243.2555245 dblp:conf/ppopp/TardieuHCGKSSTV14 fatcat:kyw5snbvu5abjlk2eehtdx74ge

Expression Profiling and Glycan Engineering of IgG Subclass 1–4 in Nicotiana benthamiana

Somanath Kallolimath, Thomas Hackl, Raphaela Gahn, Clemens Grünwald-Gruber, Wilhelm Zich, Benjamin Kogelmann, Anja Lux, Falk Nimmerjahn, Herta Steinkellner
2020 Frontiers in Bioengineering and Biotechnology  
IgG, the main serum immunoglobulin isotype, exists in four subclasses which selectively appear with distinctive glycosylation profiles. However, very little is known about the biological consequences mainly due to the difficulties in the generation of distinct IgG subtypes with targeted glycosylation. Here, we show a comprehensive expression and glycan modulation profiling of IgG variants in planta that are identical in their antigen binding domain but differ in their subclass appearance. While
more » ... IgG1, 2, and 4 exhibit similar expression levels and purification yields, IgG3 is generated only at low levels due to the in planta degradation of the heavy chain. All IgG subtypes are produced with four distinct N-glycosylation profiles, differing in sugar residues previously shown to impact IgG activities, i.e., galactosylation, sialylation and core fucosylation. Affinity purified IgG variants are shown to be fully assembled to heterodimers but display different biochemical/physical features. All subtypes are equally well amenable to targeted glycosylation, except sialylated IgG4 which frequently accumulates substantial fractions of unusual oligo-mannosidic structures. IgG variants show significant differences in aggregate formation and endotoxin contamination which are eliminated by additional polishing steps (size exclusion chromatography, endotoxin removal treatments). Collectively we demonstrate the generation of 16 IgG variants at high purity and large glycan homogeneity which constitute an excellent toolbox to further study the biological impact of the two main Fc features, subclass and glycosylation.
doi:10.3389/fbioe.2020.00825 pmid:32793574 pmcid:PMC7393800 fatcat:rols6kb72va2teluxrfgc4gpii
« Previous Showing results 1 — 15 out of 730 results