Filters








284 Hits in 1.7 sec

Coiled: Dask as a Service

Matthew Rocklin
2021 Zenodo  
Pangeo Showcase seminar series talk. Coiled is a company that provides Dask as a service. In this talk we provide a short motivation for and introduction to the company and product, then give a live demo.
doi:10.5281/zenodo.4964489 fatcat:kayii7fworctvoeafoe53a4jge

Zarr in Pangeo [article]

Richard Signell, Alistair Miles, Ryan Abernathey, Joe Hamman, Matthew Rocklin
2019 Figshare  
This presentation was given in July 2019 at the Earth Science Information Partners (ESIP) Summer Meeting held in Tacoma, Washington.
doi:10.6084/m9.figshare.9701684.v1 fatcat:zk75mlwfg5aq3d2g34ohl6ex6m

On Clustering on Graphs with Multiple Edge Types [article]

Matthew Rocklin, Ali Pinar
2011 arXiv   pre-print
We study clustering on graphs with multiple edge types. Our main motivation is that similarities between objects can be measured in many different metrics. For instance similarity between two papers can be based on common authors, where they are published, keyword similarity, citations, etc. As such, graphs with multiple edges is a more accurate model to describe similarities between objects. Each edge/metric provides only partial information about the data; recovering full information requires
more » ... aggregation of all the similarity metrics. Clustering becomes much more challenging in this context, since in addition to the difficulties of the traditional clustering problem, we have to deal with a space of clusterings. We generalize the concept of clustering in single-edge graphs to multi-edged graphs and investigate problems such as: Can we find a clustering that remains good, even if we change the relative weights of metrics? How can we describe the space of clusterings efficiently? Can we find unexpected clusterings (a good clustering that is distant from all given clusterings)? If given the ground-truth clustering, can we recover how the weights for edge types were aggregated? %In this paper, we discuss these problems and the underlying algorithmic challenges and propose some solutions. We also present two case studies: one based on papers on Arxiv and one based on CIA World Factbook.
arXiv:1109.1605v1 fatcat:hhkh7zl4yjdxfguson3zcxl3w4

On Clustering on Graphs with Multiple Edge Types

Matthew Rocklin, Ali Pinar
2013 Internet Mathematics  
We study clustering on graphs with multiple edge types. Our main motivation is that similarities between objects can be measured by many different metrics. For instance, similarity between two papers can be based on common authors, where they were published, keyword similarity, citations, etc. As such, graphs with multiple edges give a more accurate model to describe similarities between objects than models using single-edge graphs. Each edge/metric provides only partial information about the
more » ... ta; recovering full information requires aggregation of all the similarity metrics. Clustering becomes much more challenging in this context, since in addition to the difficulties of the traditional clustering problem, we have to deal with a space of clusterings. Reducing the multidimensional space into a single dimension poses significant challenges. At the same time, the multidimensional space can contain latent structures, and searching this multidimensional space can reveal important information about the graph. We generalize the concept of clustering in single-edge graphs to multiedged graphs and investigate problems such as the following: Can we find a clustering that remains good, even if we change the relative weights of metrics? How can we describe the space of clusterings efficiently? Can we find unexpected clusterings (a good clustering that is distant from all given clusterings)? If we are given the ground-truth clustering, can we recover how the weights for edge types were aggregated?
doi:10.1080/15427951.2012.678191 fatcat:durcqj6lcjcddnzoi7vzzwwgga

Latent Clustering on Graphs with Multiple Edge Types [chapter]

Matthew Rocklin, Ali Pinar
2011 Lecture Notes in Computer Science  
We study clustering on graphs with multiple edge types. Our main motivation is that similarities between objects can be measured in many different metrics, and so allowing graphs with multivariate edges significantly increases modeling power. In this context the clustering problem becomes more challenging. Each edge/metric provides only partial information about the data; recovering full information requires aggregation of all the similarity metrics. We generalize the concept of clustering in
more » ... ngle-edge graphs to multi-edged graphs and discuss how this generates a space of clusterings. We describe a meta-clustering structure on this space and propose methods to compactly represent the metaclustering structure. Experimental results on real and synthetic data are presented.
doi:10.1007/978-3-642-21286-4_4 fatcat:n22twfsjvbewhetqqsbzjiecwa

Pangeo NSF Earthcube Proposal

Ryan Abernathey, Kevin Paul, Joe Hamman, Matthew Rocklin, Chiara Lepore, Michael Tippett, Naomi Henderson, Richard Seager, Ryan May, Davide Del Vento
2017 Figshare  
Rocklin.  ...  The creator of Dask, Matt Rocklin of Continuum Analytics, is a PI on this proposal. Dask can be used at either a high level or low level.  ... 
doi:10.6084/m9.figshare.5361094.v1 fatcat:lgj5vrhhnfa45haoj7kfizfgfi

Computing an Aggregate Edge-Weight Function for Clustering Graphs with Multiple Edge Types [chapter]

Matthew Rocklin, Ali Pinar
2010 Lecture Notes in Computer Science  
We investigate the community detection problem on graphs in the existence of multiple edge types. Our main motivation is that similarity between objects can be defined by many different metrics and aggregation of these metrics into a single one poses several important challenges, such as recovering this aggregation function from ground-truth, investigating the space of different clusterings, etc. In this paper, we address how to find an aggregation function to generate a composite metric that
more » ... st resonates with the ground-truth. We describe two approaches: solving an inverse problem where we try to find parameters that generate a graph whose clustering gives the ground-truth clustering, and choosing parameters to maximize the quality of the ground-truth clustering. We present experimental results on real and synthetic benchmarks.
doi:10.1007/978-3-642-18009-5_4 fatcat:7rjttfqrkraqxckcatfttkc7xm

AGU2018- IN53A-03: Pangeo and Binder: Scalable, shareable and reproducible scientific computing environments for the geosciences (Invited) [article]

Joseph Hamman, Ryan Abernathey, Chris Holdgraf, Yuvi Panda, Matthew Rocklin
2018 Figshare  
Abstract: Cloud computing and containerization offer a new paradigm for scientific research by providing a platform for scalable computing and frameworks that can be used to improve reproducibility. In this presentation, we will describe how Pangeo, a community driven effort for open-source big-data approaches in the geosciences, is enabling scalable cloud-based workflows using tools such as Kubernetes, Jupyter, Dask, and Xarray. We will also demonstrate how the Pangeo approach can be combined
more » ... ith data-proximate deployments of BinderHub, a tool that packages and deploys software onto a cloud-based JupyterHub, to make those scalable workflows easy to share and reproduce.
doi:10.6084/m9.figshare.7492661.v1 fatcat:aqxz3uxmsreotod2pspqbuspse

Computing an Aggregate Edge-Weight Function for Clustering Graphs with Multiple Edge Types [article]

Matthew Rocklin, Ali Pinar
2011 arXiv   pre-print
We investigate the community detection problem on graphs in the existence of multiple edge types. Our main motivation is that similarity between objects can be defined by many different metrics and aggregation of these metrics into a single one poses several important challenges, such as recovering this aggregation function from ground-truth, investigating the space of different clusterings, etc. In this paper, we address how to find an aggregation function to generate a composite metric that
more » ... st resonates with the ground-truth. We describe two approaches: solving an inverse problem where we try to find parameters that generate a graph whose clustering gives the ground-truth clustering, and choosing parameters to maximize the quality of the ground-truth clustering. We present experimental results on real and synthetic benchmarks.
arXiv:1103.0368v2 fatcat:ijqqrk54fbg43kuqotrrvkkuum

A Computational Framework for Uncertainty Quantification and Stochastic Optimization in Unit Commitment With Wind Power Generation

Emil M. Constantinescu, Victor M. Zavala, Matthew Rocklin, Sangmin Lee, Mihai Anitescu
2011 IEEE Transactions on Power Systems  
Matthew Rocklin is a Ph.D. computational mathematics student in the Computer Science Department at the University of Chicago.  ... 
doi:10.1109/tpwrs.2010.2048133 fatcat:cslwklj55nbt5jjrft7msgmj2u

SymPy: symbolic computing in Python

Aaron Meurer, Christopher P. Smith, Mateusz Paprocki, Ondřej Čertík, Sergey B. Kirpichev, Matthew Rocklin, AMiT Kumar, Sergiu Ivanov, Jason K. Moore, Sartaj Singh, Thilina Rathnayake, Sean Vig (+15 others)
2017 PeerJ Computer Science  
SymPy is an open source computer algebra system written in pure Python. It is built with a focus on extensibility and ease of use, through both interactive and programmatic applications. These characteristics have led SymPy to become a popular symbolic library for the scientific Python ecosystem. This paper presents the architecture of SymPy, a description of its features, and a discussion of select submodules. The supplementary material provide additional examples and further outline details of the architecture and features of SymPy.
doi:10.7717/peerj-cs.103 fatcat:f2mwkmqosrd5lepcej76cwalt4

Large-scale design and refinement of stable proteins using sequence-only models [article]

Jedediah M Singer, Scott Novotney, Devin Strickland, Hugh K Haddox, Nicholas Leiby, Gabriel J Rocklin, Cameron M Chow, Anindya Roy, Asim K Bera, Francis C Motta, Longxing Cao, Eva-Maria Strauch (+11 others)
2021 bioRxiv   pre-print
(2013) ; Rocklin et al. (2017) ).  ...  Library Name of dataset Source of designs Source of experimental data Library 1 Rocklin Rocklin et al., 2017 Rocklin et al., 2017 Library 2 Eva1 Linsky et al., 2021 This paper Eva2  ... 
doi:10.1101/2021.03.12.435185 fatcat:uxkfj2fuebb3xohix76b4dxfs4

Computational design of a synthetic PD-1 agonist

Cassie M. Bryan, Gabriel J. Rocklin, Matthew J. Bick, Alex Ford, Sonia Majri-Morrison, Ashley V. Kroll, Chad J. Miller, Lauren Carter, Inna Goreshnik, Alex Kang, Frank DiMaio, Kristin V. Tarbell (+1 others)
2021 Proceedings of the National Academy of Sciences of the United States of America  
Programmed cell death protein-1 (PD-1) expressed on activated T cells inhibits T cell function and proliferation to prevent an excessive immune response, and disease can result if this delicate balance is shifted in either direction. Tumor cells often take advantage of this pathway by overexpressing the PD-1 ligand PD-L1 to evade destruction by the immune system. Alternatively, if there is a decrease in function of the PD-1 pathway, unchecked activation of the immune system and autoimmunity can
more » ... result. Using a combination of computation and experiment, we designed a hyperstable 40-residue miniprotein, PD-MP1, that specifically binds murine and human PD-1 at the PD-L1 interface with a Kd of ∼100 nM. The apo crystal structure shows that the binder folds as designed with a backbone RMSD of 1.3 Å to the design model. Trimerization of PD-MP1 resulted in a PD-1 agonist that strongly inhibits murine T cell activation. This small, hyperstable PD-1 binding protein was computationally designed with an all-beta interface, and the trimeric agonist could contribute to treatments for autoimmune and inflammatory diseases.
doi:10.1073/pnas.2102164118 pmid:34272285 pmcid:PMC8307378 fatcat:7yp4oefx2zg2padq47z6h2jdh4

Deformation of Crystals: Connections with Statistical Physics

James P. Sethna, Matthew K. Bierbaum, Karin A. Dahmen, Carl P. Goodrich, Julia R. Greer, Lorien X. Hayden, Jaron P. Kent-Dobias, Edward D. Lee, Danilo B. Liarte, Xiaoyue Ni, Katherine N. Quinn, Archishman Raju (+3 others)
2017 Annual review of materials research (Print)  
We give a bird's-eye view of the plastic deformation of crystals aimed at the statistical physics community, and a broad introduction into the statistical theories of forced rigid systems aimed at the plasticity community. Memory effects in magnets, spin glasses, charge density waves, and dilute colloidal suspensions are discussed in relation to the onset of plastic yielding in crystals. Dislocation avalanches and complex dislocation tangles are discussed via a brief introduction to the
more » ... ization group and scaling. Analogies to emergent scale invariance in fracture, jamming, coarsening, and a variety of depinning transitions are explored. Dislocation dynamics in crystals challenges non equilibrium statistical physics. Statistical physics provides both cautionary tales of subtle memory effects in nonequilibrium systems, and systematic tools designed to address complex scale-invariant behavior on multiple length and time scales.
doi:10.1146/annurev-matsci-070115-032036 fatcat:llwsa3phljeedpshd25zxdom4q

Uncertainty Modeling with SymPy Stats

Matthew Rocklin
2012 Proceedings of the 11th Python in Science Conference   unpublished
We add a random variable type to a mathematical modeling language. We demonstrate through examples how this is a highly separable way to introduce uncertainty and produce and query stochastic models. We motivate the use of symbolics and thin compilers in scientific computing.
doi:10.25080/majora-54c7f2c8-009 fatcat:m53ycuaorbb3toihjp7ejq6ffe
« Previous Showing results 1 — 15 out of 284 results