Filters








50 Hits in 0.81 sec

Do CIFAR-10 Classifiers Generalize to CIFAR-10? [article]

Benjamin Recht, Rebecca Roelofs, Ludwig Schmidt, Vaishaal Shankar
2018 arXiv   pre-print
Acknowledgements Benjamin Recht and Vaishaal Shankar are supported by ONR award N00014-17-1-2502.  ... 
arXiv:1806.00451v1 fatcat:vluhmz6sdbd6vpnwvrwuuehe5u

numpywren: serverless linear algebra [article]

Vaishaal Shankar, Karl Krauth, Qifan Pu, Eric Jonas, Shivaram Venkataraman, Ion Stoica, Benjamin Recht, Jonathan Ragan-Kelley
2018 arXiv   pre-print
Linear algebra operations are widely used in scientific computing and machine learning applications. However, it is challenging for scientists and data analysts to run linear algebra at scales beyond a single machine. Traditional approaches either require access to supercomputing clusters, or impose configuration and cluster management challenges. In this paper we show how the disaggregation of storage and compute resources in so-called "serverless" environments, combined with compute-intensive
more » ... workload characteristics, can be exploited to achieve elastic scalability and ease of management. We present numpywren, a system for linear algebra built on a serverless architecture. We also introduce LAmbdaPACK, a domain-specific language designed to implement highly parallel linear algebra algorithms in a serverless setting. We show that, for certain linear algebra algorithms such as matrix multiply, singular value decomposition, and Cholesky decomposition, numpywren's performance (completion time) is within 33% of ScaLAPACK, and its compute efficiency (total CPU-hours) is up to 240% better due to elasticity, while providing an easier to use interface and better fault tolerance. At the same time, we show that the inability of serverless runtimes to exploit locality across the cores in a machine fundamentally limits their network efficiency, which limits performance on other algorithms such as QR factorization. This highlights how cloud providers could better support these types of computations through small changes in their infrastructure.
arXiv:1810.09679v1 fatcat:lxv65s42rzd6vomkg2pff2akhm

Neural Kernels Without Tangents [article]

Vaishaal Shankar, Alex Fang, Wenshuo Guo, Sara Fridovich-Keil, Ludwig Schmidt, Jonathan Ragan-Kelley, Benjamin Recht
2020 arXiv   pre-print
We investigate the connections between neural networks and simple building blocks in kernel space. In particular, using well established feature space tools such as direct sum, averaging, and moment lifting, we present an algebra for creating "compositional" kernels from bags of features. We show that these operations correspond to many of the building blocks of "neural tangent kernels (NTK)". Experimentally, we show that there is a correlation in test error between neural network architectures
more » ... and the associated kernels. We construct a simple neural network architecture using only 3x3 convolutions, 2x2 average pooling, ReLU, and optimized with SGD and MSE loss that achieves 96% accuracy on CIFAR10, and whose corresponding compositional kernel achieves 90% accuracy. We also use our constructions to investigate the relative performance of neural networks, NTKs, and compositional kernels in the small dataset regime. In particular, we find that compositional kernels outperform NTKs and neural networks outperform both kernel methods.
arXiv:2003.02237v2 fatcat:gmynxkneszb2pbbtpyq3qehryu

Serverless Straggler Mitigation using Local Error-Correcting Codes [article]

Vipul Gupta, Dominic Carrano, Yaoqing Yang, Vaishaal Shankar, Thomas Courtade, Kannan Ramchandran
2020 arXiv   pre-print
Inexpensive cloud services, such as serverless computing, are often vulnerable to straggling nodes that increase end-to-end latency for distributed computation. We propose and implement simple yet principled approaches for straggler mitigation in serverless systems for matrix multiplication and evaluate them on several common applications from machine learning and high-performance computing. The proposed schemes are inspired by error-correcting codes and employ parallel encoding and decoding
more » ... r the data stored in the cloud using serverless workers. This creates a fully distributed computing framework without using a master node to conduct encoding or decoding, which removes the computation, communication and storage bottleneck at the master. On the theory side, we establish that our proposed scheme is asymptotically optimal in terms of decoding time and provide a lower bound on the number of stragglers it can tolerate with high probability. Through extensive experiments, we show that our scheme outperforms existing schemes such as speculative execution and other coding theoretic methods by at least 25%.
arXiv:2001.07490v1 fatcat:ptbzh4ld3jezphosqkylgpadni

Do ImageNet Classifiers Generalize to ImageNet? [article]

Benjamin Recht, Rebecca Roelofs, Ludwig Schmidt, Vaishaal Shankar
2019 arXiv   pre-print
We build new test sets for the CIFAR-10 and ImageNet datasets. Both benchmarks have been the focus of intense research for almost a decade, raising the danger of overfitting to excessively re-used test sets. By closely following the original dataset creation processes, we test to what extent current classification models generalize to new data. We evaluate a broad range of models and find accuracy drops of 3% - 15% on CIFAR-10 and 11% - 14% on ImageNet. However, accuracy gains on the original
more » ... st sets translate to larger gains on the new test sets. Our results suggest that the accuracy drops are not caused by adaptivity, but by the models' inability to generalize to slightly "harder" images than those found in the original test sets.
arXiv:1902.10811v2 fatcat:d32xzqv56fbxfh26omlfy6g4bi

Convolutional Kitchen Sinks for Transcription Factor Binding Site Prediction [article]

Alyssa Morrow, Vaishaal Shankar, Devin Petersohn, Anthony Joseph, Benjamin Recht, Nir Yosef
2017 arXiv   pre-print
We present a simple and efficient method for prediction of transcription factor binding sites from DNA sequence. Our method computes a random approximation of a convolutional kernel feature map from DNA sequence and then learns a linear model from the approximated feature map. Our method outperforms state-of-the-art deep learning methods on five out of six test datasets from the ENCODE consortium, while training in less than one eighth the time.
arXiv:1706.00125v1 fatcat:36buw2z3rvao7deae3a62nxzke

Predicting with Confidence on Unseen Distributions [article]

Devin Guillory, Vaishaal Shankar, Sayna Ebrahimi, Trevor Darrell, Ludwig Schmidt
2021 arXiv   pre-print
Recent work has shown that the performance of machine learning models can vary substantially when models are evaluated on data drawn from a distribution that is close to but different from the training distribution. As a result, predicting model performance on unseen distributions is an important challenge. Our work connects techniques from domain adaptation and predictive uncertainty literature, and allows us to predict model accuracy on challenging unseen distributions without access to
more » ... d data. In the context of distribution shift, distributional distances are often used to adapt models and improve their performance on new domains, however accuracy estimation, or other forms of predictive uncertainty, are often neglected in these investigations. Through investigating a wide range of established distributional distances, such as Frechet distance or Maximum Mean Discrepancy, we determine that they fail to induce reliable estimates of performance under distribution shift. On the other hand, we find that the difference of confidences (DoC) of a classifier's predictions successfully estimates the classifier's performance change over a variety of shifts. We specifically investigate the distinction between synthetic and natural distribution shifts and observe that despite its simplicity DoC consistently outperforms other quantifications of distributional difference. DoC reduces predictive error by almost half (46%) on several realistic and challenging distribution shifts, e.g., on the ImageNet-Vid-Robust and ImageNet-Rendition datasets.
arXiv:2107.03315v2 fatcat:ewlcswqpljfmjmsranrap6dkw4

Ground Control to Major Tom: the importance of field surveys in remotely sensed data analysis [article]

Ian Bolliger, Jonathan Kadish , Vaishaal Shankar
2017 arXiv   pre-print
In this project, we build a modular, scalable system that can collect, store, and process millions of satellite images. We test the relative importance of both of the key limitations constraining the prevailing literature by applying this system to a data-rich environment. To overcome classic data availability concerns, and to quantify their implications in an economically meaningful context, we operate in a data rich environment and work with an outcome variable directly correlated with key
more » ... icators of socioeconomic well-being. We collect public records of sale prices of homes within the United States, and then gradually degrade our rich sample in a range of different ways which mimic the sampling strategies employed in actual survey-based datasets. Pairing each house with a corresponding set of satellite images, we use image-based features to predict housing prices within each of these degraded samples. To generalize beyond any given featurization methodology, our system contains an independent featurization module, which can be interchanged with any preferred image classification tool. Our initial findings demonstrate that while satellite imagery can be used to predict housing prices with considerable accuracy, the size and nature of the ground truth sample is a fundamental determinant of the usefulness of imagery for this category of socioeconomic prediction. We quantify the returns to improving the distribution and size of observed data, and show that the image classification method is a second-order concern. Our results provide clear guidance for the development of adaptive sampling strategies in data-sparse locations where satellite-based metrics may be integrated with standard survey data, while also suggesting that advances from image classification techniques for satellite imagery could be further augmented by more robust sampling strategies.
arXiv:1710.09342v1 fatcat:onko3dislrajrlqnubppfdtasm

A Generalizable and Accessible Approach to Machine Learning with Global Satellite Imagery [article]

Esther Rolf, Jonathan Proctor, Tamma Carleton, Ian Bolliger, Vaishaal Shankar, Miyabi Ishihara, Benjamin Recht, Solomon Hsiang
2020 arXiv   pre-print
[57] Ian Bolliger, Tamma Carleton, Solomon Hsiang, Jonathan Kadish, Jonathan Proctor, Ben- jamin Recht, Esther Rolf, and Vaishaal Shankar.  ...  [48] Benjamin Recht, Rebecca Roelofs, Ludwig Schmidt, and Vaishaal Shankar. Do ImageNet Classifiers Generalize to ImageNet? In International Conference on Machine Learning, pages 5389-5400, 2019.  ... 
arXiv:2010.08168v1 fatcat:pdxqlxejabdabbqozgdrcjwt7e

Data Determines Distributional Robustness in Contrastive Language Image Pre-training (CLIP) [article]

Alex Fang, Gabriel Ilharco, Mitchell Wortsman, Yuhao Wan, Vaishaal Shankar, Achal Dave, Ludwig Schmidt
2022 arXiv   pre-print
Contrastively trained image-text models such as CLIP, ALIGN, and BASIC have demonstrated unprecedented robustness to multiple challenging natural distribution shifts. Since these image-text models differ from previous training approaches in several ways, an important question is what causes the large robustness gains. We answer this question via a systematic experimental investigation. Concretely, we study five different possible causes for the robustness gains: (i) the training set size, (ii)
more » ... he training distribution, (iii) language supervision at training time, (iv) language supervision at test time, and (v) the contrastive loss function. Our experiments show that the more diverse training distribution is the main cause for the robustness gains, with the other factors contributing little to no robustness. Beyond our experimental results, we also introduce ImageNet-Captions, a version of ImageNet with original text annotations from Flickr, to enable further controlled experiments of language-image training.
arXiv:2205.01397v1 fatcat:qrjrygxpwzhupcm26x4omhplbe

Do Image Classifiers Generalize Across Time? [article]

Vaishaal Shankar, Achal Dave, Rebecca Roelofs, Deva Ramanan, Benjamin Recht, Ludwig Schmidt
2019 arXiv   pre-print
We study the robustness of image classifiers to temporal perturbations derived from videos. As part of this study, we construct two datasets, ImageNet-Vid-Robust and YTBB-Robust , containing a total 57,897 images grouped into 3,139 sets of perceptually similar images. Our datasets were derived from ImageNet-Vid and Youtube-BB respectively and thoroughly re-annotated by human experts for image similarity. We evaluate a diverse array of classifiers pre-trained on ImageNet and show a median
more » ... ication accuracy drop of 16 and 10 on our two datasets. Additionally, we evaluate three detection models and show that natural perturbations induce both classification as well as localization errors, leading to a median drop in detection mAP of 14 points. Our analysis demonstrates that perturbations occurring naturally in videos pose a substantial and realistic challenge to deploying convolutional neural networks in environments that require both reliable and low-latency predictions
arXiv:1906.02168v3 fatcat:trmemkurqndffb7i553eaknfma

Measuring Robustness to Natural Distribution Shifts in Image Classification [article]

Rohan Taori, Achal Dave, Vaishaal Shankar, Nicholas Carlini, Benjamin Recht, Ludwig Schmidt
2020 arXiv   pre-print
This is the "pm-k" metric introduced by Shankar et al. [76] . Dataset shifts.  ...  In CVPR, 2017. [68] Recht, B., Roelofs, R., Schmidt, L., and Shankar, V. Do imagenet classifiers generalize to imagenet?  ... 
arXiv:2007.00644v2 fatcat:ef6s3w4ignam7a2cvbvkps6ixm

Flare Prediction Using Photospheric and Coronal Image Data

Eric Jonas, Monica Bobra, Vaishaal Shankar, J. Todd Hoeksema, Benjamin Recht
2018 Solar Physics  
The precise physical process that triggers solar flares is not currently understood. Here we attempt to capture the signature of this mechanism in solar image data of various wavelengths and use these signatures to predict flaring activity. We do this by developing an algorithm that [1] automatically generates features in 5.5 TB of image data taken by the Solar Dynamics Observatory of the solar photosphere, chromosphere, transition region, and corona during the time period between May 2010 and
more » ... ay 2014, [2] combines these features with other features based on flaring history and a physical understanding of putative flaring processes, and [3] classifies these features to predict whether a solar active region will flare within a time period of T hours, where T = 2 and 24. We find that when optimizing for the True Skill Score (TSS), photospheric vector magnetic field data combined with flaring history yields the best performance, and when optimizing for the area under the precision-recall curve, all the data are helpful. Our model performance yields a TSS of 0.84 ± 0.03 and 0.81 ± 0.03 in the T = 2 and 24 hour cases, respectively, and a value of 0.13 ± 0.07 and 0.43 ± 0.08 for the area under the precision-recall curve in the T = 2 and 24 hour cases, respectively. These relatively high scores are similar to, but not greater than, other attempts to predict solar flares. Given the similar values of algorithm performance across various types of models reported in the literature, we conclude that we can expect a certain baseline predictive capacity using these data. This is the first attempt to predict solar flares using photospheric vector magnetic field data as well as multiple wavelengths of image data from the chromosphere, transition region, and corona.
doi:10.1007/s11207-018-1258-9 fatcat:pswgzof6xndbdpbyv3wpqkhcga

Approximate Subgraph Isomorphism for Image Localization

Vaishaal Shankar, Jordan Zhang, Jerry Chen, Christopher Dinh, Mattthew Clements, Avideh Zakhor
2016 IS&T International Symposium on Electronic Imaging Science and Technology  
We propose a system for user-aided image localization in urban regions by exploiting the inherent graph like structure of urban streets, buildings and intersections. In this graph the nodes represent buildings, intersections and roads. The edges represent "logical links" such as two buildings being next to each other, or a building being on a road. We generate this graph automatically for large areas using publicly available road and building footprint data. To localize a query image, a user
more » ... erates a similar graph manually by identifying the buildings, intersections and roads in the image. We then run a subgraph isomorphism algorithm to find candidate locations for the the query image. We evaluate our system on regions of multiple sizes ranging from 2km 2 to 47km 2 in the Amman,Jordan and Berkeley,CA,USA. We have found that in many cases we reduce the uncertainty in the query's location by as much as 90 percent.
doi:10.2352/issn.2470-1173.2016.15.ipas-191 fatcat:6vjijfmgpndavn74u7r5ae2mn4

Reviewer Integration and Performance Measurement for Malware Detection [article]

Brad Miller, Alex Kantchelian, Michael Carl Tschantz, Sadia Afroz, Rekha Bachwani, Riyaz Faizullabhoy, Ling Huang, Vaishaal Shankar, Tony Wu, George Yiu, Anthony D. Joseph, J. D. Tygar
2016 arXiv   pre-print
We present and evaluate a large-scale malware detection system integrating machine learning with expert reviewers, treating reviewers as a limited labeling resource. We demonstrate that even in small numbers, reviewers can vastly improve the system's ability to keep pace with evolving threats. We conduct our evaluation on a sample of VirusTotal submissions spanning 2.5 years and containing 1.1 million binaries with 778GB of raw feature data. Without reviewer assistance, we achieve 72% detection
more » ... at a 0.5% false positive rate, performing comparable to the best vendors on VirusTotal. Given a budget of 80 accurate reviews daily, we improve detection to 89% and are able to detect 42% of malicious binaries undetected upon initial submission to VirusTotal. Additionally, we identify a previously unnoticed temporal inconsistency in the labeling of training datasets. We compare the impact of training labels obtained at the same time training data is first seen with training labels obtained months later. We find that using training labels obtained well after samples appear, and thus unavailable in practice for current training data, inflates measured detection by almost 20 percentage points. We release our cluster-based implementation, as well as a list of all hashes in our evaluation and 3% of our entire dataset.
arXiv:1510.07338v2 fatcat:kr6r3uocrjgulcfwme4gxenyi4
« Previous Showing results 1 — 15 out of 50 results