Filters








451 Hits in 1.6 sec

Hierarchical Infinite Relational Model [article]

Feras A. Saad, Vikash K. Mansinghka
2021 arXiv   pre-print
et al., 2017) , as well as enable fast exact inference (Saad et al., 2021) for the broad range of probabilistic queries that the HIRM can handle.  ...  (7) ) has been considered in other settings, including non-relational tabular data , multivariate time series (Saad and Mansinghka, 2018) , topic modeling (Blei et al., 2010) , and computer vision  ... 
arXiv:2108.07208v1 fatcat:455leyuanbf7rexhpa2xmffxhi

Probabilistic Data Analysis with Probabilistic Programming [article]

Feras Saad, Vikash Mansinghka
2016 arXiv   pre-print
Probabilistic techniques are central to data analysis, but different approaches can be difficult to apply, combine, and compare. This paper introduces composable generative population models (CGPMs), a computational abstraction that extends directed graphical models and can be used to describe and compose a broad class of probabilistic data analysis techniques. Examples include hierarchical Bayesian models, multivariate kernel methods, discriminative machine learning, clustering algorithms,
more » ... nsionality reduction, and arbitrary probabilistic programs. We also demonstrate the integration of CGPMs into BayesDB, a probabilistic programming platform that can express data analysis tasks using a modeling language and a structured query language. The practical value is illustrated in two ways. First, CGPMs are used in an analysis that identifies satellite data records which probably violate Kepler's Third Law, by composing causal probabilistic programs with non-parametric Bayes in under 50 lines of probabilistic code. Second, for several representative data analysis tasks, we report on lines of code and accuracy measurements of various CGPMs, plus comparisons with standard baseline solutions from Python and MATLAB libraries.
arXiv:1608.05347v1 fatcat:cy3ddgzb5rdzxctz7lfzoecm4u

Probabilistic Search for Structured Data via Probabilistic Programming and Nonparametric Bayes [article]

Feras Saad, Leonardo Casarsa, Vikash Mansinghka
2017 arXiv   pre-print
As shown by Saad and Mansinghka (2017) , structural dependencies induced by CrossCat's variable partition are related to an upper-bound on the probability there exists a statistical dependence between  ...  This assumption suffices for our development of predictive relevance, and is applicable to a broad class of statistical data types (Saad and Mansinghka, 2016) with conjugate prior-likelihood representations  ...  ., 2015; Saad and Mansinghka, 2016) , a probabilistic programming platform for probabilistic data analysis.  ... 
arXiv:1704.01087v1 fatcat:6nye6jpv2fg5zlvt6rvfwezfda

Intelligent Key (IKey) The key of Securing Cars

Sultan Saad Alshamrani, Abdulaziz Othman, Ahmed Saad, Salem Alaidaros, Feras Abo Alaoun, Mohammed Sadiq, Ramiz Farouq
2020 INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY  
With the advancement of technology, applications have become an indispensable part of daily life to solve many problems. Technology is one of the important contents that promotes communication between people and enhances the acquisition and acquisition of information. Therefore, the development of information technology is the reason for the major scientific and knowledge revolution, which has an impact on the convenience of human life in various scientific journals. To recognize our
more » ... ity, we came up with the idea of replacing physical objects with applications that increase usability in our life. As we strive to develop a smart car control application, the application can save the driver from the car key. Therefore, the application controls certain functions in the car, such as opening and closing doors and starting and stopping the engine. By connecting the app to the car and ensuring that the user does not need to carry the car key. As a result, now we are able to provide convenience and luxury to our customers and end-users.
doi:10.24297/ijct.v20i.8859 fatcat:fex3vewa55gv5batq2q7rh6vca

A Probabilistic Programming Approach To Probabilistic Data Analysis

Feras Saad, Vikash K. Mansinghka
2016 Neural Information Processing Systems  
Probabilistic techniques are central to data analysis, but different approaches can be challenging to apply, combine, and compare. This paper introduces composable generative population models (CGPMs), a computational abstraction that extends directed graphical models and can be used to describe and compose a broad class of probabilistic data analysis techniques. Examples include discriminative machine learning, hierarchical Bayesian models, multivariate kernel methods, clustering algorithms,
more » ... d arbitrary probabilistic programs. We demonstrate the integration of CGPMs into BayesDB, a probabilistic programming platform that can express data analysis tasks using a modeling definition language and structured query language. The practical value is illustrated in two ways. First, the paper describes an analysis on a database of Earth satellites, which identifies records that probably violate Kepler's Third Law by composing causal probabilistic programs with nonparametric Bayes in 50 lines of probabilistic code. Second, it reports the lines of code and accuracy of CGPMs compared with baseline solutions from standard machine learning libraries. '));
dblp:conf/nips/SaadM16 fatcat:kb2luygjfnbjtjmo7qhw6pneea

A Family of Exact Goodness-of-Fit Tests for High-Dimensional Discrete Distributions [article]

Feras A. Saad, Cameron E. Freer, Nathanael L. Ackerman, Vikash K. Mansinghka
2019 arXiv   pre-print
The objective of goodness-of-fit testing is to assess whether a dataset of observations is likely to have been drawn from a candidate probability distribution. This paper presents a rank-based family of goodness-of-fit tests that is specialized to discrete distributions on high-dimensional domains. The test is readily implemented using a simulation-based, linear-time procedure. The testing procedure can be customized by the practitioner using knowledge of the underlying data domain. Unlike most
more » ... existing test statistics, the proposed test statistic is distribution-free and its exact (non-asymptotic) sampling distribution is known in closed form. We establish consistency of the test against all alternatives by showing that the test statistic is distributed as a discrete uniform if and only if the samples were drawn from the candidate distribution. We illustrate its efficacy for assessing the sample quality of approximate sampling algorithms over combinatorially large spaces with intractable probabilities, including random partitions in Dirichlet process mixture models and random lattices in Ising models.
arXiv:1902.10142v1 fatcat:if3sqzvoenedxgjdi7dp53voxi

Application of Time-Voltage Characteristics in Overcurrent Scheme to Reduce Arc-Flash Incident Energy for Safety and Reliability of Microgrid Protection

Feras Alasali, Saad M. Saad, Naser El-Naily, Anis Layas, Abdelsalam Elhaffar, Tawfiq Hussein, Faisal A. Mohamed
2021 Energies  
In Saad et al. [35] , the primary investigation, which used OCRs to minimize the AFIE, was presented by employing the water cycle algorithm.  ... 
doi:10.3390/en14238074 fatcat:mijwfigy2zcrxeeik6sin7qexe

Time Series Structure Discovery via Probabilistic Program Synthesis [article]

Ulrich Schaechtle, Feras Saad, Alexey Radul, Vikash Mansinghka
2017 arXiv   pre-print
There is a widespread need for techniques that can discover structure from time series data. Recently introduced techniques such as Automatic Bayesian Covariance Discovery (ABCD) provide a way to find structure within a single time series by searching through a space of covariance kernels that is generated using a simple grammar. While ABCD can identify a broad class of temporal patterns, it is difficult to extend and can be brittle in practice. This paper shows how to extend ABCD by
more » ... it in terms of probabilistic program synthesis. The key technical ideas are to (i) represent models using abstract syntax trees for a domain-specific probabilistic language, and (ii) represent the time series model prior, likelihood, and search strategy using probabilistic programs in a sufficiently expressive language. The final probabilistic program is written in under 70 lines of probabilistic code in Venture. The paper demonstrates an application to time series clustering that involves a non-parametric extension to ABCD, experiments for interpolation and extrapolation on real-world econometric data, and improvements in accuracy over both non-parametric and standard regression baselines.
arXiv:1611.07051v3 fatcat:6pehz3plczhvvo3qadflpsmsxu

Elements of a stochastic 3D prediction engine in larval zebrafish prey capture [article]

Andrew D Bolton, Martin Haesemeyer, Josua Jordi, Ulrich Schaechtle, Feras A Saad, Vikash K Mansinghka, Joshua B Tenenbaum, Florian Engert
2019 bioRxiv   pre-print
We used the BayesDB software library (Mansinghka et al. 2015 , Saad et al. 2016 to implement the computations needed to build these models and generate conditional simulations.  ... 
doi:10.1101/755777 fatcat:vetulsago5efhgulcs3tyh2m5u

MORBIDITY ASSOCIATED WITH OBESITY

Amjad Meshal Allahyani , Fatimah Abdullah Alrabeh , Hanan Abdullah Alhajji, Fatmah Mohsen Alhejji 3, Sameer Ayed Almaghamsi , Feras Abdulwahab Alghamdi , Zeyad Saad Aljohani, Faisal Ali Alghamdi , Abdullah Saleh Almuslam, Hussain Abdullah Alkhamis , Saleh Abdulaziz Abubaker
2018 Zenodo  
Body mass index (BMI) is the most commonly used parameter for fatness measurement. It is calculated based on weight and length of an individual by dividing the weight -in kilogram- on the squared length in meter (Kg/M2). The normal BMI slightly differ among genders. However, a BMI ranging from 25.0 to 29.9 kg/m2 is defined as adult overweight. Number high as 30 kg/m2 or more is considered to be obesity. Aim of work: In this review, we will discuss the comorbidities associated with obesity
more » ... ology: We did a systematic search for the comorbidities associated with obesity in the using PubMed search engine (http://www.ncbi.nlm.nih.gov/) and Google Scholar search engine (https://scholar.google.com). All relevant studies were retrieved and discussed. We only included full articles. Conclusions: With its burden on the healthcare system and individuals' lifestyle, Obesity is an important concern. Believing of obesity as a character flaw has shifted to more in-depth understanding of it is nature as a disease. Obesity is a result of complex interaction between multiple co-variables. Genes, Socioeconomic status, and cultural beliefs, and environmental factors are associated with the development of, and difficulty treating, obesity. Key words: Morbidity, obesity, risk factors, complications.
doi:10.5281/zenodo.1845304 fatcat:xpktd666u5c4hhwvqex5ko57qa

Detecting Dependencies in Sparse, Multivariate Databases Using Probabilistic Programming and Non-parametric Bayes [article]

Feras Saad, Vikash Mansinghka
2017 arXiv   pre-print
Datasets with hundreds of variables and many missing values are commonplace. In this setting, it is both statistically and computationally challenging to detect true predictive relationships between variables and also to suppress false positives. This paper proposes an approach that combines probabilistic programming, information theory, and non-parametric Bayes. It shows how to use Bayesian non-parametric modeling to (i) build an ensemble of joint probability models for all the variables; (ii)
more » ... efficiently detect marginal independencies; and (iii) estimate the conditional mutual information between arbitrary subsets of variables, subject to a broad class of constraints. Users can access these capabilities using BayesDB, a probabilistic programming platform for probabilistic data analysis, by writing queries in a simple, SQL-like language. This paper demonstrates empirically that the method can (i) detect context-specific (in)dependencies on challenging synthetic problems and (ii) yield improved sensitivity and specificity over baselines from statistics and machine learning, on a real-world database of over 300 sparsely observed indicators of macroeconomic development and public health.
arXiv:1611.01708v2 fatcat:skzsiytqijeabludcmogli2ew4

Elements of a stochastic 3D prediction engine in larval zebrafish prey capture

Andrew D Bolton, Martin Haesemeyer, Josua Jordi, Ulrich Schaechtle, Feras A Saad, Vikash K Mansinghka, Joshua B Tenenbaum, Florian Engert
2019 eLife  
We used the BayesDB software library (Mansinghka et al., 2015; Saad and Mansinghka, 2016) to implement the computations needed to build these models and generate conditional simulations.  ...  The mixture models generated via a DPMM prior can be converted to probabilistic programs for inference to generate the kinds of conditional simulations used in Figure 6 (Saad et al., 2019) .  ... 
doi:10.7554/elife.51975 pmid:31769753 pmcid:PMC6930116 fatcat:qxj5rd7kkvhzzmksofrydo5xsm

COVID-19 Patient Count Prediction Using LSTM

Muhammad Iqbal, Feras Al-Obeidat, Fahad Maqbool, Saad Razzaq, Sajid Anwar, Abdallah Tubaishat, Muhammad Shahrose Khan, Babar Shah
2021 IEEE Transactions on Computational Social Systems  
In December 2019, a pandemic named COVID-19 broke out in Wuhan, China, and in a few weeks, it spread to more than 200 countries worldwide. Every country infected with the disease started taking necessary measures to stop the spread and provide the best possible medical facilities to infected patients and take precautionary measures to control the spread. As the infection spread was exponential, there arose a need to model infection spread patterns to estimate the patient volume computationally.
more » ... Such patients' estimation is the key to the necessary actions that local governments may take to counter the spread, control hospital load, and resource allocations. This article has used long short-term memory (LSTM) to predict the volume of COVID-19 patients in Pakistan. LSTM is a particular type of recurrent neural network (RNN) used for classification, prediction, and regression tasks. We have trained the RNN model on Covid-19 data (March 2020 to May 2020) of Pakistan and predict the Covid-19 Percentage of Positive Patients for June 2020. Finally, we have calculated the mean absolute percentage error (MAPE) to find the model's prediction effectiveness on different LSTM units, batch size, and epochs. Predicted patients are also compared with a prediction model for the same duration, and results revealed that the predicted patients' count of the proposed model is much closer to the actual patient count.
doi:10.1109/tcss.2021.3056769 fatcat:mqzun2co5ndcje453egecpbo3q

Exact Symbolic Inference in Probabilistic Programs via Sum-Product Representations [article]

Feras A. Saad, Martin C. Rinard, Vikash K. Mansinghka
2020 arXiv   pre-print
In our query language, computing (condition ) or P is linear time in the size of whenever normalize is a single Conjunction (as in the restricted query interface from Saad and Mansinghka [2016] ): a sufficient  ...  Probabilistic Program Synthesis: The synthesis methods from Chasins and Phothilimthana [2017] and Saad et al. [2019, Sec. 6 ] generate programs in DSLs that are subsets of Sppl, thereby providing approaches  ... 
arXiv:2010.03485v1 fatcat:gezscr56fnduvp4cruul6jiddi

Temporally-Reweighted Chinese Restaurant Process Mixtures for Clustering, Imputing, and Forecasting Multivariate Time Series [article]

Feras A. Saad, Vikash K. Mansinghka
2018 arXiv   pre-print
This article proposes a Bayesian nonparametric method for forecasting, imputation, and clustering in sparsely observed, multivariate time series data. The method is appropriate for jointly modeling hundreds of time series with widely varying, non-stationary dynamics. Given a collection of N time series, the Bayesian model first partitions them into independent clusters using a Chinese restaurant process prior. Within a cluster, all time series are modeled jointly using a novel
more » ... hted" extension of the Chinese restaurant process mixture. Markov chain Monte Carlo techniques are used to obtain samples from the posterior distribution, which are then used to form predictive inferences. We apply the technique to challenging forecasting and imputation tasks using seasonal flu data from the US Center for Disease Control and Prevention, demonstrating superior forecasting accuracy and competitive imputation accuracy as compared to multiple widely used baselines. We further show that the model discovers interpretable clusters in datasets with hundreds of time series, using macroeconomic data from the Gapminder Foundation.
arXiv:1710.06900v2 fatcat:cxlskdx3bned3ncjdv3joi7lam
« Previous Showing results 1 — 15 out of 451 results