3,758 Hits in 4.1 sec

Feature subset selection for logistic regression via mixed integer optimization

Toshiki Sato, Yuichi Takano, Ryuhei Miyashiro, Akiko Yoshise
2016 Computational optimization and applications  
This paper concerns a method of selecting a subset of features for a logistic regression model.  ...  The feature subset selection problem is formulated as a mixed integer linear optimization problem, which can be solved with standard mathematical optimization software, by using a piecewise linear approximation  ...  The feature subset selection problem for the logistic regression model (1) is framed as a combinatorial optimization problem, IC opt = min{IC(S) | S ⊆ {1, 2, . . . , p}}. (5) Mixed Integer Optimization  ... 
doi:10.1007/s10589-016-9832-2 fatcat:asuc4bzmyncbzmvtgrfwllhrfi

Logistic Regression: From Art to Science

Dimitris Bertsimas, Angela King
2017 Statistical Science  
Motivated by this speedup, we propose modeling logistic regression problems algorithmically with a mixed integer nonlinear optimization (MINLO) approach in order to explicitly incorporate these properties  ...  In the period 1991-2015, algorithmic advances in Mixed-Integer Linear Optimization (MILO) coupled with hardware improvements have resulted in an astonishing 450 billion factor speedup in solving MILO problems  ...  Satoa [48] is the only work that solves a penalized version of (2) via mixed integer optimization (MIO).  ... 
doi:10.1214/16-sts602 fatcat:hgyh5eucrjbqngcqaucr5dd2na

Feature Subset Selection for Ordered Logit Model via Tangent-Plane-Based Approximation

2019 IEICE transactions on information and systems  
For feature subset selection in the sequential logit model, Sato et al. [22] recently proposed a mixed-integer linear optimization (MILO) formulation.  ...  This paper is concerned with a mixed-integer optimization (MIO) approach to selecting a subset of relevant features from among many candidates.  ...  Mixed-Integer Optimization Approach This section presents an MILO formulation for feature subset selection in an ordered logit model.  ... 
doi:10.1587/transinf.2018edp7188 fatcat:nkzsognurfb7ljnulrfycvlnku

Feature Selection on Noisy Twitter Short Text Messages for Language Identification [article]

Mohd Zeeshan Ansari, Tanvir Ahmad, Ana Fatima
2020 arXiv   pre-print
Therefore, feature selection methods are significant in choosing feature that are most relevant for an efficient model.  ...  Various n-gram profiles are examined with different feature selection algorithms over many classifiers.  ...  Different variations of logistic regression are binomial and multinomial logistic regression. For binomial logistic regression the dependent variable can have two possible outcome i.e. '0' or '1'.  ... 
arXiv:2007.05727v1 fatcat:7x7fa2343fa7lhgmm3s3okjkwy

Feature subset selection for kernel SVM classification via mixed-integer optimization [article]

Ryuta Tamura, Yuichi Takano, Ryuhei Miyashiro
2022 arXiv   pre-print
We study the mixed-integer optimization (MIO) approach to feature subset selection in nonlinear kernel support vector machines (SVMs) for binary classification.  ...  We propose a mixed-integer linear optimization (MILO) formulation based on the kernel-target alignment for feature subset selection, and this MILO problem can be solved to optimality using optimization  ...  We study the mixed-integer optimization (MIO) approach to feature subset selection in nonlinear kernel support vector machines (SVMs) for binary classification.  ... 
arXiv:2205.14325v1 fatcat:q7xhwan6qjff3pddzalmkaayvq

Simple rules for complex decisions [article]

Jongbin Jung, Connor Concannon, Ravi Shroff, Sharad Goel, Daniel G. Goldstein
2017 arXiv   pre-print
Here we present a new method, select-regress-and-round, for constructing simple rules that perform well for complex decisions.  ...  We find that simple rules significantly outperform judges and are on par with decisions derived from random forests trained on all available features.  ...  ACKNOWLEDGMENTS We thank Avi Feller, Andrew Gelman, Gerd Gigerenzer, Art Owen, and Berk Ustun for helpful conversations.  ... 
arXiv:1702.04690v3 fatcat:bfu7yvoyrnd6vhfuvo57hkbhyy

Optimization Models for Machine Learning: A Survey [article]

Claudio Gambella, Bissan Ghaddar, Joe Naoum-Sawaya
2020 arXiv   pre-print
Particularly, mathematical optimization models are presented for regression, classification, clustering, deep learning, and adversarial learning, as well as new emerging applications in machine teaching  ...  This paper surveys the machine learning literature and presents in an optimization framework several commonly used machine learning approaches.  ...  Acknowledgement We are very grateful to four anonymous referees for their valuable feedback and comments that helped improve the content and presentation of the paper.  ... 
arXiv:1901.05331v4 fatcat:3bwfbl34rrf2tkpqeidl5hfoxu

Wasserstein Logistic Regression with Mixed Features [article]

Aras Selvi and Mohammad Reza Belbasi and Martin B Haugh and Wolfram Wiesemann
2022 arXiv   pre-print
In this paper, we show that distributionally robust logistic regression with mixed (i.e., numerical and categorical) features, despite amounting to an optimization problem of exponential size, admits a  ...  We show that our method outperforms both the unregularized and the regularized logistic regression on categorical as well as mixed-feature benchmark instances.  ...  . , 1 2 } are selected via 5-fold cross-validation.  ... 
arXiv:2205.13501v1 fatcat:yok255pj3nakbkmknwjrkfpdc4

Learning Sparse Classifiers: Continuous and Mixed Integer Optimization Perspectives [article]

Antoine Dedieu, Hussein Hazimeh, Rahul Mazumder
2021 arXiv   pre-print
We consider a discrete optimization formulation for learning sparse classifiers, where the outcome depends upon a linear combination of a small subset of features.  ...  Recent work has shown that mixed integer programming (MIP) can be used to solve (to optimality) ℓ_0-regularized regression problems at scales much larger than what was conventionally considered possible  ...  For the hinge loss with q = 1, the problem is a Mixed Integer Linear Program (MILP). For q = 2 and the hinge loss function, (19) becomes a Mixed Integer Quadratic Program (MIQP).  ... 
arXiv:2001.06471v2 fatcat:ruakr33w2nhijibhc3ukxgtpha

A clinical decision tool for predicting patient care characteristics: patients returning within 72 hours in the emergency department

Eva K Lee, Fan Yuan, Daniel A Hirsh, Michael D Mallory, Harold K Simon
2012 AMIA Annual Symposium Proceedings  
We began with a large pool of potentially important factors, and used particle swarm optimization techniques for feature selection coupled with an optimization-based discriminant analysis model (DAMIP)  ...  The analysis involves using a subset of the patient cohort for training and establishment of the predictive rule, and blind predicting the return of the remaining patients.  ...  Acknowledgement The authors acknowledge the AMIA reviewers for providing useful comments to improve the paper.  ... 
pmid:23304321 pmcid:PMC3540516 fatcat:wb7baetsgnbrzlbt7h3dp7ncyy


Keiji Kimura
2019 Journal of the Operations Research Society of Japan  
We proposed a mixed integer nonlinear programming approach to AIC minimization for linear regression and showed that the approach outperformed existing approaches in terms of computational time [13] .  ...  We implement the proposed approach via SCIP, which is a noncommercial optimization software and a branch-and-bound framework.  ...  Acknowledgements I would like to thank Hayato Waki for useful discussions and carefully reading the manuscript. I am grateful to the referees for their helpful comments.  ... 
doi:10.15807/jorsj.62.15 fatcat:ps5ehyqsofhx5ezglsz5xrtteu

Predicting 72-hour reattendance in emergency departments using discriminant analysis via mixed integer programming with electronic medical records

Fanwen Meng, Kiok Liang Teow, Kelvin Wee Sheng Teo, Chee Kheong Ooi, Seow Yian Tay
2019 Journal of Industrial and Management Optimization  
With these factors in combination with suggestions from ED clinicians, a mixed integer programming model based on discriminant analysis is proposed to determine a classification rule for 72-hour reattendance  ...  In numerical experiments, various small subsets of risk factors are used for classification and prediction.  ...  mixed integer programming (MIP) and linear programming (LP) [8, 20] .  ... 
doi:10.3934/jimo.2018079 fatcat:pgfkca3syrdizcltjppzdyw42a

Holistic Generalized Linear Models [article]

Benjamin Schwendinger, Florian Schwendinger, Laura Vana
2022 arXiv   pre-print
Holistic linear regression extends the classical best subset selection problem by adding additional constraints designed to improve the model quality.  ...  By making use of state-of-the-art conic mixed-integer solvers, the package can reliably solve GLMs for Gaussian, binomial and Poisson responses with a multitude of holistic constraints.  ...  In particular, this can be achieved by utilizing quadratic mixed-integer optimization, where the integer constraints are used to place cardinality constraints on the linear regression model.  ... 
arXiv:2205.15447v1 fatcat:z552zwu335cexjtihfazkh6i54

Semi-supervised remote sensing image classification via maximum entropy

Ayse Naz Erkan, Gustavo Camps-Valls, Yasemin Altun
2010 2010 IEEE International Workshop on Machine Learning for Signal Processing  
In this paper, we evaluate semi-supervised logistic regression (SLR), a recent information theoretic semi-supervised algorithm, for remote sensing image classification problems.  ...  These characteristics make SLR a strong alternative to the widely used semi-supervised variants of SVM for the segmentation of remote sensing images.  ...  In [11] , a mixed integer programming was proposed to find the labeling with the lowest objective function. The optimization, however, is intractable for large data sets.  ... 
doi:10.1109/mlsp.2010.5589199 fatcat:sfzutkmgvzcvzm3hpj325nzhxq

An Aggregation Method for Sparse Logistic Regression [article]

Zhe Liu
2015 arXiv   pre-print
L_1 regularized logistic regression has now become a workhorse of data mining and bioinformatics: it is widely used for many classification problems, particularly ones with many features.  ...  In this paper, we demonstrate and analyze an aggregation method for sparse logistic regression in high dimensions.  ...  For the penalized logistic regression and elastic net methods, the tuning parameter λ and elastic net mixing parameter α are chosen via cross-validations, as implemented in the R glmnet package (Friedman  ... 
arXiv:1410.6959v2 fatcat:sh3qu3od6rdk7p5dtehbyrqtdm
« Previous Showing results 1 — 15 out of 3,758 results