37 Hits in 4.6 sec

Group lasso with overlap and graph lasso

Laurent Jacob, Guillaume Obozinski, Jean-Philippe Vert
2009 Proceedings of the 26th Annual International Conference on Machine Learning - ICML '09  
We propose a new penalty function which, when used as regularization for empirical risk minimization procedures, leads to sparse estimators.  ...  The support of the sparse vector is typically a union of potentially overlapping groups of covariates defined a priori, or a set of covariates which tend to be connected to each other when a graph of covariates  ...  Acknowledgments This work was supported by ANR grant ANR-07-BLAN-0311 and the France-Berkeley fund. The authors thank Bin Yu and Michael Jordan for useful discussions.  ... 
doi:10.1145/1553374.1553431 dblp:conf/icml/JacobOV09 fatcat:cd36nwzvzbfhzip25bzlytl7ee

The benefit of group sparsity

Junzhou Huang, Tong Zhang
2010 Annals of Statistics  
Moreover, the theory predicts some limitations of the group Lasso formulation that are confirmed by simulation studies.  ...  This provides a convincing theoretical justification for using group sparse regularization when the underlying group structure is consistent with the data.  ...  Intuitively, group Lasso favors large sized groups because the 2-norm regularization for large group size is weaker.  ... 
doi:10.1214/09-aos778 fatcat:4nltvkkdtzbg7fldstz7snnhcu

Training Structured Neural Networks Through Manifold Identification and Variance Reduction [article]

Zih-Syuan Huang, Ching-pei Lee
2022 arXiv   pre-print
the regularizer at the stationary point of asymptotic convergence, even in the presence of engineering tricks like data augmentation and dropout that complicate the training process.  ...  For unstructured sparsity, RMDA also outperforms a state-of-the-art pruning method, validating the benefits of training structured NNs through regularization.  ...  Acknowledgements This work was supported in part by MOST of R.O.C. grant 109-2222-E-001-003-MY3, and the AWS Cloud Credits for Research program of Amazon Inc.  ... 
arXiv:2112.02612v3 fatcat:zaslevsawzdovke3oggvcfdsvy

Fenchel duality of Cox partial likelihood and its application in survival kernel learning [article]

Christopher Wilson, Kaiqiao Li, Qian Sun, Pei-Fen Kaun, Xuefeng Wang
2020 bioRxiv   pre-print
However, the optimization problem becomes intractable when more complicated regularization is employed with the Cox loss function.  ...  The Cox proportional hazard model is the most widely used method in modeling time-to-event data in the health sciences.  ...  group lasso penalty terms [10] .  ... 
doi:10.1101/2020.05.04.077263 fatcat:tygvslitajdmvohlvokz5at37m

Towards Ultrahigh Dimensional Feature Selection for Big Data [article]

Mingkui Tan and Ivor W. Tsang and Li Wang
2019 arXiv   pre-print
Moreover, the proposed method can tackle two challenging tasks in feature selection: 1) group-based feature selection with complex structures and 2) nonlinear feature selection with explicit feature mappings  ...  The feature generating paradigm can guarantee that the solution converges globally under mild conditions and achieve lower feature selection bias.  ...  Acknowledgments We would like to acknowledge the valuable comments and useful suggestions by the Action Editor and the four anonymous reviewers. We would like to express our gratitude to Dr.  ... 
arXiv:1209.5260v2 fatcat:h4o4nhfctvcpvbbyovd46qqqge

Interpretable Convolution Methods for Learning Genomic Sequence Motifs [article]

Matthew S Ploenzke, Rafael A Irizarry
2018 bioRxiv   pre-print
We additionally leverage regularization to encourage learning highly-representative motifs with low inter-filter redundancy.  ...  with the given filter.  ...  We utilize the sparse group lasso penalty with each filter defined as a 159 group [24].  ... 
doi:10.1101/411934 fatcat:wilsrvvzvreurm3pddej2a5lqe

Group Variable Selection Methods and Their Applications in Analysis of Genomic Data [chapter]

Jun Xie, Lingmin Zeng
2010 Frontiers in Computational and Systems Biology  
The group Lasso estimators are obtained by minimizing ||y − J j=1 X j β j || 2 + λ J j=1 ||β j || Kj where λ is the regularization parameter and ||z|| K = (z Kz) 1/2 with a symmetric k × k positive definite  ...  When people have prior knowledge on variable groups, group Lasso proposed by Yuan and Lin [22] is designed to select pre-defined groups of predictors.  ... 
doi:10.1007/978-1-84996-196-7_12 fatcat:dsgmyd7dsna75otkivt4yaspz4

Rare Feature Selection in High Dimensions [article]

Xiaohan Yan, Jacob Bien
2020 arXiv   pre-print
We apply our method to data from TripAdvisor, in which we predict the numerical rating of a hotel based on the text of the associated review.  ...  We show, both theoretically and empirically, that not explicitly accounting for the rareness of features can greatly reduce the effectiveness of an analysis.  ...  Acknowledgments The authors thank Andy Clark for calling our attention to the challenge of rare features. This work was supported by NSF CAREER grant, DMS-1653017.  ... 
arXiv:1803.06675v2 fatcat:gnx74z5v7fbz5bxqf3mwjddkqe

Fenchel duality of Cox partial likelihood with an application in survival kernel learning

Christopher M. Wilson, Kaiqiao Li, Qiang Sun, Pei Fen Kuan, Xuefeng Wang
2021 Artificial Intelligence in Medicine  
However, due to the nature of censored data, the optimization problem becomes intractable when more complicated regularization is employed, which is necessary when dealing with high dimensional omic data  ...  The Cox proportional hazard model is one of the most widely used methods in modeling time-to-event data in the health sciences.  ...  This work has also been supported in part by the Miles for Moffitt Foundation Funds and by the Biostatistics and Bioinformatics Shared Resource at the H.  ... 
doi:10.1016/j.artmed.2021.102077 pmid:34020756 fatcat:7nnetudqdfaqfc2pfxq6hoqtzi

Fast Iteratively Reweighted Least Squares Algorithms for Analysis-Based Sparsity Reconstruction [article]

Chen Chen, Junzhou Huang, Lei He, Hongsheng Li
2015 arXiv   pre-print
It can solve the generalized problem by structured sparsity regularization with an orthogonal basis and total variation regularization.  ...  The convergence rate of the proposed algorithm is almost the same as that of the traditional IRLS algorithms, that is, exponentially fast.  ...  In addition, it is unknown how to extend these IRLS based algorithms to solve the overlapping group Lasso problems.  ... 
arXiv:1411.5057v3 fatcat:jsxkjkljsbehpj6hoih2b7kr7i

Too many covariates and too few cases? - a comparative study

Qingxia Chen, Hui Nian, Yuwei Zhu, H. Keipp Talbot, Marie R. Griffin, Frank E. Harrell
2016 Statistics in Medicine  
Those methods, however, have been less frequently used when p ≈ n, and in this situation, there is no guidance on choosing among regular logistic regression models, propensity score methods, and shrinkage  ...  Recent work on shrinkage approaches like lasso were motivated by the critical need to develop methods for the p >> n situation, where p is the number of parameters and n is the sample size.  ...  Acknowledgments The authors wish to thank the editor, the associate editor and two referees for several suggestions and editorial changes which have greatly improved the paper. Dr.  ... 
doi:10.1002/sim.7021 pmid:27357163 pmcid:PMC5050102 fatcat:lr6oud65gnbhjhu3lquxrtptwm

Regularized outcome weighted subgroup identification for differential treatment effects

Yaoyao Xu, Menggang Yu, Ying-Qi Zhao, Quefeng Li, Sijian Wang, Jun Shao
2015 Biometrics  
The function uses patient outcomes as weights rather than modeling targets.  ...  We demonstrate the advantages of our method in simulation studies and in analyses of two real data sets.  ...  Acknowledgments The research efforts were partly supported by the University of Wisconsin Carbone Cancer Center support grant, NIH/NIA P30 CA014520 (for Drs.  ... 
doi:10.1111/biom.12322 pmid:25962845 pmcid:PMC5395466 fatcat:aywo77sjgzcadceoq5zcfps6lm

Ensembled sparse-input hierarchical networks for high-dimensional datasets [article]

Jean Feng, Noah Simon
2020 arXiv   pre-print
The proposed method, Ensemble by Averaging Sparse-Input Hierarchical networks (EASIER-net), appropriately prunes the network structure by tuning only two L1-penalty parameters, one that controls the input  ...  The method selects variables from the true support if the irrelevant covariates are only weakly correlated with the response; otherwise, it exhibits a grouping effect, where strongly correlated covariates  ...  Acknowledgments The authors thank Frederick A. Matsen IV for helpful discussions and suggestions. This work was supported by NIH Grant DP5OD019820.  ... 
arXiv:2005.04834v1 fatcat:j4pfrcikgjgvho2qskyuytlhvi

Learning Social Networks from Text Data using Covariate Information [article]

Xiaoyi Yang, Nynke M.D. Niezink, Rebecca Nugent
2020 arXiv   pre-print
The Local Poisson Graphical Lasso model leverages the number of co-mentions in the text to measure relationships between people and uses a conditional independence structure to model a social network.  ...  This structure will reduce the tendency to overstate the relationship between "friends of friends", but given the historical high frequency of common names, without additional distinguishing information  ...  Li et al (2015) extended this method to a multivariate sparse group Lasso to incorporate arbitrary and group structures in the data.  ... 
arXiv:2010.08076v1 fatcat:3t6qupgssfa6xgzdbc6jtm4lta

High-dimensional Ising model selection using ℓ 1 -regularized logistic regression

Pradeep Ravikumar, Martin J. Wainwright, John D. Lafferty
2010 Annals of Statistics  
When these same conditions are imposed directly on the sample matrices, we show that a reduced sample size of n=Ω(d^2 p) suffices for the method to estimate neighborhoods consistently.  ...  We describe a method based on ℓ_1-regularized logistic regression, in which the neighborhood of any given node is estimated by performing logistic regression subject to an ℓ_1-constraint.  ...  The convex program (48) is the multiclass logistic analog of the group Lasso, a type of relaxation that has been studied in previous and on-going work on linear and logistic regression (e.g., [19, 22,  ... 
doi:10.1214/09-aos691 fatcat:uqtb3hfjhrdz7cpaujmnk43ole
« Previous Showing results 1 — 15 out of 37 results