Filters








41 Hits in 2.4 sec

Learning the intensity of time events with change-points [article]

Mokhtar Zahdi Alaya, Agathe Guilloux
2015 arXiv   pre-print
We consider the problem of learning the inhomogeneous intensity of a counting process, under a sparse segmentation assumption. We introduce a weighted total-variation penalization, using data-driven weights that correctly scale the penalization along the observation interval. We prove that this leads to a sharp tuning of the convex relaxation of the segmentation prior, by stating oracle inequalities with fast rates of convergence, and consistency for change-points detection. This provides first
more » ... theoretical guarantees for segmentation with a convex proxy beyond the standard i.i.d signal + white noise setting. We introduce a fast algorithm to solve this convex problem. Numerical experiments illustrate our approach on simulated and on a high-frequency genomics dataset.
arXiv:1507.00513v1 fatcat:jy3qyjn4p5ecdbh2yro5khrfh4

High-dimensional robust regression and outliers detection with SLOPE [article]

Alain Virouleau, Agathe Guilloux, Stéphane Gaïffas, Malgorzata Bogdan
2017 arXiv   pre-print
The problems of outliers detection and robust regression in a high-dimensional setting are fundamental in statistics, and have numerous applications. Following a recent set of works providing methods for simultaneous robust regression and outliers detection, we consider in this paper a model of linear regression with individual intercepts, in a high-dimensional setting. We introduce a new procedure for simultaneous estimation of the linear regression coefficients and intercepts, using two
more » ... ted sorted-ℓ_1 penalizations, also called SLOPE. We develop a complete theory for this problem: first, we provide sharp upper bounds on the statistical estimation error of both the vector of individual intercepts and regression coefficients. Second, we give an asymptotic control on the False Discovery Rate (FDR) and statistical power for support selection of the individual intercepts. As a consequence, this paper is the first to introduce a procedure with guaranteed FDR and statistical power control for outliers detection under the mean-shift model. Numerical illustrations, with a comparison to recent alternative approaches, are provided on both simulated and several real-world datasets. Experiments are conducted using an open-source software written in Python and C++.
arXiv:1712.02640v1 fatcat:nmluadcbzvfznnjh3dgt64ruia

High-dimensional additive hazard models and the Lasso [article]

Séphane Gaïffas, Agathe Guilloux
2012 arXiv   pre-print
We consider a general high-dimensional additive hazard model in a non-asymptotic setting, including regression for censored-data. In this context, we consider a Lasso estimator with a fully data-driven ℓ_1 penalization, which is tuned for the estimation problem at hand. We prove sharp oracle inequalities for this estimator. Our analysis involves a new "data-driven" Bernstein's inequality, that is of independent interest, where the predictable variation is replaced by the optional variation.
arXiv:1106.4662v2 fatcat:l6kxnr2ljvayhcpibkrn6bih4m

Maximum likelihood estimator for cumulative incidence functions under proportionality constraint

Ségolen Geffray, Agathe Guilloux
2011 Sankhya A  
. , X J ) be the r.v. which stands for the overall failure 304 Ségolen Ge↵ray and Agathe Guilloux time of the individual with distribution function (d.f.) F .  ...  We have for t 0: 318 Ségolen Ge↵ray and Agathe Guilloux Table 2 : nM ISE ⇤ multiplied by 10 3 (CP = censoring proportion) 1 F )(1 G) where G is the exponential distribution with parameter a.  ...  . , J such that k 6 = j and for s, t 0 by: Ségolen Ge↵ray and Agathe Guilloux where B n , B (1) n and B (1,j) n for j = 1, . . . , J are the correlated Brownian bridges of Lemma 7.2.  ... 
doi:10.1007/s13171-011-0013-1 fatcat:rx2lqr4xavarjlk7hwwxd2vdrq

High-dimensional additive hazards models and the Lasso

Stéphane Gaïffas, Agathe Guilloux
2012 Electronic Journal of Statistics  
Guilloux Together with (24) , this gives P U t ≥ c 1,ϵ θ t (x +l) n + c 3,ϵ x + 1 +l n ≤ 4 + 3(log(1 + ϵ)) −c ℓ j≥1 j −c ℓ e −x , where c 3,ϵ = 2 max(c 0 , 2(1 + ϵ)(4/3 + ϵ)) + 1/3.  ...  Guilloux Now, writing again (23) forŨ t with the fact that |H i t | ≤ 1 and using the same arguments as before, we arrive at P |θ t − ϑ t | ≥ φ(λ/n) λ nϑ t + x λ ≤ 2e −x and P |θ t − ϑ t | ≥ 2wϑ t x vn  ... 
doi:10.1214/12-ejs681 fatcat:c4fu3jomnjgq7komiack2hazb4

Nonparametric inference under competing risks and selection-biased sampling

Jean-Yves Dauxois, Agathe Guilloux
2008 Journal of Multivariate Analysis  
The aim of this paper is to carry out statistical inference in a competing risks setup when only selectionbiased observation of the data of interest is available. We introduce estimators of the cumulative incidence functions and study their joint large sample behavior.
doi:10.1016/j.jmva.2007.02.001 fatcat:v7fvlbd2w5gc3nesv7ua4foc5y

Adaptive kernel estimation of the baseline function in the Cox model, with high-dimensional covariates [article]

Agathe Guilloux, Marie-Luce Taupin
2015 arXiv   pre-print
We refer to Guilloux et al. (2015) for a proof of Proposition 3.3 in the general case. |β − β 0 | 1 ≤ C(s) log(pn k ) n (7) where C(s) > 0 is a constant depending on the sparsity index s.  ...  oracle inequality depends on the non-asymptotic control of |β − β 0 | 1 deduced from an estimation inequality stated by Huang et al. (2013) and extended to the case of unbounded counting processes (see Guilloux  ... 
arXiv:1507.01397v1 fatcat:wo5hymz3yjdcdisxbpi7zn23re

Learning the Intensity of Time Events With Change-Points

Mokhtar Z. Alaya, Stephane Gaiffas, Agathe Guilloux
2015 IEEE Transactions on Information Theory  
We consider the problem of learning the 1 inhomogeneous intensity of a counting process, under a 2 sparse segmentation assumption. We introduce a weighted 3 total-variation penalization, using data-driven weights that 4 correctly scale the penalization along the observation interval. 5 We prove that this leads to a sharp tuning of the convex relaxation 6 of the segmentation prior, by stating oracle inequalities with fast 7 rates of convergence, and consistency for change-points detection. 8
more » ... provides first theoretical guarantees for segmentation with 9 a convex proxy beyond the standard independent identically 10 distributed signal + white noise setting. We introduce a fast 11 algorithm to solve this convex problem. Numerical experiments 12 illustrate our approach on simulated and on a high-frequency 13 genomics data set. 14
doi:10.1109/tit.2015.2448087 fatcat:3k7xofwjbzeznpamjduvtyzro4

Estimation/Imputation Strategies for Missing Data in Survival Analysis [chapter]

Elodie Brunel, Fabienne Comte, Agathe Guilloux
2014 Statistical Models and Methods for Reliability and Survival Analysis  
doi:10.1002/9781118826805.ch15 fatcat:kerg7m553jd3vihnf5ondcuisq

Supplemental material for C-mix: A high-dimensional mixture model for censored durations, with applications to genetic data

Simon Bussy, Agathe Guilloux, Stéphane Gaïffas, Anne-Sophie Jannot
2018 Figshare  
Supplemental material for C-mix: A high-dimensional mixture model for censored durations, with applications to genetic data by Simon Bussy, Agathe Guilloux, Stéphane Gaïffas and Anne-Sophie Jannot in Statistical  ... 
doi:10.25384/sage.6148508.v2 fatcat:h3gl4iehv5bwvio26h7qp3hvs4

Supplemental material for C-mix: A high-dimensional mixture model for censored durations, with applications to genetic data

Simon Bussy, Agathe Guilloux, Stéphane Gaïffas, Anne-Sophie Jannot
2018 Figshare  
Supplemental material for C-mix: A high-dimensional mixture model for censored durations, with applications to genetic data by Simon Bussy, Agathe Guilloux, Stéphane Gaïffas and Anne-Sophie Jannot in Statistical  ... 
doi:10.25384/sage.6148508.v1 fatcat:4p744trhonaklijidjn4gh5tuq

ConvSCCS: convolutional self-controlled case series model for lagged adverse event detection [article]

Maryan Morel, Emmanuel Bacry, Stéphane Gaïffas, Agathe Guilloux, Fanny Leroy
2018 arXiv   pre-print
With the increased availability of large databases of electronic health records (EHRs) comes the chance of enhancing health risks screening. Most post-marketing detections of adverse drug reaction (ADR) rely on physicians' spontaneous reports, leading to under reporting. To take up this challenge, we develop a scalable model to estimate the effect of multiple longitudinal features (drug exposures) on a rare longitudinal outcome. Our procedure is based on a conditional Poisson model also known
more » ... self-controlled case series (SCCS). We model the intensity of outcomes using a convolution between exposures and step functions, that are penalized using a combination of group-Lasso and total-variation. This approach does not require the specification of precise risk periods, and allows to study in the same model several exposures at the same time. We illustrate the fact that this approach improves the state-of-the-art for the estimation of the relative risks both on simulations and on a cohort of diabetic patients, extracted from the large French national health insurance database (SNIIRAM), a SQL database built around medical reimbursements of more than 65 million people. This work has been done in the context of a research partnership between Ecole Polytechnique and CNAMTS (in charge of SNIIRAM).
arXiv:1712.08243v2 fatcat:umjl3uif2vd4netzujpx6ebdwu

Estimation in a competing risks proportional hazards model under length-biased sampling with censoring

Jean-Yves Dauxois, Agathe Guilloux, Syed N. U. A. Kirmani
2013 Lifetime Data Analysis  
Geffray & Guilloux (2011) for details.  ...  Dauxois & Guilloux (2008) have considered the problem of the nonparametric inference of the Cumulative Incidence Functions under competing risks and selection-biased sampling.  ...  From Theorem 3 of Dauxois & Guilloux (2008) , we have, under Assumption A, in D 2 [0, ∞], whereZ(·) is defined in Theorem 3 andZ 1 is a mean-zero gaussian process defined on [0, ∞] with covariance function  ... 
doi:10.1007/s10985-013-9248-6 pmid:23456312 fatcat:rjucn2ormfeexluhoepxw5jjdu

Adaptive kernel estimation of the baseline function in the Cox model with high-dimensional covariates

Agathe Guilloux, Sarah Lemler, Marie-Luce Taupin
2016 Journal of Multivariate Analysis  
[17] and extended to the case of unbounded counting processes (see Guilloux et al. [16] for details). The paper is organized as follows.  ...  We refer to Guilloux et al. [16] for a proof of Proposition 3.4 in the general case. Proposition 3.4. Let k > 0, c > 0, and let s be the sparsity index of β 0 .  ... 
doi:10.1016/j.jmva.2016.03.002 fatcat:lsprggamwnh3bmwydojctvxbfa

C-mix: A high-dimensional mixture model for censored durations, with applications to genetic data

Simon Bussy, Agathe Guilloux, Stéphane Gaïffas, Anne-Sophie Jannot
2018 Statistical Methods in Medical Research  
We introduce a supervised learning mixture model for censored durations (C-mix) to simultaneously detect subgroups of patients with different prognosis and order them based on their risk. Our method is applicable in a highdimensional setting, i.e. with a large number of biomedical covariates. Indeed, we penalize the negative log-likelihood by the Elastic-Net, which leads to a sparse parameterization of the model and automatically pinpoints the relevant covariates for the survival prediction.
more » ... erence is achieved using an efficient Quasi-Newton Expectation Maximization (QNEM) algorithm, for which we provide convergence properties. The statistical performance of the method is examined on an extensive Monte Carlo simulation study, and finally illustrated on three publicly available genetic cancer datasets with high-dimensional covariates. We show that our approach outperforms the state-of-the-art survival models in this context, namely both the CURE and Cox proportional hazards models penalized by the Elastic-Net, in terms of C-index, AUC(t) and survival prediction. Thus, we propose a powerfull tool for personalized medicine in cancerology. This is equivalent to saying that conditionally on a latent variable Z = k ∈ {0, . . . , K− 1}, the density of T at time t ≥ 0 is f k (t ; α k ), and we have P[Z = k|X = x] = π k (x) = π β k (x)
doi:10.1177/0962280218766389 pmid:29658407 fatcat:ynn6nt4kmveyvd6cugbkcgfhqq
« Previous Showing results 1 — 15 out of 41 results