Making relative survival analysis relatively easy
Computers in Biology and Medicine
In survival analysis we are interested in time from the beginning of an observation until certain event (death, relapse, etc.). We assume that the final event is well defined, so that we are never in doubt whether the final event has occurred or not. In practice this is not always true. If we are interested in cause-specific deaths, then it may sometimes be difficult or even impossible to establish the cause of death, or there may be different causes of death, making it impossible to assign
... h to just one cause. Suicides of terminal cancer patients are a typical example. In such cases, standard survival techniques cannot be used for estimation of mortality due to a certain cause. The cure to the problem are relative survival techniques which compare the survival experience in a study cohort to the one expected should they follow the background population mortality rates. This enables the estimation of the proportion of deaths due to a certain cause. In this paper, we briefly review some of the techniques to model relative survival, and outline a new fitting method for the additive model, which solves the problem of dependency of the parameter estimation on the assumption about the baseline excess hazard. We then direct the reader's attention to our R package relsurv that provides functions for easy and flexible fitting of all the commonly used relative survival regression models. The basic features of the package have been described in detail elsewhere, but here we additionally explain the usage of the new fitting method and the interface for using population mortality data freely available on the Internet. The combination of the package and the data sets provides a powerful informational tool in the hands of a skilled statistician/informatician. Motivation If a person with an incurable disease commits suicide, the cause of death written in the death certificate will be suicide. And if there were many such cases, the mortality statistics would show much lower proportion of deaths due to the disease in question than it really should. And while suicides are just an obvious, more or less hypothetical, example, it is less well known that it is often difficult or even impossible to select among different possible causes of death or assign a certain cause at all. People with a certain condition (e.g. diabetes, high blood pressure, etc.) may die of natural causes, but it is quite possible that they would have lived longer without that condition. In such cases we need methods of relative survival to estimate the proportion of people dying due to a certain cause. These methods are widely used in cancer registries, but almost never in other areas of medicine. The goal of this paper is to bring the methods of relative survival to a wider, possibly less statistical, audience, by giving an overview of the existing methods, outline some new methods, describe new possibilities of acquiring population data, and present a software package that includes functions for easy and flexible fitting of all the commonly used relative survival regression models.