Modeling high-dimensional dependence in astronomical data

R. Vio, T.W. Nagler, P. Andreani
2020 Astronomy and Astrophysics  
Fixing the relationship of a set of experimental quantities is a fundamental issue in many scientific disciplines. In the 2D case, the classical approach is to compute the linear correlation coefficient ρ from a scatterplot. This method, however, implicitly assumes a linear relationship between the variables. Such an assumption is not always correct. With the use of the partial correlation coefficients, an extension to the multidimensional case is possible. However, the problem of the assumed
more » ... tual linear relationship of the variables remains. A relatively recent approach that makes it possible to avoid this problem is the modeling of the joint probability density function of the data with copulas. These are functions that contain all the information on the relationship between two random variables. Although in principle this approach also can work with multidimensional data, theoretical as well computational difficulties often limit its use to the 2D case. In this paper, we consider an approach based on so-called vine copulas, which overcomes this limitation and at the same time is amenable to a theoretical treatment and feasible from the computational point of view. We applied this method to published data on the near-IR and far-IR luminosities and atomic and molecular masses of the Herschel reference sample, a volume-limited sample in the nearby Universe. We determined the relationship of the luminosities and gas masses and show that the far-IR luminosity can be considered as the key parameter relating the other three quantities. Once removed from the 4D relation, the residual relation among the latter is negligible. This may be interpreted as the correlation between the gas masses and near-IR luminosity being driven by the far-IR luminosity, likely by the star formation activity of the galaxy.
doi:10.1051/0004-6361/202038585 fatcat:o6zfwerpzzetjji6txyis3opyq