Institut für Mathematische Stochastik

DFG-Project Mu 1230 /8-1 and 8-2:
Statistical methods of model selection in regression analysis



Project Managment:

  • Prof. Dr. A. Munk, Göttingen
  • Prof. Dr. C. Czado, München


  • Dr. Hajo Holzmann



Regression analysis surely is one of the most important statistical methods of analysis in quantitative empirical research, and is nowadays employed in all disciplines working with empirical data. Its main goal is to reduce complex data structures to simple relationships between the different measured quantities via a regression model. Fields of application range from economical time series, survival studies in medicine, data from social panel studies to data from the natural sciences. The quality of the different scientific conclusions depends heavily on the validity of the underlying regression model. Therefore, during the last century a large variety of data-driven procedures were developed, which can be summarized by the keywords Godness of Fit Test, Specification Test, Model Selection Procedure and Model Diagnostics. Most of these procedure rely either on the idea of classical hypothesis testing or on penalizing the complexity of the model appropriately.

The goal of our project is to investigate the problem of model selection in regression analysis under a unified principle, namely the concept of p-value curves as introduced by Munk & Czado (1998). Here the main idea is to replace the classical concept of p-values by curves which allow for visualizing graphically the evidence for a model simultaneously under a whole scenario of competing models. To this end both theoretical foundations have to be developed as well as effective algorithms and there implementation of the often rather computationally complex procedures have to be provided.

The project is divided into three parts.

P1. Methods of model selection In this part mathematical and statistical questions of analysis, asymptotics and interpretation of p-value cures have to be developed and have to be compared with known methods in model selction

P2 Model Selection in generalized linear models (GLMs) Development and Implementation of procedures for the situation of generalized linear models and models with overdispersion. Application of the methods in a social economical panel studies, in modeling car insurence and in the evaluation of credit risk.

P3 Inverse regression models and their applications in physics Development and implementation of model selection procedures for inverse regression models and there application to problems in physics and astrophysics, in particular in modeling the milky way.

This project is in cooperation with the TU Munich and is supported by the DFG.