Thesis
Risk bounds for improper prediction procedures
- Abstract:
-
Statistical Learning Theory studies the problem of learning an unknown relationship between observed input-output pairs sampled independently from some unknown and arbitrary probability distribution. The quality of the inferred relationship is judged by its excess risk -- a measure of prediction capability on unseen data compared to the best predictor in some predefined reference class of functions. A learning procedure is said to be improper if it is allowed to output a prediction rule outside the chosen reference class. This thesis presents four contributions to analyzing the prediction performance of improper learning algorithms.
Our first result contributes to developing the mathematical machinery suitable for the analysis of improper learning procedures. We obtain exponential-tail excess risk bounds in terms of offset Rademacher complexity for which only in-expectation guarantees were previously obtained.
Our second result shows that offset Rademacher complexity yields upper bounds on the excess risk of iterative regularization schemes characterized by mirror descent algorithms. Moreover, by providing a unified analysis, our proposed proof technique circumvents the limitations of some previous analyses tailored to exploit the exact form of specific iterative schemes.
Our third contribution concerns the analysis of the constrained linear least squares algorithm. We find that this classical and widely studied statistical estimator is suboptimal, with dimension-dependent excess risk improvements offered via known improper procedures.
In our fourth contribution, we investigate improperness in the linear regression problem with the squared loss, without imposing any assumptions on the distribution of the covariates and imposing a minimal assumption on the conditional distribution of the response variable. We first establish the in-expectation optimality of the truncated least squares estimator and then, we show that it can fail with constant probability. We conclude by proposing a deviation-optimal procedure. The considered setup admits heavy-tailed distributions while falling outside the scope of the typically studied procedures for heavy-tailed linear regression.
Actions
Authors
Contributors
- Role:
- Supervisor
- Role:
- Supervisor
- ORCID:
- 0000-0001-7772-4160
- Funder identifier:
- http://dx.doi.org/10.13039/501100000265
- Grant:
- EP/L016710/1
- Programme:
- EPSRC and MRC Centre for Doctoral Training in Next Generation Statistical Science: The Oxford-Warwick Statistics Programme.
- Funder identifier:
- http://dx.doi.org/10.13039/501100000266
- Grant:
- EP/L016710/1
- Programme:
- EPSRC and MRC Centre for Doctoral Training in Next Generation Statistical Science: The Oxford-Warwick Statistics Programme.
- Type of award:
- DPhil
- Level of award:
- Doctoral
- Awarding institution:
- University of Oxford
- Language:
-
English
- Keywords:
- Subjects:
- Deposit date:
-
2022-04-16
Terms of use
- Copyright holder:
- Vaskevicius, T
- Copyright date:
- 2021
If you are the owner of this record, you can report an update to it here: Report update to this record