Thesis
Systems and methods for improving large language models with increased inference compute
- Abstract:
- Scaling the amount of compute used to train language models has led to massive increases in their capability. A relatively under-explored direction has been scaling the amount of compute used at inference time. This thesis explores the design space for test-time compute algorithms, focusing on the simple method of repeated sampling. Notably, this thesis demonstrates that coverage, the percent of problems solved by any sample, often scales log-linearly with the number of samples over four orders of magnitude. In domains where automatic verification tools are available, this allows Llama-8b to outperform the single-attempt performance of GPT4o across all tasks studied. These increases in coverage can often be modeled with an exponentiated power law, which enables forecasting coverage up to 10k samples (the largest sample size attempted) using an order of magnitude fewer samples with an average error of 3.84% across all models and datasets. In domains where tools for verification are available, these gains in coverage directly translate into increased capability. In other domains, methods for selecting a correct sample from a large sample collection are required. This thesis investigates this selection problem in the salient application of software engineering. Using a combination of unit-test based voting and model-based selection, half of the accuracy between the single-attempt lower bound and coverage upper bound can be recovered. Finally, this thesis explores the systems implications of test-time compute scaling and presents an exact implementation of attention that accelerates inference throughput in these settings by over 10x.
Actions
- DOI:
- Type of award:
- MSc by Research
- Level of award:
- Masters
- Awarding institution:
- University of Oxford
- Language:
-
English
- Keywords:
- Subjects:
- Deposit date:
-
2025-11-21
If you are the owner of this record, you can report an update to it here: Report update to this record