Universal in-context approximation

Petrov, A

Abstract:: The explosive rise of large language model capabilities has shifted research and practice from task-specific training to in-context learning via prompting. This raises a fundamental question: can a model with fixed weights solve novel tasks as effectively as a fine-tuned one? This thesis investigates the theoretical capabilities and limitations of in-context learning across major sequence model architectures, introducing the formal notion of universal in-context approximation, where a single model can approximate any function from a given class by only selecting an appropriate input prompt.

Our investigation begins by exploring the limitations of prompting in transformers. We first prove a significant restriction: prompting and prefix-tuning are fundamentally incapable of changing a model’s learned attention patterns over the user-provided content. This finding initially suggests that prompting is strictly less powerful than fine-tuning. However, we then demonstrate the contrary: transformers are universal in-context approximators. This thesis resolves the apparent contradiction by showing that universality is achieved not by altering attention over the content, but by leveraging the transformer’s attention mechanism’s ability to approximate smooth functions to arbitrary precision. Most notably, the model size is constant in the target precision.

Furthermore, we extend this inquiry beyond attention-based models to fully recurrent architectures, including RNNs, LSTMs and modern State Space Models (SSMs). As these models lack an attention mechanism, we develop a different method for proving their in-context capabilities: a compiler that translates high-level procedural programs into the parameters of recurrent models. Using this framework, we prove that these fully recurrent architectures are also universal in-context approximators, possibly even more efficiently so than the transformer.

Collectively, these results establish that, from a representational standpoint, in-context learning can be as expressive as full model retraining. This work provides a rigorous foundation for understanding the emergent capabilities of large-scale models and shows that, at least in theory, a carefully crafted prompt can go a long way.

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Share
Cite

Cite this record

APA Style

Petrov, A. (2025). Universal in-context approximation [PhD thesis]. University of Oxford.

MLA Style

Petrov, A. Universal in-Context Approximation. 2025. University of Oxford, PhD thesis.

Chicago Style

Petrov, A. 2025. “Universal in-Context Approximation.” PhD thesis, University of Oxford.
Print

Access Document

Files:: Petrov_2025_Universal_in-context_approximation.pdf

(Preview, Dissemination version, pdf, 17.0MB, Terms of use)

Authors

+ Petrov, A More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Engineering Science
Role:: Author

Contributors

+ Torr, PHS

Role:: Supervisor

+ Bibi, A

Role:: Supervisor

+ Engineering and Physical Sciences Research Council More from this funder

Funder identifier:: https://ror.org/0439y7842
Grant:: EP/S024050/1
Programme:: EPSRC Centre for Doctoral Training in Autonomous Intelligent Machines and Systems

DOI:: 10.5287/ora-xqvqkezd7
Type of award:: DPhil
Level of award:: Doctoral
Awarding institution:: University of Oxford

Language:: English
Keywords:: learning theory

state-space models

deep learning

in-context

transformer
Subjects:: Deep learning (Machine learning)
Deposit date:: 2025-12-20
ARK identifier:: ark:/29072/ora_64bd101871294acebed52d83bfdc8c3e

Terms of use

Copyright holder:: Aleksandar Petrov

Licence:: Terms and Conditions of Use for Oxford University Research Archive

Views and Downloads

About views and downloads

If you are the owner of this record, you can report an update to it here: Report update to this record

Thesis

Universal in-context approximation

Actions

Access Document

Authors

Contributors

Terms of use

Views and Downloads

Altmetrics

Dimensions

Thesis

Universal in-context approximation

Actions

Access Document

Authors

Contributors

Funding

Bibliographic Details

Item Description

Related Items

Terms of use

Metrics

Views and Downloads

Altmetrics

Dimensions