Thesis icon

Thesis

Transfer in sequential decision making

Abstract:

Transfer learning is critical for improving the data efficiency and applicability of deep learning models in sequential decision-making. However, determining what knowledge transfers and how to effectively leverage it remains an open challenge. Recent breakthroughs in representation learning, especially in language and vision domains, demonstrate the power of transfer from large-scale datasets. Meanwhile, progress in simulation platforms and environment designs has opened up new possibilities for collecting diverse, realistic training data. Against this backdrop, the four works contained in this thesis explore transfer techniques in various aspects of sequential decision-making.

First, we provide a comprehensive survey of prior work on integrating natural language data and representations in sequential decision-making. Our survey reveals open challenges and charts promising research directions, advocating for the greater utilization of large language models and development of more semantically complex environments. Second, we propose and study a modular architecture design for compositional generalization in multi-modal multi-task settings. Controlled experiments demonstrate zero-shot transfer on held-out compositions of observation, action and instruction spaces, as well as efficient integration of new observation modalities. Third, we propose a method for directing unsupervised skill discovery toward more useful behaviors by transferring knowledge about value-relevant state features from the source tasks. Experiments in continuous control domains show our method yields superior coverage of the relevant dimensions of the state space and improved performance on the downstream tasks. Finally, our analysis of meta-gradients in non-stationary environments demonstrates that learning optimizers as functions of contextual features enables faster adaptation and increased lifetime performance.

Overall, the thesis offers novel insights and strategies for effective knowledge transfer in sequential decision-making. The works illustrate the benefits of incorporating language, targeted inductive biases, modest supervision, and metalearned adaptation.

Actions


Access Document


Files:

Authors


More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Computer Science
Role:
Author

Contributors

Institution:
University of Oxford
Division:
MPLS
Department:
Computer Science
Role:
Supervisor
Institution:
University of Oxford
Division:
MPLS
Department:
Statistics
Role:
Examiner
Role:
Examiner


More from this funder
Funding agency for:
Luketina, J
Grant:
OUCS/JL/1126383
Programme:
Oxford-DeepMind Graduate Scholarship


DOI:
Type of award:
DPhil
Level of award:
Doctoral
Awarding institution:
University of Oxford


Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP