Preprint icon

Preprint

Decode-gLM: tools to interpret, audit, and steer genomic language models

Abstract:

While genomic language models are enabling the de novo design of entire genomes, they remain challenging to interpret, limiting their trustworthiness. Here, we show that sparse autoencoders (SAEs) trained on Nucleotide Transformer activations decompose hidden representations into interpretable biological features without supervision. Across layers and model sizes, SAEs identified over 100 diverse functional annotations encoded in the model’s activations. This included viral regulatory elements such as the CMV enhancer, despite viral genomes being excluded from training data. Tracing this signal revealed contamination in reference databases, demonstrating that interpretability methods can audit training data and identify hidden data leakage. We then show that Meta-SAEs, trained on the decoder weights of another SAE, can identify conceptual hierarchies encoded in the model, including a more abstract feature related to multiple HIV annotations. We confirmed that the features identified by our SAEs were learned during pretraining through probing a randomly initialised model. Finally, we demonstrate that our SAEs allow us to steer model predictions in biologically meaningful ways, showing that we can use an antibiotic-resistance SAE-feature to steer the model toward the A1408G aminoglycoside-resistance mutation in the ribosomal gene 16S rRNA. Together, these results establish SAEs as a method for both discovery and auditing, providing a toolkit for interpretable and trustworthy genomic foundation models. Readers can explore our findings at https://interpretglm.netlify.app/.

Publication status:
Published
Peer review status:
Not peer reviewed

Actions

Access Document

Preprint server copy:
10.1101/2025.10.31.685860

Authors

More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Statistics
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Biology
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Statistics
Oxford college:
Green Templeton College
Role:
Author
ORCID:
0000-0003-1731-8405


More from this funder
Funder identifier:
https://ror.org/0439y7842
Grant:
EP/S024093/1


Preprint server:
bioRxiv
Publication date:
2025-11-03
Acceptance date:
2025-11-03
DOI:
Server owner:
Cold Spring Harbor Laboratory


Language:
English
Pubs id:
2336884
UUID:
uuid_62a5fd92-5bb6-48ca-89bf-f70c0325778e
Local pid:
pubs:2336884
Deposit date:
2025-11-28
ARK identifier:

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP