Journal article icon

Journal article

ANARCII enables alignment-free antigen receptor numbering using a generalised language model

Abstract:
Antigen receptor numbering allows delineation of antigen-binding regions of antibodies and T cell receptors, from sequence alone. Numbering is currently achieved by aligning to a reference set. This approach may result in different numbering depending on reference set used or fail on sequences from rare species or formats. We present a method (ANARCII) which requires no alignment step and is based on a Seq2Seq language model. ANARCII improves upon existing methods through more consistent numbering of key regions, robustness to truncations, generalisation to unseen species, and easier user installation. The lightweight architecture allows numbering of 90,000 sequences per minute on a high-end GPU. The software is available via web app (https://opig.stats.ox.ac.uk/webapps/sabdab-sabpred/sabpred/anarcii/), and package (https://github.com/oxpig/ANARCII). Ultimately ANARCII allows numbering of more antibody-like sequences, with better recovery of full-length regions from existing databases, and enables comparative analysis of new receptors not numbered by existing tools.
Publication status:
Published
Peer review status:
Peer reviewed

Actions

Access Document

Publisher copy:
10.1038/s42003-026-10186-z

Authors

More by this author
Institution:
University of Oxford
Role:
Author
ORCID:
0000-0002-8740-9823
More by this author
Institution:
University of Oxford
Role:
Author
More by this author
Institution:
University of Oxford
Role:
Author
ORCID:
0000-0003-2500-0173
More by this author
Institution:
University of Oxford
Role:
Author
ORCID:
0000-0001-9544-0890
More by this author
Institution:
University of Oxford
Role:
Author
ORCID:
0000-0002-8259-9111


Publisher:
Nature Research
Journal:
Communications Biology More from this journal
Publication date:
2026-05-21
Acceptance date:
2026-04-23
DOI:
EISSN:
2399-3642
ISSN:
2399-3642


Language:
English
Keywords:
Pubs id:
2422363
Local pid:
pubs:2422363
Source identifiers:
W7161762559
Deposit date:
2026-05-29
ARK identifier:
This ORA record was generated from metadata provided by an external service. It has not been edited by the ORA Team.

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP