AI Collection

Journal article

Exploring the effectiveness of instruction tuning in biomedical language processing

Abstract:: Large Language Models (LLMs), particularly those similar to ChatGPT, have significantly influenced the field of Natural Language Processing (NLP). While these models excel in general language tasks, their performance in domain-specific downstream tasks such as biomedical and clinical Named Entity Recognition (NER), Relation Extraction (RE), and Medical Natural Language Inference (NLI) is still evolving. In this context, our study investigates the potential of instruction tuning for biomedical language processing, applying this technique to two general LLMs of substantial scale. We present a comprehensive, instruction-based model trained on a dataset that consists of approximately 200,000 instruction-focused samples. This dataset represents a carefully curated compilation of existing data, meticulously adapted and reformatted to align with the specific requirements of our instruction-based tasks. This initiative represents an important step in utilising such models to achieve results on par with specialised encoder-only models like BioBERT and BioClinicalBERT for various classical biomedical NLP tasks. Our work includes an analysis of the dataset's composition and its impact on model performance, providing insights into the intricacies of instruction tuning. By sharing our codes, models, and the distinctively assembled instruction-based dataset, we seek to encourage ongoing research and development in this area.<sup>2</sup>.

Publication status:: Published

Peer review status:: Peer reviewed

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Share
Cite

Cite this record

APA Style

Rohanian, O., Nouriborji, M., Kouchaki, S., Nooralahzadeh, F., Clifton, L., & Clifton, D. A. (2024). Exploring the effectiveness of instruction tuning in biomedical language processing. Artificial Intelligence in Medicine, 158.

MLA Style

Rohanian, O, et al. “Exploring the Effectiveness of Instruction Tuning in Biomedical Language Processing.” Artificial Intelligence in Medicine, vol. 158, 2024.

Chicago Style

Rohanian, O, M Nouriborji, S Kouchaki, F Nooralahzadeh, L Clifton, and DA Clifton. 2024. “Exploring the Effectiveness of Instruction Tuning in Biomedical Language Processing.” Artificial Intelligence in Medicine 158.
Print

Access Document

Files:: Rohanian_et_al_2024_Exploring_the_effectiveness.pdf

(Preview, Version of record, pdf, 1.6MB, Terms of use)

Publisher copy:: 10.1016/j.artmed.2024.103007

Authors

+ Rohanian, O More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Engineering Science
Role:: Author
ORCID:: 0000-0002-8771-8386

+ Nouriborji, M More by this author

Role:: Author

+ Kouchaki, S More by this author

Role:: Author

+ Nooralahzadeh, F More by this author

Role:: Author

+ Clifton, L More by this author

Institution:: University of Oxford
Division:: MSD
Department:: Primary Care Health Sciences
Role:: Author
ORCID:: 0000-0001-5595-8468

More authors...

+ Medical Research Council More from this funder

Funder identifier:: https://ror.org/03x94j517

Publisher:: Elsevier
Journal:: Artificial Intelligence in Medicine More from this journal
Volume:: 158
Article number:: 103007
Publication date:: 2024-11-07
Acceptance date:: 2024-10-23
DOI:: 10.1016/j.artmed.2024.103007
EISSN:: 1873-2860
ISSN:: 0933-3657
Pmid:: 39541861

Language:: English
Keywords:: instruction tuning

biomedical NLP

namedentity recognition

relation extraction

medical NLI

Llama2-MedTuned
Pubs id:: 2063140
Local pid:: pubs:2063140
Deposit date:: 2025-01-06
ARK identifier:: ark:/29072/ora_47dc20ab39364de1b0bdd02609f5f3e2

Terms of use

Copyright holder:: Rohanian et al
Copyright date:: 2024
Rights statement:: © 2024 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

Licence:: CC Attribution (CC BY)

Views and Downloads

About views and downloads

If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP