More than 17,000 tree species are at risk from rapid global change

Boonman, CCF; Serra-Diaz, JM; Hoeks, S; Guo, W-Y; Enquist, BJ; Maitner, B; Malhi, Y; Merow, C; Buitenwerf, R; Svenning, J-C

Journal article

More than 17,000 tree species are at risk from rapid global change

Abstract:: Funding Information: AC was supported by a grant ( PRT/BD/152100/2021 ) financed by the Portuguese Foundation for Science and Technology (FCT) under MIT Portugal Program. AC and CC acknowledge support from FCT through support to CEG/IGOT Research Unit ( UIDB/00295/2020 and UIDP/00295/2020 ). JP was funded through FCT for funds to GHTM ( UID/04413/2020 ). LR was funded through the FCT contract \‘ CEECIND/00445/2017 \’ under the \‘Stimulus of Scientific Employment\—Individual Support\’ and by FCT \‘UNRAVEL\’ project ( PTDC/BIA-ECO/0207/2020 ; https://doi.org/10.54499/PTDC/BIA-ECO/0207/2020 ). PP acknowledge support from the Czech Science Foundation (project no. 23-07278S ). Publisher Copyright: © 2024 The AuthorsThe vast volume of currently available unstructured text data, such as research papers, news, and technical report data, shows great potential for ecological research. However, manual processing of such data is labour-intensive, posing a significant challenge. In this study, we aimed to assess the application of three state-of-the-art prompt-based large language models (LLMs), GPT-3.5, GPT-4, and LLaMA-2-70B, to automate the identification, interpretation, extraction, and structuring of relevant ecological information from unstructured textual sources. We focused on species distribution data from two sources: news outlets and research papers. We assessed the LLMs for four key tasks: classification of documents with species distribution data, identification of regions where species are recorded, generation of geographical coordinates for these regions, and supply of results in a structured format. GPT-4 consistently outperformed the other models, demonstrating a high capacity to interpret textual data and extract relevant information, with the percentage of correct outputs often exceeding 90% (average accuracy across tasks: 87–100%). Its performance also depended on the data source type and task, with better results achieved with news reports, in the identification of regions with species reports and presentation of structured output. Its predecessor, GPT-3.5, exhibited slightly lower accuracy across all tasks and data sources (average accuracy across tasks: 81–97%), whereas LLaMA-2-70B showed the worst performance (37–73%). These results demonstrate the potential benefit of integrating prompt-based LLMs into ecological data assimilation workflows as essential tools to efficiently process large volumes of textual data.publishersversionpublishe

Publication status:: Published

Peer review status:: Peer reviewed

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Share
Cite

Cite this record

APA Style

Boonman, C. C. F., Serra-Diaz, J. M., Hoeks, S., Guo, W.-Y., Enquist, B. J., Maitner, B., Malhi, Y., Merow, C., Buitenwerf, R., & Svenning, J.-C. (2024). More than 17,000 tree species are at risk from rapid global change. Nature Communications, 15(1), 166–166.

MLA Style

Boonman, CCF, et al. “More than 17,000 Tree Species Are at Risk from Rapid Global Change.” Nature Communications, vol. 15, no. 1, 2024, pp. 166–66.

Chicago Style

Boonman, CCF, JM Serra-Diaz, S Hoeks, et al. 2024. “More than 17,000 Tree Species Are at Risk from Rapid Global Change.” Nature Communications 15 (1): 166–66.
Print

Access Document

Files:: Boonman_et_al_2024_More_than_17000.pdf

(Preview, Version of record, pdf, 9.5MB, Terms of use)

Publisher copy:: 10.1038/s41467-023-44321-9

Publication website:: https://run.unl.pt/bitstream/10362/172961/1/Large_language_models_overcome_the_challenges_of_unstructured_text_data_in_ecology.pdf

Authors

+ Boonman, CCF More by this author

Role:: Author
ORCID:: 0000-0003-2417-1579

+ Serra-Diaz, JM More by this author

Role:: Author
ORCID:: 0000-0003-1988-1154

+ Hoeks, S More by this author

Role:: Author
ORCID:: 0000-0001-5619-3233

+ Guo, W-Y More by this author

Role:: Author

+ Enquist, BJ More by this author

Role:: Author
ORCID:: 0000-0002-6124-7096

More authors...

+ Danmarks Grundforskningsfond More from this funder

Funder identifier:: 10.13039/501100001732
Grant:: DNRF173

+ NSF | National Science Board More from this funder

Funder identifier:: 10.13039/100005716
Grant:: 2225076

+ Agence Nationale de la Recherche More from this funder

Funder identifier:: 10.13039/501100001665
Grant:: ANR-21-CE32-0003

+ Leverhulme Trust More from this funder

Funder identifier:: 10.13039/501100000275

+ Jackson Foundation More from this funder

Funder identifier:: 10.13039/100002158

Publisher:: Nature Research
Journal:: Nature Communications More from this journal
Volume:: 15
Issue:: 1
Pages:: 166-166
Article number:: 166
Publication date:: 2024-01-02
DOI:: 10.1038/s41467-023-44321-9
EISSN:: 2041-1723
ISSN:: 2041-1723

Language:: English
Keywords:: Biology

Combinatorics

Tree (set theory)

Ecology

Mathematics

Climate change
Pubs id:: 1595481
Local pid:: pubs:1595481
Source identifiers:: W4390511178
Deposit date:: 2026-06-04
ARK identifier:: ark:/29072/ora_aaacd5e605284a92a1901e28280e8d5a

Terms of use

Licence:: CC Attribution (CC BY)

Views and Downloads

About views and downloads

If you are the owner of this record, you can report an update to it here: Report update to this record

Journal article

More than 17,000 tree species are at risk from rapid global change

Actions

Access Document

Authors

Terms of use

Views and Downloads

Altmetrics

Dimensions

Journal article

More than 17,000 tree species are at risk from rapid global change

Actions

Access Document

Authors

Funding

Bibliographic Details

Item Description

Terms of use

Metrics

Views and Downloads

Altmetrics

Dimensions