Journal article
Derivational morphology reveals analogical generalization in large language models
- Abstract:
- What mechanisms underlie linguistic generalization in large language models (LLMs)? This question has attracted considerable attention, with most studies analyzing the extent to which the language skills of LLMs resemble rules. As of yet, it is not known whether linguistic generalization in LLMs could equally well be explained as the result of analogy. A key shortcoming of prior research is its focus on regular linguistic phenomena, for which rule-based and analogical approaches make the same predictions. Here, we instead examine derivational morphology, specifically English adjective nominalization, which displays notable variability. We introduce a method for investigating linguistic generalization in LLMs: Focusing on GPT-J, we fit cognitive models that instantiate rule-based and analogical learning to the LLM training data and compare their predictions on a set of nonce adjectives with those of the LLM, allowing us to draw direct conclusions regarding underlying mechanisms. As expected, rule-based and analogical models explain the predictions of GPT-J equally well for adjectives with regular nominalization patterns. However, for adjectives with variable nominalization patterns, the analogical model provides a much better match. Furthermore, GPT-J's behavior is sensitive to the individual word frequencies, even for regular forms, a behavior that is consistent with an analogical account but not a rule-based one. These findings refute the hypothesis that GPT-J's linguistic generalization on adjective nominalization involves rules, suggesting analogy as the underlying mechanism. Overall, our study suggests that analogical processes play a bigger role in the linguistic generalization of LLMs than previously thought.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Version of record, eps, 1.4MB, Terms of use)
-
- Publisher copy:
- 10.1073/pnas.2423232122
Authors
+ European Research Council
More from this funder
- Funder identifier:
- https://ror.org/0472cxd90
- Grant:
- 740516
+ Engineering and Physical Sciences Research Council
More from this funder
- Funder identifier:
- https://ror.org/0439y7842
- Grant:
- EP/T023333/1
- Publisher:
- National Academy of Sciences
- Journal:
- Proceedings of the National Academy of Sciences More from this journal
- Volume:
- 122
- Issue:
- 19
- Article number:
- e2423232122
- Publication date:
- 2025-05-09
- Acceptance date:
- 2025-02-18
- DOI:
- EISSN:
-
1091-6490
- ISSN:
-
0027-8424
- Language:
-
English
- Keywords:
- Pubs id:
-
2124012
- Local pid:
-
pubs:2124012
- Deposit date:
-
2025-05-15
- ARK identifier:
Terms of use
- Copyright holder:
- Hofmann et al
- Copyright date:
- 2025
- Rights statement:
- © 2025 the Author(s). Published by PNAS. This open access article is distributed under Creative Commons Attribution License 4.0 (CC BY).
- Licence:
- CC Attribution (CC BY)
If you are the owner of this record, you can report an update to it here: Report update to this record