Journal article
Foundational challenges in assuring alignment and safety of large language models
- Abstract:
- This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs). These challenges are organized into three different categories: scientific understanding of LLMs, development and deployment methods, and sociotechnical challenges. Based on the identified challenges, we pose 200+ concrete research questions.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Version of record, pdf, 1.7MB, Terms of use)
-
- Publication website:
- https://openreview.net/forum?id=oVTkOs8Pka
Authors
- Publisher:
- Journal of Machine Learning Research
- Journal:
- Transactions on Machine Learning Research More from this journal
- Volume:
- 2024
- Publication date:
- 2024-09-17
- Acceptance date:
- 2024-09-02
- EISSN:
-
2835-8856
- Language:
-
English
- Pubs id:
-
2102156
- Local pid:
-
pubs:2102156
- Deposit date:
-
2025-04-09
- ARK identifier:
Terms of use
- Copyright holder:
- Anwar et al
- Copyright date:
- 2025
- Rights statement:
- © 2025 The Authors. This paper is an open access article distributed under the terms of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/)
- Licence:
- CC Attribution (CC BY)
If you are the owner of this record, you can report an update to it here: Report update to this record