Journal article icon

Journal article

Taxonomizing synthetic data for law

Abstract:
Synthetic data is increasingly important in data usage and AI design, creating novel legal and policy dilemmas. All too often, discussions of synthetic data treat it as entirely distinct from “real,” collected data, overlooking the risks posed by different kinds and uses of synthetic data. This piece comments on Michal Gal and Orla Lynskey, which persuasively argues that synthetic data will transform information privacy, market competition, and data quality. While the risks posed by synthetic data depend on its connection to collected data, we argue that background knowledge and assumptions about ground truth used to create it are at least as important. We bring that focus to Gal and Lynskey’s taxonomy of synthetic data, arguing that it is essential to grasp synthetic data’s legal and policy implications. As such, we divide synthetic data into (1) transformed data, which modifies collected data to preserve certain statistical properties for an end use; (2) augmented data, which relies on assumptions to bolster a collected dataset’s fidelity to the ground truth; and (3) simulated data, which relies almost entirely on background knowledge and ground-truth assumptions. As policymakers weigh whether to incentivize, mandate, or discourage the use of synthetic data, they should consider the validity of the ground-truth assumptions used in producing that data.
Publication status:
Published
Peer review status:
Peer reviewed

Actions


Access Document


Authors


More by this author
Institution:
University of Oxford
Division:
SSD
Department:
Law
Oxford college:
Reuben College
Role:
Author
ORCID:
0000-0001-9067-7231


Publisher:
University of Iowa College of Law
Journal:
Iowa Law Review More from this journal
Volume:
110
Article number:
217
Publication date:
2025-10-21
Acceptance date:
2025-10-06
EISSN:
1556-5068


Language:
English
Pubs id:
2302252
Local pid:
pubs:2302252
Deposit date:
2025-10-27

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP