Journal article icon

Journal article

A metrics-based look at disk images: insights and applications

Abstract:
There is currently no systematic method for evaluating digital forensic datasets. This makes it difficult to judge their suitability for specific use cases in digital forensic education and training. Additionally, there is limited comparability in the quality of synthetic datasets or the strengths and weaknesses of different data synthesis approaches. In this paper, we propose the concept of a quantitative, metrics-based assessment of forensic datasets as a first step toward a systematic evaluation approach. As a concrete implementation of this approach, we introduce Mass Disk Processor, a tool that automates the collection of metrics from large sets of disk images. It enables a privacy-preserving retrieval of high-level disk image characteristics, facilitating the assessment of not only synthetic but also real-world disk images. We demonstrate two applications of our tool. First, we create a comprehensive datasheet for publicly available, scenario-based synthetic disk images. Second, we propose a formal definition of synthetic data realism that compares properties of synthetic data to properties of real-world data and present results from an examination of the realism of current scenario-based disk images.
Publication status:
Published
Peer review status:
Peer reviewed

Actions

Access Document

Files:
Publisher copy:
10.1016/j.fsidi.2025.301874

Authors

More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Computer Science
Role:
Author
ORCID:
0009-0008-1088-8302


More from this funder
Funder identifier:
https://ror.org/018mejw64
Grant:
393 541 319/ GRK2475/2-2024
Programme:
Cybercrime and Forensic Computing


Publisher:
Elsevier
Journal:
Forensic Science International: Digital Investigation More from this journal
Volume:
52
Issue:
Supplement
Article number:
301874
Publication date:
2025-03-24
DOI:
EISSN:
2666-2817
ISSN:
2666-2825


Language:
English
Keywords:
Pubs id:
2118786
Local pid:
pubs:2118786
Deposit date:
2025-06-12
ARK identifier:

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP