Journal article
A metrics-based look at disk images: insights and applications
- Abstract:
- There is currently no systematic method for evaluating digital forensic datasets. This makes it difficult to judge their suitability for specific use cases in digital forensic education and training. Additionally, there is limited comparability in the quality of synthetic datasets or the strengths and weaknesses of different data synthesis approaches. In this paper, we propose the concept of a quantitative, metrics-based assessment of forensic datasets as a first step toward a systematic evaluation approach. As a concrete implementation of this approach, we introduce Mass Disk Processor, a tool that automates the collection of metrics from large sets of disk images. It enables a privacy-preserving retrieval of high-level disk image characteristics, facilitating the assessment of not only synthetic but also real-world disk images. We demonstrate two applications of our tool. First, we create a comprehensive datasheet for publicly available, scenario-based synthetic disk images. Second, we propose a formal definition of synthetic data realism that compares properties of synthetic data to properties of real-world data and present results from an examination of the realism of current scenario-based disk images.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Version of record, pdf, 1.4MB, Terms of use)
-
- Publisher copy:
- 10.1016/j.fsidi.2025.301874
Authors
+ Deutsche Forschungsgemeinschaft
More from this funder
- Funder identifier:
- https://ror.org/018mejw64
- Grant:
- 393 541 319/ GRK2475/2-2024
- Programme:
- Cybercrime and Forensic Computing
- Publisher:
- Elsevier
- Journal:
- Forensic Science International: Digital Investigation More from this journal
- Volume:
- 52
- Issue:
- Supplement
- Article number:
- 301874
- Publication date:
- 2025-03-24
- DOI:
- EISSN:
-
2666-2817
- ISSN:
-
2666-2825
- Language:
-
English
- Keywords:
- Pubs id:
-
2118786
- Local pid:
-
pubs:2118786
- Deposit date:
-
2025-06-12
- ARK identifier:
Terms of use
- Copyright holder:
- Voigt et al.
- Copyright date:
- 2025
- Rights statement:
- © 2025 The Author(s). Published by Elsevier Ltd on behalf of DFRWS. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
If you are the owner of this record, you can report an update to it here: Report update to this record