Journal article
Application of machine learning with MALDI-TOF MS for rapid differentiation between methicillin-susceptible and methicillin-resistant Staphylococcus aureus
- Abstract:
- Background: Application of machine learning with matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry may allow rapid differentiation between methicillin-susceptible (MSSA) and methicillin-resistant Staphylococcus aureus (MRSA) and enable earlier AST-guided antibiotic use, but prior studies saw limited model performance. This study aims to apply novel machine learning techniques to a large dataset to create a prediction model with potential for clinical applications. Methods: This study has employed one of the largest datasets to date. 24487 Staphylococcus aureus isolates (13776 MRSA and 10711 MSSA) were collected between Jan 2021 and May 2024 in Hong Kong. These spectra were randomly divided into an 80:20 training-validation split to develop models of various structures. Top models, including a large-scale neural network (NN), the LightGBM gradient boosting framework (LGBM), and the weight-averaging ensemble model (“ensemble”) of NN and LGBM, underwent prospective testing using 2975 additional clinical isolates (1867 MRSA and 1108 MSSA), and external validation using 1000 spectra (500 MRSA and 500 MSSA) from Taiwan. Results: The NN, LGBM, and ensemble models all achieved high performance with accuracy of 0.9284-0.9388 and AUPRC of 0.9843-0.9866 during prospective testing. The models are well-calibrated and confidence thresholds increased the accuracy to 0.9697-0.9777 by rejecting 20% of low-confidence predictions. External validation revealed accuracy of 0.695-0.723 and AUPRC of 0.8409-0.8765 with an increased number of false negatives. Shapley additive explanations revealed top feature groups consistent with previous studies, but feature importance was found to be geographically specific. Conclusions: We present new machine learning models with high performance in differentiating between MRSA and MSSA. Model performance can be further boosted with confidence thresholds, but models are not generalizable across different geographical areas. Clinical applications should use geographically specific models with fallback to traditional AST methods for low confidence predictions.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Version of record, pdf, 1.5MB, Terms of use)
-
- Publisher copy:
- 10.1371/journal.pcbi.1013760
Authors
- Publisher:
- Public Library of Science
- Journal:
- PLoS Computational Biology More from this journal
- Volume:
- 22
- Issue:
- 5
- Pages:
- e1013760
- Article number:
- e1013760
- Publication date:
- 2026-05-05
- Acceptance date:
- 2025-11-17
- DOI:
- EISSN:
-
1553-7358
- ISSN:
-
1553734X, 1553-734X
- Language:
-
English
- Keywords:
- Source identifiers:
-
4038145
- Deposit date:
-
2026-05-12
- ARK identifier:
This ORA record was generated from metadata provided by an external service. It has not been edited by the ORA Team.
Terms of use
- Copyright date:
- 2026
- Licence:
- CC Attribution (CC BY)
If you are the owner of this record, you can report an update to it here: Report update to this record