Analysis of primary care computerized medical records (CMR) data with deep autoencoders (DAE)

Thomas, SA; Smith, NA; Livina, V; Yonova, I; Webb, R; de Lusignan, S

Journal article

Analysis of primary care computerized medical records (CMR) data with deep autoencoders (DAE)

Abstract:: The use of deep learning is becoming increasingly important in the analysis of medical data such as pattern recognition for classification. The use of primary healthcare computational medical records (CMR) data is vital in prediction of infection prevalence across a population, and decision making at a national scale. To date, the application of machine learning algorithms to CMR data remains under-utilized despite the potential impact for use in diagnostics or prevention of epidemics such as outbreaks of influenza. A particular challenge in epidemiology is how to differentiate incident cases from those that are follow-ups for the same condition. Furthermore, the CMR data are typically heterogeneous, noisy, high dimensional and incomplete, making automated analysis difficult. We introduce a methodology for converting heterogeneous data such that it is compatible with a deep autoencoder for reduction of CMR data. This approach provides a tool for real time visualization of these high dimensional data, revealing previously unknown dependencies and clusters. Our unsupervised nonlinear reduction method can be used to identify the features driving the formation of these clusters that can aid decision making in healthcare applications. The results in this work demonstrate that our methods can cluster more than 97.84% of the data (clusters >5 points) each of which is uniquely described by three attributes in the data: Clinical System (CMR system), Read Code (as recorded) and Read Term (standardized coding). Further, we propose the use of Shannon Entropy as a means to analyse the dispersion of clusters and the contribution from the underlying attributes to gain further insight from the data. Our results demonstrate that Shannon Entropy is a useful metric for analysing both the low dimensional clusters of CMR data, and also the features in the original heterogeneous data. Finally, we find that the entropy of the low dimensional clusters are directly representative of the entropy of the input data (Pearson Correlation = 0.99, R2 = 0.98) and therefore the reduced data from the deep autoencoder is reflective of the original CMR data variability.

Publication status:: Published

Peer review status:: Peer reviewed

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Cite

Cite this record

APA Style

Thomas, S. A., Smith, N. A., Livina, V., Yonova, I., Webb, R., & de Lusignan, S. (2019). Analysis of primary care computerized medical records (CMR) data with deep autoencoders (DAE). Frontiers in Applied Mathematics and Statistics, 5.

MLA Style

Thomas, S. A., et al. “Analysis of Primary Care Computerized Medical Records (CMR) Data with Deep Autoencoders (DAE).” Frontiers in Applied Mathematics and Statistics, vol. 5, Frontiers Media, 2019.

Chicago Style

Thomas, SA, NA Smith, V Livina, I Yonova, R Webb, and S de Lusignan. 2019. “Analysis of Primary Care Computerized Medical Records (CMR) Data with Deep Autoencoders (DAE).” Frontiers in Applied Mathematics and Statistics 5.
Share
Print

Access Document

Files:: fams-05-00042.pdf

(Preview, Version of record, 3.9MB, Terms of use)

Publisher copy:: 10.3389/fams.2019.00042

Authors

+ Thomas, SA More by this author

Role:: Author

+ Smith, NA More by this author

Role:: Author

+ Livina, V More by this author

Role:: Author

+ Yonova, I More by this author

Role:: Author

+ Webb, R More by this author

Role:: Author

More authors...

Publisher:: Frontiers Media
Journal:: Frontiers in Applied Mathematics and Statistics More from this journal
Volume:: 5
Article number:: 42
Publication date:: 2019-08-06
Acceptance date:: 2019-07-23
DOI:: 10.3389/fams.2019.00042
EISSN:: 2297-4687

Language:: English
Keywords:: dimensionality reduction

primary healthcare

FFR

deep learning

computerized medical records

visualization

heterogeneous data
Pubs id:: 1084431
Local pid:: pubs:1084431
Deposit date:: 2020-08-17

Terms of use

Copyright holder:: Thomas, SA et al.
Rights statement:: © 2019 Thomas, Smith, Livina, Yonova, Webb and de Lusignan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Licence:: CC Attribution (CC BY)

Views and Downloads

About views and downloads

If you are the owner of this record, you can report an update to it here: Report update to this record

Journal article

Analysis of primary care computerized medical records (CMR) data with deep autoencoders (DAE)

Actions

Access Document

Authors

Terms of use

Views and Downloads

Altmetrics

Dimensions

Journal article

Analysis of primary care computerized medical records (CMR) data with deep autoencoders (DAE)

Actions

Access Document

Authors

Bibliographic Details

Item Description

Terms of use

Metrics

Views and Downloads

Altmetrics

Dimensions