Deep learning sonographer visual attention

Cai, Y

Abstract:: Current automated fetal ultrasound (US) analysis methods are heavily influenced by the recent success of deep learning in computer vision tasks. Models built on convolutional neural networks (CNN) for fetal biometry planes detection have surpassed classic models built on hand-crafted features, but training such networks requires large dataset, especially sonographer annotations, which is normally not available in US image analysis. Meanwhile, sonographer visual attention has proven to be a strong prior for human interpretation of US video frames. This thesis attempts to utilize sonograher visual attention in the form of gaze-tracking data in deep learning frameworks to assist US image analysis tasks.

We created a single sweep dataset on fetal abdominal videos with retrospective gaze-tracking, then implemented deep learning frameworks that utilize gaze-tracking data to assist fetal biometry plane detection. We first developed a CNN called SonoEyeNet for standardized abdominal circumference plane (ACP) detection informed by sonographer visual attention. We demonstrate that with the assistance of human visual attention information, ACP detection performance is increased compared to models not using gaze information.

We extended this framework by proposing a novel multi-task CNN called Multi-task SonoEyeNet (MSEN) that learns to generate clinically relevant spatial visual attention maps using sonographer gaze tracking data, and used the predicted visual attention maps to assist ACP detection. This framework expands the potential clinical usefulness of the previous framework by eliminating the requirement of input gaze-tracking data during inference without compromising its ACP detection performance.

With the availability of a novel dataset containing real-time screen recordings of US anomaly scans coupled with simultaneous gaze-tracking, we further extended the CNN framework by introducing a bi-directional convolutional long-short term memory (LSTM) as a recurrent module to model spatio-temporal visual attention as well as to detect all standard biometry planes of fetal abdomen (ACP), head (HCP) and femur (FLP). It was demonstrated that by modeling spatio-temporal visual attention, standard biometry planes detection performance can be further improved.

This work constitutes the first demonstration that learning sonographer visual attention in an ultrasound video in a deep learning framework is an efficient method to assist other US image analysis tasks.

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Share
Cite

Cite this record

APA Style

Cai, Y. (2019). Deep learning sonographer visual attention [PhD thesis]. University of Oxford.

MLA Style

Cai, Y. Deep Learning Sonographer Visual Attention. 2019. University of Oxford, PhD thesis.

Chicago Style

Cai, Y. 2019. “Deep Learning Sonographer Visual Attention.” PhD thesis, University of Oxford.
Print

Access Document

Files:: Cai_2019_Deep_learning_sonographer.pdf

(Preview, Dissemination version, pdf, 32.3MB, Terms of use)

Cai_2019_Supplementary_materials.zip

(Supplementary materials, zip, 251.7MB, Terms of use)

Authors

+ Cai, Y More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Engineering Science
Role:: Author

Contributors

+ Noble, J

Institution:: University of Oxford
Division:: MPLS
Department:: Engineering Science
Role:: Supervisor
ORCID:: 0000-0002-3060-3772

DOI:: 10.5287/ora-zbvrrer0e
Type of award:: DPhil
Level of award:: Doctoral
Awarding institution:: University of Oxford

Language:: English
Keywords:: machine learning

deep learning

gaze tracking
Subjects:: Machine learning

Deep learning (Machine learning)
Deposit date:: 2026-04-28
ARK identifier:: ark:/29072/ora_2d8a85cc244547eeb821833288fd12ca

Terms of use

Copyright holder:: Yifan Cai

Licence:: Terms and Conditions of Use for Oxford University Research Archive

Views and Downloads

About views and downloads

If you are the owner of this record, you can report an update to it here: Report update to this record

Thesis

Deep learning sonographer visual attention

Actions

Access Document

Authors

Contributors

Terms of use

Views and Downloads

Altmetrics

Dimensions

Thesis

Deep learning sonographer visual attention

Actions

Access Document

Authors

Contributors

Bibliographic Details

Item Description

Terms of use

Metrics

Views and Downloads

Altmetrics

Dimensions