Thesis icon

Thesis

Barriers to generalisation in acoustic hive monitoring

Abstract:
In industrial apiculture, honeybee colonies are exposed to a range of interacting stressors which impact their wellbeing and productivity. Machine Learning (ML) offers the promise to detect colony condition from sensor data to optimise pollination performance and honey yields while maintaining colony health. However, robust testing of these methods is lacking, limiting real-world applications. In this work, I develop best practices for colony state prediction and attempt to characterise the major barriers to cross-hive generalisation.

In Chapter 2, I develop a pipeline to extract handcrafted and deep-learned features from honeybee audio samples and train neural networks for binary classification. Using a dataset of 10 queenless and queenright hives, I test the effect of different feature extraction methods, sensor positions, training set diversity, and hive ID on prediction accuracy. Crucially, I demonstrate how test set choice can artificially inflate model performance and discuss best practices for model testing.

In Chapter 3, I investigate how different hive metrics shape the in-hive soundscape using UMAP embeddings. I then delve into the prediction time series of individual models, testing different methods for increasing prediction fidelity. Finally, I use SHAP analysis to identify which audio features contribute to detecting queenlessness across hives and relate these features to our existing knowledge of honeybee vibrational communication.

In Chapter 4, I present a dataset of 25 colonies undergoing varying levels of pollen starvation in combination with chalkbrood infection and propose an intuitive visualisation of soundscape differences over time. I analyse the effect of periods of pollen dearth on brood production and test the ability of neural networks to predict nutritional state from a multi-modal feature set of audio, temperature, and humidity features.

Overall, I identify an overwhelming lack of robust testing in the literature, leading to many overstated claims. I demonstrate a strong confounding effect of hive ID and encounter difficulties with cross-hive generalisation in my own models, dependent on dataset and prediction task. Large multi-hive, multi-context datasets will be required to one day achieve generalisation across apiaries, seasons, and climates.

Actions

Access Document

Files:

Authors

More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Biology
Research group:
Oxford Bee Lab
Role:
Author
ORCID:
0000-0002-0150-8389

Contributors

Institution:
University of Oxford
Division:
MPLS
Department:
Biology
Research group:
Oxford Bee Lab
Oxford college:
Jesus College
Role:
Supervisor
ORCID:
0000-0002-2749-021X
Institution:
University of Oxford
Division:
MPLS
Department:
Biology
Research group:
Animal Vibration Lab
Oxford college:
Hertford College
Role:
Supervisor
ORCID:
0000-0002-7230-3647
Role:
Supervisor
Role:
Supervisor
Institution:
University of Oxford
Division:
MPLS
Department:
Computer Science
Oxford college:
Kellogg College
Role:
Examiner
ORCID:
0000-0001-5716-3941


More from this funder
Funder identifier:
https://ror.org/00cwqg982
Programme:
Oxford BBSRC DTP


DOI:
Type of award:
DPhil
Level of award:
Doctoral
Awarding institution:
University of Oxford

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP