Thesis
Barriers to generalisation in acoustic hive monitoring
- Abstract:
-
In industrial apiculture, honeybee colonies are exposed to a range of interacting stressors which impact their wellbeing and productivity. Machine Learning (ML) offers the promise to detect colony condition from sensor data to optimise pollination performance and honey yields while maintaining colony health. However, robust testing of these methods is lacking, limiting real-world applications. In this work, I develop best practices for colony state prediction and attempt to characterise the major barriers to cross-hive generalisation.
In Chapter 2, I develop a pipeline to extract handcrafted and deep-learned features from honeybee audio samples and train neural networks for binary classification. Using a dataset of 10 queenless and queenright hives, I test the effect of different feature extraction methods, sensor positions, training set diversity, and hive ID on prediction accuracy. Crucially, I demonstrate how test set choice can artificially inflate model performance and discuss best practices for model testing.
In Chapter 3, I investigate how different hive metrics shape the in-hive soundscape using UMAP embeddings. I then delve into the prediction time series of individual models, testing different methods for increasing prediction fidelity. Finally, I use SHAP analysis to identify which audio features contribute to detecting queenlessness across hives and relate these features to our existing knowledge of honeybee vibrational communication.
In Chapter 4, I present a dataset of 25 colonies undergoing varying levels of pollen starvation in combination with chalkbrood infection and propose an intuitive visualisation of soundscape differences over time. I analyse the effect of periods of pollen dearth on brood production and test the ability of neural networks to predict nutritional state from a multi-modal feature set of audio, temperature, and humidity features.
Overall, I identify an overwhelming lack of robust testing in the literature, leading to many overstated claims. I demonstrate a strong confounding effect of hive ID and encounter difficulties with cross-hive generalisation in my own models, dependent on dataset and prediction task. Large multi-hive, multi-context datasets will be required to one day achieve generalisation across apiaries, seasons, and climates.
Actions
Access Document
- Files:
-
-
(Preview, Dissemination version, pdf, 246.7MB, Terms of use)
-
Authors
Contributors
+ Wright, G
- Institution:
- University of Oxford
- Division:
- MPLS
- Department:
- Biology
- Research group:
- Oxford Bee Lab
- Oxford college:
- Jesus College
- Role:
- Supervisor
- ORCID:
- 0000-0002-2749-021X
+ Mortimer, E
- Institution:
- University of Oxford
- Division:
- MPLS
- Department:
- Biology
- Research group:
- Animal Vibration Lab
- Oxford college:
- Hertford College
- Role:
- Supervisor
- ORCID:
- 0000-0002-7230-3647
+ Evans, H
- Role:
- Supervisor
+ Evans, S
- Role:
- Supervisor
+ Markham, A
- Institution:
- University of Oxford
- Division:
- MPLS
- Department:
- Computer Science
- Oxford college:
- Kellogg College
- Role:
- Examiner
- ORCID:
- 0000-0001-5716-3941
+ Biotechnology and Biological Sciences Research Council
More from this funder
- Funder identifier:
- https://ror.org/00cwqg982
- Programme:
- Oxford BBSRC DTP
- DOI:
- Type of award:
- DPhil
- Level of award:
- Doctoral
- Awarding institution:
- University of Oxford
- Language:
-
English
- Keywords:
- Subjects:
- Deposit date:
-
2026-04-28
- ARK identifier:
Terms of use
- Copyright holder:
- Stella Marie Felsinger
- Copyright date:
- 2025
If you are the owner of this record, you can report an update to it here: Report update to this record