Towards trustworthy machine learning models in vision, physics, and language applications

Girbal Eiras, F

Thesis

Towards trustworthy machine learning models in vision, physics, and language applications

Abstract:: The wide-ranging impact of machine learning, particularly deep neural networks, cannot be overstated. These highly capable models are now deployed in critical domains such as autonomous driving, medical diagnosis, finance, and manufacturing. While their adoption is driven by superior performance on benchmark tasks, their data-driven nature often renders them unpredictable when encountering non-standard inputs. This unpredictability poses a significant challenge in safety-critical applications, where interactions with humans or the potential for system failures could lead to severe consequences. This underscores the need for trustworthy machine learning, where models must not only excel in standard metrics but also prove to be reliable and robust in real-world settings.

Our work addresses this need by improving methods that aim to either certify the robustness of these systems or, at a minimum, provide strong empirical evaluations of robustness and safety to support responsible deployment. We present advancements in probabilistic certification for image classification via randomized smoothing, introduce a general framework for verifying the partial derivatives of neural networks, which has applications in certifying the correctness of physics-informed neural networks, and analyse the safety risks involved in fine-tuning large language models on task-specific data, along with mitigation strategies. Additionally, we explore the broader implications of open-source generative AI models for improving trustworthiness. These contributions mark a step forward in developing trustworthy machine learning systems, and we conclude by discussing their strengths, limitations, as well as key open questions that remain for the field.

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Cite

Cite this record

APA Style

Girbal Eiras, F. (2024). Towards trustworthy machine learning models in vision, physics, and language applications [PhD thesis]. University of Oxford.

MLA Style

Girbal Eiras, F. Towards Trustworthy Machine Learning Models in Vision, Physics, and Language Applications. University of Oxford, 2024.

Chicago Style

Girbal Eiras, F. 2024. “Towards Trustworthy Machine Learning Models in Vision, Physics, and Language Applications.” PhD thesis, University of Oxford.
Share
Print