Thesis icon

Thesis

Crystallization properties of molecular materials: prediction and rule extraction by machine learning

Abstract:

Crystallization is an increasingly important process in a variety of applications from drug development to single crystal X-ray diffraction structure determination. However, while there is a good deal of research into prediction of molecular crystal structure, the factors that cause a molecule to be crystallizable have so far remained poorly understood.

The aim of this project was to answer the seemingly straightforward question: can we predict how easily a molecule will crystallize? The Cambridge Structural Database contains almost a million examples of materials from the scientific literature that have crystallized. Models for the prediction of crystallization propensity of organic molecular materials were developed by training machine learning algorithms on carefully curated sets of molecules which are either observed or not observed to crystallize, extracted from a database of commercially available molecules. The models were validated computationally and experimentally, while feature extraction methods and high resolution powder diffraction studies were used to understand the molecular and structural features that determine the ease of crystallization. This led to the development of a new molecular descriptor which encodes information about the conformational flexibility of a molecule.

The best models gave error rates of less than 5% for both cross-validation data and previously-unseen test data, demonstrating that crystallization propensity can be predicted with a high degree of accuracy. Molecular size, flexibility and nitrogen atom environments were found to be the most influential factors in determining the ease of crystallization, while microstructural features determined by powder diffraction showed almost no correlation with the model predictions. Further predictions on co-crystals show scope for extending the methodology to other relevant applications.

Actions


Access Document


Authors


More by this author
Division:
MPLS
Department:
Chemistry
Sub department:
Inorganic Chemistry
Role:
Author

Contributors

Role:
Supervisor
Role:
Supervisor


DOI:
Type of award:
DPhil
Level of award:
Doctoral
Awarding institution:
University of Oxford


Language:
English
Keywords:
Subjects:
UUID:
uuid:34beef4e-e499-4248-8fa6-7e8d8344f02c
Deposit date:
2018-03-20

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP