Thesis
Crystallization properties of molecular materials: prediction and rule extraction by machine learning
- Abstract:
-
Crystallization is an increasingly important process in a variety of applications from drug development to single crystal X-ray diffraction structure determination. However, while there is a good deal of research into prediction of molecular crystal structure, the factors that cause a molecule to be crystallizable have so far remained poorly understood.
The aim of this project was to answer the seemingly straightforward question: can we predict how easily a molecule will crystallize? The Cambridge Structural Database contains almost a million examples of materials from the scientific literature that have crystallized. Models for the prediction of crystallization propensity of organic molecular materials were developed by training machine learning algorithms on carefully curated sets of molecules which are either observed or not observed to crystallize, extracted from a database of commercially available molecules. The models were validated computationally and experimentally, while feature extraction methods and high resolution powder diffraction studies were used to understand the molecular and structural features that determine the ease of crystallization. This led to the development of a new molecular descriptor which encodes information about the conformational flexibility of a molecule.
The best models gave error rates of less than 5% for both cross-validation data and previously-unseen test data, demonstrating that crystallization propensity can be predicted with a high degree of accuracy. Molecular size, flexibility and nitrogen atom environments were found to be the most influential factors in determining the ease of crystallization, while microstructural features determined by powder diffraction showed almost no correlation with the model predictions. Further predictions on co-crystals show scope for extending the methodology to other relevant applications.
Actions
- DOI:
- Type of award:
- DPhil
- Level of award:
- Doctoral
- Awarding institution:
- University of Oxford
- Language:
-
English
- Keywords:
- Subjects:
- UUID:
-
uuid:34beef4e-e499-4248-8fa6-7e8d8344f02c
- Deposit date:
-
2018-03-20
Terms of use
- Copyright holder:
- Wicker, J
- Copyright date:
- 2017
If you are the owner of this record, you can report an update to it here: Report update to this record