On the role of gradients for machine learning of molecular energies and forces

Christensen, Anders S and von Lilienfeld, O Anatole (2020) On the role of gradients for machine learning of molecular energies and forces. Machine Learning: Science and Technology, 1 (4). 045018. ISSN 2632-2153

[thumbnail of Christensen_2020_Mach._Learn.__Sci._Technol._1_045018.pdf] Text
Christensen_2020_Mach._Learn.__Sci._Technol._1_045018.pdf - Published Version

Download (1MB)

Abstract

The accuracy of any machine learning potential can only be as good as the data used in the fitting process. The most efficient model therefore selects the training data that will yield the highest accuracy compared to the cost of obtaining the training data. We investigate the convergence of prediction errors of quantum machine learning models for organic molecules trained on energy and force labels, two common data types in molecular simulations. When training models for the potential energy surface of a single molecule, we find that the inclusion of atomic forces in the training data increases the accuracy of the predicted energies and forces 7-fold, compared to models trained on energy only. Surprisingly, for models trained on sets of organic molecules of varying size and composition in non-equilibrium conformations, inclusion of forces in the training does not improve the predicted energies of unseen molecules in new conformations. Predicted forces, however, improve about 7-fold. For the systems studied, we find that force labels and energy labels contribute equally per label to the convergence of the prediction errors. The optimal choice of what type of training data to include depends on several factors: the computational cost of acquiring the force and energy labels for training, the application domain, the property of interest and the complexity of the machine learning model. Based on our observations we describe key considerations for the creation of new datasets for potential energy surfaces of molecules which maximize the efficiency of the resulting machine learning models.

Item Type: Article
Subjects: Impact Archive > Multidisciplinary
Depositing User: Managing Editor
Date Deposited: 30 Jun 2023 04:21
Last Modified: 25 Oct 2023 03:50
URI: http://research.sdpublishers.net/id/eprint/2630

Actions (login required)

View Item
View Item