open force field

An open and collaborative approach to better force fields

Publications

Scientific publications from the Open Force Field Initiative.
  • Improving force field accuracy by training against condensed phase mixture properties

    Simon Boothroyd, Owen C. Madin, David L. Mobley, Lee-Ping Wang, John D. Chodera and Michael R. Shirts

    Preprint ahead of publication: chemRxiv CC BY 4.0

    Published: Journal of Chemical Theory and Computation 18:6:3577–3592, 2022 [DOI]

    Code accompanying the publication: SimonBoothroyd/binary-mixture-publication

    Developing a sufficiently accurate classical force field representation of molecules is key to realizing the full potential of molecular simulation as a route to gaining fundamental insight into a broad spectrum of chemical and biological phenomena. This is only possible, however, if the many complex interactions between molecules of different species in the system are accurately captured by the model. Historically, the intermolecular van der Waals (vdW) interactions have primarily been trained against densities and enthalpies of vaporization of pure (single-component) systems, with occasional usage of hydration free energies. In this study, we demonstrate how including physical property data of binary mixtures can better inform these parameters, encoding more information about the underlying physics of the system in complex chemical mixtures. To demonstrate this, we re-train a select number of the Lennard-Jones parameters describing the vdW interactions of the OpenFF 1.0.0 (Parsley) fixed charge force field against training sets composed of densities and enthalpies of mixing for binary liquid mixtures as well as densities and enthalpies of vaporization of pure liquid systems, and assess the performance of each of these combinations. We show that retraining against the mixture data almost universally improves the force field’s ability to reproduce both pure and mixture properties, reducing some systematic errors that exist when training vdW interactions against properties of pure systems only.

  • The Open Force Field Evaluator: An automated, efficient, and scalable framework for the estimation of physical properties from molecular simulation

    Simon Boothroyd, Lee-Ping Wang, David Mobley, John Chodera, and Michael Shirts

    Preprint ahead of publication: chemRxiv CC BY 4.0

    Published: Journal of Chemical Theory and Computation 18:6:3566–3576, 2022 [DOI]

    Code accompanying the publication: openforcefield/openff-evaluator

    Parameterization and assessment of force fields against high-quality, condensed phase physical property data is an integral aspect of force field development. Here we present an entirely automated, highly scalable framework for evaluating physical properties and their gradients in terms of force field parameters. It is written as a modular and extensible Python framework, which employs an intelligent multiscale estimation approach that allows for the automated estimation of properties from simulation and cached simulation data, and a pluggable API for estimation of new properties. In this study we demonstrate the utility of the framework by benchmarking the OpenFF 1.0.0 small molecule force field, GAFF 1.8 and GAFF 2.1 force fields against a test set of binary density and enthalpy of mixing measurements curated using the frameworks utilities. Further, we demonstrate the framework’s utility as part of force field optimization by using it alongside ForceBalance, a framework for systematic force field optimization, to retrain a set of non-bonded van der Waals parameters against a training set of density and enthalpy of vaporization measurements.

  • Bayesian inference-driven model parameterization and model selection for 2CLJQ fluid models

    Owen C. Madin, Simon Boothroyd, Richard A. Messerly, John D. Chodera, Josh Fass, and Michael R. Shirts

    Preprint ahead of publication: arXiv CC BY 4.0

    Published: Journal of Chemical Information and Modeling 62:4:874-889, 2022 [DOI]

    Code accompanying the publication: SimonBoothroyd/bayesiantesting

    A high level of physical detail in a molecular model improves its ability to perform high accuracy simulations, but can also significantly affect its complexity and computational cost. In some situations, it is worthwhile to add additional complexity to a model to capture properties of interest; in others, additional complexity is unnecessary and can make simulations computationally infeasible. In this work we demonstrate the use of Bayes factors for molecular model selection, using Monte Carlo sampling techniques to evaluate the evidence for different levels of complexity in the two-centered Lennard-Jones + quadrupole (2CLJQ) fluid model. Examining three levels of nested model complexity, we demonstrate that the use of variable quadrupole and bond length parameters in this model framework is justified only sometimes. We also explore the effect of the Bayesian prior distribution on the Bayes factors, as well as ways to propose meaningful prior distributions. This Bayesian Markov Chain Monte Carlo (MCMC) process is enabled by the use of analytical surrogate models that accurately approximate the physical properties of interest. This work paves the way for further atomistic model selection work via Bayesian inference and surrogate modeling

  • Best practices for constructing, preparing, and evaluating protein-ligand binding affinity benchmarks

    David F. Hahn, Christopher I. Bayly, Hannah E. Bruce Macdonald, John D. Chodera, Antonia S. J. S. Mey, David L. Mobley, Laura Perez Benito, Christina E.M. Schindler, Gary Tresadern, Gregory L. Warren

    Preprint ahead of publication: arXiv

    Code accompanying the publication: openforcefield/FE-Benchmarks-Best-Practices

    As new methods, force fields, and implementations are developed, assessing the expected accuracy of free energy calculations on real-world systems (benchmarking) becomes critical to provide users with an assessment of the accuracy expected when these methods are applied within their domain of applicability, and developers with a way to assess the expected impact of new methodologies. Here, we present guidelines for (1) curating experimental data to develop meaningful benchmark sets, (2) preparing benchmark inputs according to best practices to facilitate widespread adoption, and (3) analysis of the resulting predictions to enable statistically meaningful comparisons among methods and force fields.

  • Development and Benchmarking of Open Force Field v1.0.0, the Parsley Small Molecule Force Field

    Yudong Qiu, Daniel Smith, Simon Boothroyd, Hyesu Jang, Jeffrey Wagner, Caitlin C. Bannan, Trevor Gokey, Victoria T. Lim, Chaya Stern, Andrea Rizzi, Xavier Lucas, Bryon Tjanaka, Michael R. Shirts, Michael Gilson, John Chodera, Christopher I. Bayly, David Mobley, Lee-Ping Wang

    Preprint ahead of publication: chemRxiv CC BY 4.0

    Published: Journal of Chemical Theory and Computation 17:10:6262-6280, 2021 [DOI]

    Code accompanying the publication: openforcefield/openforcefield-forcebalance/tree/v1.0.0-RC2

    We describe the structure and optimization of the Open Force Field 1.0.0 small molecule force field, code-named Parsley. Parsley uses the SMIRKS-native Open Force Field (SMIRNOFF) parameter assignment formalism in which parameter types are assigned directly by chemical perception, in contrast to traditional atom type-based approaches. This method provides a natural means to incorporate increasingly diverse chemistry without needlessly increasing force field complexity. In this work, we present essentially a full optimization of the valence parameters in the force field. The optimization was carried out with the ForceBalance tool and was informed by reference quantum chemical data that include torsion potential energy profiles, optimized gas-phase structures, and vibrational frequencies. These data were computed and are maintained with QCArchive, an open-source and freely available distributed computing and database software ecosystem. Tests of the resulting force field against compounds and data types outside the training set show improvements in optimized geometries and conformational energetics and demonstrate that Parsley’s accuracy for liquid properties is similar to that of other general force fields.

  • End-to-end differentiable molecular mechanics force field construction

    Yuanqing Wang, Josh Fass, and John D. Chodera

    Preprint ahead of publication: arXiv CC BY 4.0

    Code accompanying the publication: choderalab/espaloma

    Molecular mechanics force fields have been a workhorse for computational chemistry and drug discovery. Here, we propose a new approach to force field parameterization in which graph convolutional networks are used to perceive chemical environments and assign molecular mechanics (MM) force field parameters. The entire process of chemical perception and parameter assignment is differentiable end-to-end with respect to model parameters, allowing new force fields to be easily constructed from MM or QM force fields, extended, and applied to arbitrary biomolecules.

  • Capturing non-local through-bond effects when fragmenting molecules for quantum chemical torsion scans

    Chaya D. Stern, Christopher I. Bayly, Daniel G. A. Smith, Josh Fass, Lee-Ping Wang, David L. Mobley, and John D. Chodera

    Preprint ahead of publication: bioRxiv CC BY 4.0

    Code accompanying the publication: openforcefield/fragmenter

    We show how the Wiberg Bond Order (WBO) can be used to construct small molecule fragmentation schemes that will avoid disrupting the chemical environment around torsions. The resulting fragmentation scheme powers the QCSubmit tool used to fragment and inject small molecule datasets into the QCFractal computation pipeline for deposition into the QCArchive quantum chemistry archive the Open Force Field Initiative uses for constructing force fields, as well as powering bespoke torsion refitting for individual molecules.

  • Improving Small Molecule Force Fields by Identifying and Characterizing Small Molecules with Inconsistent Parameters

    Jordan Ehrman, Victoria T. Lim, Caitlin C. Bannan, Nam Thi, Daisy Kyu, and David Mobley

    Preprint ahead of publication: chemRxiv CC BY 4.0

    Published: Journal of Computer-Aided Molecular Design 35:271-284, 2021 [DOI]

    Code accompanying the publication: mobleylab/off-ffcompare

    We present a pipeline for comparing the geometries of small molecule conformers optimized with different force fields. We aimed to identify molecules or chemistries that are particularly informative for future force field development because they display inconsistencies between force fields. We applied our pipeline to a subset of the eMolecules database, and highlighted molecules that appear to be parameterized inconsistently across different force fields. We then identified over-represented functional groups in these molecule sets. The molecules and moieties identified by this pipeline may be particularly helpful for future force field parameterization.

  • Towards chemical accuracy for alchemical free energy calculations with hybrid physics-based machine learning/molecular mechanics potentials

    Dominic Rufa, Hannah Bruce Macdonald, Josh Fass, Marcus Wieder, Patrick Grinaway, Adrian Roitberg, Olexandr Isayev and John Chodera

    Preprint ahead of publication: bioRxiv CC BY 4.0

    Code accompanying the publication: choderalab/perses

    This study combines a new generation of hybrid ML/MM potentials and a nonequilibrium perturbation approach to predict protein-ligand binding affinities. With this approach, a standard, GPU-accelerated MM alchemical free energy calculation can be corrected in a simple post-processing step to efficiently recover ML/MM free energies, while delivering a significant accuracy improvement with small additional computational effort. The authors show that it is possible to significantly reduce the error in absolute binding free energies with this new hybrid ML/MM approach ANI2xx/AMBER14SB/TIP3P on Tyk2 benchmarking system. The same set of FE calculations performed with OpenFF-1.0.0 instead of ANI2xx to model ligands achieves RMSE statistically indistinguishable from the Schrodinger JACS result for the tested system, which implies that we should expect even better results with the latest Parsley update (OpenFF-1.2.0).

  • Benchmark Assessment of Molecular Geometries and Energies from Small Molecule Force Fields

    Victoria T. Lim and David L. Mobley

    Preprint ahead of publication: chemRxiv CC BY 4.0

    Published: F1000Research 9:1390, 2020 [DOI]

    Code accompanying the publication: mobleylab/benchmarkff

    In this work, we aim to compare six force fields: GAFF, GAFF2, MMFF94, MMFF94S, SMIRNOFF99Frosst, and the openff-1.0.0 (Parsley) force field by focusing on small molecules (< 50 heavy atoms). On a dataset comprising over 26,000 molecular structures, we analyzed their force field-optimized geometries and conformer energies compared to reference quantum mechanical (QM) data. We show that most of these force fields are comparable in accuracy at reproducing gas-phase QM geometries and energetics, but that GAFF/GAFF2/Parsley do slightly better in reproducing QM energies and that MMFF94/MMFF94S perform slightly better in geometries. Parsley version OpenFF-1.0.0 shows considerable improvement over its predecessor SMIRNOFF99Frosst, while OpenFF-1.2.0 performs even better with accuracy comparable to other available general force fields. We identify particular outlying chemical groups for further force field improvement.

  • Driving torsion scans with wavefront propagation

    Yudong Qiu, Daniel G. A. Smith, Chaya D. Stern, Mudong Feng, Hyesu Jang, and Lee-Ping Wang

    Preprint ahead of publication: chemRxiv CC BY 4.0

    Published: The Journal of Chemical Physics 152:244116, 2020 [DOI]

    Code accompanying the publication: lpwgroup/torsiondrive/

    In this paper, we propose a systematic and versatile workflow called TorsionDrive to generate energy-minimized structures on a grid of torsion constraints by means of a recursive wavefront propagation algorithm, which resolves the deficiencies of conventional scanning approaches and generates higher quality QM data for force field development. The method is implemented in an open-source software package that is compatible with many QM software packages and energy minimization codes. The paper also describes integration with the MolSSI QCArchive distributed computing ecosystem.

  • Binding thermodynamics of host-guest systems with SMIRNOFF99Frosst 1.0.5 from the Open Force Field Initiative

    David R. Slochower, Niel Henriksen, Lee-Ping Wang, John D. Chodera, David L. Mobley, and Michael K. Gilson

    Preprint ahead of publication: chemRxiv CC BY 4.0

    Published: Journal of Chemical Theory and Computation 15:6225, 2019 [DOI]

    Code accompanying the publication: slochower/smirnoff-host-guest-manuscript

    We evaluate the accuracy of SMIRNOFF99Frosst, using free energy calculations of 43 α and β-cyclodextrin host-guest pairs and compare with matched calculations using two versions of GAFF. These results suggest that SMIRNOFF99Frosst performs competitively with existing small molecule force fields and is a parsimonious starting point for optimization.

  • Graph Nets for Partial Charge Prediction

    Yuanqing Wang and Josh Fass and Chaya D. Stern and John Chodera

    Preprint ahead of publication: arXiv arXiv-1.0

    Code accompanying the publication: choderalab/gimlet

    Graph convolutional and message-passing networks can be a powerful tool for predicting physical properties of small molecules when coupled to a simple physical model that encodes the relevant invariances. Here, we show the ability of graph nets to predict partial atomic charges for use in molecular dynamics simulations and physical docking.

  • ChemPer: An Open Source Tool for Automatically Generating SMIRKS Patterns

    Caitlin C. Bannan, David Mobley

    Preprint ahead of publication: chemRxiv CC BY 4.0

    Code accompanying the publication: MobleyLab/chemper

    In this work, we present ChemPer – a new tool for generating SMIRKS patterns based on clustered fragments (i.e. bonds, angles, or torsions) which should be assigned the same force field parameter. We demonstrate the utility of ChemPer by clustering fragments based on a reference force field, and then regenerating those parameters starting with a simple set of alkanes, ethers, and alcohols.

  • Systematic Optimization of Water Models Using Liquid/Vapor Surface Tension Data

    Yudong Qiu, Paul S. Nerenberg, Teresa Head-Gordon, Lee-Ping Wang

    Preprint ahead of publication: chemRxiv CC BY 4.0

    Published: The Journal of Physical Chemistry B 123:7061, 2019 [DOI]

    Code accompanying the publication: leeping/forcebalance

    This work investigates whether experimental surface tension measurements, which are less sensitive to quantum and self-polarization corrections, are able to replace the usual reliance on the heat of vaporization as experimental reference data for fitting force field models of molecular liquids.

  • Uncertainty quantification confirms unreliable extrapolation toward high pressures for united-atom Mie λ-6 force field

    Richard A. Messerly, Michael R. Shirts, and Andrei F. Kazakov

    Published: The Journal of Chemical Physics 149:114109, 2018 [DOI]

    We demonstrate how Bayesian approaches can be used to estimate the reliability of predictions made with molecular mechanics force fields.

  • Toward learned chemical perception of force field typing rules

    Camila Zanette, Caitlin C. Bannan, Christopher I. Bayly, Josh Fass, Michael K. Gilson, Michael R. Shirts, John Chodera, and David L. Mobley

    Preprint ahead of publication: chemRxiv CC BY 4.0

    Published: Journal of Chemical Theory and Computation 15:402, 2019 [DOI]

    Code accompanying the publication: openforcefield/smarty

    Here, we introduce a proof of principle for automatically sampling chemical perception compared to traditional atom typed force fields and our SMIRNOFF format.

  • Facile Synthesis of a Diverse Library of Mono-3-substituted β-Cyclodextrin Analogues

    Kathryn Kellett, Brendan M. Duggan and Michael K. Gilson

    Preprint ahead of publication: chemRxiv CC BY 4.0

    Published: Supramolecular Chemistry 31:251, 2019 [DOI]

    We show the facile synthesis of a library of diverse mono-3-substituted β-cyclodextrin analogues, that have the potential to be used to collect host-guest binding data to test and improve simulation force fields.

  • Escaping atom types using direct chemical perception

    David Mobley, Caitlin C. Bannan, Andrea Rizzi, Christopher I. Bayly, John D. Chodera, Victoria T Lim, Nathan M. Lim, Kyle A. Beauchamp, Michael R. Shirts, Michael K. Gilson, and Peter K. Eastman

    Preprint ahead of publication: bioRxiv CC BY 4.0

    Published: Journal of Chemical Theory and Computation 14:6076, 2018 [DOI] PMC6245550

    This paper introduces the SMIRNOFF format in the context of traditional force fields, explains the development and validation of our new small molecule force field smirnoff99Frosst, and highlights some directions the initiative is headed.

  • Toward Expanded Diversity of Host–Guest Interactions via Synthesis and Characterization of Cyclodextrin Derivatives

    Kathryn Kellett, S. A. Kantonen, Brendan M. Duggan and Michael K. Gilson

    Preprint ahead of publication: chemRxiv CC BY-NC-ND 4.0

    Published: The Journal of Solution Chemistry 47:1597, 2018 [DOI]

    This paper shows the synthesis of a mono-3-functionalized cyclodextrin and how cyclodextrin derivatives can effect the binding of guest molecules using 1D/2D NMR and ITC experiments.

  • Approaches for Calculating Solvation Free Energies and Enthalpies Demonstrated with an Update of the FreeSolv Database

    Guilherme Duarte Ramos Matos, Daisy Y. Kyu, Hannes H. Loeffler, John D. Chodera, Michael R. Shirts, and David L. Mobley

    Preprint ahead of publication: bioRxiv CC BY 4.0

    Published: Journal of Chemical Engineering Data 62:1559, 2017 [DOI] PMC5648357

    Code accompanying the publication: mobleylab/freesolv

    We review alchemical methods for computing solvation free energies and present an update (version 0.5) to the FreeSolv database of experimental and calculated hydration free energies of neutral compounds.

  • Towards Automated Benchmarking of Atomistic Forcefields: Neat Liquid Densities and Static Dielectric Constants from the ThermoML Data Archive

    Kyle A. Beauchamp, Julie M. Behr, Ariën S. Rustenburg, Christopher I. Bayly, Kenneth Kroenlein, and John D. Chodera

    Preprint ahead of publication: arXiv arXiv-1.0

    Published: Journal of Physical Chemistry B 119:12912, 2015 [DOI] PMC4667959

    Code accompanying the publication: choderalab/LiquidBenchmark

    Progress in forcefield validation and parameterization has been hindered by the availability of high-quality machine-readable physical property data for small organic molecules. We show how the NIST ThermoML dataset provides a solution to this problem, and demonstrate its utility in benchmarking the GAFF/AM1-BCC small molecule forcefield on neat liquid densities and static dielectric constants to uncover problems in the representation of low-dielectric environments.