open force field

An open and collaborative approach to better force fields

Data

All our datasets are available on GitHub and on QCArchive, with those used in force field optimization and benchmarking are also available on Zenodo. Feel free to contact us if you have any questions!



Quantum chemistry data

The Open Force Field Initiative uses QCArchive infrastructure to compute, store and access quantum chemistry data. Our data generation and submission scripts for each dataset are available in our OpenFF QCArchive Dataset Submission repository.

A select number of datasets used to train or benchmark our flagship force field or those routinely leveraged by our collaborators are also available as Zenodo records. These records contain “dataset views” and Docker images equiped with a Jupyter notebook entry points containing examples of how to access the data we use to fit our force fields. These dataset views are SQLite files exported from QCArchive with calculation records serialized with msgpack and compressed with zstandard, a combination that provides data with a lossless reduced size.

Datasets available on Zenodo as dataset views

Flagship Forcefield Datasets

Benchmarking and Other Datasets



Physical properties

We use NIST ThermoML archive to access condensed phase physical properties of various compounds included in our force field optimization and benchmarking. The utilities for automated selection and curation of these datasets are available as a part of OpenFF Evaluator, developed by Simon Boothroyd.

An older version of selected physical properties datasets can be found in our Open Forcefield Data repository.



Protein-ligand free energies

Our protein-ligand benchmarking dataset for calculating binding free energies can be accessed in our ProteinLigandBenchmarks repository.



MiniDrugBank

Our MiniDrugBank repository tracks the creation and evolution of the MiniDrugBank Molecule set, filtered from DrugBank Release Version 5.0.1.