May 11, 2020 Open Force Field Initiative Advisory Board Meeting

The first meeting of the OpenFF Initiative Advisory Board took place on May 11, 2020. The meeting notes are summarized below:

As we push into biopolymer FFs, what data should we be using to fit force fields?
- We are in the process of porting AMBER ff14sb into SMIRNOFF format.
- We have a plan for making an “AMBER-like” protein force field using the current small molecule bond/angle/vdw, with an AMBER protein charge model, then fitting torsions.
What data should we use to evaluate the quality of proteins/nucleic acids? Data for other molecules (lipids/carbohydrates - though those will likely be tackled next year)?
- High-quality protein-ligand binding affinity datasets with rigid ligands
- Conformational decoys? (What experimental datasets inform this?)
  - Similar sequences but different folds? (e.g. “Paracelsus Challenge”)
- Peptide data: NMR data on cyclic peptides
  - Scott Lokey: logD and NMR data for cyclic peptides
- Check with Alan Mark on automated protein benchmark set

TIP3P is likely to cause problems. OPC or OPC3 might be good candidates. PC3/TIP4P-Ew performed well, according to Onufriev’s study.
Import data used in fitting TIP3P-FB; ultimately co-optimize water model
Important properties:
- Hydration free energies insensitive to water model (but entropy/enthalpy decomposition is); polarization issues
- Diffusion coefficient would likely be more important, water model dependent
- Enthalpy of vaporization: Difficult property to reproduce; unclear where discrepancy arises from (e.g. quantum dynamical effect)
- Mixing properties: Enthalpies of mixing; partial molar volumes
- Partition coefficients: OCHEM (large database of octanol-water partition coefficients and other properties) ~ 20K [follow up with Sereina Riniker]; however, significant cancellation. Also OCHEM has a lot of license terms (OK for non-commercial, but otherwise a lot of restrictions: https://ochem.eu/home/show.do)
- Solvation free energies in solvents other than water
- Don’t neglect charged molecules (relative values, not absolute)

How can we best collaborate with the biopolymer FF development communities and provide/share resources (software, data, infrastructure)?
- Making it easy for novice Python programmers to extend the OFF toolkit – at the moment, understanding how various bits and pieces of OpenFF work together is a steep learning curve, although modifying code seems to be quite straightforward.
- Support for new parameter types / functional forms
How best can we share curated standard train/test datasets? We want to make it programmatically accessible, but what other formats are good too?
- Flexibility – everyone has their own favourite way of dealing with data
- Data accessible with an API to grab data in batches
- Offer bulk download
- Identifiers within dataset are critical (names, SMILES, CAS, InChI, etc.)
How can we best communicate with biopolymer FF development communities?
- Try attending community user group meetings
- Paris CHARMM / Tinker-HP/AMOEBA meeting is a great model
Support for different functional forms?
- e.g. site-specific multipoles

Is there a way to work with MD package developers to get support for these into their packages?
- Show package developers persuasive data that the force field is valuable to motivate implementation of new functional forms (especially protein-ligand binding energies) – demonstrate that it works
What about a library that provides advanced potentials until they can be implemented internally, like OpenKIM?
- This could make it easy for developers to try new forces with minimal effort
How can we best support Quantum-trained ML potentials? Other ML potentials?
- TensorFlow support (for ANI) already integrated into gromos

Should we focus on existing converters (e.g. ParmEd/InterMl), object models (an OFF/MolSSI common system object model?), libraries to read common object model serialization that other codes can plug into?
- Low-level converters for parameterized systems (like ParmEd) are likely to be the most successful
- New system construction tools (especially if broadly deployed, e.g. via CHARMM-GUI) could have impactful
- CanDo - Chris Schafmeister - “this is what people are going to use”
Currently, we’re aiming to build a common biopolymer + small molecule Topology object model that will ideally replace/augment OpenMM or other force field representations, Topology objects.
Probably out of scope of what OpenFF is developing are general trajectory representations, but we plan to work with other developers on these to help provide input/assistance.
Ditto with any replacements for Molecule SDF/mol2/PDB files: out of scope for our effort, but we would like to be involved in larger efforts.

open force field