Skip to content

Commit

Permalink
first pass at charge tutorial updates
Browse files Browse the repository at this point in the history
  • Loading branch information
jthorton committed Feb 6, 2025
1 parent cf41975 commit 3e25848
Show file tree
Hide file tree
Showing 3 changed files with 171 additions and 7 deletions.
103 changes: 103 additions & 0 deletions cli_tutorials/cli_charge_molecules.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
# Generating Partial Charges with the OpenFE CLI

This tutorial will show you how to use the OpenFE CLI to generate and store partial charges for a series of ligands
which can be used with OpenFE protocols. It is recommended to use a single set of charges for each ligand to ensure reproducibility
between repeats or consistent charges between different legs of a calculation involving the same ligand, like a relative binding
affinity calculation for example. As such both the `plan-rbfe-network` and `plan-rhfe-network` commands will calculate
partial charges for ligands making it expensive to run multiple network mappings while finding the optimal one for the resources
available. Hence, the `charge-molecules` command offers a way to generate and store the charges into an SDF file which
can be used with the rest of the OpenFE CLI and python API.

## Charging Molecules

The `charge-molecules` command allows you to generate partial charges a series of small molecules saved in SDF or MOL2
format using the `am1bcc` method calculated using `ambertools`:

```bash
openfe charge-molecules -M tyk2_ligands.sdf -o charged_tyk2_ligands.sdf
```

This will result in a new SDF file `charged_tyk2_ligands.sdf` which contains the same ligands and their partial charges
stored in a new SD tag like so:

```text
lig_ejm_42
RDKit 3D
35 36 0 0 0 0 0 0 0 0999 V2000
-4.7651 -2.8327 -16.5085 H 0 0 0 0 0 0 0 0 0 0 0 0
-5.3566 -3.6931 -16.2274 C 0 0 0 0 0 0 0 0 0 0 0 0
-4.7703 -4.9699 -16.2000 C 0 0 0 0 0 0 0 0 0 0 0 0
[continues]
28 31 1 0
M END
> <ofe-name>
lig_ejm_42
> <atom.dprop.PartialCharge>
0.14794282857142857 -0.096057171428571425 -0.12905717142857143 ...
[continues]
```

Generating partial charges with the `am1bcc` method can be slow as they require a semi-empirical quantum chemical calculation,
we can however take advantage of multiprocessing to calculate the charges in parallel for each ligand which offers a
significant speed-up. The number of processors available for the workflow can be specified using the `-n` flag:

```bash
openfe charge-molecules -M tyk2_ligands.sdf -o charged_tyk2_ligands.sdf -n 4
```

## Customizing the Charge Method

There are a wide range of partial charge generation methods available with `am1bcc` based schemes being most commonly
used with OpenFF force fields. The choice of charge scheme can be easily customised by providing a settings file in `.yaml` format.
For example to recreate the current default settings in the workflow you would do the following:

1. Provide a file like `settings.yaml` with the desired settings:

```yaml
partial_charge:
method: am1bcc
settings:
off_toolkit_backend: ambertools
```
2. Charge the ligands with an additional `-s` flag for passing the settings:

```bash
openfe charge-molecules -M tyk2_ligands.sdf -o charged_tyk2_ligands.sdf -n 4 -s settings.yaml
```

3. The output of the CLI program will now reflect the changes made:

```text
SMALL MOLECULE PARTIAL CHARGE GENERATOR
_________________________________________
Parsing in Files:
Got input:
Small Molecules: SmallMoleculeComponent(name=lig_ejm_31) SmallMoleculeComponent(name=lig_ejm_42) SmallMoleculeComponent(name=lig_ejm_43) SmallMoleculeComponent(name=lig_ejm_46) SmallMoleculeComponent(name=lig_ejm_47) SmallMoleculeComponent(name=lig_ejm_48) SmallMoleculeComponent(name=lig_ejm_50) SmallMoleculeComponent(name=lig_jmc_23) SmallMoleculeComponent(name=lig_jmc_27) SmallMoleculeComponent(name=lig_jmc_28)
Using Options:
Partial Charge Generation: am1bcc
```

The full range of partial charge settings can be found in the snipit bellow, note that some may require installing extra packages.

```yaml
partial_charge:
method: am1bcc
# method: am1bccelf10
# method: espaloma
# method: nagl
settings:
off_toolkit_backend: ambertools
# off_toolkit_backend: openeye # required for the am1bccelf10 method
number_of_conformers: null # null specifies the use of the input conformer, a value requests that a new conformer be generated
# nagl_model: null # null specifies the use of the latest nagl model
```

## Overwriting Charges

By default, the `charge-molecules` command will only assign partial charges to ligands which **do not** already have charges,
this behaviour can be changed via the `--overwrite-charges` flag which will assign new charges using the specified settings.
37 changes: 31 additions & 6 deletions rbfe_tutorial/cli_tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,9 +50,21 @@ we do the following:
- Instruct `openfe` to output files into a directory called `network_setup`
with the `-o network_setup` option.

Planning the campaign may take some time, as it tries to find the best
network from all possible transformations. This will create a directory called
`network_setup/`, which is structured like this:
Planning the campaign may take some time due to the complex series of tasks involved:

- partial charges are generated for each of the ligands to ensure reproducibility, by default this requires a semi-empirical quantum
chemical calculation to calculate `am1bcc` charges
- atom mappings are created and scored based on the perceived difficulty for all possible ligand pairs
- an optimal network is extracted from all possible pairwise transformations which balances edge redundancy and the total difficulty score of the network

The partial charge generation can take advantage of multiprocessing which offers a significant speed-up, you can specify
the number of processors available using the `-n` flag:

```bash
openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup -n 4
```

This will result in a directory called `network_setup/`, which is structured like this:

<!-- top lines from `tree network_setup` -->

Expand Down Expand Up @@ -87,7 +99,7 @@ The files that describe each individual simulation we will run are located withi
leg to run and contains all the necessary information to run that leg.
Filenames indicate ligand names as taken from the SDF; for example, the file
`easy_rbfe_lig_ejm_31_complex_lig_ejm_42_complex.json` is the leg
associated with the tranformation of the ligand `lig_ejm_31` into `lig_ejm_42`
associated with the transformation of the ligand `lig_ejm_31` into `lig_ejm_42`
while in complex with the protein.

A single RBFE between a pair of ligands requires running two legs of an alchemical cycle (JSON files):
Expand All @@ -112,8 +124,9 @@ OpenFE contains many different options and methods for setting up a simulation c
The options can be easily accessed and modified by providing a settings
file in the `.yaml` format.
Let's assume you want to exchange the LOMAP atom mapper with the Kartograf
atom mapper and the Minimal Spanning Tree
Network Planner with the Maximal Network Planner, then you could do the following:
atom mapper, the Minimal Spanning Tree
Network Planner with the Maximal Network Planner and the am1bcc charge method with the am1bccelf10 version from openeye,
then you could do the following:

1. provide a file like `settings.yaml` with the desired changes:

Expand All @@ -123,6 +136,11 @@ mapper:

network:
method: generate_maximal_network

partial_charge:
method: am1bccelf10
settings:
off_toolkit_backend: openeye
```
2. Plan your rbfe network with an additional `-s` flag for passing the settings:
Expand All @@ -148,6 +166,7 @@ Using Options:
Mapper: <kartograf.atom_mapper.KartografAtomMapper object at 0x7fea079de790>
Mapping Scorer: <function default_lomap_score at 0x7fea1b423d80>
Networker: functools.partial(<function generate_maximal_network at 0x7fea18371260>)
Partial Charge Generation: am1bccelf10
```

That concludes the straightforward process of tailoring your OpenFE setup to your specifications.
Expand All @@ -166,6 +185,12 @@ network:
# method: generate_radial_network
# method: generate_maximal_network
# method: generate_minimal_redundant_network
partial_charge:
method: am1bcc
# method: am1bccelf10
# settings:
# off_toolkit_backend: openeye # required for the am1bccelf10 method
```

**Customize away!**
Expand Down
38 changes: 37 additions & 1 deletion rbfe_tutorial/python_tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,42 @@
"ligands = [openfe.SmallMoleculeComponent.from_rdkit(mol) for mol in supp]"
]
},
{
"cell_type": "markdown",
"id": "8e5de19a",
"metadata": {},
"source": [
"## Charging the ligands\n",
"\n",
"It is recommended to use a single set of charges for each ligand to ensure reproducibility between repeats or consistent charges between different legs of a calculation involving the same ligand, like a relative binding affinity calculation for example. \n",
"\n",
"Here we will use some utility functions from OpenFE which can assign partial charges to a series of molecules with a variety of methods which can be configured via the `OpenFFPartialChargeSettings` class. In this example \n",
"we will charge the ligands using the `am1bcc` method from `ambertools` which is the default charge scheme used by OpenFE."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5219106c",
"metadata": {},
"outputs": [],
"source": [
"from openfe.protocols.openmm_utils.omm_settings import OpenFFPartialChargeSettings\n",
"from openfe.protocols.openmm_utils.charge_generation import bulk_assign_partial_charges\n",
"\n",
"charge_settings = OpenFFPartialChargeSettings(partial_charge_method=\"am1bcc\", off_toolkit_backend=\"ambertools\")\n",
"\n",
"charged_ligands = bulk_assign_partial_charges(\n",
" molecules=ligands,\n",
" overwrite=False, \n",
" method=charge_settings.partial_charge_method,\n",
" toolkit_backend=charge_settings.off_toolkit_backend,\n",
" generate_n_conformers=charge_settings.number_of_conformers,\n",
" nagl_model=charge_settings.nagl_model,\n",
" processors=1\n",
")"
]
},
{
"cell_type": "markdown",
"id": "6963be83",
Expand Down Expand Up @@ -93,7 +129,7 @@
"outputs": [],
"source": [
"ligand_network = network_planner(\n",
" ligands=ligands,\n",
" ligands=charged_ligands,\n",
" mappers=[mapper],\n",
" scorer=scorer\n",
")"
Expand Down

0 comments on commit 3e25848

Please sign in to comment.