Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
nnn911 committed Aug 30, 2024
1 parent b4837a3 commit bc1c07d
Show file tree
Hide file tree
Showing 6 changed files with 225 additions and 77 deletions.
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
MIT License

Copyright (c) [[Year]] [[Author]]
Copyright (c) 2024 Daniel Utt

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
91 changes: 68 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,34 +1,79 @@
# Python Modifier Template
# Match Molecule
Match parts of molecules using query strings.

Template for a custom Python-based modifier that hooks into OVITO and can easily be shared with other users.
## Description / Examples
This modifier allows you to select sections of molecules using query strings. The query strings use a simplied form of [SMILES](https://en.wikipedia.org/wiki/Simplified_Molecular_Input_Line_Entry_System):

This repository contains a template for creating your own [Python script modifier](https://docs.ovito.org/python/introduction/custom_modifiers.html),
which can be installed into *OVITO Pro* or the [`ovito`](https://pypi.org/project/ovito/) Python module using *pip*.
![Smile explanation image](https://upload.wikimedia.org/wikipedia/commons/0/00/SMILES.png)
> Original by Fdardel, slight edit by DMacks, CC BY-SA 3.0 <http://creativecommons.org/licenses/by-sa/3.0/>, via Wikimedia Commons
## Getting Started
where molecules can be defined by strings.

1. Click the "Use this template" button to create your own repository based on this template.
2. Rename `src/PackageName` to reflect the name of your modifier.
3. Implement your [modifier](https://docs.ovito.org/python/introduction/custom_modifiers.html#advanced-interface) in [`src/PackageName/__init__.py`](src/PackageName/__init__.py). If your modifier needs access to more than one frame of a trajectory, you can uncomment and implement the `input_caching_hints` method. Otherwise, you can delete it. More details on this method can be found in the [OVITO Python docs](https://www.ovito.org/docs/current/python/introduction/custom_modifiers.html#writing-custom-modifiers-advanced-interface).
4. Fill in the [`pyproject.toml`](pyproject.toml) file. Fields that need to be replaced with your information are enclosed in descriptive `[[field]]` tags. Please make sure to include ovito>=3.9.1 as a dependency. Depending on your needs, you can add additional fields to the `pyproject.toml` file. Information can be found [here](https://setuptools.pypa.io/en/latest/userguide/index.html).
5. Fill in the [`README_Template.md`](README_Template.md) file. Again, the `[[fields]]` placeholders should guide you. Feel free to add other sections like "Images", "Citation", or "References" as needed.
6. Add meaningful examples and data sample files to the `examples` directory to help others understand the use of your modifier.
7. Pick a license for your project and replace the current (MIT) [`LICENSE`](LICENSE) file with your license. If you keep the MIT license, please update the name and year in the current file.
8. Once you're done, rename `README_Template.md` to `README.md`, replacing this file.
### Selecting linear molecules
In the simplest form `HOH` can be used to define the water (`H-O-H`) molecule.

## Testing
This repository is configured to enable automated testing using the [pytest](https://docs.pytest.org/en/7.4.x/) framework. Tests are automatically executed after each push to the main branch. To set up and activate automated testing, follow these two steps:
### Adding side chains
To define more complex molecules one can use `()`. To select this submolecule,
```
O
\
N - C - C -
/
H-O
```
one might use this query string `ON(OH)CC`. Here `(OH)` denotes a side chain which branches off from the preceeding `N` atom.

1. Write your tests in the `test/test_modifier.py` file. You can also use other filenames that adhere to the pytest requirements.
2. Open the `.github/workflows/python-tests.yml` file and remove the `if: ${{ false }}` condition on line 15.
### Selecting multi-letter elements
To select this group of atoms,
```
- C - Fe -
| |
H O
|
H
```
you could write the following query `C(H)"Fe"(OH)`. Note, that multi-letter chemical elements need to be enclosed by `""`. An equivalent formulation would be `C(H)"Fe"OH`.

If needed, you can also adjust the operating system and Python versions by modifying the following lines:
```yaml
os: [ubuntu-latest, macos-latest, windows-latest]
python-version: ["3.7", "3.8", "3.9", "3.10", "3.11"]
### Adding wildcards / placeholders
If you want to match multiple sub-molecules you can use the `?` wildcard character. `H?H` would match both, the `H-O-H` and the `H-N-H` molecules (and any other molecule where 2 H atoms are connected by a singular bridge atom).

### Creating additional bonds
This syntax can be limiting so you might need to manually add bonds to your string. If you want to select this group atoms:
```
- C - N
/ \*
C C - C -
\ /
C - C
```
Here you could write `CNCCCCC`. This would select all atoms, however, you would be missing the bond tagged by the `*` in the picture. In such cases you can use numbers to tag atoms. Atoms with the same nummerical tag will be connected by bonds. This query string `CN1CCCC1C` would correctly select all atoms and bonds shown in the image. Here these two atoms (tagged 1) would be connected to form the `*` highlghted bond.
```
- C - N1
/ \*
C C1 - C -
\ /
C - C
```

## Parameters
- `query` / "Query": Query string used to select the atoms and bonds.
- `selectParticles` / "Select particles": Create a selection for the particles selected by the query string.
- `selectBonds` / "Select bonds": Create a selection for the bonds defined by the query string.

## Installation
- OVITO Pro [integrated Python interpreter](https://docs.ovito.org/python/introduction/installation.html#ovito-pro-integrated-interpreter):
```
ovitos -m pip install --user git+https://github.com/ovito-org/MatchMolecule.git
```
The `--user` option is recommended and [installs the package in the user's site directory](https://pip.pypa.io/en/stable/user_guide/#user-installs).

An example can be found [here](https://github.com/ovito-org/GenerateRandomSolution).
- Other Python interpreters or Conda environments:
```
pip install git+https://github.com/ovito-org/MatchMolecule.git
```

As of August 16, 2023, according to the [GitHub documentation](https://docs.github.com/en/billing/managing-billing-for-github-actions/about-billing-for-github-actions), *"GitHub Actions usage is free for standard GitHub-hosted runners in public repositories, and for self-hosted runners."* Please refer to the GitHub documentation if you are uncertain about incurring costs.
## Technical information / dependencies
- Tested on OVITO version 3.10.6

## Contact
- Daniel Utt ([email protected])
29 changes: 0 additions & 29 deletions README_Template.md

This file was deleted.

21 changes: 11 additions & 10 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,24 +3,25 @@ requires = ["setuptools", "wheel"]
build-backend = "setuptools.build_meta"

[project]
name = "[[PackageName]]"
version = "[[Version number]]"
description = "[[Short description]]"
name = "MatchMolecule"
version = "2024.1"
description = "Match parts of molecules using query strings"
keywords = ["ovito", "ovito-extension"]
authors = [{name = "[[Author 1 name]]", email = "[[Author 1 email]]"}, {name = "[[Author 2 name]]", email = "[[Author 2 email]]"}]
maintainers = [{name = "[[Maintainer 1 name]]", email = "[[Maintainer 1 email]]"}]
license = {text = "[[License]]"}
authors = [{name = "Daniel Utt", email = "[email protected]"}]
maintainers = [{name = "Daniel Utt", email = "[email protected]"}]
license = {text = "MIT"}
readme = "README.md"
requires-python = ">=3.7"
requires-python = ">=3.9"
dependencies = [
"ovito >= 3.9.1",
"ovito >= 3.10.6",
"networkx >= 3.0",
]

[project.urls]
repository = "[[Repository Link]]"
repository = "https://github.com/ovito-org/MatchMolecule"

[project.entry-points.'OVITO.Modifier']
"[[Human readable modifier name]]" = "[[PackageName]]:[[ModifierName]]"
"Match Molecule" = "MatchMolecule:MatchMolecule"

[tool.setuptools.packages.find]
where = ["src"]
Expand Down
145 changes: 145 additions & 0 deletions src/MatchMolecule/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
#### Match Molecule ####
# Match parts of molecules using query strings.

import networkx as nx
import numpy as np
from ovito.data import DataCollection
from ovito.pipeline import ModifierInterface
from traits.api import Bool, Str


class MatchMolecule(ModifierInterface):
query = Str("", label="Query", ovito_invalidate_cache=False)
selectParticles = Bool(True, label="Select particles", ovito_invalidate_cache=False)
selectBonds = Bool(True, label="Select bonds", ovito_invalidate_cache=False)

def tokenizer(self):
tokens = []
offset = 0
for i in range(0, len(self.query)):
idx = offset + i
if idx >= len(self.query):
break
elif self.query[idx] == '"' or self.query[idx] == "'":
j = 1
while (
idx + j < len(self.query)
and self.query[idx + j] != '"'
and self.query[idx + j] != "'"
):
j += 1
offset += j
tokens.append(self.query[idx + 1 : idx + j])
elif idx + 1 < len(self.query) and self.query[idx + 1].islower():
j = 1
while idx + j < len(self.query) and self.query[idx + j].islower():
j += 1
offset += j - 1
tokens.append(self.query[idx : idx + j])
else:
tokens.append(self.query[idx])
return tokens

def parseBranch(self, tokens, G, con, start=0, connect=-1):
str_offset = 0
for i in range(start, len(tokens)):
idx = i + str_offset
if idx >= len(tokens):
return
elif tokens[idx].isdigit():
if tokens[idx] in con:
G.add_edge(connect, con[tokens[idx]])
else:
con[tokens[idx]] = connect
elif tokens[idx] == "(":
str_offset += self.parseBranch(
tokens, G, con, start=idx + 1, connect=connect
)
elif tokens[idx] == ")":
return i - start + 1 + str_offset
else:
G.add_node(len(G.nodes), tag=tokens[idx])
if connect != -1:
G.add_edge(connect, len(G.nodes) - 1)
connect = len(G.nodes) - 1

def read_query(self, data_cache, frame):
cache_key = f"query_{frame}"
self.query.strip()
if not (
cache_key in data_cache.attributes
and data_cache.attributes[cache_key] == self.query
):
data_cache.attributes[f"matches_{frame}"] = None
connections = {}
G = nx.Graph()
self.parseBranch(self.tokenizer(), G, connections)
data_cache.attributes[cache_key] = self.query
data_cache.attributes[cache_key + "_graph"] = G
data_cache.attributes["matches"] = None
return data_cache.attributes[cache_key + "_graph"]

@staticmethod
def parseStructure(data, data_cache, frame):
cache_key = f"molecule_graph_{frame}"
if cache_key not in data_cache.attributes:
G = nx.Graph()
pTypes = data.particles["Particle Type"]
for i, (a, b) in enumerate(data.particles.bonds.topology):
name_a = data.particles.particle_types.type_by_id(pTypes[a]).name
name_b = data.particles.particle_types.type_by_id(pTypes[b]).name
G.add_node(a, tag=name_a)
G.add_node(b, tag=name_b)
G.add_edge(a, b, idx=i)
yield i / data.particles.bonds.count
data_cache.attributes[cache_key] = G
return data_cache.attributes[cache_key]

@staticmethod
def node_matcher(n1, n2):
if n1["tag"] == "?" or n2["tag"] == "?":
return True
return n1["tag"] == n2["tag"]

@staticmethod
def getMatches(moleculeG, queryG, data_cache, frame):
cache_key = f"matches_{frame}"
if (
cache_key not in data_cache.attributes
or data_cache.attributes[cache_key] is None
):
matcher = nx.algorithms.isomorphism.GraphMatcher(
moleculeG, queryG, node_match=__class__.node_matcher
)
data_cache.attributes[cache_key] = set()
for match in matcher.subgraph_monomorphisms_iter():
data_cache.attributes[cache_key].add(frozenset(match.keys()))
return data_cache.attributes[cache_key]

def modify(
self, data: DataCollection, frame: int, data_cache: DataCollection, **kwargs
):
if not self.query:
return
moleculeG = yield from self.parseStructure(data, data_cache, frame)
queryG = self.read_query(data_cache, frame)

if self.selectParticles:
selection = data.particles_.create_property("Selection")
if self.selectBonds:
bond_selection = data.particles_.bonds_.create_property("Selection")
bond_selection[:] = 0
topo = data.particles.bonds.topology

for match in self.getMatches(moleculeG, queryG, data_cache, frame):
match = list(match)
if self.selectParticles:
selection[match] = 1

if self.selectBonds:
bond_selection[...] = np.logical_or(
bond_selection,
np.logical_and(
np.isin(topo[:, 0], match), np.isin(topo[:, 1], match)
),
)
14 changes: 0 additions & 14 deletions src/PackageName/__init__.py

This file was deleted.

0 comments on commit bc1c07d

Please sign in to comment.