Initial commit

ovito-org · Aug 30, 2024 · bc1c07d · bc1c07d
1 parent b4837a3
commit bc1c07d
Show file tree

Hide file tree

Showing 6 changed files with 225 additions and 77 deletions.
diff --git a/LICENSE b/LICENSE
@@ -1,6 +1,6 @@
 MIT License
 
-Copyright (c) [[Year]] [[Author]]
+Copyright (c) 2024 Daniel Utt 
 
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal

diff --git a/README.md b/README.md
@@ -1,34 +1,79 @@
-# Python Modifier Template
+# Match Molecule
+Match parts of molecules using query strings.
 
-Template for a custom Python-based modifier that hooks into OVITO and can easily be shared with other users.
+## Description / Examples
+This modifier allows you to select sections of molecules using query strings. The query strings use a simplied form of [SMILES](https://en.wikipedia.org/wiki/Simplified_Molecular_Input_Line_Entry_System):
 
-This repository contains a template for creating your own [Python script modifier](https://docs.ovito.org/python/introduction/custom_modifiers.html), 
-which can be installed into *OVITO Pro* or the [`ovito`](https://pypi.org/project/ovito/) Python module using *pip*.
+![Smile explanation image](https://upload.wikimedia.org/wikipedia/commons/0/00/SMILES.png)
+> Original by Fdardel, slight edit by DMacks, CC BY-SA 3.0 <http://creativecommons.org/licenses/by-sa/3.0/>, via Wikimedia Commons
 
-## Getting Started
+where molecules can be defined by strings.
 
-1. Click the "Use this template" button to create your own repository based on this template.
-2. Rename `src/PackageName` to reflect the name of your modifier.
-3. Implement your [modifier](https://docs.ovito.org/python/introduction/custom_modifiers.html#advanced-interface) in [`src/PackageName/__init__.py`](src/PackageName/__init__.py). If your modifier needs access to more than one frame of a trajectory, you can uncomment and implement the `input_caching_hints` method. Otherwise, you can delete it. More details on this method can be found in the [OVITO Python docs](https://www.ovito.org/docs/current/python/introduction/custom_modifiers.html#writing-custom-modifiers-advanced-interface). 
-4. Fill in the [`pyproject.toml`](pyproject.toml) file. Fields that need to be replaced with your information are enclosed in descriptive `[[field]]` tags. Please make sure to include ovito>=3.9.1 as a dependency. Depending on your needs, you can add additional fields to the `pyproject.toml` file. Information can be found [here](https://setuptools.pypa.io/en/latest/userguide/index.html).
-5. Fill in the [`README_Template.md`](README_Template.md) file. Again, the `[[fields]]` placeholders should guide you. Feel free to add other sections like "Images", "Citation", or "References" as needed.
-6. Add meaningful examples and data sample files to the `examples` directory to help others understand the use of your modifier.
-7. Pick a license for your project and replace the current (MIT) [`LICENSE`](LICENSE) file with your license. If you keep the MIT license, please update the name and year in the current file.
-8. Once you're done, rename `README_Template.md` to `README.md`, replacing this file.
+### Selecting linear molecules
+In the simplest form `HOH` can be used to define the water (`H-O-H`) molecule. 
 
-## Testing
-This repository is configured to enable automated testing using the [pytest](https://docs.pytest.org/en/7.4.x/) framework. Tests are automatically executed after each push to the main branch. To set up and activate automated testing, follow these two steps:
+### Adding side chains
+To define more complex molecules one can use `()`. To select this submolecule,
+``` 
+  O
+   \
+     N - C - C -
+   /
+H-O
+```
+one might use this query string `ON(OH)CC`. Here `(OH)` denotes a side chain which branches off from the preceeding `N` atom. 
 
-1. Write your tests in the `test/test_modifier.py` file. You can also use other filenames that adhere to the pytest requirements.
-2. Open the `.github/workflows/python-tests.yml` file and remove the `if: ${{ false }}` condition on line 15.
+### Selecting multi-letter elements
+To select this group of atoms,
+```
+- C - Fe -
+  |   |
+  H   O
+      |
+      H
+```
+you could write the following query `C(H)"Fe"(OH)`. Note, that multi-letter chemical elements need to be enclosed by `""`. An equivalent formulation would be `C(H)"Fe"OH`.
 
-If needed, you can also adjust the operating system and Python versions by modifying the following lines:
-```yaml
-os: [ubuntu-latest, macos-latest, windows-latest]
-python-version: ["3.7", "3.8", "3.9", "3.10", "3.11"]
+### Adding wildcards / placeholders
+If you want to match multiple sub-molecules you can use the `?` wildcard character. `H?H` would match both, the `H-O-H` and the `H-N-H` molecules (and any other molecule where 2 H atoms are connected by a singular bridge atom).
+
+### Creating additional bonds
+This syntax can be limiting so you might need to manually add  bonds to your string. If you want to select this group atoms:
+```
+- C - N 
+    /   \*
+   C      C - C -
+   \     /
+    C - C
 ```
+Here you could write `CNCCCCC`. This would select all atoms, however, you would be missing the bond tagged by the `*` in the picture. In such cases you can use numbers to tag atoms. Atoms with the same nummerical tag will be connected by bonds. This query string `CN1CCCC1C` would correctly select all atoms and bonds shown in the image. Here these two atoms (tagged 1) would be connected to form the `*` highlghted bond.
+```
+- C - N1
+    /   \*
+   C      C1 - C -
+   \     /
+    C - C
+```
+
+## Parameters 
+- `query` / "Query": Query string used to select the atoms and bonds.
+- `selectParticles` / "Select particles": Create a selection for the particles selected by the query string. 
+- `selectBonds` / "Select bonds": Create a selection for the bonds defined by the query string. 
+
+## Installation
+- OVITO Pro [integrated Python interpreter](https://docs.ovito.org/python/introduction/installation.html#ovito-pro-integrated-interpreter):
+  ```
+  ovitos -m pip install --user git+https://github.com/ovito-org/MatchMolecule.git
+  ``` 
+  The `--user` option is recommended and [installs the package in the user's site directory](https://pip.pypa.io/en/stable/user_guide/#user-installs).
 
-An example can be found [here](https://github.com/ovito-org/GenerateRandomSolution).
+- Other Python interpreters or Conda environments:
+  ```
+  pip install git+https://github.com/ovito-org/MatchMolecule.git
+  ```
 
-As of August 16, 2023, according to the [GitHub documentation](https://docs.github.com/en/billing/managing-billing-for-github-actions/about-billing-for-github-actions), *"GitHub Actions usage is free for standard GitHub-hosted runners in public repositories, and for self-hosted runners."* Please refer to the GitHub documentation if you are uncertain about incurring costs.
+## Technical information / dependencies
+- Tested on OVITO version 3.10.6
 
+## Contact
+- Daniel Utt ([email protected])
diff --git a/README_Template.md b/README_Template.md
diff --git a/pyproject.toml b/pyproject.toml
@@ -3,24 +3,25 @@ requires = ["setuptools", "wheel"]
 build-backend = "setuptools.build_meta"
 
 [project]
-name = "[[PackageName]]"
-version = "[[Version number]]"
-description = "[[Short description]]"
+name = "MatchMolecule"
+version = "2024.1"
+description = "Match parts of molecules using query strings"
 keywords = ["ovito", "ovito-extension"]
-authors = [{name = "[[Author 1 name]]", email = "[[Author 1 email]]"}, {name = "[[Author 2 name]]", email = "[[Author 2 email]]"}]
-maintainers = [{name = "[[Maintainer 1 name]]", email = "[[Maintainer 1 email]]"}]
-license = {text = "[[License]]"}
+authors = [{name = "Daniel Utt", email = "[email protected]"}]
+maintainers = [{name = "Daniel Utt", email = "[email protected]"}]
+license = {text = "MIT"}
 readme = "README.md"
-requires-python = ">=3.7"
+requires-python = ">=3.9"
 dependencies = [
-    "ovito >= 3.9.1",
+    "ovito >= 3.10.6",
+    "networkx >= 3.0",
 ]
 
 [project.urls]
-repository = "[[Repository Link]]"
+repository = "https://github.com/ovito-org/MatchMolecule"
 
 [project.entry-points.'OVITO.Modifier']
-"[[Human readable modifier name]]" = "[[PackageName]]:[[ModifierName]]"
+"Match Molecule" = "MatchMolecule:MatchMolecule"
 
 [tool.setuptools.packages.find]
 where = ["src"]

diff --git a/src/MatchMolecule/__init__.py b/src/MatchMolecule/__init__.py
@@ -0,0 +1,145 @@
+#### Match Molecule ####
+# Match parts of molecules using query strings.
+
+import networkx as nx
+import numpy as np
+from ovito.data import DataCollection
+from ovito.pipeline import ModifierInterface
+from traits.api import Bool, Str
+
+
+class MatchMolecule(ModifierInterface):
+    query = Str("", label="Query", ovito_invalidate_cache=False)
+    selectParticles = Bool(True, label="Select particles", ovito_invalidate_cache=False)
+    selectBonds = Bool(True, label="Select bonds", ovito_invalidate_cache=False)
+
+    def tokenizer(self):
+        tokens = []
+        offset = 0
+        for i in range(0, len(self.query)):
+            idx = offset + i
+            if idx >= len(self.query):
+                break
+            elif self.query[idx] == '"' or self.query[idx] == "'":
+                j = 1
+                while (
+                    idx + j < len(self.query)
+                    and self.query[idx + j] != '"'
+                    and self.query[idx + j] != "'"
+                ):
+                    j += 1
+                offset += j
+                tokens.append(self.query[idx + 1 : idx + j])
+            elif idx + 1 < len(self.query) and self.query[idx + 1].islower():
+                j = 1
+                while idx + j < len(self.query) and self.query[idx + j].islower():
+                    j += 1
+                offset += j - 1
+                tokens.append(self.query[idx : idx + j])
+            else:
+                tokens.append(self.query[idx])
+        return tokens
+
+    def parseBranch(self, tokens, G, con, start=0, connect=-1):
+        str_offset = 0
+        for i in range(start, len(tokens)):
+            idx = i + str_offset
+            if idx >= len(tokens):
+                return
+            elif tokens[idx].isdigit():
+                if tokens[idx] in con:
+                    G.add_edge(connect, con[tokens[idx]])
+                else:
+                    con[tokens[idx]] = connect
+            elif tokens[idx] == "(":
+                str_offset += self.parseBranch(
+                    tokens, G, con, start=idx + 1, connect=connect
+                )
+            elif tokens[idx] == ")":
+                return i - start + 1 + str_offset
+            else:
+                G.add_node(len(G.nodes), tag=tokens[idx])
+                if connect != -1:
+                    G.add_edge(connect, len(G.nodes) - 1)
+                connect = len(G.nodes) - 1
+
+    def read_query(self, data_cache, frame):
+        cache_key = f"query_{frame}"
+        self.query.strip()
+        if not (
+            cache_key in data_cache.attributes
+            and data_cache.attributes[cache_key] == self.query
+        ):
+            data_cache.attributes[f"matches_{frame}"] = None
+            connections = {}
+            G = nx.Graph()
+            self.parseBranch(self.tokenizer(), G, connections)
+            data_cache.attributes[cache_key] = self.query
+            data_cache.attributes[cache_key + "_graph"] = G
+            data_cache.attributes["matches"] = None
+        return data_cache.attributes[cache_key + "_graph"]
+
+    @staticmethod
+    def parseStructure(data, data_cache, frame):
+        cache_key = f"molecule_graph_{frame}"
+        if cache_key not in data_cache.attributes:
+            G = nx.Graph()
+            pTypes = data.particles["Particle Type"]
+            for i, (a, b) in enumerate(data.particles.bonds.topology):
+                name_a = data.particles.particle_types.type_by_id(pTypes[a]).name
+                name_b = data.particles.particle_types.type_by_id(pTypes[b]).name
+                G.add_node(a, tag=name_a)
+                G.add_node(b, tag=name_b)
+                G.add_edge(a, b, idx=i)
+                yield i / data.particles.bonds.count
+            data_cache.attributes[cache_key] = G
+        return data_cache.attributes[cache_key]
+
+    @staticmethod
+    def node_matcher(n1, n2):
+        if n1["tag"] == "?" or n2["tag"] == "?":
+            return True
+        return n1["tag"] == n2["tag"]
+
+    @staticmethod
+    def getMatches(moleculeG, queryG, data_cache, frame):
+        cache_key = f"matches_{frame}"
+        if (
+            cache_key not in data_cache.attributes
+            or data_cache.attributes[cache_key] is None
+        ):
+            matcher = nx.algorithms.isomorphism.GraphMatcher(
+                moleculeG, queryG, node_match=__class__.node_matcher
+            )
+            data_cache.attributes[cache_key] = set()
+            for match in matcher.subgraph_monomorphisms_iter():
+                data_cache.attributes[cache_key].add(frozenset(match.keys()))
+        return data_cache.attributes[cache_key]
+
+    def modify(
+        self, data: DataCollection, frame: int, data_cache: DataCollection, **kwargs
+    ):
+        if not self.query:
+            return
+        moleculeG = yield from self.parseStructure(data, data_cache, frame)
+        queryG = self.read_query(data_cache, frame)
+
+        if self.selectParticles:
+            selection = data.particles_.create_property("Selection")
+        if self.selectBonds:
+            bond_selection = data.particles_.bonds_.create_property("Selection")
+            bond_selection[:] = 0
+            topo = data.particles.bonds.topology
+
+        for match in self.getMatches(moleculeG, queryG, data_cache, frame):
+            match = list(match)
+            if self.selectParticles:
+                selection[match] = 1
+
+            if self.selectBonds:
+                bond_selection[...] = np.logical_or(
+                    bond_selection,
+                    np.logical_and(
+                        np.isin(topo[:, 0], match), np.isin(topo[:, 1], match)
+                    ),
+                )
diff --git a/src/PackageName/__init__.py b/src/PackageName/__init__.py