Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sweep: Update to G1rN/v8 #49

Open
marcosfelt opened this issue Jun 16, 2023 · 2 comments · May be fixed by #51
Open

Sweep: Update to G1rN/v8 #49

marcosfelt opened this issue Jun 16, 2023 · 2 comments · May be fixed by #51
Labels
sweep Assigns Sweep to an issue or pull request.

Comments

@marcosfelt
Copy link
Collaborator

marcosfelt commented Jun 16, 2023

Update .github/workflows/ci.yml to use Gr1N/setup-poetry@v8 instead of v7

@sweep-ai sweep-ai bot added the sweep Assigns Sweep to an issue or pull request. label Jun 16, 2023
@sweep-ai
Copy link

sweep-ai bot commented Jun 16, 2023

Hey @marcosfelt,

I've started working on this issue. The plan is to update the version of setup-poetry action used in our GitHub workflow from v7 to v8. This change will be made in two places within the workflow file, specifically in the build and publish jobs.

Give me a minute!

Some code snippets I looked at (click to expand). If some file is missing from here, you can mention the path in the ticket description.

# # )
# # correct_inchi = "ULNUEACLWYUUMO-UHFFFAOYSA-N"
# # # correct_smiles = Chem.CanonSmiles(correct_smiles)
# # input_compound, resolved_identifiers = resolved[0]
# # assert resolved_identifiers[0].value == correct_inchi
# def test_resolve_backup_identifiers():
# logging.basicConfig(
# level=logging.DEBUG,
# format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
# )
# logger = logging.getLogger(__name__)
# resolved = resolve_identifiers(
# [
# "Pd(OAc)2",
# # "Josiphos SL-J001-1",
# # "Rh(NBD)2BF4",
# # "Dichloro(p-cymene)ruthenium(II) dimer",
# # "DuPhos",
# ],
# input_identifer_type=CompoundIdentifierType.NAME,
# output_identifier_type=CompoundIdentifierType.SMILES,
# backup_identifier_types=[
# CompoundIdentifierType.INCHI_KEY,
# CompoundIdentifierType.CAS_NUMBER,
# ],
# services=[PubChem(autocomplete=False), CIR(), CAS()],
# agreement=1,
# silent=True,
# )
# print("\nResults\n")
# for input_compound, resolved_identifiers in resolved:
# print(input_compound, resolved_identifiers, "\n")
# # Josiphos SL-J001-1 [CompoundIdentifier(identifier_type=<CompoundIdentifierType.SMILES: 2>, value='C1CCCC1.CC(C1CCCC1P(c1ccccc1)c1ccccc1)P(C1CCCCC1)C1CCCCC1.[Fe]', details=None)]

def main():
# loop = asyncio.new_event_loop()
# asyncio.set_event_loop(loop)
# loop.run_until_complete(_main())
results = resolve_identifiers(
list(catalyst_replacements.keys())[:2],
output_identifier_type=CompoundIdentifierType.SMILES,
input_identifer_type=CompoundIdentifierType.NAME,
services=[LocalDatabase(return_canonical_only=True)],
)
for res in results:
print(res)
if __name__ == "__main__":
# import logging
# logging.basicConfig(level=logging.DEBUG)
main()
# from rdkit import Chem
# for name, smi in catalyst_replacements.items():

import logging
from pura.services import Service
from pura.compound import CompoundIdentifier, CompoundIdentifierType
from aiohttp import ClientSession
from typing import List, Optional, Tuple, Union
from databases import Database as AsyncDatabase
import sqlalchemy
import pandas as pd
import asyncio
import nest_asyncio
from rdkit import Chem
from sqlalchemy.dialects.sqlite import insert
from importlib_resources import files
import pathlib
DATA_PATH = pathlib.Path(files("pura.data").joinpath("pura.db"))
metadata = sqlalchemy.MetaData()
dialect = sqlalchemy.dialects.sqlite.dialect()
# Schema
# compound
# --------------
# id: INT, KEY
# inchi_key: STR
# identifiers
# --------------
# id: INT, KEY
# identifier_type: ENUM
# identifier_value: STR
# compound_id: Foreign Key to compound(id)
# canonical: BOOL -> True if this is the canonical identifier for the compound
metadata = sqlalchemy.MetaData()
compound_table = sqlalchemy.Table(
"compound",
metadata,
sqlalchemy.Column("id", sqlalchemy.Integer, primary_key=True),
sqlalchemy.Column("inchi", sqlalchemy.String, nullable=False),
sqlalchemy.UniqueConstraint("inchi", sqlite_on_conflict="IGNORE"),
)

pura/pura/resolvers.py

Lines 260 to 327 in e208bca

)
)
async def _resolve(
self,
input_compounds: List[Compound],
output_identifier_type: CompoundIdentifierType,
backup_identifier_types: Optional[List[CompoundIdentifierType]] = None,
agreement: Optional[int] = 1,
batch_size: Optional[int] = None,
n_retries: Optional[int] = 3,
**kwargs,
) -> List[Tuple[Compound, Union[List[CompoundIdentifier], None]]]:
"""This is the async function with the same API as resolve"""
# Run setup for services
for service in self.services:
await service.setup()
n_identifiers = len(input_compounds)
if batch_size is None:
batch_size = 10 if n_identifiers >= 10 else n_identifiers
n_batches = n_identifiers // batch_size
n_batches += 0 if n_identifiers % batch_size == 0 else 1
resolved_identifiers = []
backup_identifier_types = (
backup_identifier_types if backup_identifier_types is not None else []
)
progress_bar_type = kwargs.get("progress_bar_type", "tqdm")
if progress_bar_type == "tqdm":
progress_bar = tqdm
elif progress_bar_type == "streamlit":
from stqdm import stqdm
progress_bar = stqdm
# Iterate through batches
for batch in progress_bar(range(n_batches), position=0, desc="Batch"):
# Get subset of data
start = batch * batch_size
batch_identifiers = input_compounds[start : start + batch_size]
# Start aiohttp session
async with ClientSession() as session:
# Create series of tasks to run in parallel
tasks = [
self._resolve_one_compound(
session,
compound_identifier,
output_identifier_type,
backup_identifier_types,
agreement,
n_retries=n_retries,
)
for compound_identifier in batch_identifiers
]
batch_bar = progress_bar(
asyncio.as_completed(tasks),
total=len(tasks),
desc=f"Batch {batch} Progress",
position=1,
leave=True,
)
resolved_identifiers.extend([await f for f in batch_bar])
batch_bar.clear()
for service in self.services:
await service.teardown()

name: Test and Publish
on:
push:
pull_request:
branches:
# Branches from forks have the form 'user:branch-name' so we only run
# this job on pull_request events for branches that look like fork
# branches. Without this we would end up running this job twice for non
# forked PRs, once for the push and then once for opening the PR.
- '**:**'
jobs:
# Build the package
build:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v2
- name: Install python
uses: actions/setup-python@v2
with:
python-version: '3.9'
- name: Install poetry
uses: Gr1N/setup-poetry@v7
- name: Build package
run: poetry build
- name: Upload built package
uses: actions/upload-artifact@v3
with:
name: dist
path: dist/
retention-days: 1
# Run pytest using built package
test:
needs: build
runs-on: ubuntu-latest
strategy:
matrix:
python: ["3.8", "3.9", "3.10"]
steps:
- name: Checkout repository
uses: actions/checkout@v2
- name: Install python
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python }}
cache: 'pip'
cache-dependency-path: "poetry.lock"
- name: Download built package
uses: actions/download-artifact@v3
with:
name: dist
- name: Install pura and pytest
shell: bash
run: |
WHL_NAME=$(ls pura-*.whl)
pip install ${WHL_NAME}[experiments,entmoot] pytest
- name: Run tests
shell: bash
run: pytest
# Publish to pypi on version change
# This is based on https://github.com/coveooss/pypi-publish-with-poetry
publish:
needs: test
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v2
- name: Install python
uses: actions/setup-python@v2
with:
python-version: '3.9'
- name: Download built package
uses: actions/download-artifact@v3
with:
name: dist
path: dist/
- name: Install poetry
uses: Gr1N/setup-poetry@v7
- name: Install coveo-pypi-cli
run: pip install coveo-pypi-cli
- name: Determine the version for this release from the build
id: current
run: |
BUILD_VER="$(ls dist/pura-*.tar.gz)"
echo "Path: $BUILD_VER"
if [[ $BUILD_VER =~ (pura-)([^,][0-9.]{4}) ]]; then
echo "::set-output name=version::${BASH_REMATCH[2]}"
echo "Version of build: ${BASH_REMATCH[2]}"
else
echo "No version found found"
fi
- name: Get latest published version
id: published
run: |
PUB_VER="$(pypi current-version pura)"
echo "::set-output name=version::$PUB_VER"
echo "Latest published version: $PUB_VER"
- name: Publish to pypi if new version
if: (steps.current.outputs.version != steps.published.outputs.version)
shell: bash
run: |
poetry config pypi-token.pypi ${{ secrets.PYPI_TOKEN }}
if [[ '${{ github.ref_name }}' == 'main' ]]; then
poetry publish
else
echo "Dry run of publishing the package"
poetry publish --dry-run
fi
- name: Tag repository
shell: bash
id: get-next-tag
if: (steps.current.outputs.version != steps.published.outputs.version)
run: |
TAG_NAME=${{ steps.current.outputs.version }}
echo "::set-output name=tag-name::$TAG_NAME"
echo "This release will be tagged as $TAG_NAME"
git config user.name "github-actions"
git config user.email "[email protected]"
git tag --annotate --message="Automated tagging system" $TAG_NAME ${{ github.sha }}
- name: Push the tag
if: (steps.current.outputs.version != steps.published.outputs.version)
env:
TAG_NAME: ${{ steps.current.outputs.version }}
run: |
if [[ ${{ github.ref_name }} == 'main' ]]; then
git push origin $TAG_NAME
else
echo "If this was the main branch, I would push a new tag named $TAG_NAME"


I'm a bot that handles simple bugs and feature requests but I might make mistakes. Please be kind!

@marcosfelt marcosfelt added sweep Assigns Sweep to an issue or pull request. and removed sweep Assigns Sweep to an issue or pull request. labels Jun 24, 2023
@sweep-ai
Copy link

sweep-ai bot commented Jun 24, 2023

Hey @marcosfelt,

I've started working on the PR for this issue. The plan is pretty straightforward: I'll be updating the version of Gr1N/setup-poetry from v7 to v8 in the .github/workflows/ci.yml file.

Give me a minute!

Best,
Sweep bot

Some code snippets I looked at (click to expand). If some file is missing from here, you can mention the path in the ticket description.

# Dictionary for manual canonicalization
# initialize a dict that maps catalysts to the humanly cleaned smiles

from pura.compound import (

import logging
from pura.services import Service
from pura.compound import CompoundIdentifier, CompoundIdentifierType
from aiohttp import ClientSession
from typing import List, Optional, Tuple, Union
from databases import Database as AsyncDatabase
import sqlalchemy
import pandas as pd
import asyncio
import nest_asyncio
from rdkit import Chem
from sqlalchemy.dialects.sqlite import insert
from importlib_resources import files
import pathlib
DATA_PATH = pathlib.Path(files("pura.data").joinpath("pura.db"))
metadata = sqlalchemy.MetaData()
dialect = sqlalchemy.dialects.sqlite.dialect()
# Schema
# compound
# --------------
# id: INT, KEY
# inchi_key: STR
# identifiers
# --------------
# id: INT, KEY
# identifier_type: ENUM
# identifier_value: STR
# compound_id: Foreign Key to compound(id)
# canonical: BOOL -> True if this is the canonical identifier for the compound
metadata = sqlalchemy.MetaData()
compound_table = sqlalchemy.Table(

if isinstance(identifier, int):
identifier = str(identifier)
if not isinstance(identifier, text_types):
identifier = ",".join(str(x) for x in identifier)
# Filter None values from kwargs
kwargs = dict((k, v) for k, v in kwargs.items() if v is not None)
# Build API URL
urlid, postdata = None, {}
if namespace == "sourceid":
identifier = identifier.replace("/", ".")
if (
namespace in ["listkey", "formula", "sourceid"]
or searchtype == "xref"
or (searchtype and namespace == "cid")
or domain == "sources"
):
urlid = quote(identifier.encode("utf8"))
else:
# postdata = urlencode([(namespace, identifier)]).encode("utf8")
postdata = {namespace: identifier}
comps = filter(
None, [api_base, domain, searchtype, namespace, urlid, operation, output]
)
apiurl = "/".join(comps)
if kwargs:
apiurl += "?%s" % urlencode(kwargs)
# Make request
logger.debug("Request URL: %s", apiurl)
logger.debug("Request data: %s", postdata)

name: Test and Publish
on:
push:
pull_request:
branches:
# Branches from forks have the form 'user:branch-name' so we only run
# this job on pull_request events for branches that look like fork
# branches. Without this we would end up running this job twice for non
# forked PRs, once for the push and then once for opening the PR.
- '**:**'
jobs:
# Build the package
build:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v2
- name: Install python
uses: actions/setup-python@v2
with:
python-version: '3.9'
- name: Install poetry
uses: Gr1N/setup-poetry@v7
- name: Build package
run: poetry build
- name: Upload built package
uses: actions/upload-artifact@v3
with:
name: dist
path: dist/
retention-days: 1
# Run pytest using built package
test:
needs: build
runs-on: ubuntu-latest
strategy:
matrix:
python: ["3.8", "3.9", "3.10"]
steps:
- name: Checkout repository
uses: actions/checkout@v2
- name: Install python
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python }}
cache: 'pip'
cache-dependency-path: "poetry.lock"
- name: Download built package
uses: actions/download-artifact@v3
with:
name: dist
- name: Install pura and pytest
shell: bash
run: |
WHL_NAME=$(ls pura-*.whl)
pip install ${WHL_NAME}[experiments,entmoot] pytest
- name: Run tests
shell: bash
run: pytest
# Publish to pypi on version change
# This is based on https://github.com/coveooss/pypi-publish-with-poetry
publish:
needs: test
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v2
- name: Install python
uses: actions/setup-python@v2
with:
python-version: '3.9'
- name: Download built package
uses: actions/download-artifact@v3
with:
name: dist
path: dist/
- name: Install poetry
uses: Gr1N/setup-poetry@v7
- name: Install coveo-pypi-cli
run: pip install coveo-pypi-cli
- name: Determine the version for this release from the build
id: current
run: |
BUILD_VER="$(ls dist/pura-*.tar.gz)"
echo "Path: $BUILD_VER"
if [[ $BUILD_VER =~ (pura-)([^,][0-9.]{4}) ]]; then
echo "::set-output name=version::${BASH_REMATCH[2]}"
echo "Version of build: ${BASH_REMATCH[2]}"
else
echo "No version found found"
fi
- name: Get latest published version
id: published
run: |
PUB_VER="$(pypi current-version pura)"
echo "::set-output name=version::$PUB_VER"
echo "Latest published version: $PUB_VER"
- name: Publish to pypi if new version
if: (steps.current.outputs.version != steps.published.outputs.version)
shell: bash
run: |
poetry config pypi-token.pypi ${{ secrets.PYPI_TOKEN }}
if [[ '${{ github.ref_name }}' == 'main' ]]; then
poetry publish
else
echo "Dry run of publishing the package"
poetry publish --dry-run
fi
- name: Tag repository
shell: bash
id: get-next-tag
if: (steps.current.outputs.version != steps.published.outputs.version)
run: |
TAG_NAME=${{ steps.current.outputs.version }}
echo "::set-output name=tag-name::$TAG_NAME"
echo "This release will be tagged as $TAG_NAME"
git config user.name "github-actions"
git config user.email "[email protected]"
git tag --annotate --message="Automated tagging system" $TAG_NAME ${{ github.sha }}
- name: Push the tag
if: (steps.current.outputs.version != steps.published.outputs.version)
env:
TAG_NAME: ${{ steps.current.outputs.version }}
run: |
if [[ ${{ github.ref_name }} == 'main' ]]; then
git push origin $TAG_NAME
else
echo "If this was the main branch, I would push a new tag named $TAG_NAME"


I'm a bot that handles simple bugs and feature requests but I might make mistakes. Please be kind!

@sweep-ai sweep-ai bot linked a pull request Jun 24, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sweep Assigns Sweep to an issue or pull request.
Projects
None yet
1 participant