Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SCHEMATIC-192] V24.12.1 into Main (#1562) #1563

Merged
merged 1 commit into from
Dec 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/api_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ on:

jobs:
test:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
env:
POETRY_VERSION: 1.3.0
strategy:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/pdoc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ concurrency:
cancel-in-progress: true
jobs:
build:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
env:
POETRY_VERSION: 1.3.0
PYTHON_VERSION: "3.10"
Expand Down
9 changes: 5 additions & 4 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,10 @@ on:

jobs:
pypi_release:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
env:
POETRY_VERSION: 1.3.0
PYTHON_VERSION: "3.10"
if: github.event_name == 'push' && contains(github.ref, 'refs/tags')
steps:
#----------------------------------------------
Expand All @@ -18,10 +19,10 @@ jobs:
- name: Check out repository
uses: actions/checkout@v4

- name: Set up Python ${{ matrix.python-version }}
- name: Set up Python ${{ env.PYTHON_VERSION }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
python-version: ${{ env.PYTHON_VERSION }}

#----------------------------------------------
# install & configure poetry
Expand All @@ -48,7 +49,7 @@ jobs:
- name: Get current pushed tag
run: |
echo "RELEASE_VERSION=${GITHUB_REF#refs/*/}" >> $GITHUB_ENV
echo ${{ env.RELEASE_VERSION }}
echo "$RELEASE_VERSION"

#----------------------------------------------
# override version tag
Expand Down
3 changes: 3 additions & 0 deletions .github/workflows/scan_repo.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,9 @@ jobs:
trivy:
name: Trivy
runs-on: ubuntu-latest
env:
TRIVY_DB_REPOSITORY: public.ecr.aws/aquasecurity/trivy-db:2
TRIVY_JAVA_DB_REPOSITORY: public.ecr.aws/aquasecurity/trivy-java-db:1
steps:
- name: Checkout code
uses: actions/checkout@v4
Expand Down
29 changes: 28 additions & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ concurrency:
cancel-in-progress: true
jobs:
test:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
env:
POETRY_VERSION: 1.3.0
strategy:
Expand Down Expand Up @@ -127,11 +127,31 @@ jobs:
#----------------------------------------------
# run integration test suite
#----------------------------------------------

- name: Retrieve telemetry access token from IDP
if: ${{ contains(fromJSON('["3.10"]'), matrix.python-version) }}
id: retrieve-telemetry-access-token
run: |
response=$(curl --request POST \
--url ${{ vars.TELEMETRY_AUTH_CLIENT_URL }} \
--header 'content-type: application/json' \
--data '{"client_id":"${{ vars.TELEMETRY_AUTH_CLIENT_ID }}","client_secret":"${{ secrets.TELEMETRY_AUTH_CLIENT_SECRET }}","audience":"${{ vars.TELEMETRY_AUTH_AUDIENCE }}","grant_type":"client_credentials"}')
access_token=$(echo $response | jq -r .access_token)
echo "::add-mask::$access_token"
echo "TELEMETRY_ACCESS_TOKEN=$access_token" >> "$GITHUB_OUTPUT"
- name: Run integration tests
if: ${{ contains(fromJSON('["3.10"]'), matrix.python-version) }}
env:
SYNAPSE_ACCESS_TOKEN: ${{ secrets.SYNAPSE_ACCESS_TOKEN }}
SERVICE_ACCOUNT_CREDS: ${{ secrets.SERVICE_ACCOUNT_CREDS }}
OTEL_EXPORTER_OTLP_HEADERS: "Authorization=Bearer ${{ steps.retrieve-telemetry-access-token.outputs.TELEMETRY_ACCESS_TOKEN }}"
DEPLOYMENT_ENVIRONMENT: ${{ vars.DEPLOYMENT_ENVIRONMENT }}
OTEL_EXPORTER_OTLP_ENDPOINT: ${{ vars.OTEL_EXPORTER_OTLP_ENDPOINT }}
TRACING_EXPORT_FORMAT: ${{ vars.TRACING_EXPORT_FORMAT }}
LOGGING_EXPORT_FORMAT: ${{ vars.LOGGING_EXPORT_FORMAT }}
TRACING_SERVICE_NAME: ${{ vars.TRACING_SERVICE_NAME }}
LOGGING_SERVICE_NAME: ${{ vars.LOGGING_SERVICE_NAME }}
SERVICE_INSTANCE_ID: ${{ github.head_ref || github.ref_name }}
run: >
poetry run pytest --durations=0 --cov-append --cov-report=term --cov-report=html:htmlcov --cov-report=xml:coverage.xml --cov=schematic/
-m "not (rule_benchmark or single_process_execution)" --reruns 4 -n 8 --ignore=tests/unit
Expand All @@ -141,6 +161,13 @@ jobs:
env:
SYNAPSE_ACCESS_TOKEN: ${{ secrets.SYNAPSE_ACCESS_TOKEN }}
SERVICE_ACCOUNT_CREDS: ${{ secrets.SERVICE_ACCOUNT_CREDS }}
OTEL_EXPORTER_OTLP_HEADERS: "Authorization=Bearer ${{ steps.retrieve-telemetry-access-token.outputs.TELEMETRY_ACCESS_TOKEN }}"
DEPLOYMENT_ENVIRONMENT: ${{ vars.DEPLOYMENT_ENVIRONMENT }}
OTEL_EXPORTER_OTLP_ENDPOINT: ${{ vars.OTEL_EXPORTER_OTLP_ENDPOINT }}
TRACING_EXPORT_FORMAT: ${{ vars.TRACING_EXPORT_FORMAT }}
LOGGING_EXPORT_FORMAT: ${{ vars.LOGGING_EXPORT_FORMAT }}
TRACING_SERVICE_NAME: ${{ vars.TRACING_SERVICE_NAME }}
LOGGING_SERVICE_NAME: ${{ vars.LOGGING_SERVICE_NAME }}
run: >
poetry run pytest --durations=0 --cov-append --cov-report=term --cov-report=html:htmlcov --cov-report=xml:coverage.xml --cov=schematic/
-m "single_process_execution" --reruns 4 --ignore=tests/unit
Expand Down
14 changes: 7 additions & 7 deletions CONTRIBUTION.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Please note we have a [code of conduct](CODE_OF_CONDUCT.md), please follow it in

## How to report bugs or feature requests

You can **create bug and feature requests** through [Sage Bionetwork's FAIR Data service desk](https://sagebionetworks.jira.com/servicedesk/customer/portal/5/group/8). Providing enough details to the developers to verify and troubleshoot your issue is paramount:
You can **create bug and feature requests** through [Sage Bionetwork's DPE schematic support](https://sagebionetworks.jira.com/servicedesk/customer/portal/5/group/7/create/225). Providing enough details to the developers to verify and troubleshoot your issue is paramount:
- **Provide a clear and descriptive title as well as a concise summary** of the issue to identify the problem.
- **Describe the exact steps which reproduce the problem** in as many details as possible.
- **Describe the behavior you observed after following the steps** and point out what exactly is the problem with that behavior.
Expand All @@ -25,7 +25,7 @@ For new features, bugs, enhancements:

#### 1. Branch Setup
* Pull the latest code from the develop branch in the upstream repository.
* Checkout a new branch formatted like so: `develop-<feature/fix-name>` from the develop branch
* Checkout a new branch formatted like so: `<JIRA-ID>-<feature/fix-name>` from the develop branch

#### 2. Development Workflow
* Develop on your new branch.
Expand All @@ -35,22 +35,22 @@ For new features, bugs, enhancements:
* You can choose to create a draft PR if you prefer to develop this way

#### 3. Branch Management
* Push code to `develop-<feature/fix-name>` in upstream repo:
* Push code to `<JIRA-ID>-<feature/fix-name>` in upstream repo:
```
git push <upstream> develop-<feature/fix-name>
git push <upstream> <JIRA-ID>-<feature/fix-name>
```
* Branch off `develop-<feature/fix-name>` if you need to work on multiple features associated with the same code base
* Branch off `<JIRA-ID>-<feature/fix-name>` if you need to work on multiple features associated with the same code base
* After feature work is complete and before creating a PR to the develop branch in upstream
a. ensure that code runs locally
b. test for logical correctness locally
c. run `pre-commit` to style code if the hook is not installed
c. wait for git workflow to complete (e.g. tests are run) on github

#### 4. Pull Request and Review
* Create a PR from `develop-<feature/fix-name>` into the develop branch of the upstream repo
* Create a PR from `<JIRA-ID>-<feature/fix-name>` into the develop branch of the upstream repo
* Request a code review on the PR
* Once code is approved merge in the develop branch. The **"Squash and merge"** strategy should be used for a cleaner commit history on the `develop` branch. The description of the squash commit should include enough information to understand the context of the changes that were made.
* Once the actions pass on the main branch, delete the `develop-<feature/fix-name>` branch
* Once the actions pass on the main branch, delete the `<JIRA-ID>-<feature/fix-name>` branch

### Updating readthedocs documentation
1. Navigate to the docs directory.
Expand Down
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -29,4 +29,4 @@ RUN poetry install --no-interaction --no-ansi --no-root

COPY . ./

RUN poetry install --only-root
RUN poetry install --only-root
2 changes: 2 additions & 0 deletions env.example
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ SERVICE_ACCOUNT_CREDS='Provide service account creds'
# LOGGING_EXPORT_FORMAT=otlp
# TRACING_SERVICE_NAME=schematic-api
# LOGGING_SERVICE_NAME=schematic-api
## Instance ID is used during integration tests export to identify the git branch
# SERVICE_INSTANCE_ID=schematic-1234
## Other examples: dev, staging, prod
# DEPLOYMENT_ENVIRONMENT=local
# OTEL_EXPORTER_OTLP_ENDPOINT=https://..../telemetry
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "schematicpy"
version = "24.11.2"
version = "24.12.1"
description = "Package for biomedical data model and metadata ingress management"
authors = [
"Milen Nikolov <[email protected]>",
Expand Down
20 changes: 14 additions & 6 deletions schematic/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,13 @@
from opentelemetry.instrumentation.flask import FlaskInstrumentor
from opentelemetry.sdk._logs import LoggerProvider, LoggingHandler
from opentelemetry.sdk._logs.export import BatchLogRecordProcessor
from opentelemetry.sdk.resources import DEPLOYMENT_ENVIRONMENT, SERVICE_NAME, Resource
from opentelemetry.sdk.resources import (
DEPLOYMENT_ENVIRONMENT,
SERVICE_INSTANCE_ID,
SERVICE_NAME,
SERVICE_VERSION,
Resource,
)
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor, Span
from opentelemetry.sdk.trace.sampling import ALWAYS_OFF
Expand All @@ -20,6 +26,7 @@

from schematic.configuration.configuration import CONFIG
from schematic.loader import LOADER
from schematic.version import __version__
from schematic_api.api.security_controller import info_from_bearer_auth

Synapse.allow_client_caching(False)
Expand Down Expand Up @@ -91,16 +98,14 @@ def set_up_tracing(session: requests.Session) -> None:
Synapse.enable_open_telemetry(True)
tracing_service_name = os.environ.get("TRACING_SERVICE_NAME", "schematic-api")
deployment_environment = os.environ.get("DEPLOYMENT_ENVIRONMENT", "")
service_instance_id = os.environ.get("SERVICE_INSTANCE_ID", "")
trace.set_tracer_provider(
TracerProvider(
resource=Resource(
attributes={
SERVICE_INSTANCE_ID: service_instance_id,
SERVICE_NAME: tracing_service_name,
# TODO: Revisit this portion later on. As of 11/12/2024 when
# deploying this to ECS or running within a docker container,
# the package version errors out with the following error:
# importlib.metadata.PackageNotFoundError: No package metadata was found for schematicpy
# SERVICE_VERSION: package_version,
SERVICE_VERSION: __version__,
DEPLOYMENT_ENVIRONMENT: deployment_environment,
}
)
Expand All @@ -122,11 +127,14 @@ def set_up_logging(session: requests.Session) -> None:
logging_export = os.environ.get("LOGGING_EXPORT_FORMAT", None)
logging_service_name = os.environ.get("LOGGING_SERVICE_NAME", "schematic-api")
deployment_environment = os.environ.get("DEPLOYMENT_ENVIRONMENT", "")
service_instance_id = os.environ.get("SERVICE_INSTANCE_ID", "")
if logging_export == "otlp":
resource = Resource.create(
{
SERVICE_INSTANCE_ID: service_instance_id,
SERVICE_NAME: logging_service_name,
DEPLOYMENT_ENVIRONMENT: deployment_environment,
SERVICE_VERSION: __version__,
}
)

Expand Down
2 changes: 2 additions & 0 deletions schematic/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
from schematic.visualization.commands import (
viz as viz_cli,
) # viz generation commands
from schematic import __version__

logger = logging.getLogger()
click_log.basic_config(logger)
Expand All @@ -24,6 +25,7 @@
# invoke_without_command=True -> forces the application not to show aids before losing them with a --h
@click.group(context_settings=CONTEXT_SETTINGS, invoke_without_command=True)
@click_log.simple_verbosity_option(logger)
@click.version_option(version=__version__, prog_name="schematic")
def main():
"""
Command line interface to the `schematic` backend services.
Expand Down
2 changes: 2 additions & 0 deletions schematic/manifest/generator.py
Original file line number Diff line number Diff line change
Expand Up @@ -1904,6 +1904,8 @@ def get_manifest(
# TODO: avoid explicitly exposing Synapse store functionality
# just instantiate a Store class and let it decide at runtime/config
# the store type
# TODO: determine which parts of fileview are necessary for `get` operations
# and pass query parameters at object instantiation to avoid having to re-query
if access_token:
# for getting an existing manifest on AWS
store = SynapseStorage(access_token=access_token)
Expand Down
7 changes: 5 additions & 2 deletions schematic/models/validate_attribute.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@

from schematic.schemas.data_model_graph import DataModelGraphExplorer
from schematic.store.synapse import SynapseStorage
from schematic.utils.df_utils import read_csv
from schematic.utils.validate_rules_utils import validation_rule_info
from schematic.utils.validate_utils import (
comma_separated_list_regex,
Expand Down Expand Up @@ -868,7 +869,7 @@ def _get_target_manifest_dataframes(
entity: File = self.synStore.getDatasetManifest(
datasetId=dataset_id, downloadFile=True
)
manifests.append(pd.read_csv(entity.path))
manifests.append(read_csv(entity.path))
return dict(zip(manifest_ids, manifests))

def get_target_manifests(
Expand Down Expand Up @@ -2119,7 +2120,9 @@ def filename_validation(

where_clauses = []

dataset_clause = f"parentId='{dataset_scope}'"
dataset_clause = SynapseStorage.build_clause_from_dataset_id(
dataset_id=dataset_scope
)
where_clauses.append(dataset_clause)

self._login(
Expand Down
3 changes: 2 additions & 1 deletion schematic/store/database/synapse_database_wrapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
from opentelemetry import trace

from schematic.store.synapse_tracker import SynapseEntityTracker
from schematic.utils.df_utils import read_csv


class SynapseTableNameError(Exception):
Expand Down Expand Up @@ -108,7 +109,7 @@ def execute_sql_query(
pandas.DataFrame: The queried table
"""
result = self.execute_sql_statement(query, include_row_data)
table = pandas.read_csv(result.filepath)
table = read_csv(result.filepath)
return table

def execute_sql_statement(
Expand Down
Loading
Loading