Skip to content

Commit

Permalink
Merge pull request #68 from dynotx/make-validators-decorators
Browse files Browse the repository at this point in the history
refactor validators to be decorator
  • Loading branch information
ndamania00 authored Feb 5, 2025
2 parents cd8c049 + 29135a7 commit 003b055
Show file tree
Hide file tree
Showing 15 changed files with 295 additions and 225 deletions.
30 changes: 13 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,30 +52,26 @@ With your schemas defined in code, you can now take advantage of the additional
1. Entity validation: Easily create custom validation rules for your Benchling entities.

```python
from liminal.validation import BenchlingValidator, BenchlingValidatorReport, BenchlingReportLevel
from liminal.orm.base_model import BaseModel

class CookTempValidator(BenchlingValidator):
"""Validates that a field value is a valid enum value for a Benchling entity"""

def validate(self, entity: type[BaseModel]) -> BenchlingValidatorReport:
valid = True
message = None
if entity.cook_time is not None and entity.cook_temp is None:
valid = False
message = "Cook temp is required if cook time is set"
if entity.cook_time is None and entity.cook_temp is not None:
valid = False
message = "Cook time is required if cook temp is set"
return self.create_report(valid, BenchlingReportLevel.MED, entity, message)
from liminal.validation import ValidationSeverity, liminal_validator

class Pizza(BaseModel, CustomEntityMixin):
...

@liminal_validator(ValidationSeverity.MED)
def cook_time_and_temp_validator(self) -> None:
if self.cook_time is not None and self.cook_temp is None:
raise ValueError("Cook temp is required if cook time is set")
if self.cook_time is None and self.cook_temp is not None:
raise ValueError("Cook time is required if cook temp is set")

validation_reports = Pizza.validate(session)
```

2. Strongly typed queries: Write type-safe queries using SQLAlchemy to access your Benchling entities.

```python
with BenchlingSession(benchling_connection, with_db=True) as session:
pizza = session.query(Pizza).filter(Pizza.name == "Margherita").first()
print(pizza)
```

3. CI/CD integration: Use Liminal to automatically generate and apply your revision files to your Benchling tenant(s) as part of your CI/CD pipeline.
Expand Down
38 changes: 0 additions & 38 deletions docs/reference/entity-schemas.md
Original file line number Diff line number Diff line change
Expand Up @@ -193,44 +193,6 @@ multi_relationship(target_class_name: str, current_class_name: str, entity_link_

The `query()` method must be implemented for the entity schema class to define a custom query. This is useful if you want to add additional filtering or joins to the query.

## Validators: [class](https://github.com/dynotx/liminal-orm/blob/main/liminal/validation/__init__.py)

As seen in the example above, the `get_validators` method is used to define a list of validators for the entity schema. These validators run on entities of the schema that are queried from Benchling's Postgres database. For example:

```python
pizza_entity = Pizza.query(session).first()

# Validate a single entity from a query
report = CookTempValidator().validate(pizza_entity)

# Validate all entities for a schema
reports = Pizza.validate(session)
```

The list of validators within `get_validators` are used to run on all entities of the schema.

The `BenchlingValidator` object is used to define the validator classes, that can be defined with custom logic to validate entities of a schema. Refer to the [Validators](./validators.md) page to learn more about how to define validators.

## Additional Functionality

Below is additional functionality that is provided by the Liminal BaseModel class.

```python
connection = BenchlingConnection(...)
benchling_service = BenchlingService(connection, use_db=True)

with benchling_service as session:

# Get all entities for a schema and return a dataframe
df = Pizza.df(session)

# Validate all entities for a schema and return a list of ValidatorReports
reports = Pizza.validate(session)

# Validate all entities for a schema and return a dataframe
validated_df = Pizza.validate_to_df(session)
```

## Notes

- Note that the Entity Schema definition in Liminal does not cover 100% of the properties that can be set through the Benchling website. However, the goal is to have 100% parity! If you find any missing properties that are not covered in the definition or migration service, please open an issue on [Github](https://github.com/dynotx/liminal-orm/issues). In the meantime, you can manually set the properties through the Benchling website.
141 changes: 141 additions & 0 deletions docs/reference/validation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
When using Benchling to store essential data, it is important to validate the data to ensure it is accurate and consistent. Liminal provides a way to validate entities that follow key business logic by defining validators. Below is an example of a Liminal Validator that validates the cook temp of a pizza. Validators must be defined within the entity schema, and run on data queried from the warehouse.

## Defining a Liminal Validator [decorator](https://github.com/dynotx/liminal-orm/blob/main/liminal/validation/__init__.py#L61)

Any function decorated with `liminal_validator` are detected as validators for the entity schema.
Each validator returns a `BenchlingValidatorReport` object per entity it is run on, with either `valid=True` or `valid=False`.

```python
from liminal.validation import ValidationSeverity, liminal_validator

class Pizza(BaseModel, CustomEntityMixin):
...

@liminal_validator(ValidationSeverity.MED)
def cook_time_and_temp_validator(self) -> None:
if self.cook_time is not None and self.cook_temp is None:
raise ValueError("Cook temp is required if cook time is set")
if self.cook_time is None and self.cook_temp is not None:
raise ValueError("Cook time is required if cook temp is set")
```

### Parameters

**validator_level: ValidationSeverity**

> The severity of the validator. Defaults to `ValidationSeverity.LOW`.
**validator_name: str | None**

> The name of the validator. Defaults to the pascalized name of the function.
## BenchlingValidatorReport: [class](https://github.com/dynotx/liminal-orm/blob/main/liminal/validation/__init__.py#L13)

### Parameters

**valid : bool**

> Indicates whether the validation passed or failed.
**model : str**

> The name of the model being validated. (eg: Pizza)
**level : ValidationSeverity**

> The severity level of the validation report.
**validator_name : str | None**

> The name of the validator that generated this report. (eg: CookTimeAndTempValidator)
**entity_id : str | None**

> The entity ID of the entity being validated.
**registry_id : str | None**

> The registry ID of the entity being validated.
**entity_name : str | None**

> The name of the entity being validated.
**message : str | None**

> A message describing the result of the validation.
**creator_name : str | None**

> The name of the creator of the entity being validated.
**creator_email : str | None**

> The email of the creator of the entity being validated.
**updated_date : datetime | None**

> The date the entity was last updated.
## Running Validation

To run validation using Liminal, you can call the `validate()` method on the entity schema.

```python
with BenchlingSession(benchling_connection, with_db=True) as session:
reports = Pizza.validate(session)
```

### Parameters

**session : Session**

> The Benchling database session.
**base_filters: BaseValidatorFilters | None**

> Filters to apply to the query.
**only_invalid: bool**

> If True, only returns reports for entities that failed validation.
### Returns

**list[BenchlingValidatorReport]**

> List of reports from running all validators on all entities returned from the query.
!!! note
The `validate_to_df` method returns a pandas dataframe with all the reports.

## BaseValidatorFilters: [class](https://github.com/dynotx/liminal-orm/blob/main/liminal/base/base_validation_filters.py)

This class is used to pass base filters to benchling warehouse database queries.
These columns are found on all tables in the benchling warehouse database.

### Parameters

**created_date_start: date | None**

> Start date for created date filter.
**created_date_end: date | None**

> End date for created date filter.
**updated_date_start: date | None**

> Start date for updated date filter.
**updated_date_end: date | None**

> End date for updated date filter.
**entity_ids: list[str] | None**

> List of entity IDs to filter by.
**creator_full_names: list[str] | None**

> List of creator full names to filter by.
24 changes: 0 additions & 24 deletions docs/reference/validators.md

This file was deleted.

15 changes: 15 additions & 0 deletions liminal/base/base_validation_filters.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,21 @@ class BaseValidatorFilters(BaseModel):
"""
This class is used to pass base filters to benchling warehouse database queries.
These columns are found on all tables in the benchling warehouse database.
Parameters
----------
created_date_start: date | None
Start date for created date filter.
created_date_end: date | None
End date for created date filter.
updated_date_start: date | None
Start date for updated date filter.
updated_date_end: date | None
End date for updated date filter.
entity_ids: list[str] | None
List of entity IDs to filter by.
creator_full_names: list[str] | None
List of creator full names to filter by.
"""

created_date_start: date | None = None
Expand Down
2 changes: 1 addition & 1 deletion liminal/base/properties/base_field_properties.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ def set_warehouse_name(self, wh_name: str) -> BaseFieldProperties:
self.warehouse_name = wh_name
return self

def validate_column(self, wh_name: str) -> bool:
def validate_column_definition(self, wh_name: str) -> bool:
"""If the Field Properties are meant to represent a column in Benchling,
this will validate the properties and ensure that the entity_link and dropdowns are valid names that exist in our code.
"""
Expand Down
2 changes: 1 addition & 1 deletion liminal/entity_schemas/compare.py
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ def compare_entity_schemas(
exclude_base_columns=True
)
# Validate the entity_link and dropdown_link reference an entity_schema or dropdown that exists in code.
model.validate_model()
model.validate_model_definition()
# if the model table_name is found in the benchling schemas, check for changes...
if (model_wh_name := model.__schema_properties__.warehouse_name) in [
s.warehouse_name for s, _, _ in benchling_schemas
Expand Down
10 changes: 2 additions & 8 deletions liminal/entity_schemas/generate_files.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,6 @@ def generate_all_entity_schema_files(
"from liminal.orm.base_model import BaseModel",
"from liminal.orm.schema_properties import SchemaProperties",
"from liminal.enums import BenchlingEntityType, BenchlingFieldType, BenchlingNamingStrategy",
"from liminal.validation import BenchlingValidator",
f"from liminal.orm.mixins import {get_entity_mixin(schema_properties.entity_type)}",
]
init_strings = [f"{tab}def __init__(", f"{tab}self,"]
Expand Down Expand Up @@ -151,11 +150,7 @@ def generate_all_entity_schema_files(
relationship_string = "\n".join(relationship_strings)
import_string = "\n".join(list(set(import_strings)))
init_string = f"\n{tab}".join(init_strings) if len(columns) > 0 else ""
functions_string = """
def get_validators(self) -> list[BenchlingValidator]:
return []"""

content = f"""{import_string}
full_content = f"""{import_string}
class {classname}(BaseModel, {get_entity_mixin(schema_properties.entity_type)}):
Expand All @@ -168,7 +163,6 @@ class {classname}(BaseModel, {get_entity_mixin(schema_properties.entity_type)}):
{init_string}
{functions_string}
"""
write_directory_path = write_path / get_file_subdirectory(
schema_properties.entity_type
Expand All @@ -181,7 +175,7 @@ class {classname}(BaseModel, {get_entity_mixin(schema_properties.entity_type)}):
)
write_directory_path.mkdir(exist_ok=True)
with open(write_directory_path / filename, "w") as file:
file.write(content)
file.write(full_content)

for subdir, names in subdirectory_map.items():
init_content = (
Expand Down
1 change: 0 additions & 1 deletion liminal/enums/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,4 @@
from liminal.enums.benchling_field_type import BenchlingFieldType
from liminal.enums.benchling_folder_item_type import BenchlingFolderItemType
from liminal.enums.benchling_naming_strategy import BenchlingNamingStrategy
from liminal.enums.benchling_report_level import BenchlingReportLevel
from liminal.enums.benchling_sequence_type import BenchlingSequenceType
1 change: 0 additions & 1 deletion liminal/external/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,5 @@
BenchlingFieldType,
BenchlingFolderItemType,
BenchlingNamingStrategy,
BenchlingReportLevel,
BenchlingSequenceType,
)
Loading

0 comments on commit 003b055

Please sign in to comment.