Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: validators for cdd,ead, and error pipeline #177

Merged
merged 25 commits into from
Jun 28, 2024

Conversation

johanseto
Copy link
Collaborator

@johanseto johanseto commented Jun 15, 2024

Description

feat: validators pipeline step for cdd and ead
feat: pipelines for validation error handling trigger.

Testing instructions

Tested in stage.

After

  1. Use a user with missing data.
    for eg lastname without data.
    image

  2. Send a CDD request
    image

  3. Check in logs validator errors
    image

  4. Check in audit model info related validation error.
    image
    easteregg: Check sentry and find the error.

Additional information

Jira story
https://edunext.atlassian.net/browse/FUTUREX-775

Checklist for Merge

  • Tested in a remote environment
  • Updated documentation
  • Rebased master/main
  • Squashed commits

@github-actions github-actions bot added test size/m m lines label labels Jun 15, 2024
@johanseto johanseto self-assigned this Jun 15, 2024
eox_nelp/pearson_vue/validators.py Outdated Show resolved Hide resolved
eox_nelp/pearson_vue/validators.py Outdated Show resolved Hide resolved
eox_nelp/pearson_vue/validators.py Outdated Show resolved Hide resolved


@audit_method(action="PearsonVue Error validating data")
def audit_validation_error(*args, **kwargs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to suggest something different here, because this just audit errors, I mean a success is not an option here, however you can decorate the method validate_cdd_request and validate_ead_request and audit both cases and remove the extra logs, I'm thinking in something like validation_error_pipeline that would be a method that executes multiple task on the error case, at this moment you just need to raise the exception but probably we will need to send an email to inform the user that some fields has to be updated or something similar

eox_nelp/pearson_vue/tests/test_validators.py Outdated Show resolved Hide resolved
try:
EadRequest(**ead_request)
except ValidationError as validation_exception:
logger.info("Validation error for ead_request: %s \n %s", ead_request, validation_exception)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we raise the exception do we need really the log ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

@johanseto johanseto force-pushed the jlc/add-pydantic-dataclasses branch from f033834 to 5fceafc Compare June 19, 2024 15:37
@github-actions github-actions bot added size/l and removed size/m m lines label labels Jun 19, 2024
@johanseto johanseto force-pushed the jlc/add-pydantic-dataclasses branch from 5fceafc to 9844c63 Compare June 19, 2024 15:39
@johanseto johanseto changed the base branch from master to jlc/refactor-cdd-request-pipeline June 19, 2024 15:40
@github-actions github-actions bot added size/m m lines label django_plugin size/l and removed size/l size/m m lines label labels Jun 19, 2024
@johanseto johanseto force-pushed the jlc/add-pydantic-dataclasses branch from b7d98ef to 692bc33 Compare June 19, 2024 19:04
@johanseto johanseto changed the title feat: validators for cdd and ead feat: validators for cdd and ead and error pipeline Jun 19, 2024
@johanseto
Copy link
Collaborator Author

@andrey-canon now there is a new task for error handling.
image

In this case error_validation_task only has one function in the pipe: audit_error_validation This creates one record in eox-audit and no more. (thrown once)
image

@johanseto johanseto changed the title feat: validators for cdd and ead and error pipeline feat: validators for cdd,ead, and error pipeline Jun 19, 2024
@johanseto johanseto changed the base branch from jlc/refactor-cdd-request-pipeline to master June 19, 2024 21:15
@johanseto johanseto force-pushed the jlc/add-pydantic-dataclasses branch from 07d0ce0 to a01630a Compare June 19, 2024 21:15
Copy link
Collaborator

@andrey-canon andrey-canon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some code suggestions that works together, basically the suggestion is to raise a custom exception instead of sending the launch_validation_error_pipeline key and then handle that based on the exception we can achieve something similar to that with current implementation and dictionary key, however look this example:

def build_ead_request(
    profile_metadata,
    exam_metadata,
    transaction_type="Add",
    **kwargs
):  # pylint: disable=unused-argument
    """Build the ead_request dict.

    Args:
        profile_metadata (dict): Basic user data.
        exam_metadata (dict): Exam information.
        transaction_type (str): The type of transaction for the authorization (default is "Add").
        **kwargs: A dictionary containing the following key-value pairs:

    Returns:
        dict: dict with ead_request dict
    """
    try:
        ead_request = {
            "@clientAuthorizationID": exam_metadata["client_authorization_id"],
            "@clientID": getattr(settings, "PEARSON_RTI_WSDL_CLIENT_ID"),
            "@authorizationTransactionType": transaction_type,
            "clientCandidateID": f'NELC{profile_metadata["anonymous_user_id"]}',
            "examAuthorizationCount": exam_metadata["exam_authorization_count"],
            "examSeriesCode": exam_metadata["exam_series_code"],
            "eligibilityApptDateFirst": exam_metadata["eligibility_appt_date_first"],
            "eligibilityApptDateLast": exam_metadata["eligibility_appt_date_last"],
            "lastUpdate": timezone.now().strftime("%Y/%m/%d %H:%M:%S GMT"),
        }
    except KeyError:
        raise PearsonKeyError()

    # Validates 
    ead_request = CddRequest(**ead_request)  # lets image that this raises PearsonValidationError and returns the data that we required

    return {
        "ead_request": ead_request
    }

So in the example we have a pipeline that raises two different exceptions we could return a dictionary that indicates the error but we should also catch the validation error, finally if we apply that in the error handler pipeline we could have some pipes that make something based on the exception_type

@@ -54,6 +60,17 @@ def run_pipeline(self):
self.backend_data["pipeline_index"] = len(pipeline) - 1
break

if result.get("launch_validation_error_pipeline"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    def run_pipeline(self):
        """
        Executes the RTI pipeline by iterating through the pipeline functions.
        """
        pipeline = self.get_pipeline()
        pipeline_index = self.backend_data.get("pipeline_index", 0)
        for idx, func in enumerate(pipeline[pipeline_index:]):
            self.backend_data["pipeline_index"] = pipeline_index + idx
            
            try:
                result = func(**self.backend_data) or {}
            except PearsonBaseError as exc : # just handle Pearson exceptions
                result["safely_pipeline_termination"] = True
                tasks = importlib.import_module("eox_nelp.pearson_vue.tasks")
                tasks.error_validation_task.delay(exception_type=exc.exception_type, ...)

            self.backend_data.update(result)

            if result.get("safely_pipeline_termination"):
                self.backend_data["pipeline_index"] = len(pipeline) - 1
                break

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can remove this block

Comment on lines 429 to 435
except ValidationError as validation_exception:
return {
"launch_validation_error_pipeline": True,
"validation_exception": validation_exception.json()
}

return None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
except ValidationError as validation_exception:
return {
"launch_validation_error_pipeline": True,
"validation_exception": validation_exception.json()
}
return None
except ValidationError as validation_exception:
raise PearsonValidationError()

Comment on lines 407 to 411
try:
raise_audit_validation_exception(*args, **kwargs)
except ValueError:
pass
logger.error("Validation Error args:%s-kwargs:%s", args, kwargs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
try:
raise_audit_validation_exception(*args, **kwargs)
except ValueError:
pass
logger.error("Validation Error args:%s-kwargs:%s", args, kwargs)
if exception_type == PearsonValidationError.exception_type:
try:
raise_audit_validation_exception(*args, **kwargs)
except ValueError:
logger.error("Validation Error args:%s-kwargs:%s", args, kwargs)

Comment on lines +20 to +22
validate_cdd_request,
validate_ead_request,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are validating after sending the request

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the imports I think so

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hahah you are right



@shared_task(bind=True)
def error_validation_task(self, pipeline_index=0, **kwargs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def error_validation_task(self, pipeline_index=0, **kwargs):
def handle_error_task(self, pipeline_index=0, **kwargs):

check_service_availability,
import_candidate_demographics,
]


class ErrorValidationDataImport(RealTimeImport):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class ErrorValidationDataImport(RealTimeImport):
class ErrorRealTimeImportHandler(RealTimeImport):

@johanseto
Copy link
Collaborator Author

@andrey-canon Take a look to the new refactor proposals.
Here is how now, audit saves 2 different errors:
2024-06-21_09-00
2024-06-21_08-59

eox_nelp/pearson_vue/exceptions.py Show resolved Hide resolved
try:
result = func(**self.backend_data) or {}
except PearsonBaseError as pearson_error:
self.backend_data["pipeline_index"] = len(pipeline) - 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a suggestion to avoid this line

Comment on lines 61 to 68
# clean kwargs to dont finish next pipeline launch.
executed__pipeline_kwargs = remove_keys_from_dict(self.backend_data, ["pipeline_index"])
executed__pipeline_kwargs["failed_step_pipeline"] = func.__name__
tasks = importlib.import_module("eox_nelp.pearson_vue.tasks")
tasks.rti_error_handler_task.delay(**executed__pipeline_kwargs, **pearson_error.__dict__)
break
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# clean kwargs to dont finish next pipeline launch.
executed__pipeline_kwargs = remove_keys_from_dict(self.backend_data, ["pipeline_index"])
executed__pipeline_kwargs["failed_step_pipeline"] = func.__name__
tasks = importlib.import_module("eox_nelp.pearson_vue.tasks")
tasks.rti_error_handler_task.delay(**executed__pipeline_kwargs, **pearson_error.__dict__)
break
tasks = importlib.import_module("eox_nelp.pearson_vue.tasks")
tasks.rti_error_handler_task.delay(
failed_step_pipeline=func.__name__,
exception_data=pearson_error.__dict__,
)

The suggestion is to avoid to pass arguments that we don't need, the current implementation sends everything but we don't know the content of that and that content is variable, that depends on which pipe failed,so this suggestion is to pass explicit arguments then we can implement pipe based on arguments that we know

Copy link
Collaborator Author

@johanseto johanseto Jun 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like to have the same kwarg that the original pipeline had. Eg to send an email you would need the profile_metadata to know the user...
Then an error pipeline like social core way, you could use it like kwargs.get("profile_metadata")...

The only problem, is for example that the audit_pipe error saves everything but that I could change it.

@@ -54,6 +60,17 @@ def run_pipeline(self):
self.backend_data["pipeline_index"] = len(pipeline) - 1
break

if result.get("launch_validation_error_pipeline"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can remove this block


return {
"ead_request": ead_request
}


def audit_pearson_error(*args, **kwargs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have conflict with this pipe, I think it's too general and this shouldn't raise a ValueError I think this should raise the same exception that started the rti_error_handler_task you are also passing the whole kwargs and args that could have sensitive data


return {
"ead_request": ead_request
}


def audit_pearson_error(*args, **kwargs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have conflict with this pipe, I think it's too general and this shouldn't raise a ValueError I think this should raise the same exception that started the rti_error_handler_task you are also passing the whole kwargs and args that could have sensitive data

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the exception raised. No way, I think that the data audited is similar to the rti import data.
So in terms in sensitive in my opinion is not much difference from this
Also you could compare the audit_pearson_error eg

@johanseto johanseto force-pushed the jlc/add-pydantic-dataclasses branch from c2415cf to 54ded90 Compare June 24, 2024 19:18
This is for CddRequest and EadRequest

feat: add native address fields
@johanseto johanseto force-pushed the jlc/add-pydantic-dataclasses branch from 54ded90 to 7a20113 Compare June 24, 2024 19:41
@johanseto
Copy link
Collaborator Author

@andrey-canon I made this 7a20113 to remove sensitive kwargs if needed in the audit step.

Comment on lines 61 to 68
executed_pipeline_kwargs = remove_keys_from_dict(self.backend_data, ["pipeline_index"])
tasks = importlib.import_module("eox_nelp.pearson_vue.tasks")
tasks.rti_error_handler_task.delay(
failed_step_pipeline=func.__name__,
exception_data=pearson_error.__dict__,
**executed_pipeline_kwargs,
)
break
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was your previous answer, but the code changed so the thread is not the same

I like to have the same kwarg that the original pipeline had. Eg to send an email you would need the profile_metadata to know the user...
Then an error pipeline like social core way, you could use it like kwargs.get("profile_metadata")...

The only problem, is for example that the audit_pipe error saves everything but that I could change it.

and my suggestion is the same hahahaha, you like to have the same kwargs, on the other hand, I don't like to have that because it's not easy to track which arguments you are sharing and what methods needs, so if you need the profile_metadata make that explicit

tasks.rti_error_handler_task.delay(
                    failed_step_pipeline=func.__name__,
                    exception_data=pearson_error.__dict__,
                    profile_metadata=self.backend_data.get(profile_metadata),
                )

So if a create a new pipe is clear the information that is available, if we share everything I have to analyze every pipe to know which possible arguments are available, another option is to create a context,

context = {
    "profile_metadata": self.backend_data.get(profile_metadata),
    "exam_data": self.backend_data.get("exam_metadata")
}

tasks.rti_error_handler_task.delay(
                    failed_step_pipeline=func.__name__,
                    exception_data=pearson_error.__dict__,
                    context=context,
                )

in this case as developer I just have to check the context to know the available information

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would let you know the intention of who needs the data??. And the dynamic kwargs behaviour.

Let's Imagine a case: I want to use profile_metadata in the error pipeline.

As the error_task is triggered when any step fails I don't know the kwargs executed(pipeline_step) in the initial RTI pipeline.
For eg

    def get_pipeline(self):
        """
        Returns the RTI pipeline, which is a list of functions to be executed.
        """
        return [
            handle_course_completion_status,
            get_user_data,
            get_exam_data,
            build_cdd_request,
            validate_cdd_request,
            build_ead_request,
            validate_ead_request,
            check_service_availability,
            import_candidate_demographics,
            import_exam_authorization,
        ]

Here we have pipe with different inputs and outputs.
So I would add all kwargs definition like:

tasks.rti_error_handler_task.delay(
                    failed_step_pipeline=func.__name__,
                    exception_data=pearson_error.__dict__,
                   course_id=self.backend_data.get("course_id"),
                    user_id=self.backend_data.get("user_id")
                    profile_metadata=self.backend_data.get("profile_metadata"),
                   exam_metadata=self.backend_data.get("exam_metadata"),
                 ead_request=self.backend_data.get("ead_request"),
                 cdd_request=self.backend_data.get("cdd_request"),
                transaction_type=self.backend_data.get("transaction_type"),
                ...
                )

And also here you have to add any other possible return of the pipeline steps(new steps)...

Or only create a pipeline with

     if not  kwargs.get("profile_metadata"):
         return

Now, using **kwargs, when launching the error pipeline you only specify that this pipeline is like the continuation of the first pipeline so it would have the same kwargs.(executed-dynamic~)

Comment on lines 459 to 460
audit_action = "Pearson Vue Exception"
audit_action = f"{audit_action}~{exception_data['exception_type']}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
audit_action = "Pearson Vue Exception"
audit_action = f"{audit_action}~{exception_data['exception_type']}"
audit_action = f"Pearson Vue Exception~{exception_data['exception_type']}"


from eox_nelp.api_clients.pearson_rti import PearsonRTIApiClient
from eox_nelp.edxapp_wrapper.student import anonymous_id_for_user
from eox_nelp.pearson_vue import exceptions
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from eox_nelp.pearson_vue import exceptions
from eox_nelp.pearson_vue import exceptions as pearson_vue_exceptions


@audit_method(action=audit_action)
def raise_audit_pearson_exception(exception_data, **audit_kwargs):
raise pearson_exception(exception_data, audit_kwargs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you are passing two positional arguments so if the pearson_exception is PearsonAttributeError that equivalent to pearson_exception(request_type=exception_data, attribute_error=audit_kwargs), does that make sense ?
id f the exception_data contains all the original data that should be pearson_exception(**exception_data),

eox_nelp/pearson_vue/exceptions.py Show resolved Hide resolved
except PearsonBaseError as pearson_error:
# clean kwargs to dont finish next pipeline launch.
executed_pipeline_kwargs = remove_keys_from_dict(self.backend_data, ["pipeline_index"])
tasks = importlib.import_module("eox_nelp.pearson_vue.tasks")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just asking why not from eox_nelp.pearson_vue.tasks import rti_error_handler_task

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the top of the file is a circular python import error.
And I could add there from eox_nelp.pearson_vue.tasks import rti_error_handler_task but I want to avoid
import-outside-toplevel

refactor: init exception also from exception_dict

chore: pylint changes
"""
Module to add data_classes related Pearson Vue Integration
"""
# pylint: disable=missing-class-docstring
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-.-

Comment on lines 22 to 33
"""Init pearson exception.Is mandatory the exception_reasons.
You could init using pipe_frame
Or init using exception_dict representation with **kwargs.
That representation should have the following shape:
exception_dict = {
'exception_type': 'validation-error',
'pipe_args_dict': {
"cdd_request": {}
},
'pipe_function': 'validate_cdd_request',
'exception_reason': "error: ['String to short.']"
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update this

""" Test ead_request is built with profile_metadata and exam_metadata.
Expected behavior:
- The result is the expected value.
def setUp(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a blank line before this line

""" Test cdd_request is built with profile_metadata.
Expected behavior:
- The result is the expected value.
def setUp(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a blank line before this line

""" Test ead_request is built with profile_metadata and exam_metadata.
Expected behavior:
- The result is the expected value.
def setUp(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a blank line before this line

@johanseto johanseto merged commit f95f96b into master Jun 28, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants