You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When raised_exception == True, exception_info has incorrect structure.
Instead of {'raised_exception': True, 'exception_traceback': 'The traceback', 'exception_message': 'some message'}, it has the following structure:
{"additional_key" : {'raised_exception': True, 'exception_traceback': 'The traceback', 'exception_message': 'some message'}}
To Reproduce
# df to validate
df = spark.sql("""
SELECT id , CASE WHEN id%4 = 0 THEN "NOT NULL" END AS colname
FROM range(1, 100)""")
# update expectation suite
suite_name = "e_simple_unit_test"
suite = context.suites.add_or_update (gx.ExpectationSuite(name=suite_name))
correct_column_name = gx.expectations.ExpectColumnValuesToNotBeNull (
column="colname", mostly=1, row_condition = "id%2 = 0", condition_parser = "spark")
incorrect_column_name = gx.expectations.ExpectColumnValuesToNotBeNull (
column="___colname___", mostly=1, row_condition = "id%2 = 0", condition_parser = "spark")
suite.add_expectation(correct_column_name)
suite.add_expectation(incorrect_column_name)
suite.save()
# update validation
data_source_name = data_source_configs["data_source_name"]
data_asset_name = data_source_configs["data_asset_name"]
batch_definition_name = data_source_configs["batch_definition_name"]
batch_definition = context.data_sources.get(data_source_name).get_asset(data_asset_name).get_batch_definition(batch_definition_name)
validation_definition_name = "unit_test_validation_definition"
validation_definition = gx.ValidationDefinition(
data=batch_definition, suite=suite, name=validation_definition_name
)
unit_test_validation_definition = context.validation_definitions.add_or_update(validation_definition)
# run the ValidationDefinition
validation_results = unit_test_validation_definition.run(
batch_parameters={"dataframe": df},
result_format = "COMPLETE")
results_dict = validation_results.to_json_dict()
for dct in results_dict["results"]:
if "exception_message" in dct["exception_info"].keys():
print("\nCorrect exception_info structure:")
elif "exception_message" not in dct["exception_info"].keys():
print("\nInorrect exception_info structure:")
print(dct["exception_info"])
returns -- >
Inorrect exception_info structure:
{"('column_values.nonnull.condition', '242ce27d28b7ac28fe08ad7be0377b1a', ())": {'exception_traceback': 'Traceback.......', 'exception_message': 'Error: The column "___colname___" in BatchData does not exist.', 'raised_exception': True}}
Correct exception_info structure:
{'raised_exception': False, 'exception_traceback': None, 'exception_message': None}
Expected behavior
Correct exception_info structure:
{'raised_exception': True, 'exception_traceback': 'Traceback.......', 'exception_message': 'Error: The column "___colname___" in BatchData does not exist.'}
Correct exception_info structure:
{'raised_exception': False, 'exception_traceback': None, 'exception_message': None}
Environment (please complete the following information):
Great Expectations Version: [e.g. 1.3.1]
Data Source: Spark
Cloud environment: Databricks
The text was updated successfully, but these errors were encountered:
I don't use data docs. I parse and save GE results into my log table instead.
# run the ValidationDefinition
validation_results = my_validation_definition.run(
batch_parameters={"dataframe": data_frame_to_check},
result_format = "COMPLETE")
# convert the results to a dictionary
validation_results_dict= validation_results.to_json_dict()
# create a DF to save later in a log table
df = spark.createDataFrame(validation_results_dict, schema=ge_results_schema)
I used a similar approach with v0.18 and was able to see which checks had failed because of errors and what exactly those errors were (column was not found or smth else)
Now, because of this additional key that is every time different, I cannot pull out error values nor raised_exception boolean result
Describe the bug
When raised_exception == True, exception_info has incorrect structure.
Instead of {'raised_exception': True, 'exception_traceback': 'The traceback', 'exception_message': 'some message'}, it has the following structure:
{"additional_key" : {'raised_exception': True, 'exception_traceback': 'The traceback', 'exception_message': 'some message'}}
To Reproduce
returns -- >
Expected behavior
Environment (please complete the following information):
The text was updated successfully, but these errors were encountered: