Skip to content

Conversation

@AkhilaNenavath123
Copy link

This PR improves the user experience by making database error messages in the Airflow Rest API more user-friendly and less technical. Specifically, it addresses issues related to unique constraint violations and general database errors by providing clear, actionable error messages without exposing technical SQL details to the user.

Problem:
When attempting to trigger a second DAG run with the same logical date as the first one, the UI displayed a generic unique_constraint error, which included SQL query details. This error was difficult for users to interpret, as it exposed database internals rather than providing a clear action for resolution.
Screenshot from 2025-04-28 16-37-18

Solution:
This PR enhances the _UniqueConstraintErrorHandler and adds a new _DatabaseErrorHandler to handle database errors in a more user-friendly manner. The changes ensure that:

  • Users are shown clear, concise, and actionable error messages.
  • SQL technical details (e.g., SQL queries) are removed from user-facing messages.
  • Specific guidance is provided to help resolve issues like duplicate DAG run IDs or task IDs.

Key Changes:

Improved Error Messages:

  • Enhanced _UniqueConstraintErrorHandler to provide specific, user-friendly messages for unique constraint violations (e.g., DAG run ID, task ID, logical date).
  • Removed technical SQL details and provided clear guidance on how to resolve the issues.

New Database Error Handler:

  • Added _DatabaseErrorHandler class to handle general database errors (e.g., OperationalError, ProgrammingError) with user-friendly messages like "Please try again later."

Refactor Exception Handling:

  • Refactored exception handling in API routes to use the new user-friendly error messages and provide actionable guidance for database-related issues.

Screenshot from 2025-04-28 12-44-48

Issue Link: #49034

@boring-cyborg
Copy link

boring-cyborg bot commented Apr 29, 2025

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
  • Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: [email protected]
    Slack: https://s.apache.org/airflow-slack

Copy link
Contributor

@bugraoz93 bugraoz93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Some tests are failing because of the change, pre-commit needs to be run and some comments need to be addressed :)

FAILED airflow-core/tests/unit/api_fastapi/common/test_exceptions.py::TestUniqueConstraintErrorHandler::test_handle_multiple_columns_unique_constraint_error[Test for postgres only - DagRun] - assert equals failed
  'This operation would create a   {                               
  duplicate entry. Please ensure                                   
  all unique fields have unique v                                  
  alues.\nType: unique_constraint                                  
  _violation'                                                      
                                     'orig_error': 'duplicate key  
                                   value violates unique constrain 
                                   t "dag_run_dag_id_run_id_key"\n 
                                   DETAIL:  Key (dag_id, run_id)=( 
                                   test_dag_id, test_run_id) alrea 
                                   dy exists.\n',                  
                                     'reason': 'Unique constraint  
                                   violation',                     
                                     'statement': 'INSERT INTO dag 
                                   _run (dag_id, queued_at, logica 
                                   l_date, start_date, end_date, s 
                                   tate, run_id, creating_job_id,  
                                   run_type, triggered_by, conf, d 
                                   ata_interval_start, data_interv 
                                   al_end, run_after, last_schedul 
                                   ing_decision, log_template_id,  
                                   updated_at, clear_number, backf 
                                   ill_id, bundle_version, schedul 
                                   ed_by_job_id, context_carrier,  
                                   created_dag_version_id) VALUES  
                                   (%(dag_id)s, %(queued_at)s, %(l 
                                   ogical_date)s, %(start_date)s,  
                                   %(end_date)s, %(state)s, %(run_ 
                                   id)s, %(creating_job_id)s, %(ru 
                                   n_type)s, %(triggered_by)s, %(c 
                                   onf)s, %(data_interval_start)s, 
                                    %(data_interval_end)s, %(run_a 
                                   fter)s, %(last_scheduling_decis 
                                   ion)s, (SELECT max(log_template 
                                   .id) AS max_1 \nFROM log_templa 
                                   te), %(updated_at)s, %(clear_nu 
                                   mber)s, %(backfill_id)s, %(bund 
                                   le_version)s, %(scheduled_by_jo 
                                   b_id)s, %(context_carrier)s, %( 
                                   created_dag_version_id)s) RETUR 
                                   NING dag_run.id',               
                                   }

error_message = self._get_user_friendly_message(exc)
raise HTTPException(
status_code=status.HTTP_409_CONFLICT,
detail={
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should change this with a string. Some tests and logic depend on this is expected to be a dictionary, not a string, in the API and API tests. Is there a reason for this change?

error_message = self._get_user_friendly_message(exc)
raise HTTPException(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
detail=f"{error_message}\nType: database_error"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, I think enhancing the text message according to cases and making it user-friendly is a good idea, but the other two things are also useful for debugging.

    "statement": str(exc.statement),
    "orig_error": str(exc.orig),

Copy link
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR.

I left a few comments.

We need more tests and adjust the existing ones.

Comment on lines +79 to +96
def _get_user_friendly_message(self, exc: IntegrityError) -> str:
"""Convert database error to user-friendly message."""
exc_orig_str = str(exc.orig)

# Handle DAG run unique constraint
if "dag_run.dag_id" in exc_orig_str and "dag_run.run_id" in exc_orig_str:
return "A DAG run with this ID already exists. Please use a different run ID."

# Handle task instance unique constrain
if "task_instance.dag_id" in exc_orig_str and "task_instance.task_id" in exc_orig_str:
return "A task instance with this ID already exists. Please use a different task ID."

# Handle DAG run logical date unique constraint
if "dag_run.dag_id" in exc_orig_str and "dag_run.logical_date" in exc_orig_str:
return "A DAG run with this logical date already exists. Please use a different logical date."

# Generic unique constraint message
return "This operation would create a duplicate entry. Please ensure all unique fields have unique values."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seem to handle only duplicated entry errors (unique constraint), while there are many other IntegrityError possible.

Comment on lines +112 to +119
def _get_user_friendly_message(self, exc: SQLAlchemyError) -> str:
"""Convert database error to user-friendly message."""
if isinstance(exc, OperationalError):
return "A database operation failed. Please try again later."
elif isinstance(exc, ProgrammingError):
return "An error occurred while processing your request. Please check your input and try again."
else:
return "An unexpected database error occurred. Please try again later."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This obfuscate the error cause, there's no way to tell what's happening from a user perspective receiving such messages.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, I’ve recently been assigned to another project and will be unable to work on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll pick it up

@pierrejeambrun
Copy link
Member

closing for now, @rawwar will pick it up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:API Airflow's REST/HTTP API

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants