Skip to content

Conversation

@potiuk
Copy link
Member

@potiuk potiuk commented Mar 14, 2025

This is continuation of the separation of the Airflow codebase into
separate distributions. This one splits airflow into two of them:

  • apache-airflow - becomes an empty, meta no-code distribution that
    only has dependencies to apache-airflow-core and task-sdk
    distributions and it has preinstalled provider distributions
    added in standard "wheel" distribution. All "extras" lead
    either to "apache-airflow-core" extras or to providers - the
    dependencies and optional dependencies are calculated differently
    depending on "editable" or "standard" mode - in editable mode,
    just provider dependencies are installed for preinstalled providers
    in standard mode - those preinstalled providers are dependencies.

  • the apache-airflow-core distribution contains all airflow core
    sources (previously in apache-airflow) and it has no provider
    extras. Thanks to that apache-airflow distribution does not
    have any dynamically calculated dependencies.

  • the apache-airflow-core distribution hs "hatch_build_airflow_core.py"
    build hooks that add custom build target and implement custom
    cleanup in order to implement compiling assets as part of the build.

  • During the move, the following changes were applied for consistency:

    • packages when used in context of distribution packages have been
      renamed to "distributions" - including all documentations and
      commands in breeze to void confusion with import packages
      (see
      https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/)

    • all tests in airflow-core follow now the same convention
      where tests are in unit, system and integration package.
      no extra package has been as second level, because all the
      provider tests have "" there, so we just have to avoid
      naming airflow unit."" with the same name as provider.

    • all tooling in CI/DEV have been updated to follow the new
      structure. We should always build to packages now when we
      are building them using breeze.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@potiuk potiuk force-pushed the move-airflow-sources-to-airflow-core branch from 4c98a0c to 8dbbcec Compare March 14, 2025 18:40
@jscheffl
Copy link
Contributor

Wow! Impressive metric! More files moved than code changed :-D
image

@potiuk
Copy link
Member Author

potiuk commented Mar 15, 2025

Wow! Impressive metric! More files moved than code changed :-D

Indeed :)

@potiuk potiuk force-pushed the move-airflow-sources-to-airflow-core branch 11 times, most recently from 37c0179 to 317027d Compare March 16, 2025 15:32
@potiuk
Copy link
Member Author

potiuk commented Mar 16, 2025

Wow! Impressive metric! More files moved than code changed :-D

This changed now :) .. During making the PR green I used the opportunity since I was changing anyway the package preparation steps - that I finally will fix the long standing "what is a package" problem.

In the current incarnation of the change I made a consistent change to name what we produce "distributions" not "packages" - because "package" is really ambiguous.

See https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/

So we will be:

  • preparing distribution (in a give distribution format - wheel, or sdist)
  • releasing distribution
  • verifying distribution
  • working on distribution

We will have provider distributions rather than provider packages. This is mostly changing breeze DISTRIBUTION in our code and documentation links etc.

I think this is the right time of doing it especially that we are going to have those DISTRIBUTIONS:

  • apache-airflow
  • apache-airlfow-core
  • apache-airlfow-task-sdk
  • apache-airflow-providers-* -> 90+ of those

And most of those will have Python airflow package inside.

@potiuk potiuk force-pushed the move-airflow-sources-to-airflow-core branch from 317027d to 9ee4b70 Compare March 16, 2025 16:37
@potiuk potiuk force-pushed the move-airflow-sources-to-airflow-core branch 11 times, most recently from e996e86 to a5df6d2 Compare March 18, 2025 12:30
@potiuk potiuk force-pushed the move-airflow-sources-to-airflow-core branch 4 times, most recently from 3d2dc17 to 664c660 Compare March 20, 2025 23:03
@potiuk
Copy link
Member Author

potiuk commented Mar 20, 2025

OK. That one has a big chance of being green

@potiuk potiuk force-pushed the move-airflow-sources-to-airflow-core branch from 664c660 to ff4612a Compare March 21, 2025 00:17
Copy link
Member

@gopidesupavan gopidesupavan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wooho its green :)

@potiuk
Copy link
Member Author

potiuk commented Mar 21, 2025

Last rebase I hope.

@potiuk potiuk force-pushed the move-airflow-sources-to-airflow-core branch from ff4612a to fbd7e4f Compare March 21, 2025 08:27
@potiuk
Copy link
Member Author

potiuk commented Mar 21, 2025

Damn... have to rebase AGAIN :)

@potiuk potiuk force-pushed the move-airflow-sources-to-airflow-core branch from fbd7e4f to 1f73817 Compare March 21, 2025 09:37
This is continuation of the separation of the Airflow codebase into
separate distributions. This one splits airflow into two of them:

* apache-airflow - becomes an empty, meta no-code distribution that
  only has dependencies to apache-airflow-core and task-sdk
  distributions and it has preinstalled provider distributions
  added in standard "wheel" distribution. All "extras" lead
  either to "apache-airflow-core" extras or to providers - the
  dependencies and optional dependencies are calculated differently
  depending on "editable" or "standard" mode - in editable mode,
  just provider dependencies are installed for preinstalled providers
  in standard mode - those preinstalled providers are dependencies.

* the apache-airflow-core distribution contains all airflow core
  sources (previously in apache-airflow) and it has no provider
  extras. Thanks to that apache-airflow distribution does not
  have any dynamically calculated dependencies.

* the apache-airflow-core distribution hs "hatch_build_airflow_core.py"
  build hooks that add custom build target and implement custom
  cleanup in order to implement compiling assets as part of the build.

* During the move, the following changes were applied for consistency:

  * packages when used in context of distribution packages have been
    renamed to "distributions" - including all documentations and
    commands in breeze to void confusion with import packages
    (see
https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/)

  * all tests in `airflow-core` follow now the same convention
    where tests are in `unit`, `system` and `integration` package.
    no extra package has been as second level, because all the
    provider tests have "<PROVIDER>" there, so we just have to avoid
    naming airflow unit."<PROVIDER>" with the same name as provider.

  * all tooling in CI/DEV have been updated to follow the new
    structure. We should always build to packages now when we
    are building them using `breeze`.
@potiuk potiuk force-pushed the move-airflow-sources-to-airflow-core branch from 1f73817 to 36a986d Compare March 21, 2025 13:24
@potiuk
Copy link
Member Author

potiuk commented Mar 21, 2025

YOLO! Merging after rebase without waiting.

@potiuk potiuk merged commit 243fe86 into apache:main Mar 21, 2025
9 checks passed
@potiuk potiuk deleted the move-airflow-sources-to-airflow-core branch March 21, 2025 13:25
agupta01 pushed a commit to agupta01/airflow that referenced this pull request Mar 21, 2025
This is continuation of the separation of the Airflow codebase into
separate distributions. This one splits airflow into two of them:

* apache-airflow - becomes an empty, meta no-code distribution that
  only has dependencies to apache-airflow-core and task-sdk
  distributions and it has preinstalled provider distributions
  added in standard "wheel" distribution. All "extras" lead
  either to "apache-airflow-core" extras or to providers - the
  dependencies and optional dependencies are calculated differently
  depending on "editable" or "standard" mode - in editable mode,
  just provider dependencies are installed for preinstalled providers
  in standard mode - those preinstalled providers are dependencies.

* the apache-airflow-core distribution contains all airflow core
  sources (previously in apache-airflow) and it has no provider
  extras. Thanks to that apache-airflow distribution does not
  have any dynamically calculated dependencies.

* the apache-airflow-core distribution hs "hatch_build_airflow_core.py"
  build hooks that add custom build target and implement custom
  cleanup in order to implement compiling assets as part of the build.

* During the move, the following changes were applied for consistency:

  * packages when used in context of distribution packages have been
    renamed to "distributions" - including all documentations and
    commands in breeze to void confusion with import packages
    (see
https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/)

  * all tests in `airflow-core` follow now the same convention
    where tests are in `unit`, `system` and `integration` package.
    no extra package has been as second level, because all the
    provider tests have "<PROVIDER>" there, so we just have to avoid
    naming airflow unit."<PROVIDER>" with the same name as provider.

  * all tooling in CI/DEV have been updated to follow the new
    structure. We should always build to packages now when we
    are building them using `breeze`.
shubham-pyc pushed a commit to shubham-pyc/airflow that referenced this pull request Mar 22, 2025
This is continuation of the separation of the Airflow codebase into
separate distributions. This one splits airflow into two of them:

* apache-airflow - becomes an empty, meta no-code distribution that
  only has dependencies to apache-airflow-core and task-sdk
  distributions and it has preinstalled provider distributions
  added in standard "wheel" distribution. All "extras" lead
  either to "apache-airflow-core" extras or to providers - the
  dependencies and optional dependencies are calculated differently
  depending on "editable" or "standard" mode - in editable mode,
  just provider dependencies are installed for preinstalled providers
  in standard mode - those preinstalled providers are dependencies.

* the apache-airflow-core distribution contains all airflow core
  sources (previously in apache-airflow) and it has no provider
  extras. Thanks to that apache-airflow distribution does not
  have any dynamically calculated dependencies.

* the apache-airflow-core distribution hs "hatch_build_airflow_core.py"
  build hooks that add custom build target and implement custom
  cleanup in order to implement compiling assets as part of the build.

* During the move, the following changes were applied for consistency:

  * packages when used in context of distribution packages have been
    renamed to "distributions" - including all documentations and
    commands in breeze to void confusion with import packages
    (see
https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/)

  * all tests in `airflow-core` follow now the same convention
    where tests are in `unit`, `system` and `integration` package.
    no extra package has been as second level, because all the
    provider tests have "<PROVIDER>" there, so we just have to avoid
    naming airflow unit."<PROVIDER>" with the same name as provider.

  * all tooling in CI/DEV have been updated to follow the new
    structure. We should always build to packages now when we
    are building them using `breeze`.
@zlonghofer
Copy link

@potiuk the Apache Airflow website has a GitHub "Suggest a change on this page" widget that still links to the old location and returns a 404.

This is true for all of the pages I've tried and probably reflects a templating update to the main documentation website is required to resolve the broken links. Is it appropriate to open an issue on the Airflow repo to implement this change, or does a different entity manage the links on the main documentation site?

@amoghrajesh
Copy link
Contributor

@zlonghofer thanks for reporting, i created an issue for this: #48178

This change will have to go in airflow-site repository, not here.

nailo2c pushed a commit to nailo2c/airflow that referenced this pull request Apr 4, 2025
This is continuation of the separation of the Airflow codebase into
separate distributions. This one splits airflow into two of them:

* apache-airflow - becomes an empty, meta no-code distribution that
  only has dependencies to apache-airflow-core and task-sdk
  distributions and it has preinstalled provider distributions
  added in standard "wheel" distribution. All "extras" lead
  either to "apache-airflow-core" extras or to providers - the
  dependencies and optional dependencies are calculated differently
  depending on "editable" or "standard" mode - in editable mode,
  just provider dependencies are installed for preinstalled providers
  in standard mode - those preinstalled providers are dependencies.

* the apache-airflow-core distribution contains all airflow core
  sources (previously in apache-airflow) and it has no provider
  extras. Thanks to that apache-airflow distribution does not
  have any dynamically calculated dependencies.

* the apache-airflow-core distribution hs "hatch_build_airflow_core.py"
  build hooks that add custom build target and implement custom
  cleanup in order to implement compiling assets as part of the build.

* During the move, the following changes were applied for consistency:

  * packages when used in context of distribution packages have been
    renamed to "distributions" - including all documentations and
    commands in breeze to void confusion with import packages
    (see
https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/)

  * all tests in `airflow-core` follow now the same convention
    where tests are in `unit`, `system` and `integration` package.
    no extra package has been as second level, because all the
    provider tests have "<PROVIDER>" there, so we just have to avoid
    naming airflow unit."<PROVIDER>" with the same name as provider.

  * all tooling in CI/DEV have been updated to follow the new
    structure. We should always build to packages now when we
    are building them using `breeze`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants