-
Notifications
You must be signed in to change notification settings - Fork 15.9k
Prepare 2.0.2 #15406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Prepare 2.0.2 #15406
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(cherry picked from commit 5090fb0)
(cherry picked from commit 69801f5)
(cherry picked from commit 6fd93ba)
`attachable` is only a property of compose version 3.1 files, but we are
on 2.2 still.
This was failing on self-hosted runners with an error
`networks.example.com value Additional properties are not allowed
('attachable' was unexpected)`
(cherry picked from commit 5dbabdd)
(cherry picked from commit 0c1ad94)
(cherry picked from commit 64280ab)
* Run CI (Build Image and committer PRs) on self-hosted runner The queue for GitHub actions is starting to get painful. This PR changes it to run any build of main (push or scheduled), or a PR from a committer on the self-hosted runner. However since this setting is configured _in_ the repo, it could be changed by a PR, so we can't rely on that; we also use a custom build of the self-hosted runner binary that enforces PRs are from contributors, otherwise it will fail the build * Don't free space on self-hosted runners It deletes the images we have pre-cached (defeating the point of pre-loading them) and we manage freeing space via another process. The exception to this is the K8s tests, where we don't need the images we pre-cache, but we do want the space in our tmpfs. * Some jobs take longer on self-hosted runners * When running KinD tests on CI, use pass through the host docker creds to Kind This enabled self-hosted runners (which are logged in to a docker hub user) to pull images without running in to pull limits. * Work around supurious blocking IO failures This seems to be an occasional problem on the build provider package job which produces a lot of output all at once. The error appears as the job just failing with an exit code (sometimes 1, othertimes 120) but no error. (cherry picked from commit 4816707)
The change to this file was untestable until it was merged to master :( (cherry picked from commit 9005f28)
When the image is prepared, PIP installation produces progress bars which are annoying - especially in the CI environment. This PR adds argument to control progress bar and sets it to "off" for CI builds. (cherry picked from commit 9b7852e)
When the user wants to test pull requests or - especilly - master build in their own fork, they will not likely have self-hosted runners configured and the builds will fail. This change uses self-hosted runners only if the build is run in the context of the `apache/airflow` repository. (cherry picked from commit 14890f0)
This change attempts to stabilize pylint checks. Since we have recently added self-hosted runners with multiple CPUS, seems that re-enabling parallel mode for pylint makes perfect sense as we will finally be able to use the parallelism and speed up static checks significantly. Previously the tests were run in single-processor mode in attempt to avoid random mistakes where different files were processed in different processes. This led to random pylint or mypy problems and aither false-positives or false negatives before especially when it came to circular dependencies. but since we are now past heavy refactoring, this should be no problem for future changes and occasional false positive/negative is less disruptive than long checks. The attempt is made to apply sorting order in the files passed to pylint. This should provide more stability in the results of running the tests in PR and in master. We had some custom pylint plugins that prevented using of pylint parallelism. For now we are giving up on one of the plugins (no asserts use) and we rely on committer's review on that (we have a rule in place to only use asserts in tests). The other plugin was replaced by coming back to separation of "main code" and "test code" and applying different rules to those - we have now two different configuration files|: pylintrc and pylintrc-tests to control settings for those two different cases. Mypy and flake8 have been parallelized at the level of pre-commits. By implementing those changes we are likely to speed up the tests on self-hosted runners 6x-8x times. Also caching of pre-commit environments (including the pre-commit installation) is improved. Previously we only cached the environment created by the pre-commit, but with this change we also cache the pre-commit installation in .local directory - this should save 1-2 minutes of the job. (cherry picked from commit 82cb041)
…14227) There are two types of constraints now: * default constraints that contain all depenedncies of airflow, all the provider packages released at the time of the relese of that version, as well as all transitive dependencies. Following those constraints, you can be sure Airflow's installation is repeatable * no-providers constraints - containing only the dependencies needed for core airflow installation. This allows to install/upgrade airflow without also forcing the provider's to be installed at specific version of Airflow. This allows for flexible management of Airflow and Provider packages separately. Documentation about it has been added. Also the provider 'extras' for apache airflow do not keep direct dependencies to the packages needed by the provider. Those dependencies are now transitive only - so 'provider' extras only depend on 'apache-airflow-provider-EXTRA' package and all the dependencies are transitive. This will help in the future to avoid conflicts when installing newer providers using extras. (cherry picked from commit d524cec)
(cherry picked from commit cc9827d)
* Fix caching of python images during builds After GHCR change, github image caching for python images was broken. GHCR requires each image that we push to it to be tagged with "org.opencontainers.image.source" label to point to the right project repository. Otherwise it will not appear in the repository images and we will not be able to manage permissions of the image. So we'v been extending the python base image with appropriate label, but it was not in all places necessary. As the result. the builds in CI were usign different images than the cache image in DockerHub, which in turn caused full image rebuilds ALWAYS. By implementing this change we make sure tha the the "airflow-tainted" python images - including the label. All images (when using breeze) are now built using such airflow-tainted image which in turn should give5 up to 5-8 minutes saving on building the images job per each job (CI and production) The change implements the following behaviours: 1) When image is built on CI or locally without UPGRADE_TO_NEWER_DEPENDENCIES flag and without FORCE_PULL_IMAGES flag -t the latest image from the GitHub Registry or DockerHub registry is used. 2) When both FORCE_PULL_IMAGES and UPGRADE_TO_NEWER_DEPENDENCY are set to something different than "false" - the latest Python images are used - they are pulled first and tagged appropriately. This way - always when UPGRADE_TO_NEWER_DEPENDENCIES are set (also when master pushes are done) - such builds will use latest version of python base images published on DockerHub. * Update _push_pull_remove_images.sh (cherry picked from commit b7d9c34)
This change enables easy switching between GitHub Package Registry and GitHub Container Registry by simply adding GITHUB_REGISTRY secret to be either `docker.package.github.com` or `ghcr.io`. This makes it easy to switch only by the Apache Airflow repository run builds, as it requires preparation of images (to make them public and to add permissions to manage them) after they got created for the first time. GitHub Package Registry works out-of-the-box but it is less stable and considered a legacy, also it does not allow image retention. Documentation has been updated to reflect the reasoning of choosing this solution as well as describing maintenance processes around images (including adding new Python version) (cherry picked from commit 25fa309)
The Pre-commit cache did not work when there is a change in version of host python version used. When python version changed from 3.6.12 to 3.6.13, the cache rendered useless. This change adds python versio to the cache key specification, including fallback value also including the python version, so that even after changing pre-commit, the cache can be rebuilt from the last-good cache but for the same python version. (cherry picked from commit f3bc8ab)
) Upgrade to newer dependencies is only set to something else than false in two cases now: * when setup.py or setup.cfg change * when we are running push build Previously - mistakenly - it was also set when "full_tests_needed" label was set on a PR. This caused for example the #19628 PR to fail before moto was limited to <2 in #14433. (cherry picked from commit fe31137)
Because we specified `--rcfile` as an argument to the script, the check for "no files" was failing and pylint was being run with no files, resulting in an error. (cherry picked from commit ab72b05)
…14224) Co-authored-by: Kamil Bregula <[email protected]> (cherry picked from commit ffc68bc)
It takes insane 9:30 on the self-hosted runner and likely incurs a lot of cost on pushing the docs (and it's value is limited) (cherry picked from commit 20f391e)
* Update hadolint from v1.22.1 to v1.18.0 * fixup! Update hadolint from v1.22.1 to v1.18.0 * fixup! fixup! Update hadolint from v1.22.1 to v1.18.0 Co-authored-by: Kamil Breguła <[email protected]> (cherry picked from commit cc7260a)
* Production image can be run as root * fixup! Production image can be run as root * fixup! fixup! Production image can be run as root Co-authored-by: Kamil Bregula <[email protected]> Co-authored-by: Kamil Breguła <[email protected]> (cherry picked from commit 7979b75)
The asset recompilation message did not work well in case of the tests - where we did not mount local sources. It was always showign the instructions to recompile the assets. This is now fixed and the "OK" message is green. (cherry picked from commit 9f35cff)
Currently as long as argument '-p' if present, code tries to mask it. However, '-p' may mean something else (not password), like a boolean flag. Such cases may result in exception (cherry picked from commit 486b764)
…eated Pods (#15204) closes: #15193 Currently condition if the pod is created by Airflow is not considered. This commit fixes this. We decide if the Pod is created by Airflow via checking if it has all the labels added in PodGenerator.construct_pod() or KubernetesPodOperator.create_labels_for_pod(). (cherry picked from commit c594d9c)
We've seen instances of connection resets happening, particularly in Azure, that are remedied by enabling tcp_keepalive. Enabling it by default should be safe and sane regardless of where we are running. (cherry picked from commit 6e31465)
(cherry picked from commit 97b7780)
The current implementation imports Connection on import time, which causes a circular import when a model class needs to reference a hook class. By applying this fix, the airflow.hooks package is completely decoupled with airflow.models on import time, so model code can reference hooks. Hooks, on the other hand, generally don't reference model classes. Fix #15325. (cherry picked from commit 7560316)
This module was removed/renamed in mysqlclient 2.0 -- this new name works for both (cherry picked from commit a162f2b)
(cherry picked from commit f69bb82)
There is some problem where the stackdriver logging "leaks" out of this test and remains configured, (where as the S3 one doesn't) but given this test is not actually testing logs, a quick, fix is just to change with remote logging handler we use.
After stabilizing the builds on CI, the master builds started to finally get green - except pushing to prod image cache which continuous to fail as it missed python image to push. This PR fixes it. (cherry picked from commit 1dfbb8d)
This bug got introduced in #14909. Removed sorting from list and tuple as list & tuples preserve order unlike set. The following DAG errors with: `TypeError: '<' not supported between instances of 'dict' and 'dict'` ```python from airflow import models from airflow.operators.dummy import DummyOperator from datetime import datetime, timedelta params = { "staging_schema": [{"key:":"foo","value":"bar"}, {"key:":"this","value":"that"}] } with models.DAG(dag_id='test-dag', start_date=datetime(2019, 2, 14), schedule_interval='30 13 * * *', catchup=False, max_active_runs=1, params=params ) as dag: my_task = DummyOperator( task_id='task1' ) ``` Full Error: ``` File "/usr/local/lib/python3.7/site-packages/airflow/serialization/serialized_objects.py", line 210, in <dictcomp> return cls._encode({str(k): cls._serialize(v) for k, v in var.items()}, type_=DAT.DICT) File "/usr/local/lib/python3.7/site-packages/airflow/serialization/serialized_objects.py", line 212, in _serialize return sorted(cls._serialize(v) for v in var) TypeError: '<' not supported between instances of 'dict' and 'dict' During handling of the above exception, another exception occurred: ... ``` This is because `sorted()` does not work with dict as it can't compare. Removed sorting from list & tuples which fixes it. It also fails when we have set with multiple types. (cherry picked from commit d115040)
(cherry picked from commit a0b217a)
Afer merging the constraints, the 'recursive' mode was not added to checkout resulting with non-checked out github push action. This commit fixes it and adds color to diff output in commit to better show differences when pushing. (cherry picked from commit 6b78394)
(cherry picked from commit adbab36)
kaxil
approved these changes
Apr 16, 2021
|
The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest master at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area:API
Airflow's REST/HTTP API
area:dev-tools
full tests needed
We need to run full set of tests for this PR to merge
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
DO NOT MERGE
For approval only.