Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

base-images: release a base image for our java connectors #49831

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 30 additions & 2 deletions airbyte-ci/connectors/base_images/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Our connector build pipeline ([`airbyte-ci`](https://github.com/airbytehq/airbyt
Our base images are declared in code, using the [Dagger Python SDK](https://dagger-io.readthedocs.io/en/sdk-python-v0.6.4/).

- [Python base image code declaration](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/base_images/base_images/python/bases.py)
- ~Java base image code declaration~ *TODO*
- [Java base image code declaration](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/base_images/base_images/java/bases.py)


## Where are the Dockerfiles?
Expand Down Expand Up @@ -39,6 +39,20 @@ RUN mkdir -p 755 /usr/share/nltk_data



### Example for `airbyte/java-connector-base`:
```dockerfile
FROM docker.io/amazoncorretto:21-al2023@sha256:5454cb606e803fce56861fdbc9eab365eaa2ab4f357ceb8c1d56f4f8c8a7bc33
RUN sh -c set -o xtrace && yum update -y --security && yum install -y tar openssl findutils && yum clean all
ENV AIRBYTE_SPEC_CMD=/airbyte/javabase.sh --spec
ENV AIRBYTE_CHECK_CMD=/airbyte/javabase.sh --check
ENV AIRBYTE_DISCOVER_CMD=/airbyte/javabase.sh --discover
ENV AIRBYTE_READ_CMD=/airbyte/javabase.sh --read
ENV AIRBYTE_WRITE_CMD=/airbyte/javabase.sh --write
ENV AIRBYTE_ENTRYPOINT=/airbyte/base.sh
```



## Base images


Expand All @@ -59,6 +73,17 @@ RUN mkdir -p 755 /usr/share/nltk_data
| 1.0.0 | ✅| docker.io/airbyte/python-connector-base:1.0.0@sha256:dd17e347fbda94f7c3abff539be298a65af2d7fc27a307d89297df1081a45c27 | Initial release: based on Python 3.9.18, on slim-bookworm system, with pip==23.2.1 and poetry==1.6.1 |


### `airbyte/java-connector-base`

| Version | Published | Docker Image Address | Changelog |
|---------|-----------|--------------|-----------|
| 1.0.0 | ✅| docker.io/airbyte/java-connector-base:1.0.0@sha256:be86e5684e1e6d9280512d3d8071b47153698fe08ad990949c8eeff02803201a | Create a base image for our java connectors based on Amazon Corretto. |
| 1.0.0-rc.4 | ✅| docker.io/airbyte/java-connector-base:1.0.0-rc.4@sha256:be86e5684e1e6d9280512d3d8071b47153698fe08ad990949c8eeff02803201a | Bundle yum calls in a single RUN |
| 1.0.0-rc.3 | ✅| docker.io/airbyte/java-connector-base:1.0.0-rc.3@sha256:be86e5684e1e6d9280512d3d8071b47153698fe08ad990949c8eeff02803201a | |
| 1.0.0-rc.2 | ✅| docker.io/airbyte/java-connector-base:1.0.0-rc.2@sha256:fca66e81b4d2e4869a03b57b1b34beb048e74f5d08deb2046c3bb9919e7e2273 | Set entrypoint to base.sh |
| 1.0.0-rc.1 | ✅| docker.io/airbyte/java-connector-base:1.0.0-rc.1@sha256:886a7ce7eccfe3c8fb303511d0e46b83b7edb4f28e3705818c090185ba511fe7 | Create a base image for our java connectors. |


## How to release a new base image version (example for Python)

### Requirements
Expand Down Expand Up @@ -102,6 +127,9 @@ poetry run mypy base_images --check-untyped-defs

## CHANGELOG

### 1.4.0
- Declare a base image for our java connectors.

### 1.3.1
- Update the crane image address. The previous address was deleted by the maintainer.

Expand All @@ -120,4 +148,4 @@ poetry run mypy base_images --check-untyped-defs

### 1.0.1

- Bumped dependencies ([#42581](https://github.com/airbytehq/airbyte/pull/42581))
- Bumped dependencies ([#42581](https://github.com/airbytehq/airbyte/pull/42581))
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#
# Copyright (c) 2023 Airbyte, Inc., all rights reserved.
#
102 changes: 102 additions & 0 deletions airbyte-ci/connectors/base_images/base_images/java/bases.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
#
# Copyright (c) 2023 Airbyte, Inc., all rights reserved.
#
from __future__ import annotations

from typing import Callable, Final

import dagger
from base_images import bases, published_image
from base_images import sanity_checks as base_sanity_checks
from base_images.python import sanity_checks as python_sanity_checks
from base_images.root_images import AMAZON_CORRETTO_21_AL_2023
from base_images.utils.dagger import sh_dash_c


class AirbyteJavaConnectorBaseImage(bases.AirbyteConnectorBaseImage):
# TODO: remove this once we want to build the base image with the airbyte user.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This I expect mimics the current setup?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the current user is root. The parent class sets it to airbyte, so I override the USER with root to keep v1 expected user be root.

USER: Final[str] = "root"

root_image: Final[published_image.PublishedImage] = AMAZON_CORRETTO_21_AL_2023
repository: Final[str] = "airbyte/java-connector-base"

DD_AGENT_JAR_URL: Final[str] = "https://dtdg.co/latest-java-tracer"
BASE_SCRIPT_URL = "https://raw.githubusercontent.com/airbytehq/airbyte/6d8a3a2bc4f4ca79f10164447a90fdce5c9ad6f9/airbyte-integrations/bases/base/base.sh"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider lifting the git commit sha to a dedicated constant

JAVA_BASE_SCRIPT_URL: Final[
str
] = "https://raw.githubusercontent.com/airbytehq/airbyte/6d8a3a2bc4f4ca79f10164447a90fdce5c9ad6f9/airbyte-integrations/bases/base-java/javabase.sh"

def get_container(self, platform: dagger.Platform) -> dagger.Container:
"""Returns the container used to build the base image for java connectors
We currently use the Amazon coretto image as a base.
We install some packages required to build java connectors.
We also download the datadog java agent jar and the javabase.sh script.
We set some env variables used by the javabase.sh script.

Args:
platform (dagger.Platform): The platform this container should be built for.

Returns:
dagger.Container: The container used to build the base image.
"""

return (
# TODO: Call this when we want to build the base image with the airbyte user
# self.get_base_container(platform)
self.dagger_client.container(platform=platform)
.from_(self.root_image.address)
# Bundle RUN commands together to reduce the number of layers.
.with_exec(
sh_dash_c(
[
# Update first, but in the same .with_exec step as the package installation.
# Otherwise, we risk caching stale package URLs.
"yum update -y --security",
# tar is equired to untar java connector binary distributions.
# openssl is required because we need to ssh and scp sometimes.
# findutils is required for xargs, which is shipped as part of findutils.
f"yum install -y tar openssl findutils",
# Remove any dangly bits.
"yum clean all",
]
)
)
.with_workdir("/airbyte")
# Copy the datadog java agent jar from the internet.
.with_file("dd-java-agent.jar", self.dagger_client.http(self.DD_AGENT_JAR_URL))
# Copy base.sh from the git repo.
.with_file("base.sh", self.dagger_client.http(self.BASE_SCRIPT_URL))
# Copy javabase.sh from the git repo.
.with_file("javabase.sh", self.dagger_client.http(self.JAVA_BASE_SCRIPT_URL))
# Set a bunch of env variables used by base.sh.
.with_env_variable("AIRBYTE_SPEC_CMD", "/airbyte/javabase.sh --spec")
.with_env_variable("AIRBYTE_CHECK_CMD", "/airbyte/javabase.sh --check")
.with_env_variable("AIRBYTE_DISCOVER_CMD", "/airbyte/javabase.sh --discover")
.with_env_variable("AIRBYTE_READ_CMD", "/airbyte/javabase.sh --read")
.with_env_variable("AIRBYTE_WRITE_CMD", "/airbyte/javabase.sh --write")
.with_env_variable("AIRBYTE_ENTRYPOINT", "/airbyte/base.sh")
.with_entrypoint(["/airbyte/base.sh"])
)

async def run_sanity_checks(self, platform: dagger.Platform):
"""Runs sanity checks on the base image container.
This method is called before image publication.
Consider it like a pre-flight check before take-off to the remote registry.

Args:
platform (dagger.Platform): The platform on which the sanity checks should run.
"""
container = self.get_container(platform)
await base_sanity_checks.check_user_can_read_dir(container, self.USER, self.AIRBYTE_DIR_PATH)
await base_sanity_checks.check_user_can_write_dir(container, self.USER, self.AIRBYTE_DIR_PATH)
await base_sanity_checks.check_file_exists(container, "/airbyte/dd-java-agent.jar")
await base_sanity_checks.check_file_exists(container, "/airbyte/base.sh")
await base_sanity_checks.check_file_exists(container, "/airbyte/javabase.sh")
await base_sanity_checks.check_env_var_with_printenv(container, "AIRBYTE_SPEC_CMD", "/airbyte/javabase.sh --spec")
await base_sanity_checks.check_env_var_with_printenv(container, "AIRBYTE_CHECK_CMD", "/airbyte/javabase.sh --check")
await base_sanity_checks.check_env_var_with_printenv(container, "AIRBYTE_DISCOVER_CMD", "/airbyte/javabase.sh --discover")
await base_sanity_checks.check_env_var_with_printenv(container, "AIRBYTE_READ_CMD", "/airbyte/javabase.sh --read")
await base_sanity_checks.check_env_var_with_printenv(container, "AIRBYTE_WRITE_CMD", "/airbyte/javabase.sh --write")
await base_sanity_checks.check_env_var_with_printenv(container, "AIRBYTE_ENTRYPOINT", "/airbyte/base.sh")
await base_sanity_checks.check_a_command_is_available_using_version_option(container, "tar")
await base_sanity_checks.check_a_command_is_available_using_version_option(container, "openssl", "version")
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,10 @@
tag="3.10.14-slim-bookworm",
sha="2407c61b1a18067393fecd8a22cf6fceede893b6aaca817bf9fbfe65e33614a3",
)

AMAZON_CORRETTO_21_AL_2023 = PublishedImage(
registry="docker.io",
repository="amazoncorretto",
tag="21-al2023",
sha="5454cb606e803fce56861fdbc9eab365eaa2ab4f357ceb8c1d56f4f8c8a7bc33",
)
16 changes: 16 additions & 0 deletions airbyte-ci/connectors/base_images/base_images/sanity_checks.py
Original file line number Diff line number Diff line change
Expand Up @@ -178,3 +178,19 @@ async def check_user_can_write_dir(container: dagger.Container, user: str, dir_p
await container.with_user(user).with_exec(["touch", f"{dir_path}/foo.txt"])
except dagger.ExecError:
raise errors.SanityCheckError(f"{dir_path} is not writable by the {user}.")


async def check_file_exists(container: dagger.Container, file_path: str):
"""Check that a file exists in the container.

Args:
container (dagger.Container): The container on which the sanity checks should run.
file_path (str): The file path to check.

Raises:
errors.SanityCheckError: Raised if the file does not exist.
"""
try:
await container.with_exec(["test", "-f", file_path])
except dagger.ExecError:
raise errors.SanityCheckError(f"{file_path} does not exist.")
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Our connector build pipeline ([`airbyte-ci`](https://github.com/airbytehq/airbyt
Our base images are declared in code, using the [Dagger Python SDK](https://dagger-io.readthedocs.io/en/sdk-python-v0.6.4/).

- [Python base image code declaration](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/base_images/base_images/python/bases.py)
- ~Java base image code declaration~ *TODO*
- [Java base image code declaration](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/base_images/base_images/java/bases.py)


## Where are the Dockerfiles?
Expand Down Expand Up @@ -79,6 +79,12 @@ poetry run mypy base_images --check-untyped-defs

## CHANGELOG

### 1.4.0
- Declare a base image for our java connectors.

### 1.3.1
- Update the crane image address. The previous address was deleted by the maintainer.

### 1.2.0
- Improve new version prompt to pick bump type with optional pre-release version.

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Copyright (c) 2024 Airbyte, Inc., all rights reserved.


def sh_dash_c(lines: list[str]) -> list[str]:
"""Wrap sequence of commands in shell for safe usage of dagger Container's with_exec method."""
return ["sh", "-c", " && ".join(["set -o xtrace"] + lines)]
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,12 @@
import semver
from base_images import consts, published_image
from base_images.bases import AirbyteConnectorBaseImage
from base_images.java.bases import AirbyteJavaConnectorBaseImage
from base_images.python.bases import AirbyteManifestOnlyConnectorBaseImage, AirbytePythonConnectorBaseImage
from base_images.utils import docker
from connector_ops.utils import ConnectorLanguage # type: ignore

MANAGED_BASE_IMAGES = [AirbytePythonConnectorBaseImage]
MANAGED_BASE_IMAGES = [AirbytePythonConnectorBaseImage, AirbyteJavaConnectorBaseImage]


@dataclass
Expand Down Expand Up @@ -270,6 +271,12 @@ async def get_manifest_only_registry(
)


async def get_java_registry(
dagger_client: dagger.Client, docker_credentials: Tuple[str, str], cache_ttl_seconds: int = 0
) -> VersionRegistry:
return await VersionRegistry.load(AirbyteJavaConnectorBaseImage, dagger_client, docker_credentials, cache_ttl_seconds=cache_ttl_seconds)


async def get_registry_for_language(
dagger_client: dagger.Client, language: ConnectorLanguage, docker_credentials: Tuple[str, str], cache_ttl_seconds: int = 0
) -> VersionRegistry:
Expand All @@ -291,12 +298,15 @@ async def get_registry_for_language(
return await get_python_registry(dagger_client, docker_credentials, cache_ttl_seconds=cache_ttl_seconds)
elif language is ConnectorLanguage.MANIFEST_ONLY:
return await get_manifest_only_registry(dagger_client, docker_credentials, cache_ttl_seconds=cache_ttl_seconds)
elif language is ConnectorLanguage.JAVA:
return await get_java_registry(dagger_client, docker_credentials, cache_ttl_seconds=cache_ttl_seconds)
else:
raise NotImplementedError(f"Registry for language {language} is not implemented yet.")


async def get_all_registries(dagger_client: dagger.Client, docker_credentials: Tuple[str, str]) -> List[VersionRegistry]:
return [
await get_python_registry(dagger_client, docker_credentials),
# await get_java_registry(dagger_client),
await get_java_registry(dagger_client, docker_credentials),
# await get_manifest_only_registry(dagger_client, docker_credentials),
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
[
{
"version": "1.0.0",
"changelog_entry": "Create a base image for our java connectors based on Amazon Corretto.",
"dockerfile_example": "FROM docker.io/amazoncorretto:21-al2023@sha256:5454cb606e803fce56861fdbc9eab365eaa2ab4f357ceb8c1d56f4f8c8a7bc33\nRUN sh -c set -o xtrace && yum update -y --security && yum install -y tar openssl findutils && yum clean all\nENV AIRBYTE_SPEC_CMD=/airbyte/javabase.sh --spec\nENV AIRBYTE_CHECK_CMD=/airbyte/javabase.sh --check\nENV AIRBYTE_DISCOVER_CMD=/airbyte/javabase.sh --discover\nENV AIRBYTE_READ_CMD=/airbyte/javabase.sh --read\nENV AIRBYTE_WRITE_CMD=/airbyte/javabase.sh --write\nENV AIRBYTE_ENTRYPOINT=/airbyte/base.sh"
},
{
"version": "1.0.0-rc.4",
"changelog_entry": "Bundle yum calls in a single RUN",
"dockerfile_example": "FROM docker.io/amazoncorretto:21-al2023@sha256:5454cb606e803fce56861fdbc9eab365eaa2ab4f357ceb8c1d56f4f8c8a7bc33\nRUN sh -c set -o xtrace && yum update -y --security && yum install -y tar openssl findutils && yum clean all\nENV AIRBYTE_SPEC_CMD=/airbyte/javabase.sh --spec\nENV AIRBYTE_CHECK_CMD=/airbyte/javabase.sh --check\nENV AIRBYTE_DISCOVER_CMD=/airbyte/javabase.sh --discover\nENV AIRBYTE_READ_CMD=/airbyte/javabase.sh --read\nENV AIRBYTE_WRITE_CMD=/airbyte/javabase.sh --write\nENV AIRBYTE_ENTRYPOINT=/airbyte/base.sh"
},
{
"version": "1.0.0-rc.2",
"changelog_entry": "Set entrypoint to base.sh",
"dockerfile_example": "FROM docker.io/amazoncorretto:21-al2023@sha256:5454cb606e803fce56861fdbc9eab365eaa2ab4f357ceb8c1d56f4f8c8a7bc33\nRUN yum update -y --security\nRUN yum install -y tar openssl findutils\nENV AIRBYTE_SPEC_CMD=/airbyte/javabase.sh --spec\nENV AIRBYTE_CHECK_CMD=/airbyte/javabase.sh --check\nENV AIRBYTE_DISCOVER_CMD=/airbyte/javabase.sh --discover\nENV AIRBYTE_READ_CMD=/airbyte/javabase.sh --read\nENV AIRBYTE_WRITE_CMD=/airbyte/javabase.sh --write\nENV AIRBYTE_ENTRYPOINT=/airbyte/base.sh"
},
{
"version": "1.0.0-rc.1",
"changelog_entry": "Create a base image for our java connectors.",
"dockerfile_example": "FROM docker.io/amazoncorretto:21-al2023@sha256:5454cb606e803fce56861fdbc9eab365eaa2ab4f357ceb8c1d56f4f8c8a7bc33\nRUN yum update -y --security\nRUN yum install -y tar openssl findutils\nENV AIRBYTE_SPEC_CMD=/airbyte/javabase.sh --spec\nENV AIRBYTE_CHECK_CMD=/airbyte/javabase.sh --check\nENV AIRBYTE_DISCOVER_CMD=/airbyte/javabase.sh --discover\nENV AIRBYTE_READ_CMD=/airbyte/javabase.sh --read\nENV AIRBYTE_WRITE_CMD=/airbyte/javabase.sh --write\nENV AIRBYTE_ENTRYPOINT=/airbyte/base.sh"
}
]
2 changes: 1 addition & 1 deletion airbyte-ci/connectors/base_images/pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "airbyte-connectors-base-images"
version = "1.3.1"
version = "1.4.0"
description = "This package is used to generate and publish the base images for Airbyte Connectors."
authors = ["Augustin Lafanechere <[email protected]>"]
readme = "README.md"
Expand Down
Loading