Iceberg-trino-k8s-demo

Setting up a local kubernetes cluster on macos, and executing distributed sql queries over iceberg + s3(minio) via trino.

Why

Separating the storage and processing resources allows them to be more efficiently used/scaled.

Iceberg table format scales better than hive, effectively "write once files" that support schema evolution.

Trino, more "realtime" instead of "batched" like spark, lighter weight.

Abuse the bandwidth between S3 and processing nodes.

Technologies

Kubernetes - Container orchestration across multiple nodes.

Minikube - Helper to bootstrap a single node local development kubernetes cluster.

S3 - An object store. Amazons Simple Storage Service.

Minio - An object store. Self hosted. Compatible with S3 API. Used for local development in place of S3.

Iceberg - A specification of a table format. Meaning, how to manage metadata of data files.

Trino - A distributed query engine (cluster), similar to spark.

Spark - The de facto distributed query / data science library and engine (cluster).

Dependencies

socket_vmnet (brew install socket_vmnet)
hyperkit (brew install hyperkit)
kubectl (brew install kubernetes-cli)
minikube (brew install minikube)
trino cli (brew install trino)

socket_vmnet and hyperkit are used on macos to make the linux vm experience as expected (container on macos are vm within vm, needs more management/forwarding..etc)

Summary

There are four main "deployables" organized into four namespaces.

You basically just need to kubectl create -f <file> them all in order. No need to add any namespace arguments, as the yamls create namespaces themselves.

Minio should be deployed before everything.

Postgres should be deployed before Hive. Hive deployment runs an init job to initialize the postgres db schema.

Hive should be deployed before Trino. Trino+Iceberg requires some metadata to be stored on a hive metastore.

As stated on trino:

The Iceberg table format manages most metadata in metadata files in the object storage itself. A small amount of metadata, however, still requires the use of a metastore.

Bootstrap

minikube start --network socket_vmnet --cpus='4' --memory='24GB' --disk-size='250GB'
minikube tunnel (leave this running to access services via clusterip via loadbalancer)
kubectl create -f minio.yaml
kubectl get services -n minio
Navigate to the external ip of the minio service, on port 9090 (http://10.100.138.239:9090/buckets)
Login with the credentials (admin/adminpassword123)
Create a bucket named test, no need to modify any other options. Note, this is the only "manual" step. We could create a job to do this using aws cli, but this serves as a good checkpoint to verify configuration.
kubectl create -f postgres.yaml
kubectl get pods -n postgres
Wait for postgres to be up
kubectl create -f hive.yaml
kubectl create -f trino.yaml
Trino is started with 1 coordinator and 2 workers

Usage

Submit Queries

kubectl get services -n trino
Take note of the external ip reported for trino
trino --server http://<external ip of trino>:8080 --catalog iceberg --debug
You are now in a sql repl

Input quit or exit to stop and exit the trino cli.

Refer to Trino CLI, SQL Language, SQL Statement Syntax

SQL Queries

quit or exit
show schemas from iceberg;
create schema iceberg.demo with (location = 's3a://test/demo');
create table iceberg.demo.one(country varchar, state varchar, population bigint) with (format = 'PARQUET');
show tables from demo;
insert into demo.one values('usa','delaware',2000);
insert into demo.one values('usa','new york',1000);
select * from demo.one;
delete from demo.one where state='new york';
select * from demo."one$snapshots"; This $snapshots is accessing iceberg metadata, refer to trino iceberg connector for other iceberg metadata tables.

Scale Workers

kubectl scale statefulset trino-worker -n trino --replicas=4
kubectl scale statefulset trino-worker -n trino --replicas=2

Misc

Take a look around through the minio webui, check out the test bucket.

Experiment and see what the data/metadata looks like when a schema is created in hive vs iceberg, or when the file formats are different PARQUET/AVRO/ORC.

Explore the trino webui! If trino is running with --debug you will also get a url link to the job, like:

http://10.107.143.42:8080/ui/query.html?20230525_143631_00014_6xyhs

Try to scale the trino workers, and watch as the queries are distributed to them.

You are also able to access the hive catalog that is used by the standalone hive metastore. Trino is configured to access this same catalog.

Useful Commands

watch -n 1 kubectl get services --all-namespaces
watch -n 1 kubectl get pods -n trino

Links

Disclaimer

Minio/s3 and postgres credentials are passed into apps using secrets and environment variables. Hive/Trino both have configuration files that are injected into the container, these configurations have the credentials templated via environment variable. In other words, if you need to change minio/postgres credentials, you will need to recreate the 'minio-admin' and 'postgres-admin' secrets. There are duplicate copies of these secrets in each namespace (secrets are bound to namespaces). A production environment would have a better secret-to-namespace management strategy.

A minio user/pass can be used as an s3 key/secret.

Iceberg has two version, V1 and V2. V2 is needed for deletion (schema evolution). This is only supported when iceberg uses a hive metastore for the catalog, jdbc/rest will not work.

I've referenced bits and pieces to get this all hooked up together. The repo from alexcpn was helpful for figuring out a lot of the configuration, especially the standalone hive metastore. I've used the image built from alexcpn as it is heavily inspired by the official iceberg-spark configurations, adds some jars we need from trino/postgres. Checking the dockerfile, I wouldnt do anything different and this image is already publically available. This image is used for the metastore, and for initializing the postgres db(via the apache schematool).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Iceberg-trino-k8s-demo

Why

Technologies

Dependencies

Summary

Bootstrap

Usage

Submit Queries

SQL Queries

Scale Workers

Misc

Useful Commands

Links

Disclaimer

About

Releases

Packages

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
resource		resource
README.MD		README.MD
hive.yaml		hive.yaml
minio.yaml		minio.yaml
postgres.yaml		postgres.yaml
trino.yaml		trino.yaml

socrata/iceberg-trino-k8s-demo

Folders and files

Latest commit

History

Repository files navigation

Iceberg-trino-k8s-demo

Why

Technologies

Dependencies

Summary

Bootstrap

Usage

Submit Queries

SQL Queries

Scale Workers

Misc

Useful Commands

Links

Disclaimer

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages