-
Notifications
You must be signed in to change notification settings - Fork 1k
RANGER-5310:Include Apache Tez as the process framework for ranger-hive docker #660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
… docker Signed-off-by: Ramesh Mani <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR integrates Apache Tez as the processing framework for the ranger-hive Docker setup to enable faster data processing through DAG execution and resolve issues with INSERT commands in beeline.
- Adds Tez binary distribution and configuration files for Hive integration
- Updates Hadoop YARN configuration to support Tez execution
- Creates comprehensive Tez configuration across all Hive database variants
Reviewed Changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
tez-site.xml | New Tez configuration template with memory and execution settings |
ranger-hive-setup.sh | Adds Tez setup, YARN configuration, and HDFS directory creation |
ranger-hadoop-setup.sh | Enhances YARN configuration and installs Tez JARs for NodeManager |
hive-site-*.xml | Adds Tez execution engine configuration to all database variants |
hive-site-metastore-mysql.xml | New metastore-specific configuration with Tez support |
create-users.sh | New script for creating test users (alice, abram) |
download-archives.sh | Adds Tez binary download support |
docker-compose files | Updates build arguments and environment variables for Tez |
Dockerfiles | Integrates Tez installation and user creation across containers |
.env | Updates Hadoop version compatibility and adds Tez version |
Comments suppressed due to low confidence (1)
dev-support/ranger-docker/.env:1
- The KAFKA_VERSION line appears to be missing after the HIVE_HADOOP_VERSION change. This could break Kafka-related builds that depend on this environment variable.
BUILD_HOST_SRC=true
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
@@ -0,0 +1,43 @@ | |||
#!/bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rameeshm - following users and groups are created in ranger-base image, using https://github.com/apache/ranger-tools/blob/main/docker/Dockerfile#L50. It might be useful to add users alice
and abram
in ranger-base image itself, so that these users are available in all Ranger images.
users: ranger rangeradmin rangerusersync rangertagsync rangerkms hdfs yarn hive hbase kafka ozone knox
groups: ranger hadoop knox
With this approach, updates to many Dockerfiles in this PR can be eliminated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mneethiraj Addressed this review comment to use ranger-base image and create users. Thanks
Thank you @rameeshm for the patch, I believe this is tested with Ubuntu base image, please see if this can be tested with UBI base image as well, this change needs to made in |
… docker - changes to use ranger base image for user creation, fix issue with usage of ranger base image in other containers
@kumaab current patch with the review comments tested with RANGER_BASE_VERSION=[20250712-1-ubi-8] |
ARG RANGER_BASE_IMAGE | ||
ARG RANGER_BASE_VERSION | ||
FROM ${RANGER_BASE_IMAGE}:${RANGER_BASE_VERSION} | ||
FROM ranger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it necessary to change the base image (FROM @{RANGER_BASE_IMAGE}
) to ranger
?
ranger
is the image containing Ranger admin server; usersync shouldn't use ranger
as the base.
What changes were proposed in this pull request?
Include Apache Tez as the process framework for ranger-hive docker
How was this patch tested?
Testing in Docker running HiveServer 2 beeline and execute "INSERT" statement for DAG.
