[Feature]: Module to support execution for spark programs using testcontainers #5859
hariohmprasath
started this conversation in
Ideas
Replies: 1 comment 1 reply
-
That sounds like an interesting idea for a module @hariohmprasath. Have you tried using https://hub.docker.com/r/apache/spark in conjunction with |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Module
New Module
Problem
We don't have a simple way to run integration tests for spark programs. In my case, I either mock them to test the business logic, which defeats the purpose or end up using a large docker-compose file by manually updating the ports, environment variables, memory, and the number of work nodes depending upon my use case. If my co-worker wants to run the same program, I need to either share the docker-compose file, or we will probably end up setting up a spark cluster (like EMR) using one of the cloud providers. It would be great if
testcontainers
could support this use case, where we can have a consistently reproducible environment for running distributed programs using spark based on custom configurations and I would be happy to contribute this feature if we agree to go with this.Solution
New spark module that would allow users to create custom spark clusters based on their business needs.
Benefit
Here are a few benefits:
Alternatives
Here are a few alternatives that we follow:
docker-compose.yml
file that would create different configurations of spark cluster depending upon the business needsWould you like to help contributing this feature?
Yes
Beta Was this translation helpful? Give feedback.
All reactions