-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
running two instances of spark-perf ? #91
Comments
Hi @msifalakis, My hunch is that this is a longstanding bug. It wouldn't surprise me if nobody has tried running two instances of If you'd like to try to fix this yourself, here's a few starting points:
The right fix is probably to figure out how to only monitor shutdown of executors associated with the previous test run, but this could be tricky to do. |
Hello Josh Thanks for the pointers! Mostly helpful. I will have a look at them. Meanwhile there is another relevant question I have, to which you may be thanks again for the pointers and any further suggestions Manolis. From: Josh Rosen [email protected] Hi @msifalakis, spark-perf/lib/sparkperf/cluster.py Line 46 in 79f8cfa
In your case, this message probably came from |
|
I have been running 1 instance of the spark-perf benchmark using the core tests and the MLlib tests on a small-ish cluster (only 4 nodes, yet quite poweful ones -- 64GB ram, 8 cores each), using scale-factor=1.
The benchmark is occupying just 4 executors (1 core each).
Now I ve tried to launch a second scaled-down (0.1) configuration of the benchmark suite using only the core tests.. at the same time. Although there are both memory and executor/cores available, the benchmark fails to start! (or more precisely it fails to engage workers! .. giving me the following message
"Spark is still running on some slaves ... sleeping for 10 seconds"). That is even though I have set the USE_CLUSTER_SPARK = True, and RESTART_SPARK_CLUSTER = False -- so I guess it tries to use my existing cluster
On the other hand if I start a spark-shell or start another appl it seems to get admitted just fine!
Any ideas of what this means ? .. Given the very spartan information about what the benchmark does/uses it is rather difficult to know which direction to start looking at.
TIA
Manolis.
The text was updated successfully, but these errors were encountered: