-
Notifications
You must be signed in to change notification settings - Fork 786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecation of lxplus7
#3730
Comments
I am also not so sure if we could have a dirty workaround by setting |
Hi @DickyChant, thanks a lot for these checks! I think another solution is to use https://gitlab.cern.ch/cms-cat/cmssw-lxplus/ , which emulates lxplus7 with condor support. |
There is also a slightly modified version of the above (https://gitlab.cern.ch/lviliani/mg_cmssw_docker/) which I started working on, including also genproductions with the idea to have a container including everything we need to run gridpacks, but it's just a preliminary test for now. |
Thanks for the heads up! Do we have the container from cat being unpacked to cvmfs? If so that’s a nice addition! I suffered a lot for getting mine being setup at lxplus… One thing actually worrisome is the support of condor python API… if it requires a strict IP address check it is inevitable when using container (I never checked this part for singularity, but for docker it is a well known mess up… I never thought I’d experience a similar thing in my life because I decided to avoid docker as much as possible…) Having container setup is actually nice, I started with dask-lxplus container because I thought it would have better python api support to be usable out of box, then I had to add a full set of dependencies copied from the CMSSW containers… Have genproduction being part of is actually not a bad idea, thou I am afraid that we need at least two things: 1: for NLO we often have libraries being compiled and installed on the fly… which does seem ridiculous because we basically loose many good part of using container… therefore I believe Dominic’s new PR should come before this! @sihyunjeon I thought you’ve told me about make release of genproductions. Actually maybe instead of making legacy form of release, i.e. code tarball, of genproductions, we could instead release containers through ghcr. Getting a release would need at least download and untar/unzip/… etc but publish a container seems easier to access and to maintain as well (I could imagine that a natural thing is to have a CI job build the container and a follow up CI job test whether it is usable) |
Yes the container is unpacked on cvmfs: I agree with you in case we decide to include genproductions. The nice thing of such a container is that it can be easily used also within a CI job to produce gridpacks using gitlab runners. |
Actually, I tested my container with a desktop that I had at cern, and since it is within CERN network, it has access to cern htcondor schedds and did condor_q successfully, it should also in principle be able to submit condor jobs. Because basically the dask-lxplus did the same thing as instructed from cern ABP twiki, especially the part on how to get a local htcondor setup accessible to cern htcondor pool. I think CAT’s image is doing the same after a quick glance. Now, if we think of the powheg CI jobs in this very repo that depends on a VM from cern open stack (i guess @mseidel42 knows more details), it would have afs as long as we activate it via locmaps, as well as the cern internal web env, so I do not see a technical issue to have a CI job being able to do htcondor if we make interactive solution work at lxplus. And such thing should not be technically impossible if we could get an account that has condor authorites like the pdmvserv account. Note that you can basically achieve the same thing with reana. |
And for sure no issue about doing it with CI, in fact I would imagine our common background team would benefit more since what has been done there is also basically a CI, push new cards first and then a machine picks it up and execute it with the form of submitting condor jobs. |
Right, technically it is possible indeed. We just have to figure out some details in case we want to do that. |
I could give a short report some time in gen to show how to scale up 1000 jobs at reana with gitlab ci in the context of doing tuning, but I can do a spoil here that it doesn’t scale up well. Now we’ve been trying to fine tune it (with @sihyunjeon and @shimashimarin) and I won’t waste this chance to comment on its super inconvenient condor submission which requires you to by hand upload krb5 keytab! |
Today I realized that lxplus7 it no longer there...
For MadGraph gridpack generation, the issue is that we need to setup a CMSSW as working environment on the fly, which means
genproductions/bin/MadGraph5_aMCatNLO/gridpack_generation.sh
Line 429 in 3c15d3b
I have 3 solutions in mind right now:
Those are the options that I feel are feasible (some are already available, some need a little bit of work), but I'd like to go with recommendation from GEN since some of them are not really fitting the roadmap.
The text was updated successfully, but these errors were encountered: