Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Singularity 4 OCI : fix for working directory #4484

Merged
merged 21 commits into from
Nov 8, 2023

Conversation

marcodelapierre
Copy link
Member

Fix for OCI behaviour for current directory, including unit tests

Tested with rnaseq-nf and blast-example, both in OCI and standard modes

pditommaso and others added 15 commits October 25, 2023 11:58
Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Dr Marco Claudio De La Pierre <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Dr Marco Claudio De La Pierre <[email protected]>
@pditommaso
Copy link
Member

Not sure to understand what this solves? In my tests doesn't look it is needed

@marcodelapierre
Copy link
Member Author

Practical example: try and run the pipeline nextflow-io/blast-example, and without this fix you would get an error.

Root cause, the conditional to env in this block of code:

if( containerBuilder ) {
String cmd = env ? 'eval $(nxf_container_env); ' + launcher : launcher
if( env && !containerConfig.entrypointOverride() ) {
if( containerBuilder instanceof SingularityBuilder && !containerConfig.singularityOciMode() )
cmd = 'cd $PWD; ' + cmd
cmd = "/bin/bash -c \"$cmd\""
}
launcher = containerBuilder.getRunCommand(cmd)
}

Here cd $PWD is conditional to the value of env, which can be true or false depending on the pipeline.

Why does this happen in OCI mode? Because in this mode the work dir for the container is not the host PWD any more, similar to Docker. To compensate for this behavioural change, devs have introduced the --pwd <DIR>/--cwd <DIR> options.

This is what I have implemented in this PR.

@pditommaso
Copy link
Member

pditommaso commented Nov 6, 2023

Interesting. so this is essentially the same as -w in docker.

If so, I do not understand why most of the pipeline works. Also, instead of introducing a OCI specific flag, it may be preferable to keep the cd <dir> trick and use NXF_TASK_WORKDIR instead that hold the task work directory.

Third (unrelated this PR changes), not understanding why in the condition if( env && !containerConfig.entrypointOverride() ) there's the env check (I wonder if this may be a bug).

@marcodelapierre
Copy link
Member Author

These 4 CI errors are all of the same kind:

https://github.com/nextflow-io/nextflow/actions/runs/6767234982/job/18389698270

They come out of the execution of multiqc in the test, which fails with:

ModuleNotFoundError: No module named 'typing_extensions'`, size: 705 (max: 255)

@pditommaso thoughts?

@marcodelapierre
Copy link
Member Author

Interesting. so this is essentially the same as -w in docker.

exactly

If so, I do not understand why most of the pipeline works. Also, instead of introducing a OCI specific flag, it may be preferable to keep the cd <dir> trick and use NXF_TASK_WORKDIR instead that hold the task work directory.

Singularity always changes dir to the host current dir by default, and so cd $PWD is typically not needed.
To be honest it is not clear to me why that cd $PWD is in the codebase at all, i.e. I cannot imagine when it was needed before OCI mode.

Why nextflow-io/rnaseq-nf works and nextflow-io/blast-nf does not?
Because in the former env is true, which happens when the pipeline has a bin/ subdirectory. So, to my eyes it essentially works by chance (or, for the wrong reason) in OCI mode.

There is a problem with cd $PWD. The conditional block for env && !containerConfig.entrypointOverride() also does cmd = "/bin/bash -c \"$cmd\"" . Now, when env is false:

/bin/bash -ue <task-dir> /.command.sh

whereas when env is true:

/bin/bash -c "cd $PWD; eval $(nxf_container_env); /bin/bash -ue <task-dir>/.command.sh"

The problem is, when env is false and there is no wrapping bash -c " .. ", we cannot just add cd $PWD ; , otherwise the following command will not be executed by singularity, but by the host instead. Using cd $PWD requires the wrapper bash -c, which is conditional.
And even if we get rid of the condition on env, there is still the one on containerConfig.entrypointOverride().

That's why in the case of oci I am using the new --pwd flag, to ensure the right work dir is picked unconditionally.

In all of this, I myself have no understanding of how those conditionals were born, hence I went for the most conservative change, that would not break previous runs while being able to execute the OCI mode correctly.
Before simplifying the conditionals I think it would be wise to track back their origins.

Third (unrelated this PR changes), not understanding why in the condition if( env && !containerConfig.entrypointOverride() ) there's the env check (I wonder if this may be a bug).

as above, I myself have no idea of the origin of it.

@marcodelapierre
Copy link
Member Author

Oh I forgot,

it may be preferable to [...] use NXF_TASK_WORKDIR instead that hold the task work directory.

    export NXF_TASK_WORKDIR="$PWD"
    nxf_stage

    set +e
    (set -o pipefail; (nxf_launch | tee .command.out) 3>&1 1>&2 2>&3 | tee .command.err) &

NXF_TASK_WORKDIR is defined as $PWD a couple of lines before nxf_launch being executed.
Substituting $PWD with NXF_TASK_WORKDIR seems doable and safe.

@pditommaso
Copy link
Member

Substituting $PWD with NXF_TASK_WORKDIR seems doable and safe

Let's go for this. It's more self-documenting what's needed for

@pditommaso
Copy link
Member

pditommaso commented Nov 6, 2023

In all of this, I myself have no understanding of how those conditionals were born, hence I went for the most conservative change

Indeed. However I think the overall idea was that this bash is only needed to evaluate the container env.

But, later the cd $PWD was added.

Maybe this can be more consistent

            final isSingularity = containerBuilder instanceof SingularityBuilder
            if( (env || isSingularity) && !containerConfig.entrypointOverride() ) {
                if( isSingularity )
                    cmd = 'cd $NXF_TASK_WORKDIR; ' + cmd
                cmd = "/bin/bash -c \"$cmd\""
            }

@marcodelapierre
Copy link
Member Author

marcodelapierre commented Nov 6, 2023

While implementing the change, I realised why the conditional has env .. exactly for the same reason why we need it now for OCI. env results in

String cmd = env ? 'eval $(nxf_container_env); ' + launcher : launcher

and, as above, for both commands to execute within the containers, the bash wrapping is required!

I have paraphrased your suggestion to make this reasoning visible from the code itself:

[edit]: nope this is wrong, it changes the logic! reverted to your syntax

            String cmd = env ? 'eval $(nxf_container_env); ' + launcher : launcher

            final isSingularity = containerBuilder instanceof SingularityBuilder
            cmd = isSingularity ? 'cd $NXF_TASK_WORKDIR; ' + cmd : cmd

            if( (env || isSingularity) && !containerConfig.entrypointOverride() ) {
                cmd = "/bin/bash -c \"$cmd\""
            }

…D changed to NXF_TASK_WORKDIR

Signed-off-by: Dr Marco Claudio De La Pierre <[email protected]>
@marcodelapierre
Copy link
Member Author

I have also found the cause of the failing CI tests.... the rnaseq-nf container

https://github.com/nextflow-io/rnaseq-nf/edit/master/docker/Dockerfile

unfortunately when installing with micromamba, there are some issues with the installation with multiqc: nextflow-io/rnaseq-nf#22

@pditommaso
Copy link
Member

Yeah, it seems multiqc, but @ewels was mentioning version 1.17 should solve

…ated flags

Signed-off-by: Dr Marco Claudio De La Pierre <[email protected]>
@marcodelapierre
Copy link
Member Author

OK @pditommaso , last commit, to include all substitutions of PWD into NXF_TASK_WORKDIR for container related flags.

Done cum grano salis .

make test locally successful.

@marcodelapierre
Copy link
Member Author

This one is ready for review.

Two sets of changes:

  • fix for working direectory (key)
  • replacement of PWD into NXF_TASK_WORKDIR (does not change any behaviour)

Base automatically changed from singularity-oci to master November 6, 2023 17:50
Signed-off-by: Paolo Di Tommaso <[email protected]>
Copy link

netlify bot commented Nov 6, 2023

Deploy Preview for nextflow-docs-staging canceled.

Name Link
🔨 Latest commit 0585bad
🔍 Latest deploy log https://app.netlify.com/sites/nextflow-docs-staging/deploys/654baf3d64fe9200085fbd66

Signed-off-by: Paolo Di Tommaso <[email protected]>
@marcodelapierre marcodelapierre self-assigned this Nov 7, 2023
@marcodelapierre marcodelapierre added this to the 23.11.0-edge milestone Nov 7, 2023
@pditommaso pditommaso merged commit 48ee3c6 into master Nov 8, 2023
19 checks passed
@pditommaso pditommaso deleted the singularity-oci-pwd-fix branch November 8, 2023 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants