Bug in `aiida.engine.daemon.execmanager.retrieve_files_from_list` #3095

adegomme · 2019-06-28T11:09:02Z

I'm getting
engine/daemon/execmanager.py", line 464, in retrieve_files_from_list
to_append = remote_names.split(os.path.sep)[-depth:] if depth > 0 else []
AttributeError: 'list' object has no attribute 'split'

when I try to retrieve the content of a folder over ssh. It seems that the folder name is turned into a list here:
https://github.com/aiidateam/aiida_core/blob/e232f946e2b5c1c55c8f8b2c903f05355fe9deea/aiida/engine/daemon/execmanager.py#L463
and then immediately fed to split, which only operates on strings.
Removing the [] around remote_names seems to allow it to go further. Is this correct ?

sphuber · 2019-06-28T11:31:35Z

Thanks for the report. This definitely seems like a bug. Just for reference, what are the contents of your the retrieve_list? Can you please lode the calculation node that is excepting in a shell and report the output of the following:

node = load_node(<pk>)  # Replace the pk here with the pk of the excepted calculation
print(node.get_retrieve_list())
print(node.get_retrieve_temporary_list())

adegomme · 2019-06-28T11:49:45Z

retrieve list was
[u'log.yaml', u'time.yaml', u'forces*', u'final*', [u'./debug', u'bigdft-err*', 1], u'_scheduler-stdout.txt', u'_scheduler-stderr.txt']
(the debug folder fails)

I'm able to avoid entering this branch by putting the wildcards in the first item of the tuple, as stated in the documentation ( ["./debug/bigdft-err*",".",1] ), but the bug is still there in the code, I'm just avoiding it.

sphuber · 2019-06-28T12:02:15Z

Thanks for the update. You are right that moving the wildcard to the first element will avoid the bug because then you do not hit the branch in the if transport.has_magic(tmp_rname): branch. However, the second element should not contain wildcards.

The format of the retrieve_list is not particularly well documented, I admit, but it should adhere to the following rules, as per the aiida.common.datastructures.CalcInfo docstring:

`retrieve_list`: a list of strings or tuples that indicate files that are to be retrieved from the remote
after the calculation has finished and stored in the repository in a FolderData.
If the entry in the list is just a string, it is assumed to be the filepath on the remote and it will
be copied to '.' of the repository with name os.path.split(item)[1]
If the entry is a tuple it is expected to have the following format

    ('remotepath', 'localpath', depth)

If the 'remotepath' is a file or folder, it will be copied in the repository to 'localpath'.
However, if the 'remotepath' contains file patterns with wildcards, the 'localpath' should be set to '.'
and the depth parameter should be an integer that decides the localname. The 'remotepath' will be split on
file separators and the local filename will be determined by joining the N last elements, where N is
given by the depth variable.

Example: ('some/remote/path/files/pattern*[0-9].xml', '.', 2)

Will result in all files that match the pattern to be copied to the local repository with path

    'files/pattern*[0-9].xml'

So the directive [u'./debug', u'bigdft-err*', 1] is not valid. The first element should refer to the remote file path. I assume, the remote folder will contain multiple bigdft-err* files that you would like to retrieve. This should be then the first element. Per the docstring, the second element should then be .. Most likely then, your directive should be ('bigdft-err*', '.', 0).

adegomme · 2019-06-28T12:20:54Z

Indeed, I agree in the wildcard case I was trying, and this will be enough for me to go on from now.
But that doesn't change the fact that the other branch creates a list and immediately tries to split() it (if depth>0), which will never work. It crashes if I set ["a/b", "c", 1], as well.

sphuber · 2019-06-28T12:35:39Z

Indeed, I agree in the wildcard case I was trying, and this will be enough for me to go on from now.
But that doesn't change the fact that the other branch creates a list and immediately tries to split() it (if depth>0), which will never work. It crashes if I set ["a/b", "c", 1], as well.

Absolutely, I will keep this ticket open so a patch can be submitted. Thanks again for the report

sphuber · 2019-07-01T16:42:55Z

By the way, for posterity and a laugh: you have unearthed a bug that was introduced in this commit and has been there for 6 years, since version v0.2.1. Well done 😄

sphuber · 2020-07-20T13:55:59Z

I actually fixed this "accidentally" in PR #4196 . In that PR I removed all remaining files from the pre-commit exclude list, among which was the file that causes the bug describes here. The bug was actually found by pylint and so I fixed it. Of course I didn't think straight away of this issue but only just now it snapped into place. Shows the value of linting and how it can make up (a bit) for bad test coverage. The changes of #4196 will be merged in the next release so I am closing this. I found another bug in the same branch of code, that is due to another reason, for which I have opened #4273

Edt: oh dear lord, as it turns out, in the ultimate irony, the change in #4196 that fixed this bug, is the very reason of #4273 's existence. Time to add some badly needed tests there I guess.

sphuber added aiida-core 1.x priority/important topic/calc-jobs topic/engine topic/transports type/bug labels Jun 28, 2019

sphuber changed the title ~~retrieved folder over ssh failure~~ Bug in aiida.engine.daemon.execmanager.retrieve_files_from_list Jun 28, 2019

sphuber removed the aiida-core 1.x label Jun 8, 2020

sphuber added this to the v1.3.1 milestone Jul 20, 2020

sphuber self-assigned this Jul 20, 2020

sphuber closed this as completed Jul 20, 2020

sphuber modified the milestones: v1.3.1, v1.3.2 Aug 27, 2020

sphuber modified the milestones: v1.3.2, v1.4.0 Sep 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug in `aiida.engine.daemon.execmanager.retrieve_files_from_list` #3095

Bug in `aiida.engine.daemon.execmanager.retrieve_files_from_list` #3095

adegomme commented Jun 28, 2019

sphuber commented Jun 28, 2019

adegomme commented Jun 28, 2019

sphuber commented Jun 28, 2019

adegomme commented Jun 28, 2019

sphuber commented Jun 28, 2019

sphuber commented Jul 1, 2019 •

edited by CasperWA

Loading

sphuber commented Jul 20, 2020 •

edited

Loading

Bug in aiida.engine.daemon.execmanager.retrieve_files_from_list #3095

Bug in aiida.engine.daemon.execmanager.retrieve_files_from_list #3095

Comments

adegomme commented Jun 28, 2019

sphuber commented Jun 28, 2019

adegomme commented Jun 28, 2019

sphuber commented Jun 28, 2019

adegomme commented Jun 28, 2019

sphuber commented Jun 28, 2019

sphuber commented Jul 1, 2019 • edited by CasperWA Loading

sphuber commented Jul 20, 2020 • edited Loading

Bug in `aiida.engine.daemon.execmanager.retrieve_files_from_list` #3095

Bug in `aiida.engine.daemon.execmanager.retrieve_files_from_list` #3095

sphuber commented Jul 1, 2019 •

edited by CasperWA

Loading

sphuber commented Jul 20, 2020 •

edited

Loading