Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log Directory Issue in wrap_experiment #2302

Open
ahalev opened this issue Sep 30, 2021 · 4 comments
Open

Log Directory Issue in wrap_experiment #2302

ahalev opened this issue Sep 30, 2021 · 4 comments

Comments

@ahalev
Copy link
Contributor

ahalev commented Sep 30, 2021

There appears to be a bug in wrap_experiment, where the function to create a file to store an archive of the launcher's git repo fails.

Traceback:

tar (child): data/local/experiment/PONGNoFrameskip-v4_2/launch_archive.tar.xz: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: data/local/experiment/PONGNoFrameskip-v4_2/launch_archive.tar.xz: Cannot write: Broken pipe
tar: Child returned status 2
tar: Error is not recoverable: exiting now
Traceback (most recent call last):
  File "/home/ahalev/.conda/envs/remote_env/lib/python3.6/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/ahalev/.conda/envs/remote_env/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ahalev/.conda/envs/remote_env/lib/python3.6/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/ahalev/repos/garage/src/garage/examples/torch/dqn_atari.py", line 103, in main
    **hyperparams)
  File "/home/ahalev/repos/garage/src/garage/experiment/experiment.py", line 368, in __call__
    ctxt = self._make_context(self._get_options(*args), **kwargs)
  File "/home/ahalev/repos/garage/src/garage/experiment/experiment.py", line 324, in _make_context
    make_launcher_archive(git_root_path=git_root_path, log_dir=log_dir)
  File "/home/ahalev/repos/garage/src/garage/experiment/experiment.py", line 559, in make_launcher_archive
    check=True)
  File "/home/ahalev/.conda/envs/remote_env/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '('tar', '--null', '--files-from', '-', '--xz', '--create', '--file', 'data/local/experiment/PONGNoFrameskip-v4_2/launch_archive.tar.xz')' returned non-zero exit status 2.
python-BaseException

Steps to reproduce:

Run

python examples/torch/dqn_atari.py PONG

My garage package info:

Metadata-Version: 2.1
Name: garage
Version: 2020.9.0rc2.dev0

@krzentner
Copy link
Contributor

That's strange. It might be best if we change the default argument of archive_launch_repo to False.
I imagine this could be caused by the data directory not being writable, or perhaps from tar not being able to use xz.
The tar command still looks correct, although it's conceivably also possible that git produced an unusual file list in some way.
There are several test to check that this feature works, so it should work in this case.

@ahalev
Copy link
Contributor Author

ahalev commented Sep 30, 2021

It's writable as far as I can tell -- if I run

if [ -w ` pwd ` ]; then echo "WRITABLE"; else echo "NOT WRITABLE"; fi

in $git_root_path$/data/local/experiment
it spits out WRITABLE.

Not sure how to check whether tar can use xz.

@krzentner
Copy link
Contributor

Oh, I see what happened. wrap_experiment expects log_dir to be an absolute path (and sets it to an absolute path by default), but this example explicitly sets it to a relative path. The tar command is always run in the git repo root, so if the example is run from a git repo but not from the root, then the log directory doesn't exists and the tar command fails.

Probably ExperimentWrapper should always make the log_dir into an absolute path if this condition fails.

@ahalev
Copy link
Contributor Author

ahalev commented Oct 1, 2021

Yes, it works with archive_launch_repo=False. Bizarrely, I copied the entire content of dqn_atari.py to a different folder outside of the repo and it works there with archive_launch_repo=True.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants