Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SheepRL Dreamer v3 - ValueError #309

Open
ogulcankertmen opened this issue Jul 7, 2024 · 3 comments
Open

SheepRL Dreamer v3 - ValueError #309

ogulcankertmen opened this issue Jul 7, 2024 · 3 comments

Comments

@ogulcankertmen
Copy link

I tried; "sheeprl exp=dreamer_v3 env=gym env.id=CartPole-v1" this one and i got "ValueError:
you tried to log -1 which is currently not supported. Try a dict or a scalar/tensor.

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace." these results. And I couldn't solve this problem. Do you have any suggestions?

@belerico
Copy link
Member

Hi @ogulcankertmen, I've tried the exact same command on the main branch on my machine and the training goes well: can you please share more info about the error? Maybe the entire stacktrace?

@ogulcankertmen
Copy link
Author

ogulcankertmen commented Jul 11, 2024

@belerico Here is the stacktrace;

C:\Users\Oğulcan\AppData\Local\Programs\Python\Python311\Lib\site-packages\sheeprl\utils\logger.py:22: UserWarning: The specified root directory for the TensorBoardLogger is different from the experiment one, so the logger one will be ignored and replaced with the experiment root directory
  warnings.warn(
2024-07-11 11:20:41.583052: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-07-11 11:20:43.817109: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
Error executing job with overrides: ['exp=dreamer_v3', 'env=gym', 'env.id=CartPole-v1']
Traceback (most recent call last):
  File "C:\Users\Oğulcan\AppData\Local\Programs\Python\Python311\Lib\site-packages\lightning\fabric\loggers\tensorboard.py", line 215, in log_metrics
    self.experiment.add_scalar(k, v, step)
    ^^^^^^^^^^^^^^^
  File "C:\Users\Oğulcan\AppData\Local\Programs\Python\Python311\Lib\site-packages\lightning\fabric\loggers\logger.py", line 118, in experiment
    return fn(self)
           ^^^^^^^^
  File "C:\Users\Oğulcan\AppData\Local\Programs\Python\Python311\Lib\site-packages\lightning\fabric\loggers\tensorboard.py", line 197, in experiment
    self._experiment = SummaryWriter(log_dir=self.log_dir, **self._kwargs)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Oğulcan\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\tensorboard\writer.py", line 249, in __init__
    self._get_file_writer()
  File "C:\Users\Oğulcan\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\tensorboard\writer.py", line 281, in _get_file_writer
    self.file_writer = FileWriter(
                       ^^^^^^^^^^^
  File "C:\Users\Oğulcan\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\tensorboard\writer.py", line 75, in __init__
    self.event_writer = EventFileWriter(
                        ^^^^^^^^^^^^^^^^
  File "C:\Users\Oğulcan\AppData\Local\Programs\Python\Python311\Lib\site-packages\tensorboard\summary\writer\event_file_writer.py", line 72, in __init__
    tf.io.gfile.makedirs(logdir)
  File "C:\Users\Oğulcan\AppData\Local\Programs\Python\Python311\Lib\site-packages\tensorflow\python\lib\io\file_io.py", line 513, in recursive_create_dir_v2
    _pywrap_file_io.RecursivelyCreateDir(compat.path_to_bytes(path))
tensorflow.python.framework.errors_impl.FailedPreconditionError: logs\runs\dreamer_v3/CartPole-v1\2024-07-11_11-20-40_dreamer_v3_CartPole-v1_42 is not a directory

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\Oğulcan\AppData\Local\Programs\Python\Python311\Lib\site-packages\sheeprl\cli.py", line 366, in run
    run_algorithm(cfg)
  File "C:\Users\Oğulcan\AppData\Local\Programs\Python\Python311\Lib\site-packages\sheeprl\cli.py", line 199, in run_algorithm
    fabric.launch(reproducible(command), cfg, **kwargs)
  File "C:\Users\Oğulcan\AppData\Local\Programs\Python\Python311\Lib\site-packages\lightning\fabric\fabric.py", line 845, in launch
    return self._wrap_and_launch(function, self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Oğulcan\AppData\Local\Programs\Python\Python311\Lib\site-packages\lightning\fabric\fabric.py", line 931, in _wrap_and_launch
    return to_run(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Oğulcan\AppData\Local\Programs\Python\Python311\Lib\site-packages\lightning\fabric\fabric.py", line 936, in _wrap_with_setup
    return to_run(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Oğulcan\AppData\Local\Programs\Python\Python311\Lib\site-packages\sheeprl\cli.py", line 195, in wrapper
    return func(fabric, cfg, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Oğulcan\AppData\Local\Programs\Python\Python311\Lib\site-packages\sheeprl\algos\dreamer_v3\dreamer_v3.py", line 379, in main
    fabric.logger.log_hyperparams(cfg)
  File "C:\Users\Oğulcan\AppData\Local\Programs\Python\Python311\Lib\site-packages\lightning\fabric\utilities\rank_zero.py", line 70, in wrapped_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Oğulcan\AppData\Local\Programs\Python\Python311\Lib\site-packages\lightning\fabric\loggers\tensorboard.py", line 249, in log_hyperparams
    self.log_metrics(metrics, 0)
  File "C:\Users\Oğulcan\AppData\Local\Programs\Python\Python311\Lib\site-packages\lightning\fabric\utilities\rank_zero.py", line 70, in wrapped_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Oğulcan\AppData\Local\Programs\Python\Python311\Lib\site-packages\lightning\fabric\loggers\tensorboard.py", line 218, in log_metrics
    raise ValueError(
ValueError:
 you tried to log -1 which is currently not supported. Try a dict or a scalar/tensor.

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

There is config information on them.

@belerico
Copy link
Member

belerico commented Jul 12, 2024

Hi @ogulcankertmen, I've tried on my windows machine and nothing happens: I'm not able to replicate.
Could you please share also your env?
I've seen from your error that the log_dir path has mixed separators: I've created a branch where we normalize the separators on windows. Can you try it?
Also: why torch tensorboard is calling tensorflow to create the logdirs?
I'm referring to this line:

  File "C:\Users\Oğulcan\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\tensorboard\writer.py", line 75, in __init__
    self.event_writer = EventFileWriter(
                        ^^^^^^^^^^^^^^^^
  File "C:\Users\Oğulcan\AppData\Local\Programs\Python\Python311\Lib\site-packages\tensorboard\summary\writer\event_file_writer.py", line 72, in __init__
    tf.io.gfile.makedirs(logdir)
  File "C:\Users\Oğulcan\AppData\Local\Programs\Python\Python311\Lib\site-packages\tensorflow\python\lib\io\file_io.py", line 513, in recursive_create_dir_v2
    _pywrap_file_io.RecursivelyCreateDir(compat.path_to_bytes(path))

What happens if you remove tensorflow?

A similar issue: tensorflow/tensorflow#60682 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants