Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: The server socket has failed to listen on any local network address. useIpv6: 0, code: -48, name: EADDRINUSE, message: address already in use #210

Open
rtrad89 opened this issue Nov 7, 2024 · 0 comments

Comments

@rtrad89
Copy link

rtrad89 commented Nov 7, 2024

Trying to follow the README on M1 chip, but receiving the error in the title

W1107 10:04:31.990000 8471214144 torch/distributed/elastic/multiprocessing/redirects.py:28] NOTE: Redirects are currently not supported in Windows or MacOs.
Traceback (most recent call last):
  File "/Users/myuser/.pyenv-3-12/bin/torchrun", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/myuser/.pyenv-3-12/lib/python3.12/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 348, in wrapper
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "/Users/myuser/.pyenv-3-12/lib/python3.12/site-packages/torch/distributed/run.py", line 901, in main
    run(args)
  File "/Users/myuser/.pyenv-3-12/lib/python3.12/site-packages/torch/distributed/run.py", line 892, in run
    elastic_launch(
  File "/Users/myuser/.pyenv-3-12/lib/python3.12/site-packages/torch/distributed/launcher/api.py", line 133, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myuser/.pyenv-3-12/lib/python3.12/site-packages/torch/distributed/launcher/api.py", line 255, in launch_agent
    result = agent.run()
             ^^^^^^^^^^^
  File "/Users/myuser/.pyenv-3-12/lib/python3.12/site-packages/torch/distributed/elastic/metrics/api.py", line 124, in wrapper
    result = f(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^
  File "/Users/myuser/.pyenv-3-12/lib/python3.12/site-packages/torch/distributed/elastic/agent/server/api.py", line 680, in run
    result = self._invoke_run(role)
             ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myuser/.pyenv-3-12/lib/python3.12/site-packages/torch/distributed/elastic/agent/server/api.py", line 829, in _invoke_run
    self._initialize_workers(self._worker_group)
  File "/Users/myuser/.pyenv-3-12/lib/python3.12/site-packages/torch/distributed/elastic/metrics/api.py", line 124, in wrapper
    result = f(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^
  File "/Users/myuser/.pyenv-3-12/lib/python3.12/site-packages/torch/distributed/elastic/agent/server/api.py", line 652, in _initialize_workers
    self._rendezvous(worker_group)
  File "/Users/myuser/.pyenv-3-12/lib/python3.12/site-packages/torch/distributed/elastic/metrics/api.py", line 124, in wrapper
    result = f(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^
  File "/Users/myuser/.pyenv-3-12/lib/python3.12/site-packages/torch/distributed/elastic/agent/server/api.py", line 489, in _rendezvous
    rdzv_info = spec.rdzv_handler.next_rendezvous()
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myuser/.pyenv-3-12/lib/python3.12/site-packages/torch/distributed/elastic/rendezvous/static_tcp_rendezvous.py", line 66, in next_rendezvous
    self._store = TCPStore(  # type: ignore[call-arg]
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: The server socket has failed to listen on any local network address. useIpv6: 0, code: -48, name: EADDRINUSE, message: address already in use
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant