Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Fix: Update the port scanning logic to properly skip ports in use #6560

Merged
merged 2 commits into from
Aug 29, 2023

Conversation

rileyajones
Copy link
Contributor

@rileyajones rileyajones commented Aug 28, 2023

Motivation for features / changes

#6552

Technical description of changes

The port scanning logic originally added in #1851 does not seem to work anymore. WerkzeugServer no longer raises an Error when attempting to bind to a port which is in used and instead handles the issue silently.

The current Werkzeug version is 2.3.7 but I had 2.3.4 installed when I first verified this. I tried installing 2.2.0 and the issues was not present so I assume it was introduced in 2.3.0.
https://pypi.org/project/Werkzeug

Detailed steps to verify changes work correctly (as executed by you)

  1. Start TensorBoard, note the port it is bound to (it should be 6006)
  2. In a separate terminal start TensorBoard again.
  3. TensorBoard should successfully bind to a different port (this time 6007)

@rileyajones rileyajones requested a review from nfelt August 28, 2023 19:35
@rileyajones rileyajones marked this pull request as ready for review August 28, 2023 20:22
Copy link
Contributor

@nfelt nfelt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing this! I wonder a little whether this may be quite slow in some cases (e.g. if connecting to the "test" socket will time out or otherwise fail in a less-than-clean way) but I suppose users can always pass a hardcoded port or script it (as you had in the issue) as a workaround.

The current Werkzeug version is 3.3.7

Just checking, I assume you mean 2.3.7? (that's the latest I can find)

I looked at the Werkzeug code and this commit looks like probably the culprit (moves the sys.exit() call introduced in this commit from a codepath I believe we don't call into one that we do).

Presumably Werkzeug is fine with this change being unfriendly to our use case, since they don't want people using their WSGI server for non-development purposes anyway 😒

for port in range(base_port, base_port + max_attempts):
subflags = argparse.Namespace(**vars(flags))
subflags.port = port
if is_port_in_use(port):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing the port-in-use check here won't respect the should_scan logic (based on flags.port).

Can we instead move it closer to where the logic previously happened (when binding to the socket would raise the exception)? I.e. move it into WerkzeugServer.__init__() before the call to the superclass init (which is where we'll attempt binding the socket) and make it raise TensorBoardPortInUseError, which will then trigger the existing except logic on the lines below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This portion of the code is a bit complicated so while I think I've made the suggested change, please let me know if I've missed something.

@rileyajones rileyajones requested a review from nfelt August 29, 2023 01:02
@rileyajones
Copy link
Contributor Author

rileyajones commented Aug 29, 2023

Just checking, I assume you mean 2.3.7? (that's the latest I can find)

I do mean 2.3.7, I've update the description.

@rileyajones rileyajones merged commit 26f5e33 into tensorflow:master Aug 29, 2023
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants