Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-38096: [Java] FlightStream with metadata can cause error when closing #38110

Merged
merged 5 commits into from
Oct 12, 2023

Conversation

BryanCutler
Copy link
Member

@BryanCutler BryanCutler commented Oct 6, 2023

Rationale for this change

The Java FlightStream can raise an error if metadata is transferred and ends up being closed twice.

java.lang.IllegalStateException: RefCnt has gone negative
	at org.apache.arrow.util.Preconditions.checkState(Preconditions.java:458)
	at org.apache.arrow.memory.BufferLedger.release(BufferLedger.java:130)
	at org.apache.arrow.memory.BufferLedger.release(BufferLedger.java:104)
	at org.apache.arrow.memory.ArrowBuf.close(ArrowBuf.java:1044)
	at org.apache.arrow.util.AutoCloseables.close(AutoCloseables.java:97)
	at org.apache.arrow.flight.FlightStream.close(FlightStream.java:208)

What changes are included in this PR?

When FlightStream is closed, remove any reference of previous metadata to prevent reference count going negative if closed again. Also added ExchangeReaderWriter.getResult() for convenience and clear up ambiguity on error handling.

Are these changes tested?

Unit tests added for closing with metadata and

Are there any user-facing changes?

Added ExchangeReaderWriter.getResult()

@BryanCutler
Copy link
Member Author

@lidavidm I was playing around with the DoExchange call and noticed a couple of issues. At first I was a little confused as to why getWriter().getResult() wasn't being called until I realized that errors were coming through the reader. If getWriter().getResult() is called before getReader().next() is called a final time, then it seems to hang.

So I noticed that you really need to call getReader().next() a final time, so it will return false or raise any errors that happened after exchanging all the batches. To hopefully make this more clear to the user I added ExchangeReaderWriter.getResult() so it will be obvious to call this instead. I'm not 100% sure that it's a good idea, so let me know what you think. Thanks!

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, just some nits

*/
public void getResult() {
// After exchange is complete, make sure stream is drained to propagate errors through reader
reader.next();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly loop while reader.next()?

@@ -207,6 +207,8 @@ public void close() throws Exception {
} else {
AutoCloseables.close(closeables);
}
// Remove any metadata after closing to prevent negative refcnt
applicationMetadata = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this go in the finally instead? I suppose if this throws the application is unlikely to close again.

Copy link
Member Author

@BryanCutler BryanCutler Oct 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put it here because if something throws and error before this, it's likely the buffer was not closed then. So at least this way, the buffer could be examined, but either way there is probably some resources leaked.

// After exchange is complete, make sure stream is drained to propagate errors through reader
reader.next();
}

/** Shut down the streams in this call. */
@Override
public void close() throws Exception {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we drain the stream in close as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is drained already in the call to reader.close(). From what I can tell, it will suppress any exception though. So probably better off draining with the call to getResult() before closing to handle any errors, wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

@lidavidm
Copy link
Member

At first I was a little confused as to why getWriter().getResult() wasn't being called until I realized that errors were coming through the reader.

We try to expose 'independent' readers/writers but in actuality in gRPC they are tied + the reader is the only source of errors (which in some sense makes sense, I suppose) so this is perhaps more of a mistake in our API design

@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting review Awaiting review labels Oct 11, 2023
@BryanCutler
Copy link
Member Author

We try to expose 'independent' readers/writers but in actuality in gRPC they are tied + the reader is the only source of errors (which in some sense makes sense, I suppose) so this is perhaps more of a mistake in our API design

Makes sense, thanks for the explanation

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting merge Awaiting merge labels Oct 11, 2023
@BryanCutler BryanCutler force-pushed the java-flight-stream-meta-close-38096 branch from 9849256 to 7740c76 Compare October 11, 2023 21:16
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Oct 11, 2023
@lidavidm lidavidm merged commit 71a4ef4 into apache:main Oct 12, 2023
@lidavidm lidavidm removed the awaiting change review Awaiting change review label Oct 12, 2023
@github-actions github-actions bot added the awaiting merge Awaiting merge label Oct 12, 2023
llama90 pushed a commit to llama90/arrow that referenced this pull request Oct 12, 2023
…n closing (apache#38110)

### Rationale for this change

The Java FlightStream can raise an error if metadata is transferred and ends up being closed twice.

```
java.lang.IllegalStateException: RefCnt has gone negative
	at org.apache.arrow.util.Preconditions.checkState(Preconditions.java:458)
	at org.apache.arrow.memory.BufferLedger.release(BufferLedger.java:130)
	at org.apache.arrow.memory.BufferLedger.release(BufferLedger.java:104)
	at org.apache.arrow.memory.ArrowBuf.close(ArrowBuf.java:1044)
	at org.apache.arrow.util.AutoCloseables.close(AutoCloseables.java:97)
	at org.apache.arrow.flight.FlightStream.close(FlightStream.java:208)
```

### What changes are included in this PR?

When FlightStream is closed, remove any reference of previous metadata to prevent reference count going negative if closed again. Also added `ExchangeReaderWriter.getResult()` for convenience and clear up ambiguity on error handling.

### Are these changes tested?

Unit tests added for closing with metadata and 

### Are there any user-facing changes?

Added `ExchangeReaderWriter.getResult()`
* Closes: apache#38096

Authored-by: Bryan Cutler <[email protected]>
Signed-off-by: David Li <[email protected]>
@BryanCutler BryanCutler deleted the java-flight-stream-meta-close-38096 branch October 13, 2023 23:32
@BryanCutler
Copy link
Member Author

Thanks @lidavidm !

@conbench-apache-arrow
Copy link

After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit 71a4ef4.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

JerAguilon pushed a commit to JerAguilon/arrow that referenced this pull request Oct 23, 2023
…n closing (apache#38110)

### Rationale for this change

The Java FlightStream can raise an error if metadata is transferred and ends up being closed twice.

```
java.lang.IllegalStateException: RefCnt has gone negative
	at org.apache.arrow.util.Preconditions.checkState(Preconditions.java:458)
	at org.apache.arrow.memory.BufferLedger.release(BufferLedger.java:130)
	at org.apache.arrow.memory.BufferLedger.release(BufferLedger.java:104)
	at org.apache.arrow.memory.ArrowBuf.close(ArrowBuf.java:1044)
	at org.apache.arrow.util.AutoCloseables.close(AutoCloseables.java:97)
	at org.apache.arrow.flight.FlightStream.close(FlightStream.java:208)
```

### What changes are included in this PR?

When FlightStream is closed, remove any reference of previous metadata to prevent reference count going negative if closed again. Also added `ExchangeReaderWriter.getResult()` for convenience and clear up ambiguity on error handling.

### Are these changes tested?

Unit tests added for closing with metadata and 

### Are there any user-facing changes?

Added `ExchangeReaderWriter.getResult()`
* Closes: apache#38096

Authored-by: Bryan Cutler <[email protected]>
Signed-off-by: David Li <[email protected]>
loicalleyne pushed a commit to loicalleyne/arrow that referenced this pull request Nov 13, 2023
…n closing (apache#38110)

### Rationale for this change

The Java FlightStream can raise an error if metadata is transferred and ends up being closed twice.

```
java.lang.IllegalStateException: RefCnt has gone negative
	at org.apache.arrow.util.Preconditions.checkState(Preconditions.java:458)
	at org.apache.arrow.memory.BufferLedger.release(BufferLedger.java:130)
	at org.apache.arrow.memory.BufferLedger.release(BufferLedger.java:104)
	at org.apache.arrow.memory.ArrowBuf.close(ArrowBuf.java:1044)
	at org.apache.arrow.util.AutoCloseables.close(AutoCloseables.java:97)
	at org.apache.arrow.flight.FlightStream.close(FlightStream.java:208)
```

### What changes are included in this PR?

When FlightStream is closed, remove any reference of previous metadata to prevent reference count going negative if closed again. Also added `ExchangeReaderWriter.getResult()` for convenience and clear up ambiguity on error handling.

### Are these changes tested?

Unit tests added for closing with metadata and 

### Are there any user-facing changes?

Added `ExchangeReaderWriter.getResult()`
* Closes: apache#38096

Authored-by: Bryan Cutler <[email protected]>
Signed-off-by: David Li <[email protected]>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…n closing (apache#38110)

### Rationale for this change

The Java FlightStream can raise an error if metadata is transferred and ends up being closed twice.

```
java.lang.IllegalStateException: RefCnt has gone negative
	at org.apache.arrow.util.Preconditions.checkState(Preconditions.java:458)
	at org.apache.arrow.memory.BufferLedger.release(BufferLedger.java:130)
	at org.apache.arrow.memory.BufferLedger.release(BufferLedger.java:104)
	at org.apache.arrow.memory.ArrowBuf.close(ArrowBuf.java:1044)
	at org.apache.arrow.util.AutoCloseables.close(AutoCloseables.java:97)
	at org.apache.arrow.flight.FlightStream.close(FlightStream.java:208)
```

### What changes are included in this PR?

When FlightStream is closed, remove any reference of previous metadata to prevent reference count going negative if closed again. Also added `ExchangeReaderWriter.getResult()` for convenience and clear up ambiguity on error handling.

### Are these changes tested?

Unit tests added for closing with metadata and 

### Are there any user-facing changes?

Added `ExchangeReaderWriter.getResult()`
* Closes: apache#38096

Authored-by: Bryan Cutler <[email protected]>
Signed-off-by: David Li <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Java] Flight stream with metadata gives an error if closed twice
2 participants