Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overlapped io #55

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

DoumanAsh
Copy link

Subj
This is a similar request to original repo https://gitlab.com/susurrus/serialport-rs/-/merge_requests/91

I'll need to do some testing myself, but overall it is the same code as in that PR with just some minor improvements

@smihica
Copy link

smihica commented Jun 21, 2022

The message on the original MergeRequest.

I used this library on a Windows machine and found some strange behavior.
My program runs two threads and each thread runs port.read() and port.write() simultaneously. (using port.try_clone())
Then port.read() sometimes has an extremely delayed response (about 3s~5s).
Windows seems to not be able to read the same port simultaneously in other threads even if those are created by DuplicateHandle().
So I supported OVERLAPPED IO. https://docs.microsoft.com/en-us/windows/win32/sync/synchronization-and-overlapped-input-and-output
With this PR, It turned to be working well on my project.

@jessebraham
Copy link
Member

Thanks for the PR. I don't use Windows but I will get a colleague to verify this as soon as possible so we can get it merged!

@mlsvrts
Copy link
Contributor

mlsvrts commented Jun 28, 2022

I think switching to overlapped io on Windows is the right choice, but we need to be careful about (and probably expose) the COMMTIMEOUTs. It's confusing and painful to get the expected system behavior. For reference see my own struggles with tokio-serial:

berkowski/tokio-serial#55

And also what may be the only correct description of COMMTIMEOUTs that exists:

https://gitlab.com/susurrus/serialport-rs/-/merge_requests/78#note_343695538

It's likely impossible to use a single timeout value that meets many use cases, so a think that we need:

  • A default timeout behavior that works in the 'least surprising' way
  • A way to change the timeouts to support better latency/filling the buffer/etc.

Comment on lines 226 to 272
let bytes_to_read = self.bytes_to_read()? as usize;

if self.timeout.as_millis() == 0 && bytes_to_read < read_size {
read_size = bytes_to_read;
}
if read_size == 0 {
return Ok(0);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if this belongs in the Read implementation. It:

  • Invokes an additional syscall on every Read
  • Prevents expressing 'block until the buffer is full'

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm there is actually case where ClearCommError is useful, but in this library code it is probably not the case
It is mentioned here https://github.com/serialport/serialport-rs/blob/main/src/windows/dcb.rs#L54=

bytes_to_read, if flag is set to True would be necessary to resume read operations.

In any case it can be removed since we initialize this flag to false, but I'm generally not sure what is best approach here as I took PR https://gitlab.com/susurrus/serialport-rs/-/merge_requests/91 as reference

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On side note I think maybe we should only do this check if timeout is zero

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that in case of zero timeout, overlapped IO actually performs indefinite awaiting

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that in case of zero timeout, overlapped IO actually performs indefinite awaiting

Agreed, but in some cases this might actually be the desired behavior. I think with Option<Duration> (see: #50) we could better express 'never timeout' vs 'never block':

  • None: Wait until buf.len() bytes have been read
  • Some(0): Read any pending data, up-to buf.len(). Return immediately if there is no data.
  • Some(x): Read until buf.len() bytes have been returned, or timeout has expired

So None and Some(x) would be deferred to setting COMMTIMEOUTS, while Some(0) would rely on bytes_to_read. I don't know if this is a perfect solution, but I think that serialport-rs should be able to support the 'never timeout' case.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I don't mind to improve timeout logic, but this would be a breaking change.
I think we need to improve situation at least in current 4.x.y version first

@DoumanAsh
Copy link
Author

Thanks for the PR. I don't use Windows but I will get a colleague to verify this as soon as possible so we can get it merged!

Ah thanks, I'm not sure I'm able to do comprehensive tests myself so it would be good to verify it by someone who is more well versed

@DoumanAsh DoumanAsh force-pushed the overlapped_io branch 2 times, most recently from e05c174 to 69529a7 Compare June 28, 2022 17:08
@DoumanAsh
Copy link
Author

@mlsvrts I added set_timeouts native method that allows to set all timeouts, not only read timeout for your use case

@xobs
Copy link
Contributor

xobs commented Jul 13, 2022

I'm not sure if this is expected or not, but when I use read_all() on a stream with this patch it panics:

thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: Error { kind: UnexpectedEof, message: "failed to fill whole buffer" }', src\main.rs:21:44
stack backtrace:
   0: std::panicking::begin_panic_handler
             at /rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library\std\src\panicking.rs:584
   1: core::panicking::panic_fmt
             at /rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library\core\src\panicking.rs:142
   2: core::result::unwrap_failed
             at /rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library\core\src\result.rs:1785
   3: enum$<core::result::Result<tuple$<>,std::io::error::Error>, 1, 18446744073709551615, Err>::unwrap<tuple$<>,std::io::error::Error>
             at /rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc\library\core\src\result.rs:1078
   4: mdi_tools::main::closure$0
             at .\src\main.rs:21
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

The error message of "failed to fill whole buffer" is confusing, since the purpose of .read_all() is to repetedly call .read() until the buffer is full. The code that's failing is:

    std::thread::spawn(move || {
        let mut buf = [0u8; 3];
        loop {
            println!("Reading data");
            read_half.read_exact(&mut buf).unwrap();
            println!("Got data: {:02x} {:02x} {:02x}", buf[0], buf[1], buf[2]);
        }
    });

@DoumanAsh
Copy link
Author

DoumanAsh commented Jul 13, 2022

@xobs Read::read_exact default implementation fails if it cannot fill buffer completely
If you set zero timeout, Read::read returns 0 when there is nothing to read.
If you want to make sure to wait for data, you have to set timeout to wait for read to complete

Actually looking at code base, it doesn't seem to set FNDELAY on port there so maybe it should block indefinitely too (the problem though that if you pass too big of buffer, it will block forever until it fills buffer)

@xobs
Copy link
Contributor

xobs commented Jul 13, 2022

My mistake, thanks. In that case, I suppose this patch is an improvement on the situation, since I get the error with this patchset. However, it doesn't seem to actually solve the problem.

If I set the timeout to something large, then the above code gets blocked at Reading data and never fnishes. Even if buf is a [0u8; 1].

However, if I open Tera Term and connect to that serial port after my Rust program closes, the buffer drains into the terminal window and I get all of the data that should have gone to my program.

This is in contrast to the current implementation which will just block forever.

@DoumanAsh
Copy link
Author

The patch goal is to enable proper split on windows.
Without overlapping IO, cloning serial port would result in inability efficiently write/read from different threads

I don't exactly follow what is the issue you're encountering, API should block only as long as it cannot fill buffer, so if doesn't fill even [u8; 1] I assume there is no data 🤔

@xobs
Copy link
Contributor

xobs commented Jul 13, 2022

My mistake -- the issue in my code was that on a Raspberry Pi Pico they don't send data until the DTR bit is asserted. Adding port.write_data_terminal_ready(true)?; solved my problem.

I suppose then the only question is whether it should be nonblocking by default. Structs like TcpStream are nonblocking, and will wait until data is available before returning. Aside from that, this patchset is now working for me.

@DoumanAsh
Copy link
Author

@xobs TcpStream by default is blocking actually
AFAIK serial port behavior on windows is a bit different in a sense that it is always tries to fill buffer that you give it, instead of returning any data available.
Unfortunately I do not know if it can be fixed somehow other than setting timeout

@DoumanAsh
Copy link
Author

@jessebraham
Hey, just wanted to remind you about this PR
I'm not in hurry though I just use my github repo for now

@jessebraham
Copy link
Member

Sorry, I've had some vacation time lately (and have also just been pretty busy in general) and sort of lost track of this conversation! I'll try to get to it this week.

@silverjam
Copy link

@xobs TcpStream by default is blocking actually AFAIK serial port behavior on windows is a bit different in a sense that it is always tries to fill buffer that you give it, instead of returning any data available. Unfortunately I do not know if it can be fixed somehow other than setting timeout

I think it would be possible to simulate blocking, but you'd probably need to use IOCP (I/O completion ports) and a thread which would fill a channel/queue of data from the serial port. You'd wait on this channel to be filled in order to simulate blocking reads.

This is because, as far as I understand things, with Windows overlapped I/O you're expected to give it a pool of threads and a handle to watch for I/O-- those threads are woken up and handed data once available, so it's pretty fundamentally different than typical non-blocking I/O on Linux (though io_uring on newer kernels provides similar features).

I guess the key point @DoumanAsh mentioned is that GetOverlappedResult isn't going signal completion until the whole buffer is filled or a timeout is reached?

If doing this "simulated blocking" then you could always set a timeout for the completion port and configure a small buffer size, re-queuing the read in the event of a timeout.

@DoumanAsh
Copy link
Author

@jessebraham kindly remind you about PR 😄

@mlsvrts
Copy link
Contributor

mlsvrts commented Nov 7, 2022

@jessebraham

Okay, I've done some testing with this on a Windows 11 box with an FTDI USB->serial adapter and things seem to be working well. (See #69 for the loopback test I was running).

My only remaining note here is to make sure we version control this as a breaking change, as some code that worked before will stop working after this is merged (I've verified this can happen with the timeout behavior, but there may be other subtly broken scenarios).

Other than that, it looks good to me! Thanks @DoumanAsh!

@DoumanAsh
Copy link
Author

I rebased this PR.
I would hope that it eventually gets merged, but if you do not need it please let me know I'll just close it

@eldruin eldruin added this to the 5.0.0 milestone Mar 17, 2023
win: Perform zero read size check only when timeout zero

win: Add note on how COM port is created

Add COM::set_timeouts
@RossSmyth
Copy link
Contributor

I was thinking about this PR and realized that is it a semver breaking change, and it would actually effect me sadly. Personally I'm fine with it and can maintain a fork internally. But that is one thing to keep in mind.

The reason this is a breaking change is because this impl

@RossSmyth
Copy link
Contributor

Oh it was already noted above. That's what I get for not reading. I am generally in favor of it though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants