Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timout when reading more than max_recv_size bytes #23

Closed
eltos opened this issue Mar 6, 2017 · 17 comments
Closed

Timout when reading more than max_recv_size bytes #23

eltos opened this issue Mar 6, 2017 · 17 comments

Comments

@eltos
Copy link
Contributor

eltos commented Mar 6, 2017

When trying to read raw binary data with more than 16*1024=16384 bytes (this is the value of max_recv_size), I get I/O TIMEOUT Errors.

I have seen in the source code that data is received in packages up to max_recv_size bytes and then joined.

I tried to manually set smaller or larger values for max_recv_size without success.
However, I found out, that in fact there are only a few bytes missing, causing the last client.device_read(...) to produce a timeout.
For example if I set max_recv_size=4096 and try to read 32000 bytes I can receive 7 packages à 4096 bytes (thats 28672 in total) but the last read call raises a timeout exception because it can't read all of the expected missing 3328 bytes (but probably some or even most of them).

It is just an idea, but maybee there are bytes lost in the timewindow between the calls to client.device_read(...) where you append data to the previously read?

Any thoughts on that?

@alexforencich
Copy link
Contributor

Interesting; I'll take a look at that. What device are you interfacing with?

@eltos
Copy link
Contributor Author

eltos commented Mar 9, 2017

I am using a Keysight E5810 LAN/GPIB Gateway that is connected to a GPIB device designed particular for use at our facility.
A cross-check test using LabVIEW's VISA Standard to control the Gateway showed no problems reading out that much data.

@eltos
Copy link
Contributor Author

eltos commented Mar 9, 2017

I did some more testing and found out, that depending on the setting of max_recv_size always exactly n-1 bytes are missing, if n equals the number of packages required to receive the data. To me that looks like there is a single byte lost everytime a package is received via client.device_read.

I improved the read out process so that it is ~50 times faster without solving the issue. However, as I think it's a major performance improvement, I created a pull request: #24.

@bobmcnamara
Copy link

bobmcnamara commented Mar 9, 2017 via email

@alexforencich
Copy link
Contributor

Well, fortunately I have a several E5810A boxes to test against. Does your box have the latest firmware installed? It's pretty easy to update via the web interface, though the button is in a weird spot - in the help pages.

@alexforencich
Copy link
Contributor

@bobmcnamara that's how it works, it just uses max_recv_size as the default read_len. I don't think it should make a difference, though. It used to be a very large value, but this caused issues with some older devices. I figured setting it the same as max_recv_size as reported by the instrument would be reasonable. It's also capped at 1 MB. I think that's reasonable--for larger transfers, it shouldn't add much overhead to read in 1 MB blocks, so long as reading those blocks is fast.

@bobmcnamara
Copy link

bobmcnamara commented Mar 9, 2017 via email

@alexforencich
Copy link
Contributor

It still may make sense to cap read_len, but maybe not max_recv_size. But let's figure out this off by one issue before making that change.

@eltos
Copy link
Contributor Author

eltos commented Mar 10, 2017

@bobmcnamara Unfortunately that does not take effect, as the E5810 does not send more than 16384 bytes at once, even if requested to do so.

@alexforencich I updated the firmware - still no success. That's really a weird spot to place that button...

@alexforencich
Copy link
Contributor

I know, right? I'm going to see what I can do about replicating the issue. Do you know what the data you're reading is supposed to be? Can you grab a wireshark trace?

@eltos
Copy link
Contributor Author

eltos commented Mar 10, 2017

I haven't yet grabbed a wireshark trace, but I looked into the error, reason, data returned by self.client.device_read instead. It turns out that the error flag of the last call is set to 15 (TIMEOUT). From my understanding this is a value sent by the E5810, so the timeout actually happens because the E5810 cannot read the requested amount of data from the GPIB lane, am I right?
Also, by compary the data fragments with what I expect them to be, I could verify that the byte is actually lost in between these calls.

So I guess it's an issue with the E5810 rather than with your library. What do you think?

@alexforencich
Copy link
Contributor

You said it works with labview, so that seems to indicate some sort of issue in python-ivi. Let's see what's in the trace. If the bytes are missing in the trace, then that would indicate a problem in the E5810. Also, if you can, grab a trace when labview performs a similar read operation for comparison.

@eltos
Copy link
Contributor Author

eltos commented Mar 10, 2017 via email

@alexforencich
Copy link
Contributor

Yes, but doesn't VISA use VXI11 to talk to the E5810?

@alexforencich
Copy link
Contributor

@bobmcnamara After doing some experimenting with Rigol scopes, I definitely agree that limiting read_len to max_recv_size is a bad idea. On the Rigol DS2000 series, max_recv_size seems to be 1500, and reading out, say, a screenshot in 1500 byte chunks takes a VERY long time, apparently due to per-request overheads in the scope. Limiting read_len to something more reasonable, like 1 MB or 10 MB, makes a huge difference. I will definitely make that change once we get to the bottom of the off-by-one issue.

@bobmcnamara
Copy link

bobmcnamara commented Apr 1, 2017 via email

@eltos
Copy link
Contributor Author

eltos commented May 12, 2017

I recently came to test the data transmission on the latest revision of the E5810B (note the "B" instead of "A") and did not experience the issue anymore. I have not taken the time to track it down to the byte stream, but it's working fine for my application.

@eltos eltos closed this as completed May 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants