Timout when reading more than max_recv_size bytes #23

eltos · 2017-03-06T11:13:02Z

When trying to read raw binary data with more than 16*1024=16384 bytes (this is the value of max_recv_size), I get I/O TIMEOUT Errors.

I have seen in the source code that data is received in packages up to max_recv_size bytes and then joined.

I tried to manually set smaller or larger values for max_recv_size without success.
However, I found out, that in fact there are only a few bytes missing, causing the last client.device_read(...) to produce a timeout.
For example if I set max_recv_size=4096 and try to read 32000 bytes I can receive 7 packages à 4096 bytes (thats 28672 in total) but the last read call raises a timeout exception because it can't read all of the expected missing 3328 bytes (but probably some or even most of them).

It is just an idea, but maybee there are bytes lost in the timewindow between the calls to client.device_read(...) where you append data to the previously read?

Any thoughts on that?

The text was updated successfully, but these errors were encountered:

alexforencich · 2017-03-08T23:20:53Z

Interesting; I'll take a look at that. What device are you interfacing with?

eltos · 2017-03-09T07:48:28Z

I am using a Keysight E5810 LAN/GPIB Gateway that is connected to a GPIB device designed particular for use at our facility.
A cross-check test using LabVIEW's VISA Standard to control the Gateway showed no problems reading out that much data.

eltos · 2017-03-09T13:17:16Z

I did some more testing and found out, that depending on the setting of max_recv_size always exactly n-1 bytes are missing, if n equals the number of packages required to receive the data. To me that looks like there is a single byte lost everytime a package is received via client.device_read.

I improved the read out process so that it is ~50 times faster without solving the issue. However, as I think it's a major performance improvement, I created a pull request: #24.

bobmcnamara · 2017-03-09T18:37:55Z

Here’s what the VXI-11 specification says about maxRecvSize: "To successfully complete a create_link RPC, a network instrument server SHALL...return in maxRecvSize the size of the largest data parameter the network instrument server can accept in a device_write RPC. This value SHALL be at least 1024.” I can’t find anywhere in the spec where it mentions that the maxRecvSize parameter should have any effect on the device_read RPC, only the device_write RPC. Therefore, the read_raw method should use “read_len = num” if num has been specified, else some default value. I would pick a constant like 1MB, 128KB, 10KB, or whatever, but I don’t see any reason to tie it to max_recv_size.

…

On Mar 9, 2017, at 5:17 AM, Philipp Niedermayer ***@***.***> wrote: I did some more testing and found out, that depending on the setting of max_recv_size always exactly n-1 bytes are missing, if n equals the number of packages required to receive the data. To me that looks like there is a single byte lost everytime a package is received via client.device_read. I improved the read out process so that it is ~50 times faster without solving the issue. However, as I think it's a major performance improvement, I created a pull request: #24 <#24>. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#23 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHH2e-KSHCAZF6UFawlDz5-CWcHYgs8Aks5rj_vcgaJpZM4MUA-m>.

alexforencich · 2017-03-09T19:20:07Z

Well, fortunately I have a several E5810A boxes to test against. Does your box have the latest firmware installed? It's pretty easy to update via the web interface, though the button is in a weird spot - in the help pages.

alexforencich · 2017-03-09T19:25:09Z

@bobmcnamara that's how it works, it just uses max_recv_size as the default read_len. I don't think it should make a difference, though. It used to be a very large value, but this caused issues with some older devices. I figured setting it the same as max_recv_size as reported by the instrument would be reasonable. It's also capped at 1 MB. I think that's reasonable--for larger transfers, it shouldn't add much overhead to read in 1 MB blocks, so long as reading those blocks is fast.

bobmcnamara · 2017-03-09T19:42:29Z

Actually what I was suggesting is that you should change this: if num > 0 and num < self.max_recv_size: read_len = num to this: if num > 0: read_len = num

…

On Mar 9, 2017, at 11:25 AM, Alex Forencich ***@***.***> wrote: @bobmcnamara <https://github.com/bobmcnamara> that's how it works, it just uses max_recv_size as the default read_len. I don't think it should make a difference, though. It used to be a very large value, but this caused issues with some older devices. I figured setting it the same as max_recv_size as reported by the instrument would be reasonable. It's also capped at 1 MB. I think that's reasonable--for larger transfers, it shouldn't add much overhead to read in 1 MB blocks, so long as reading those blocks is fast. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#23 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHH2e6qFkg5Qn56geC6jLvSm9-fvD1q4ks5rkFIVgaJpZM4MUA-m>.

alexforencich · 2017-03-09T19:50:24Z

It still may make sense to cap read_len, but maybe not max_recv_size. But let's figure out this off by one issue before making that change.

eltos · 2017-03-10T07:55:19Z

@bobmcnamara Unfortunately that does not take effect, as the E5810 does not send more than 16384 bytes at once, even if requested to do so.

@alexforencich I updated the firmware - still no success. That's really a weird spot to place that button...

alexforencich · 2017-03-10T07:58:14Z

I know, right? I'm going to see what I can do about replicating the issue. Do you know what the data you're reading is supposed to be? Can you grab a wireshark trace?

eltos · 2017-03-10T10:14:17Z

I haven't yet grabbed a wireshark trace, but I looked into the error, reason, data returned by self.client.device_read instead. It turns out that the error flag of the last call is set to 15 (TIMEOUT). From my understanding this is a value sent by the E5810, so the timeout actually happens because the E5810 cannot read the requested amount of data from the GPIB lane, am I right?
Also, by compary the data fragments with what I expect them to be, I could verify that the byte is actually lost in between these calls.

So I guess it's an issue with the E5810 rather than with your library. What do you think?

alexforencich · 2017-03-10T11:09:43Z

You said it works with labview, so that seems to indicate some sort of issue in python-ivi. Let's see what's in the trace. If the bytes are missing in the trace, then that would indicate a problem in the E5810. Also, if you can, grab a trace when labview performs a similar read operation for comparison.

eltos · 2017-03-10T11:40:35Z

Yes that's a point. The issue might only affect VXI not VISA, though.

alexforencich · 2017-03-10T17:49:14Z

Yes, but doesn't VISA use VXI11 to talk to the E5810?

alexforencich · 2017-03-11T04:36:01Z

@bobmcnamara After doing some experimenting with Rigol scopes, I definitely agree that limiting read_len to max_recv_size is a bad idea. On the Rigol DS2000 series, max_recv_size seems to be 1500, and reading out, say, a screenshot in 1500 byte chunks takes a VERY long time, apparently due to per-request overheads in the scope. Limiting read_len to something more reasonable, like 1 MB or 10 MB, makes a huge difference. I will definitely make that change once we get to the bottom of the off-by-one issue.

bobmcnamara · 2017-04-01T00:06:55Z

A huge difference in performance that is sensitive to max_recv_size sounds suspiciously like Nagle’s algorithm not playing nicely with delayed ACK: http://www.stuartcheshire.org/papers/nagledelayedack/ <http://www.stuartcheshire.org/papers/nagledelayedack/> I’ve run into that before with a couple of different VXI11 implementation when trying to send ~600kB data blocks to a Rohde & Schwarz spectrum analyzer. To see if that’s the culprit in your case, you could either stare at WireShark traces or just try adding something like this: sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1) and see if the problem goes away.

…

On Mar 10, 2017, at 8:36 PM, Alex Forencich ***@***.***> wrote: @bobmcnamara <https://github.com/bobmcnamara> After doing some experimenting with Rigol scopes, I definitely agree that limiting read_len to max_recv_size is a bad idea. On the Rigol DS2000 series, max_recv_size seems to be 1500, and reading out, say, a screenshot in 1500 byte chunks takes a VERY long time, apparently due to per-request overheads in the scope. Limiting read_len to something more reasonable, like 1 MB or 10 MB, makes a huge difference. I will definitely make that change once we get to the bottom of the off-by-one issue. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#23 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHH2e1CSUqeoyjHATjVGIna7ws6KBdjPks5rkiSygaJpZM4MUA-m>.

eltos · 2017-05-12T20:24:03Z

I recently came to test the data transmission on the latest revision of the E5810B (note the "B" instead of "A") and did not experience the issue anymore. I have not taken the time to track it down to the byte stream, but it's working fine for my application.

eltos closed this as completed May 12, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Timout when reading more than max_recv_size bytes #23

Timout when reading more than max_recv_size bytes #23

eltos commented Mar 6, 2017

alexforencich commented Mar 8, 2017

eltos commented Mar 9, 2017

eltos commented Mar 9, 2017

bobmcnamara commented Mar 9, 2017 via email

alexforencich commented Mar 9, 2017

alexforencich commented Mar 9, 2017

bobmcnamara commented Mar 9, 2017 via email

alexforencich commented Mar 9, 2017

eltos commented Mar 10, 2017

alexforencich commented Mar 10, 2017

eltos commented Mar 10, 2017

alexforencich commented Mar 10, 2017

eltos commented Mar 10, 2017 via email

alexforencich commented Mar 10, 2017

alexforencich commented Mar 11, 2017

bobmcnamara commented Apr 1, 2017 via email

eltos commented May 12, 2017

Timout when reading more than max_recv_size bytes #23

Timout when reading more than max_recv_size bytes #23

Comments

eltos commented Mar 6, 2017

alexforencich commented Mar 8, 2017

eltos commented Mar 9, 2017

eltos commented Mar 9, 2017

bobmcnamara commented Mar 9, 2017 via email

alexforencich commented Mar 9, 2017

alexforencich commented Mar 9, 2017

bobmcnamara commented Mar 9, 2017 via email

alexforencich commented Mar 9, 2017

eltos commented Mar 10, 2017

alexforencich commented Mar 10, 2017

eltos commented Mar 10, 2017

alexforencich commented Mar 10, 2017

eltos commented Mar 10, 2017 via email

alexforencich commented Mar 10, 2017

alexforencich commented Mar 11, 2017

bobmcnamara commented Apr 1, 2017 via email

eltos commented May 12, 2017