-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unsuccessful read with SS-SUCESS exception using Tree.getNode(...).getData() in Python #2726
Comments
Hi @alkhwarizmi -- Thanks for the detailed bug report. Would appreciate a few more details, as that will aid us in reproducing and debugging the issue. Specifically would like to know the type (Windows, Ubuntu Linux, RedHat Linux, etc.) and version of the operating system on the two client PCs. And same thing regarding the two servers. Also note that the |
Thank you very much for the support. Here are information you asked for: Regarding my PC: {'platform': 'Windows', Colleague of mine PC: {'platform': 'Windows', Regarding the GNU/Linux server: Regarding the Windows server: |
Hi @alkhwarizmi -- Thanks for the information. Based on that, I will use Windows 10 for my testing. Regarding the Linux server, please login to the computer and at the shell prompt type |
Hi @alkhwarizmi -- The initial bug report indicates that this issue only exists with old data. Thus, here are some related questions:
It is also interesting that the Windows 10 client PCs have this warning: |
Hi @alkhwarizmi -- I am configuring some systems so I can investigate this issue. Will post here when I have succeeded in reproducing the error. |
Some comments:
|
Hi @alkhwarizmi -- Thanks for the additional detail. Current conjecture is that this is a cross-version compatibility issue (i.e., client is newer than the Windows server). Reason for this conjecture is as follows:
Note that the networking protocol that is part of MDSplus, namely To test the above conjecture, I am setting up a server with stable_7.50.1 and a client with stable_7.132.0 and seeing if I can reproduce the errors described in the bug report. When I am able to reproduce the error, I will then be able to provide the following:
|
Thank you very much for the help. First a question then I tell you what I would do. Q: This incompatibility regards how data is transferred I guess, not at all how data is stored, right? So, past data will be forever readable with a newer client. If the answer to the above question is yes, then I would not develop any hack to try to maintain compatibility with 7.50. We will try to update the server. The point is that we are very much scared of everything that could go wrong... It is a production system, any delays costs quite a lot and we need to plan the upgrade in advance. I cannot make at will. |
Hi @alkhwarizmi, You are correct. If the An easy way to determine if you site has encountered a
Many sites that use MDSplus also update their server side software infrequently. And for the same reasons (i.e., don't want to break a working system, the expense of disrupting or delaying scheduled experiments, the effort involved in testing a new version before placing it in production and so forth). While you are considering whether to upgrade your Win10 server to a newer version, I will continue to investigate the issue. If I am able to reproduce the errors, then you will have the facts needed to proceed with your upgrade plans. |
Hi @alkhwarizmi, More questions . . .
For my first test of the issue, everything worked fine. Did not reproduce the errors shown in the bug report. Here is the configuration that was used:
Will now repeat with a signal that has floats and quads (as per the screenshot in the initial bug report). And if that doesn't reproduce the error, will then switch to a Win10 server with stable_7.50.1 version. I have also examined the code in the MDSplus API for Python and it definitely is detecting an error. |
Hi @alkhwarizmi -- Was unable to reproduce the errors by repeating the above experiment using signals that contained float data and quadword dimension. And those signals also displayed fine using jTraverser2 on the client. |
Hi @alkhwarizmi, More questions . . .
|
Hi @alkhwarizmi, It appears that Python for Windows does have a 32-bit version. If so, that would simplify the configuration of your Win10 computers as you would only need the 32-bit version of MDSplus. (Unless your site has a Python program that must use a 64-bit library to analyze data read from MDSplus.) https://www.python.org/downloads/windows/ |
Hi @alkhwarizmi, The initial bug report states that jTraverser isn't displaying the expected output (i.e., that it differs from the Python output). Some questions about that observation . . .
|
Hi @alkhwarizmi, Switched to Windows 10 for the server and was unable to reproduce the error. Here was the configuration:
Next will configure the Windows server with 32-bit MDSplus. |
Hi there, |
I would advise against windows as the finsl tree host as its native file systems do not support partial file locking on system level. this makes it slow and less useful for mulltithreaded/multiprocessed writes. At W7X we made a lot of tests on how to store data efficiently and found it is best to:
This will limit the amount of concurrent writes to a file and the number of sources for a write issue. |
Hi @alkhwarizmi, My newest conjecture is that this issue might be associated with running both 32-bit and 64-bit MDSplus on the same Windows 10 computer. I will do some experiments with that configuration in the coming week. To determine if your trees are undamaged, probably best to login to the Linux server and use its MDSplus and Python to check the old trees. As that will eliminate all the variables associated with your site's unusual MDSplus installation on Windows 10. Hi @zack-vii, Thanks for the tips. Much appreciated! |
Hi @zack-vii, Regarding the C code, it does indeed still have This is an interesting bug because of the cross-version aspect (i.e., multiple versions of MDSplus spanning a ~4 year period), plus 32-bit and 64-bit versions installed on the same computers (needed for 32-bit LabVIEW and 64-bit Python). |
Hi @alkhwarizmi, I was wrong. Turns out that when doing an install of MDSplus on 64-bit Windows computers, both the 32-bit and 64-bit MDSplus *.dll files are installed. (I was surprised to see that when I examined the configuration script for the installer.). And I have confirmed that for So now, I will see what happens if I use mdsip with the 32-bit *.dll files. |
Hallo thank you for the work and sorry for the late reply.
I don't think this is relevant because the real-time controller make use only of the 32-bit LabView API. We don't log there to run the tree creation scripts (I know it would be faster, but I wanted to avoid problems due to machine size altogether) when we need to update the model tree we re-create it from our PC using the Tree class. However we do have Python installed there, which is quite old (Python 3.6.8 from Anaconda3 distribution) precisely because we don't use it.
When we run the script to re-create the model we normally run in from a shell within the Spyder IDE
The plant control software (PCS) running on the Windows server (which is a National Instruments PXI rack) is the entity that creates new runs from the model when the operator decides to start a new experiment (which I call "run"). This is done by the PCS using the LabView API 32-bit of the "create pulse"
No, this is done sometimes when we need to apply structural changes to the tree, and it is not done by the PCS software itself but by our management scripts written mostly Python but we have also LabView "scripts". As I said above, are not even run from the Windows server but from our PC. Cheers, |
Hallo zack, thaks for the tip. At the moment from the point of viewof MDS+ we don'T have any subsystems because our distributed IO does not send independently data to the central node (Windows Server, aka PXI). Instead the PCS collects itself data from the actual subsystems and then writes the data into the tree in a very linear for loop. In other words we have only one writer. As the system grew with time and we are starting to experience latency problems I am courios to know your opinion about parallelizing the MDS+ writes in different threads to make them faster. However I think that we will still have one writer. Cheers, |
Regarding the Python versions: PC of mine: 3.11.5 Cheers, |
Hi @alkhwarizmi, Three topics: your old trees, your recent posts, and my continuing experiments. Old Trees Your Recent Posts My understanding is that the Windows 10 server is only running 32-bit MDSplus, correct? And that the problems your colleague encountered installing 32-bit and 64-bit MDSplus only applies to the two Win10 client PCs, correct? And that the 32-bit version is from the same MDSplus release as the 64-bit version, correct? I am curious to know what your colleague did to workaround the installer issue (that was mentioned in your post of 13-Mar-2024). If you use Windows Explorer to search your entire C:\ drive for My Experiments |
Hallo, first some answer to your last post:
yes < And that the problems your colleague encountered installing 32-bit and 64-bit MDSplus only applies to the two Win10 client PCs, correct? yes
yes
Ahh, you mean 7.b... Can't answer now, I think he had to change the system path variable every time he had to switch operations between Python and LabView. However, I also experienced the probem and my method to overcome it was "wait and pick a newer MDS+ installer" in the hope it did solve the problem, which is what in fact happened. :) So I did not raise anything here, thinking that this problem was already solved (which I think it is). Instad, regarding to the original point (old trees), I was about to confirm things and I have stumped into a very surpising "feature" that could even point to some unexpected problem. Here is how the story goes. I copied the 3 old data files referring to run 1934 from the Windows server, placed them on my local test tree and renamed appropriately.
With my great surprise, however, the script did actually access correctly all the data, as you can see from the output below.
Later I found out that how to reproduce the bug, which can be done just run the above commands one by one in the shell !!!
Moreover, I noticed that sometimes, after a kernel reset, the script behaves unpredictably generating errors, for example (sometimes errors are even more):
My conclusion is that maybe something on the Python interface is not working as intended or that network parameters (like timeouts) affects the communication. I see a possible wrong interaction among different multi thread or multi process libraries. Let me know what do you think and thanks again for the support! Cheers, |
Hi @alkhwarizmi -- Thank you for the additional information! You have found an important clue. I will install the Spyder IDE and see if I can reproduce the problem. Now for the details . . . Old Trees New Conjecture Workaround
|
Hallo, so it seems to make sense to me. For the moment the option at our site are to run the scripts in a non-interactive environment (no ipython) or using the "run" command. We also have to option to upgrade the software from 7.50 to a more recent version. I think this is recommended but is not really the source of the problem. In the long run, however, I also think that it is an opportunity for MDS+ community to upgrade mdsip to support multi-threaded applications. Sincerely, |
Hi @alkhwarizmi, You are correct that You are also correct that upgrading your site's Windows server to a new version (instead of the existing Even though you have identified the root cause of the problem, I will keep this issue open until I do the following:
|
Actually mdsip does support a kind of mutli-threaded access. The thing is it had to be backwards compatible and hence one could not simply create a connection per thread automatically. Nonetheless, this would still require server side support for system wide ranged file locking (like OFD locks) as the mdsip may spawn processes to serve multiple connections. edit: according to the TestDevice's comment:
The copy is used to transform a global tdi context into a local context. The way it is used here would not create a thread private connection. Anyway, mdstcpip/mdsipshr/mdsipthreadstatic.h still suggests that we may have thread private connections. |
Hi @zack-vii -- Thanks for the additional details. I will follow up with Josh. |
Client = Ubuntu 20 with recent MDSplus; Server = same. A quick test with ipython3 was able to do Will repeat this experiment on Windows. |
Dear all,
I don't feel comfortable to debug the following puzzling condition we recently discovered at our site. The problem is not blocking so far. For unkown reason our control software (mainly written in Labview) keeps going. However I find this quite alarming.
I need some help for debug and fix, if possible.
Thank you very much in advance.
Sincerely,
Affiliation
Eurac Research, Institute for Renewable Energy, Heat Pumps and Energy Exchange Laboratories
Version(s) Affected
Client side: 'stable_release_7.132.0' (my PC), 'alpha_release_7.139.8' (colleague of mine PC)
Server Side: 7.50.1 (Windows), 7.132.0 (GNU/Linux)
Platform
Client side Windows, server side Windows and GNU/Linux
Describe the bug
We recently discovered that access to some past run data seems to be broken.
To Reproduce
Steps to reproduce the behavior:
From my colleague PC towards Windows server:
Today from my PC towards Windows server, same run number:
Another example also tested with jTraverser, same run number different node (WFRM00 is broken also in the jTraverser):
Today from my PC towards GNU/Linux server, which is OK (yesterday it was not, if I am not mistaken):
Expected behavior
Like the third example in all conditions.
Screenshots
Access via think client (new jTraverser seems to provide different results)
Additional context
The text was updated successfully, but these errors were encountered: