-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some calling routines do not properly handle SsINTERNAL, C_ERROR, and/or FALSE returned by low-level routines #2741
Comments
If the goal of The issue with the current code is that it masks errors by treating |
SsInternal is used to flag a situation that requires internal handling. i think it is used in remoteacces of treeshr to indicate a remote operation failed because of x is not supported so the code will try method y... something like that. |
that said i agree the ok, error, true, and false should bbe consolidated tto follow one schema. but earlier attempts failed is it was either unclear if the methods are used by users (so we cannot chamge behaviour) and it was not always clear if 0 means success or false. we ended up replacing 0, 1 and -1 with the names you listed in order to help with that. however, noone dared to switch true/false to success/error or vice versa because we were afrait we would break code. today we have a better separation of public and private headers. so its worth checking if former concerns still hold |
Hi @zack-vii -- Thanks for the history. If you would prefer to discuss this via a Zoom meeting, let me know and I'll arrange to be available at whatever time is convenient for you. (There are times when a complex topic is easier to handle with a discussion than with the written word.) Reverting to the written word, I will try again to communicate the issue. In my opinion, there is a difference between what we expect There are three separate issues here.
|
Affiliation
MIT PSFC
Version(s) Affected
all
Platform
all
Describe the bug
The entire MDSplus code base uses different standards for error codes.
-1
to indicate an error (i.e., accesserrno
for the code)C_OK
(=0) andC_ERROR
(= -1)TRUE
(=1) orFALSE
(=0)B_TRUE
(=1) orB_FALSE
(=0)SsINTERNAL
(= -1)MDSplusSUCCESS
andMDSplusERROR
(there are many more too)Often the status returned by a function is immediately checked with one of the following "define" macros.
IS_OK
STATUS_OK
IS_NOT_OK
STATUS_NOT_OK
However, the above macros only work with MDSplus codes, which consist of a 32-bit integer containing 3 fields (facility ID, message ID, and severity code). Note that the low-order bit is important. Success = 1 in the bit, Failure = 0 in the bit.
The problems arise when non-MDSplus codes are passed to the macros. Specifically, the -1 of
SsINTERNAL
,C_ERROR
or an operating system error status. The value-1
=0xFFFFFFFF
has the low-order bit set, thus the macros will incorrectly treat-1
as a flavor of success. Which can cause error handling code to produce different results than what was intended when the code was originally written.This mix of error code conventions is a systemic source of errors in the entire code base. It would probably be worthwhile to at least make sure that all routines that return
SsINTERNAL
orC_ERROR
, are in turn called by functions that properly detect those return values and then map them into the appropriate MDSplus error codes.To Reproduce
This is an example that was spotted during the investigation of Issue #2731 and PR #2740. The low-level routine,
send_bytes()
can returnSsINTERNAL
. And potentially, it can ripple up the call stack all the way to the top level ofmdstcl
.The following links are to lines of code starting in
send_bytes()
and then climbing up the call stack.send_bytes()
SendMdsMsgC()
SendArg()
- bypasses error handling
GetAnswerInfoTS() can also return SsINTERNAL
- bypasses error handling
ServerSendMessage()
- bypasses error handling
ServerDispatchAction()
ServerDispatchPhase()
- bypasses error handling
TclDispatch_phase()
- bypasses error handling
Because the macros treat
SsINTERNAL
as success, the error bypasses several error handling sections in the call chain. (The analysis of GetAnswerInfoTS is not shown above, but is similar to that for SendArg.)Expected behavior
Any function that can return
SsINTERNAL
,C_ERROR
or other non-MDSplus codes, should be followed by a status check that also includes those values. In the above example, the following changes should be made.if (STATUS_OK)
becomesif ((status != SsINTERNAL) && STATUS_OK)
if (STATUS_NOT_OK)
becomesif (status == SsINTERNAL) || STATUS_NOT_OK)
The above example is the fifth or so instance of this bug that I have seen in the MDSplus source code. There are undoubtedly more bugs like this lurking in the source code. However, they are likely in seldom exercised error handling code sections. (Bugs like this that affected the mainstream workflow would have been reported by customers years ago and are surely already fixed.)
Screenshots
n/a
Additional context
There are three strategies for dealing with this issue.
SsINTERNAL
andC_ERROR
in the source code. Create a list of all the associated functions. Then create a second list of all routines that call the functions on the first list. Make sure that all functions on the second list correctly processSsINTERNAL
andC_ERROR
and only return MDSplus error codes. This is the safest approach, but also the most time consuming to implement.SsINTERNAL
andC_ERROR
. A simple and easy change, but is riskier because it is a global change.SsINTERNAL
to be-2
(i.e.0xFFFFFFFE
) so that if it is inadvertently passed into the macros, it will be treated as a failure and not as a success. (This is the same suggestion made in the comments on PR 2740, Fix: reduce open files due to dispatcher #2740 (comment) .) Also simple and easy, but risky because it is a global change.The "Don't Propagate" strategy should be feasible because the number of functions is fairly small.
SsINTERNAL
occurs 36 times, but in just 13 filesC_OK
is 23 times, 14 filesC_ERROR
is 67 times, 14 filesB_TRUE
is 20 times, 12 filesB_FALSE
is 36 times, 13 filesFALSE
andTRUE
are too numerous to inspect each occurrenceThe text was updated successfully, but these errors were encountered: