-
-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect interpretation of Quality flag unsigned v signed int #37
Comments
Looks like it could be handled with a special case here Lines 672 to 678 in d4ab3b3
|
Agree; This is the query in the example above: from drms.client import Client
c = Client()
res = c.query('hmi.B_720s[2017.09.06_05:40:00_TAI-2017.09.06_06:30:00_TAI]', ['T_REC', 'TELESCOP', 'QUALITY'])
However, as @samaloney pointed out, if we query for records with a QUALITY value less than zero, then the same T_REC shows up: res = c.query('hmi.B_720s[2017.09.06_05:40:00_TAI-2017.09.06_06:30:00_TAI][? QUALITY < 0 ?]', ['T_REC', 'TELESCOP', 'QUALITY'])
So QUALITY should be cast as res['QUALITY'][0].astype('int64')
> 3221225472
res['QUALITY'][0].astype('int32')
> -1073741824 |
The Python client just converts string representations of integers into actual integers (without any sign conversion). In the case of HMI QUALITY keywords the strings contain (unsigned) hexadecimal integers and need some special handling because Pandas apparently only supports converting strings containing decimal numbers. The QUALITY keyword for HMI data defines a number of flags that can be represented by an unsigned 32-bit integer, where each bit represents a certain condition. For example: Having the highest bit ( This means that one can check for missing data with
or for images where an eclipse has occurred with
An extensive list of additional HMI Level 1.5 flags is available at: http://jsoc.stanford.edu/jsocwiki/Lev1qualBits Unfortunately JSOC introduced an inconsistent way of querying this keyword by interpreting the QUALITY bit field as signed (two's-complement) 32-bit integers, while returning unsigned hexadecimal numbers as a result. The signed number used in queries does not really have any special meaning, it is still just a combination of bits, where setting the highest bit results in the number being negative, which is why you can select all records with existing images using @mbobra already pointed out that you could convert the QUALITY using the
But I'm not sure if it would be a good idea to do this by default. As far as I can see, the only reason for this would be to check with
and
if the
and
respectively, which makes more sense in my opinion, because the Another issue is that this could also introduce some other inconsistencies: The actual results from JSOC are unsigned integers, which you can see by skipping the conversion for the QUALITY keyword:
Here the last entry is consistent with
while the negative signed integer would also result in a negative hexadecimal number:
Maybe we could just update the documentation and point out the inconsistency between What do you think? |
Thanks for the quick response @mbobra and @kbg! I probably should have linked to the original issue which that stared all this which was to ignore/skip missing data when trying to download JSOC data using SunPy FIDO client which is build on drms ( sunpy/sunpy#3735). I remembered our previous conversation about using adding
Yea this was the source of a lot of the confusion I didn't see how an unsigned 32-bit integer could be less than zero when doing something like What operations are supported inside For example example lets say I want to filter out missing data res = c.query(
'hmi.B_720s[2017.09.06_05:40:00_TAI-2017.09.06_06:30:00_TAI][? QUALITY > -2147483648 ?]',
['T_OBS', 'TELESCOP', 'INSTRUME', 'QUALITY']
)
print(res)
T_OBS TELESCOP INSTRUME QUALITY
0 2017.09.06_05:36:04_TAI SDO/HMI HMI_COMBINED 0
1 2017.09.06_05:48:04_TAI SDO/HMI HMI_COMBINED 0
2 2017.09.06_06:00:04_TAI SDO/HMI HMI_COMBINED 65536
3 MISSING SDO/HMI MISSING 3221225472 Now I know why this happens the additional flag res = c.query(
'hmi.B_720s[2017.09.06_05:40:00_TAI-2017.09.06_06:30:00_TAI][? QUALITY > -1073741824 ?]',
['T_OBS', 'TELESCOP', 'INSTRUME', 'QUALITY']
)
print(res)
T_OBS TELESCOP INSTRUME QUALITY
0 2017.09.06_05:36:04_TAI SDO/HMI HMI_COMBINED 0
1 2017.09.06_05:48:04_TAI SDO/HMI HMI_COMBINED 0
2 2017.09.06_06:00:04_TAI SDO/HMI HMI_COMBINED 65536 As by by definition for HMI |
A specific question how to query for HMI data that does not have the QUAL_NODATA (0x80000000) or QUAL_LOWINTERPNUM (0x00010000) bits set but may have other bits set?
|
So it looks like drms is interpreting the quality flag as unsigned int but the backend service views it as a signed 32 bit int.
So if I add
[? QUAILTY >= 0 ?]
I would expect to get the same results?Ok what if I try
[? QUAILTY < 0 ?]
?The only way I can make sense of this is if Quality is a signed 32 bit int as
and
The text was updated successfully, but these errors were encountered: