-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more helpful error when a non-existent id is passed to extractors #3052
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works well for me, I have one clarifying question but LGTM!
@@ -128,8 +128,18 @@ def ids_to_indices( | |||
indices = np.arange(len(self._main_ids)) | |||
else: | |||
assert isinstance(ids, (list, np.ndarray, tuple)), "'ids' must be a list, np.ndarray or tuple" | |||
|
|||
non_existent_ids = [id for id in ids if id not in self._main_ids] | |||
if non_existent_ids: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if non_existent_ids: | |
if not np.all(np.isin(ids, self._main_ids)): | |
non_existent_ids = ids[~np.isin(ids, self._main_ids)] | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not seem to work, I really don't think it will make a differnce in performance fo this kind of sizes:
ids
[0, 1, 2]
self._main_ids
array(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12',
'13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23',
'24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34',
'35', '36', '37', '38', '39', '40', '41', '42', '43', '44', '45',
'46', '47', '48', '49', '50', '51', '52', '53', '54', '55', '56',
'57', '58', '59', '60', '61', '62', '63', '64', '65', '66', '67',
'68', '69', '70', '71', '72', '73', '74', '75', '76', '77', '78',
'79', '80', '81', '82', '83', '84', '85', '86', '87', '88', '89',
'90', '91', '92', '93', '94', '95', '96', '97', '98', '99'],
dtype='<U2')
np.isin(ids, self._main_ids)
array([ True, True, True])
Do you want me to push in that direction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually a very myserious behavior:
np.isin([0, 1], np.asarray(["0", "1", "2", "20"]))
array([False, False])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, Sam, as this behavior is buyg in numpy let's keep the list based thing, it is half a millisecond for a reasonable call:
And I also profiled the case with 1000 channels and it is around 1 millisecond. This means we will have to make thousands of such calls to make a dent.
Once the numpy thing is fix we can profile and change.
Sorry man for the time you lost on this really small details but at least you will be a numpy contributor! |
Je vis pour le débat, mon ami |
As in the title, related to
#3049
Looks like this: