Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"compute_all_importances_cy" data type mismatch #3

Open
babak-kananpour opened this issue May 29, 2022 · 5 comments
Open

"compute_all_importances_cy" data type mismatch #3

babak-kananpour opened this issue May 29, 2022 · 5 comments
Assignees

Comments

@babak-kananpour
Copy link

Thanks for this great library. I followed the instruction in readme.md file and run the setup.

I get the following error when trying to test on the notebook "DataScope-Demo-1.ipynb":

...\datascope\importance\shapley.py in compute_shapley_1nn_mapfork(distances, utilities, provenance, units, world, null_scores, simple_provenance)
    219     n_test = distances.shape[1]
    220     null_scores = null_scores if null_scores is not None else np.zeros((1, n_test))
--> 221     all_importances = compute_all_importances_cy(unit_distances, unit_utilities, null_scores)
    222 
    223     # Aggregate results.

datascope/importance/shapley_cy.pyx in datascope.importance.shapley_cy.compute_all_importances_cy()

ValueError: Buffer dtype mismatch, expected 'int_t' but got 'long long'

It seems there is an issue with type when calling the compute_all_importances_cy function. It expects integer but receives float(double?).

I tried to modify compute_all_importances_cy in shapley_cy.pyx but I had no luck to fix this bug.

@xzyaoi
Copy link
Member

xzyaoi commented Jun 16, 2022

Hi, I just ran the notebook again but I didn't encounter the same issue, could you please try again with the latest notebook in the readme file? If the error persists, I can look deeper into this.

Best regards,

@babak-kananpour
Copy link
Author

Hi @xzyaoi , I still have the same problem with the new version. In the demo notebook when the line:
importances = importance.fit(X_train_dirty, y_train_dirty).score(X_test, y_test)
gets to run I face the error. what I did to ignore this problem was to use the python written function of "compute_all_importances" in "datascope/importance/shapley" instead of cython version "compute_all_importances_cy".
These variables are float "unit_distances, unit_utilities, null_scores" which is correct but shapley_cy.pyx expect these variable to be integer.

@xzyaoi
Copy link
Member

xzyaoi commented Jun 17, 2022

@babak-1990 Interesting, I still cannot reproduce this error, even with a newly created colab environment (see https://colab.research.google.com/drive/1RdArqm0ZpYR_Tq5rKMDu8U7KxsgNEhNl#scrollTo=8b974636-7c3e-4b82-8401-ff541a47a002).

I am now thinking this is due to your local compiler, which may have a different behavior about np.int (are you on Windows or Mac OS?). I have found a possible solution: eragonruan/text-detection-ctpn#380

However, I don't have a Windows PC at hand, could you please try to change the np.int to np.int64 or np.int32 (if np.int64 does not work out) in this Line https://github.com/easeml/datascope/blob/main/datascope/importance/shapley_cy.pyx#L30? Then after re-compiling, it should work.

If it works please let me know so I can release a stable fix on this. If it doesn't please also feel free to reach out!

Best regards,
Xiaozhe

@babak-kananpour
Copy link
Author

Hi @xzyaoi, I already read this potential solution and I tried to fix it by assigning different DTYPE but it didn't work out. My local computer OS is windows. you are right this is due to my local compiler. Changing this line https://github.com/easeml/datascope/blob/main/datascope/importance/shapley_cy.pyx#L30? won't fix the problem because the error happens before entering function compute_all_importances_cy in /datascope/importance/shapley_cy.pyx , however I gave it a try to be sure about it.

In that time, I though maybe changing lines


and
ctypedef np.float_t DTYPE_t

will fix the problem but it didn't.

@xzyaoi
Copy link
Member

xzyaoi commented Jun 19, 2022

@babak-1990, Thanks for your reply! I can reproduce this error with a Windows Setup. I am trying to fix this error, and it should be soon. I will update here if I made a progress :)

@xzyaoi xzyaoi self-assigned this Jun 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants