-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug in explain_forecast()
#425
Conversation
…ich was present in shapley_est data.table. Added id and horizon column. 2. Updated the print to remove both explain_idx and horizon from the printouts for forecast. Maybe we want the horizons, but then we need to rewrite the print function as horizon does not have a standard deviation. 3. Fixed that created matrix did not take into consider that the number of rows is not equal to n_explain when forecasting, but rather n_explain times horizon.
To see that bug no longer occurs, run:
|
…he shap column names with and without none.
I am confident that the test that fails now are actually correct, in the sense that the output of the old runs were incorrect. Meaning that @martinju can accept the changes to the test output and merge. |
explain_forecast()
when verbose = "shapley"
explain_forecast()
In this PR, we fix 2 main bugs in
explain_forecast()
:A: Using
verbose = "shapley"
, caused the function to crash.B: Inconsistencies related to the number of coaltions used and shapley values computed with
explain_forecast()
.Bug A stemmed from :
There was an inconsistency in that
internal$iter_list[[iter]]$dt_shapley_est
contained the index columns"explain_idx", "horizon"
whileinternal$iter_list[[iter]]$dt_shapley_sd
did not. We added the index columns to the latter.When creating a data.table to print with the iterative Shapley values estimates with standard deviations, the code did not take into consideration that the number of rows is not
n_explain
when doing forecast, but rathern_times
xhorizon
. This is now fixed by using the number of rows ininternal$iter_list[[iter]]$dt_shapley_est
instead.To be consistent with regular Shapley value, we remove
"explain_idx", "horizon"
when printing the iterative Shapley value estimates and standard deviation. In regular Shapley we omit"explain_idx"
.Note. It might be a good thing to include
"explain_idx"
(and"horizon"
), but then the print function needs to be rewritten to reflect that"explain_idx"
(and"horizon"
) is not an estimate with a standard deviation.Note 2. There is an inconsistency between
explain
andexplain_forecast
, where the former useexplain_id
, while the latter useexplain_idx
.Bug B was fixed by defining the number of coaltions to be the number of coalitions used per horizon. We also introduced a new parameter
n_sampled_coalitions
which for explain_forecast() is the number of coalitions per horizon -2 (zero and full prediction), and for regular explain() it is the number of coalitions. A number of additional minor fixed related to the number of shapley values computed with/without grouping was also added. As a side-effect this also fixed the incorrect verbose printout forexplain_forecast()
.PS.
print_dt <- as.data.table(matrix(paste(matrix1, " (", matrix2, ") ", sep = ""), nrow = nrow(matrix1)))
produce extra whitespace at the end. Should change to, e.g.,
print_dt <- as.data.table(matrix(paste0(matrix1, " (", matrix2, ")"), nrow = nrow(matrix1)))
.