-
-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MergeError when executing "woebin" function #70
Comments
@kendalvictor I think you have to downgrade Pandas at 0.25.0. But, before you downgrade, in Pandas |
Hi @Okroshiashvili the solution was to lower the version of pandas to 1.1.3, but ideally, this error should be taken into consideration for a version of this library since currently its "woebin" function does not work in version 1.2.0 of pandas |
I think it's not surprising to have version incompatibility. I hope maintainers will solve this problem but until then if your problem is solved, please close this issue :) |
Solved after pandas library version change from 1.2.0 to 1.1.3 |
The bug should be fixed. Please check the latest version on the Github. |
I am still having problem while using with pandas 1.3.4, do we have any new work around? |
Please install the latest version on GitHub and try again. It should be fixed. |
Hi,
few days ago after updating the PANDAS library to version 1.2.0, the "woebin" function of scorerapy version '0.1.9.2' stopped working.
When trying to execute it, the error is seen:
MergeError Traceback (most recent call last)
in
----> 1 cortes = sc.woebin(
2 data[
3 (data[col_target].notnull())
4 ].drop(
5 [col for col in data.columns if 'target' in col and col != col_target] + col_no_review,
C:\ProgramData\Anaconda3\lib\site-packages\scorecardpy\woebin.py in woebin(dt, y, x, var_skip, breaks_list, special_values, stop_limit, count_distr_limit, bin_num_limit, positive, no_cores, print_step, method, ignore_const_cols, ignore_datetime_cols, check_cate_num, replace_blank, save_breaks_list, **kwargs)
956 print(('{:'+str(len(str(xs_len)))+'.0f}/{} {}').format(i, xs_len, x_i), flush=True)
957 # woebining on one variable
--> 958 bins[x_i] = woebin2(
959 dtm = pd.DataFrame({'y':dt[y], 'variable':x_i, 'value':dt[x_i]}),
960 breaks=breaks_list[x_i] if (breaks_list is not None) and (x_i in breaks_list.keys()) else None,
C:\ProgramData\Anaconda3\lib\site-packages\scorecardpy\woebin.py in woebin2(dtm, breaks, spl_val, init_count_distr, count_distr_limit, stop_limit, bin_num_limit, method)
720 if method == 'tree':
721 # 2.tree-like optimal binning
--> 722 bin_list = woebin2_tree(
723 dtm, init_count_distr=init_count_distr, count_distr_limit=count_distr_limit,
724 stop_limit=stop_limit, bin_num_limit=bin_num_limit, breaks=breaks, spl_val=spl_val)
C:\ProgramData\Anaconda3\lib\site-packages\scorecardpy\woebin.py in woebin2_tree(dtm, init_count_distr, count_distr_limit, stop_limit, bin_num_limit, breaks, spl_val)
482 '''
483 # initial binning
--> 484 bin_list = woebin2_init_bin(dtm, init_count_distr=init_count_distr, breaks=breaks, spl_val=spl_val)
485 initial_binning = bin_list['initial_binning']
486 binning_sv = bin_list['binning_sv']
C:\ProgramData\Anaconda3\lib\site-packages\scorecardpy\woebin.py in woebin2_init_bin(dtm, init_count_distr, breaks, spl_val)
274
275 # dtm $ binning_sv
--> 276 dtm_binsv_list = dtm_binning_sv(dtm, breaks, spl_val)
277 dtm = dtm_binsv_list['dtm']
278 binning_sv = dtm_binsv_list['binning_sv']
C:\ProgramData\Anaconda3\lib\site-packages\scorecardpy\woebin.py in dtm_binning_sv(dtm, breaks, spl_val)
113 # sv_df = sv_df.assign(value = lambda x: x.value.astype(dtm['value'].dtypes))
114 # dtm_sv & dtm
--> 115 dtm_sv = pd.merge(dtm.fillna("missing"), sv_df[['value']].fillna("missing"), how='inner', on='value', right_index=True)
116 dtm = dtm[~dtm.index.isin(dtm_sv.index)].reset_index() if len(dtm_sv.index) < len(dtm.index) else None
117 # dtm_sv = dtm.query('value in {}'.format(sv_df['value'].tolist()))
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\reshape\merge.py in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate)
72 validate=None,
73 ) -> "DataFrame":
---> 74 op = _MergeOperation(
75 left,
76 right,
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\reshape\merge.py in init(self, left, right, how, on, left_on, right_on, axis, left_index, right_index, sort, suffixes, copy, indicator, validate)
648 warnings.warn(msg, UserWarning)
649
--> 650 self._validate_specification()
651
652 cross_col = None
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\reshape\merge.py in _validate_specification(self)
1301 )
1302 if self.left_index or self.right_index:
-> 1303 raise MergeError(
1304 'Can only pass argument "on" OR "left_index" '
1305 'and "right_index", not a combination of both.'
MergeError: Can only pass argument "on" OR "left_index" and "right_index", not a combination of both.
The text was updated successfully, but these errors were encountered: