-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add link to open notebooks in mybinder #190
base: master
Are you sure you want to change the base?
Conversation
I think opening the notebooks via mybinder is the easiest way to get started, no need to install anything and the code is executable to play around with.
Can maybe be combined with #105 which proposes something similar but using another service. |
Hi @BioGeek , Thank for your contribution! I really loved Binder, and I actually started the repo with it activated, but sadly I had to remove it a year ago because it was too unstable at the time (see 002fd66). Moreover, it was pretty hard to make chapter 16 work properly because it requires a headless X server, and other tweaks, I really struggled to have a deterministic output. I would need to spend some time testing this again, and unfortunately I don't have much time right now. However, if you would be so kind as to make sure that all the notebooks work well within Binder, especially chapter 16, then I would more than happily merge this PR. Thanks again, |
Hi Sorry for commenting here , but somehow not able to do at right place.. Seems the updated code for end to end machine learning , chapter 2 is not working for below.. I have mentioned the error as well ...Sorry , I am very new to machine learning .. cat_attribs = ["ocean_proximity"]
old_cat_pipeline = Pipeline([
('selector', OldDataFrameSelector(cat_attribs)),
('cat_encoder', OneHotEncoder(sparse=False)),
])
#Join both the above pipeline in single pipeline
from sklearn.pipeline import FeatureUnion
old_full_pipeline = FeatureUnion(transformer_list=[
("num_pipeline", old_num_pipeline),
("cat_pipeline", old_cat_pipeline),
])
old_housing_prepared = old_full_pipeline.fit_transform(housing)
old_housing_prepared Error....... File "/anaconda3/lib/python3.6/site-packages/sklearn/preprocessing/data.py", line 1809, in _transform_selected
X = check_array(X, accept_sparse='csc', copy=copy, dtype=FLOAT_DTYPES)
File "/anaconda3/lib/python3.6/site-packages/sklearn/utils/validation.py", line 433, in check_array
array = np.array(array, dtype=dtype, order=order, copy=copy)
**ValueError: could not convert string to float: 'NEAR BAY'** |
Hi @Jiltedboy, it looks like you're importing OneHotEncoder from sklearn.preprocessing. That version does not yet support string categorical inputs, so import OneHotEncoder from the future_encoders module provided here: https://github.com/ageron/handson-ml/blob/master/future_encoders.py |
@daniel-s-ingram Ok , Got it. But LabelBinarizer() should work , thats also not working.
Executing this line
Is there not a way to use LabelEncoder and OneHoteEncoder inside the Pipeline? I thought LabelBinarizer is one option but it seems not working... |
Hi @Jiltedboy , One workaround is to wrap the class PipelineFriendlyLabelBinarizer(LabelBinarizer):
def fit_transform(self, X, y=None):
return super().fit_transform(y=X.ravel())
def transform(self, X):
return super().transform(y=X.ravel()) Note that this transformer will only be able to handle one column at a time, because that's one limitation of |
No problem, @ageron! |
@daniel-s-ingram @ageron Thank You both for your time and reply. It really means a lot for me. And Yes, going to use future_encoders.py as of now :) Sorry for bothering you :) |
@Jiltedboy is the problem solved |
@rovin235 Yes, problem solved. |
I think opening the notebooks via mybinder is the easiest way to get started, no need to install anything and the code is executable to play around with.