-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
include conf parameters and specific small data files in kedro package #1607
Comments
Hi @vitorpbarbosa7,
This actually is possible because you could configure |
For example, this seems to work in setup.py:
I think the key might be that you need to specify a full path to anything that's above setup.py in the directory hierarchy (hence the use of |
I just thought of another way of doing this: if you move everything from |
Thank you for your answers, they really helped. What worked for me was moving everything from src into the root of the kedro root package, as you said, using the source_dir = ".", and making also use of MANIFEST.in alongside with include_package_data in setup.py, for recursively include the files I wanted. And actually, for making this really work, I had to create a init.py package in root directory of the whole kedro package. |
Awesome, thanks very much for letting us know! Glad to hear it worked. |
@datajoely thinks we should consider again whether |
Not by default, but provide a way to do so if they really want to. |
I certainly think there's a good argument for this. From a QB user, I hear that some tools need you to provide everything in a single .whl file. Hence I think we need at least an official recommendation on how to get This relates very closely to the question of exactly what |
Easing the option to package conf would definitely be really useful. At the moment there's 2-3 options to package conf and needs 2-3 changes, simplifying this would decrease the deviation.Possibly moving |
Also note that setup.py is deprecated and we will have to move away from it in general |
@vitorpbarbosa7 would you be able to explain a bit more why you want to include the model pickles inside the package? |
Hi @vitorpbarbosa7 , the |
You will need to specify the config location via |
Description
The package I'm developing requires to include some parameters from yml files and some pickles from data 06_models folder.
Context
Developing a machine learning pipeline model which receives as input, when deployed, some parameters, and also the model serialized as pickle.
Possible Implementation
I tried to use package_data, data_files, package_dir in setup.py, and MANIFEST.in, but even with those, could not include the data files and conf yml files in the final wheel.
From my understading of how setup.py works, it will only search for resources in the sources parent directory, that's why, I think, the possible implementations i told, are not working.
Possible Alternatives
If it's not possible to refer to previous files from source in the directory tree, would it be possible to move the data and conf folders to src and change any parameter in kedro to look for those folders in that new address? (I know this is probably too much of changing of how the framework was though)
I read the documentation which says:
"Recipients of the .egg and .whl files need to have Python and pip on their machines, but do not need to have Kedro installed. The project is installed to the root of a folder with the relevant conf/, data/ and logs/ subfolders, by navigating to the root and calling:"
https://kedro.readthedocs.io/en/latest/tutorial/package_a_project.html
but that's exactly what I do not want. I really need more files to be packed together with source code.
Tried to solve this problem with some of said here:
https://stackoverflow.com/questions/32609248/setuptools-adding-additional-files-outside-package
But did not work
The text was updated successfully, but these errors were encountered: