-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluating CLI command usage #1293
Comments
I have been thinking about this ticket because I think we should base it on how useful these dependencies are for users. Here's how our users have used CLI-commands which have dependencies in the project's
If you look at this table, we can ask some questions like:
|
Additionally @AntonyMilneQB messaged with this comment: With that said, part of Kedro’s current offering is that it simplifies setting up a data science project because it bundles all of the things you need ( If it turns out that we do want to keep some of these commands then I think the really belong in a CLI group together, say So personally I would be very happy if, as part of the CLI cleanup, we could either remove or at least hide away these commands 😄 |
And I'll respond with:
|
[AUTO-MERGE] Merge master into develop via merge-master-to-develop
Just a quick note on telemetry data for |
Also @yetudada did you deliberately not look at use of |
One more point for consideration. Keeping these requirements in requirements.txt adds a maintenance burden for keeping on top of dependencies, e.g. #1327 #1322. Moving dependencies to dev_requirements.txt would improve the situation, and obviously removing them altogether would remove the maintenance burden completely. |
Outcome of this ticket would be to check with users whether we can remove the mentioned CLI commands or not. That would give us evidence about whether we need dev-requirements or not. |
I think
|
Summary of survey results + telemetry data
Comments from users about
Comments from users about
Comments from users about
Comments from users about
Comments from users about
Comments from users about
|
Based on the above results my suggestions for next steps are as follows:
|
Thanks for all the research and the most excellent write up of all this! Do you have the number of unique calls to Based on this research, personally I would remove all these commands in favour of my alternative proposal (to follow in next post). If we don't want to go down that route then:
If we do keep some of these commands in kedro core then I think we should consider hiding them away inside a new group (name tbc) so they clutter the CLI less:
|
ProposalWe strip out all these commands from core kedro, all our current starters, remove packages from requirements.txt, etc. We (or someone else) make a new plugin
Important note: I say "we or someone else" because I think could be a very good candidate for an unofficial community-maintained plugin. Or we could get it started and then hand over, or we could maintain it but be much more relaxed in accepting updates than we are with the core kedro template. User journeyNote that Starting a new project:
In existing project:
Pros
References that inspired this idea: #826; #844 (reply in thread); kedro-org/kedro-starters#40 Cons
Overall I think both the above can be addressed by documentation and deprecating warnings. |
@AntonyMilneQB Thanks for sharing and elaborating on your thoughts! I like the idea of moving the commands we want to keep to a separate plugin if we're keeping a reasonable amount. If we decide we're removing most of them anyway, I'm not sure it's even worth creating this plugin. From your earlier comment it sounded like you'd only really want to keep |
Just to clarify: if we (or someone else) made this plugin, I think it could still contain the build-docs, build-reqs and test functionality. If it's not part of the core kedro package I feel much more relaxed about putting in functionality that is deemed useful for software engineering best practice but not commonly used at the moment. Also, slightly simpler than a whole plugin: my original thought was that this could be done with just a custom (ideally community-maintained) starter that would provide:
I still think this is fine, just doing it as a plugin is slightly neater since that way you also get a nice |
Thanks @MerelTheisenQB and @AntonyMilneQB for the research, and excellent summary. |
This stellar research @MerelTheisenQB 🎉 Thank you so much for this. I've read through your thoughts; thanks for the additions @MerelTheisenQB, @AntonyMilneQB and @NeroOkwa. Here are my summary thoughts:
So with these changes, we still maintain some of our value proposition. But it's clear our users like other parts of Kedro and not necessarily our support of tooling that helps them create better software. |
@AntonyMilneQB--and everybody else in favor of killing I do think a non-trivial part of the value proposition of Kedro is introducing users to and enforcing basic software engineering best practices. Anecdotally, I remember being one of the first people on the client-facing side in QuantumBlack (at least in North America?) to enforce coding standards on projects over 3 years ago. More recently, I think a much larger percentage of projects do so--yes, in parallel to greater adoption of tools like Black and isort by the broader community, but also I think at higher rates due to it being bundled with Kedro. In my humble opinion, Kedro has always been opinionated, and part of the value comes from this opinionated nature. I think removing the However, going so far as to remove linter configuration needs to be considered much more carefully:
When it comes to With regards to making these commands part of an optional plugin, if anything, I would make them part of a plugin installed by default. This doesn't reduce the maintenance burden, and you can't outsource the development of such a core plugin to the broader community, but it does make it optional for those who may have good reason to not want the functionality. Finally, FWIW, I think the set of core tools included as part of the project template is actually pretty good. While I personally would include things like mypy and Prettier, I think the black/isort/Flake8 suite is a good representation of "the bare minimum" to say you're following software engineering best practices on this front, without being too difficult for the majority of users to follow. Outsourced to the community, I'm sure somebody will throw in random things like |
I've created follow up tickets for all actions mentioned in the discussion above:
Please add any further comments to the specific follow ups. |
As per discussions #1077 and #844 we want to separate dev requirements into their own file outside requirements.txt. This means any requirements that are not required for
kedro run
or running a packaged kedro project (which only exposeskedro run
). Remember in v0.18 there is nokedro install
command.kedro build-docs
) to do the right thingkedro lint
) do the right thingThings to check and think about:
kedro lint
,kedro build-docs
etc. commands need to run a pip install to install dev requirements? Is this annoying and slow? Is Waylon's idea of pre-commit a better alternative?kedro pipeline
,kedro catalog
have any dependencies outside kedro, and if so where should those dependencies go (requirements or dev requirements)?kedro build-reqs
do?Also worth checking whether any kedro requirements should be moved to project requirements. e.g. pip-tools is there but after deprecation of
kedro install
can presumably be moved to a project requirement? Do we need jupyter_client in kedro requirements? etc.The text was updated successfully, but these errors were encountered: