Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Reinstallation of models does not work as expected #253

Open
edwardchalstrey1 opened this issue May 18, 2022 · 14 comments · May be fixed by #346
Open

[BUG] Reinstallation of models does not work as expected #253

edwardchalstrey1 opened this issue May 18, 2022 · 14 comments · May be fixed by #346
Labels
bug Something isn't working
Milestone

Comments

@edwardchalstrey1
Copy link
Contributor

Describe the bug

Previous attempts #223 & #249 have not resulted in expected behaviour.

How To Reproduce

First, install a model from scivision catalog, where model_repo is the link to. the pip installable GitHub repo.

load_pretrained_model(model_repo)

Then, modify the code in the model_repo, push changes to GitHub and reinstall in scivision:

load_pretrained_model(model_repo, allow_install="force")

Expected behaviour

Changes made to the model code should be present when the model is used after reinstallation.

Currently, you have to manually do pip uninstall model_package_name and then re-run load_pretrained_model for changes to be picked up.

@edwardchalstrey1 edwardchalstrey1 added the bug Something isn't working label May 18, 2022
@edwardchalstrey1 edwardchalstrey1 changed the title [BUG] [BUG] Reinstallation of models does not work as expected May 18, 2022
@edwardchalstrey1
Copy link
Contributor Author

@ots22 I noticed our previous solution still doesn't quite have the desired effect, I can have a look at this a bit later

@edwardchalstrey1
Copy link
Contributor Author

Participants in a workshop where the aim was to add a model package to the scivision catalog found that when they were editing (new commits) the model package, then using load_pretrained_model <- it didn't automatically reinstall when there were newer commits of the model package (github repo). We should probably make this the case to help developers. Ideally, we probably don't want to force them to remember to have to set the allow_install flag.

Can we make it so the behaviour of load_pretained_model is:

  1. Checks if the package exists in the python environment and installs if not
  2. If is package already exists, checks the url for new commits and installs if so

Perhaps the only way to make the above work is if the model developer is careful to increment their package version with each new commit (or PR, feature, bug fix etc) - then this might be easier to get pip to make sure it installs the latest version (if installing the latest GH commit is not a feature pip supports), but we could then ensure the model developer docs are clear that this is what you should do

@ots22
Copy link
Member

ots22 commented Oct 6, 2022

Is the problem that the PRs were buggy, or that we now want different behaviour?

Actually, is load_pretrained_model without allow_install ever useful? The distinction between alllow_install=True and "force" is also not so clear to me now. The explicit opt-in to a regular python function messing with your installation does seem like a good idea, but maybe the whole function could be renamed to make this clearer.

Can the desired behaviour be achieved just with the various arguments to pip? This seems preferable to doing any kind of version checking and reinstallation of packages ourselves.

@edwardchalstrey1
Copy link
Contributor Author

edwardchalstrey1 commented Oct 6, 2022

The "force" option means that the package will be re-installed, even if it already exists (as will dependencies) - what I think I found previously (testing this now) was that this didn't result in any newer version of the package being installed if it existed, despite the fact we're giving pip these args: pip_install_args=["--force-reinstall", "--no-cache-dir"] - Also it really slows the process down

Instead, perhaps we could simply do pip install <package_name> --upgrade behind the scenes when calling load_pretrained_model (and get rid of allow_install completely)

@edwardchalstrey1
Copy link
Contributor Author

The explicit opt-in to a regular python function messing with your installation does seem like a good idea, but maybe the whole function could be renamed to make this clearer.

hmmm, good point, well we could separate into 2 separate functions, or even just encourage to do pip install directly... guess this comes down to the idea of scivision having minimum lines of code

@edwardchalstrey1
Copy link
Contributor Author

edwardchalstrey1 commented Oct 6, 2022

In fact, if you don't provide the allow_install flag and the package isn't already installed, we raise this exception - so perhaps this should be the way to install a model?

Screenshot 2022-10-06 at 11 22 39

@edwardchalstrey1
Copy link
Contributor Author

edwardchalstrey1 commented Oct 6, 2022

^ could even have a function like

get_install_code('https://github.com/alan-turing-institute/scivision_classifier')

which simply prints out:

Copy the following line of code into a new notebook cell (or remove the "!" and run in a terminal window):
!pip install -e git+https://github.com/alan-turing-institute/scivision_classifier@main#egg=scivision_classifier

@ots22
Copy link
Member

ots22 commented Oct 6, 2022

Some other things that seem like advantages of installing packages 'outside' scivision:

  • It makes the responsibility for package installation clearer, especially if something goes wrong.
  • Error reporting is more familiar (directly from pip, rather than via our function)
  • Less surprising to users (a function that messes with packages/environments from within python is a bit unusual)
  • It would remove some code from scivision, and let us focus on other things

Disadvantage:

  • means there is a 'manual' installation step (and get_install_code doesn't work in a script)

@ots22
Copy link
Member

ots22 commented Oct 6, 2022

As things are now, whether we auto-install or not, we still have to load the model yml file and import the right package (it just might not be necessary to install it automatically), so that stays the same I think

@ots22
Copy link
Member

ots22 commented Oct 6, 2022

In fact, if you don't provide the allow_install flag and the package isn't already installed, we raise this exception - so perhaps this should be the way to install a model?

It seems like in pretty much all of the examples, we can't assume the model is already installed, so need this argument anyway.

@ots22
Copy link
Member

ots22 commented Oct 6, 2022

The "force" option means that the package will be re-installed, even if it already exists (as will dependencies) - what I think I found previously (testing this now) was that this didn't result in any newer version of the package being installed if it existed, despite the fact we're giving pip these args: pip_install_args=["--force-reinstall", "--no-cache-dir"] - Also it really slows the process down

Instead, perhaps we could simply do pip install <package_name> --upgrade behind the scenes when calling load_pretrained_model (and get rid of allow_install completely)

The "force" way definitely seems like a kludge. I wonder why we didn't think of --upgrade before, and why the various pip arguments don't get the latest version?

@edwardchalstrey1
Copy link
Contributor Author

Agree with your thoughts, I'll work on a PR for this and can see what people think

@ots22
Copy link
Member

ots22 commented Aug 29, 2023

Did #533 fix this?

@IFenton

@IFenton
Copy link
Contributor

IFenton commented Aug 29, 2023

I haven't tested it, but I wouldn't expect removing force reinstall to mean that model updates are installed correctly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
4 participants