-
-
Notifications
You must be signed in to change notification settings - Fork 333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix all the models #266
Comments
Can we get PRs for the hotfixes? |
I suggest we also rationalize the examples. We could remove
as redundant or not adding much value (in the last case) |
Also we could rename
|
Just for reference, what's the approach here? Are you following the steps in FluxML/FluxML-Community-Call-Minutes#9 ? So, is this to track hot fixes to get things running, and later we'll refresh the files to use the latest packages in the ecosystem? |
Yes, steps as mentioned in FluxML/FluxML-Community-Call-Minutes#9. This issue is to keep track of the test results, but I plan to note down other issues found along the way. For the models that fail to run I do update the dependencies before writing a fix. And if a package is the source of the problem I intend to replace it with the latest ecosystem package one. Take for example the cifar10 model, this one can not be used with the newest flux release due to the dependency on Metalhead for it's dataset, with MLDatasets it can actually be tested on flux 0.11.4. |
I wouldn't remove the models, those are some of the better maintained models, compared to how the some others have bloated argument handling taking away from the core of the solved problem. Removing those as redundant seems like the better approach no? |
I do agree, some of the "simpler" models could actually serve as a great example of a "my first model". But I do think we should have that discussion in the coordination tracker issue, and keep this issue about the quick fixes to get the models running again. |
The other/flux-next model is a bit problematic. This one is rather old and seems to explain a 'new' api for the optimisers, but this does not work (anymore). It probably needs a full rewrite, but than it wouldn't really add something to the official Flux docs. I would suggest to just remove it from the zoo or does anyone think it still has a place here? |
For now, we can leave it be, it will be part of the Optimisers.jl release. It's not expected to be a guarantee but maybe adding a readme saying so would be alright |
Are you sure? It does not seems to match with the api from the Optimisers.jl package (it uses a step! function).. |
It's from a flux pr that we plan to merge, but it's a breaking release so may not do it immediately. |
Can we add an item in the tracker about adding guiding texts with the models that talk about: |
cc @darsnack ^ |
Sure, did you mean zoo tutorials? |
Basically having text pointing users to how to use certain Flux features, and also guiding them about different problems beings solved (recurrent nets, basic features and regression, custom train loops etc.) If we can use the Literate.jl format, we would be in good stead to automate moving them to the site. |
We can remove contrib/games/cartpole and pendulum, curated Flux's based examples are available in https://github.com/JuliaReinforcementLearning/ReinforcementLearningZoo.jl |
Those are not there for the RL, they are there to show what the dP equivalent would look like, so I would update then with the literature and maybe add them as ci examples, because I'm sure recent changes have to support this. |
Now that all the models are tested I created a new issue (#280) to start giving some direction to the next steps and address the last comments made here. |
We still need to update Metalhead I take it. |
My comment wasn't meant to skip steps. I'm just trying to prevent doing work now that will be flushed down the drain later in the process. |
Sure, i was ensuring that that was a pending item that's blocking some of the updates. |
Metalhead doesn't bound flux, also I have released patches to both. We might want to consolidate the environments and only have one at the top level. It's hard to keep track otherwise |
Do you mean a top level project and manifest file for all the models/tutorials? That could become problematic when one of the tutorials depends on a package that is not compatible with the newest Flux/CUDA. It could hold back updating of all the models until that external package has been made compatible. Transformers.jl is an example, if one tutorial would depend on it everthing will be held back. |
That's not been a big issue since we only need to activate the env, and updating the global env is usually painless, whereas individual ones usually end up becoming inconsistent very quickly and have the same issue of holding back updates. With the added annoyance of having to update many of them for every minor change in a dependency. |
We used to have this pattern before and that's where this learning comes from.. |
I agree that a single Project.toml and Manifest.toml is better. Allowing certain tutorials to fall out of compatibility with the latest Flux is not what we want. Every tutorial should be running on the latest Flux or the latest Flux/the tutorial should be fixed. The flow should be "Release Flux" --> "Update zoo" --> "Release zoo." |
My only concern is that a user needs to install lots of packages to run one specific example. Maybe it is better to have a script that automates bumping Flux for all the tutorials. |
Yeah mine too, but that's still not too bad. For instance, things like CUDA would dominate the space/ network usage for most users and adding packages doesn't take too long these days.
It depends, we would end up doing some benchmarking as well and see changes with individual prs with #278 |
What's the last column for? |
I really like the MNIST models. So please do not remove if you can. The reason being, they are pry the only model that can run reasonably on a CPU. Other vision models need GPUs for good accuracy. I will set up this expection from the MNIST example.
|
Ideally any model in the zoo that can run on GPU should also converge just as well on CPU. I don't think the MNIST models are going anywhere either! As for the autoencoder example, we should probably have both dense and conv examples. That may be redundant with the VAE models though. |
I started testing all the models in this repo against the flux 0.11.4 release on julia 1.5 (ref FluxML/FluxML-Community-Call-Minutes#9). In this issue i will collect all encoutered problems and try to list all other open issues for each model. The tests are ran against my fork which contains the updated manifests and hotfixes needed to actually run some of the models.
👍 = runs without errors on julia 1.5 + flux 0.11.4 and shows decent results (converging loss, ok looking generated images, ..)
❕ = does not run on julia 1.5 + flux 0.11.4 or gives wrong results, but PR with fix available
❗ = does not run on julia 1.5 + flux 0.11.4 or gives wrong results
all
Base.@kwdef
mauro3/Parameters.jl#89. None of the models seem to depend on features not provided by the Base implementation.vision/cdcgan_mnist: 👍
vision/cifar10: 👍
vison/cppn: 👍
vison/dcgan_mnist: 👍
vison/lenet_mnist: 👍
Flux.outputsize
instead of manually calculating the output size of the layersvison/mnist/mlp: 👍
vison/mnist/conv: 👍
vison/mnist/autoencoder: 👍
vison/vae_mnist: 👍
text/char-rnn: 👍
text/lang-detection: 👍
text/phonemens: 👍
text/treebank: 👍
other/housing: 👍
other/iris: 👍
other/fizzbuzz: 👍
other/flux-next: ❗
step!
optimiser apiother/bitstring-parity: 👍
tutorial/60-minute-blitz: 👍
tutorial/transfer_learning: ❕
contrib/audio/speech-mlstm: ❗
contrib/games/cartpole: ❗
contrib/games/pendulum: ❗
contrib/games/trebuchet: ❗
The text was updated successfully, but these errors were encountered: