-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feat] Add dynamic data and caching to data manager #398
Conversation
# Conflicts: # vizro-core/docs/pages/user-guides/data.md
for more information, see https://pre-commit.ci
|
||
Unlike static data, dynamic data cannot be supplied directly into the `data_frame` argument of a `figure`. Instead, it must first be added to the Data Manager and then referred to by name. | ||
|
||
!!! example "Dynamic data" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually a terrible example of using dynamic data because the iris.csv file is static and we have a static png anyway so you can't see anything update live.
We should come up with a better example and an animated gif to show off the live update but not in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understand that we won't update the example in this PR, but before I forget - when we add an example, some weather forecast dashboard might be nice. Then, we can also show them how to refresh the data daily.
https://openweathermap.org/api
Or some model interaction example, where we update the model output after each run (this might be a bit more complex though)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One idea maybe for this PR still could be to make this function time dependant, and filter the DF for even or odd data given the hour of the day. Then at some point we could make a nice real example with stock market data, weather data or the likes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just as a quick thing for now I will do something where it just plots the iris dataset but with different random points selected or something like that. And then in the future we can make this a more interesting example like the weather where we call some external API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in acb546a - see what you think.
@maxschulz-COL you don't know of any reason why we wouldn't be able to host iris.csv
on RTD and download it as done here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but I also haven't tried it. But I like the new approach. The only thing I am unsure about is the following:
Should the code example really have pd.read_csv
(I know it is to make it clear you are fetching from a file) or actually px.data.iris()
, ie something that will always work?
I kinda prefer the former, but with a message similar to the one we already have, ie that here we kinda simulate the dynamic case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I had this done with px.data.iris()
originally and then decided that wasn't clear enough because it's too far removed from a "real" working example where the data can change. So better to use pd.read_csv
even though that means downloading a file to get the example working.
The download works ok on the automatically built docs for this PR so presumably it will also work with the released docs 🤞
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WOOOOOOOOOW!!! 🚀 🔥 🦸♂️
I haven't properly tested this out, as I plan to do that after the unit tests have been added. But let me know if you still want me to test this prior to having all the required tests in.
Just wanted to say, that I love the documentation! Wrote down some notes for myself 👩🎓 Not only the user guides but also the documentation and code examples left in the docstrings of the relevant classes. It was very clear and easy to understand how I need to configure static/dynamic data and cache 👍
I have a few questions, but I think we can all discuss them during TL 👥 📚
|
||
Unlike static data, dynamic data cannot be supplied directly into the `data_frame` argument of a `figure`. Instead, it must first be added to the Data Manager and then referred to by name. | ||
|
||
!!! example "Dynamic data" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understand that we won't update the example in this PR, but before I forget - when we add an example, some weather forecast dashboard might be nice. Then, we can also show them how to refresh the data daily.
https://openweathermap.org/api
Or some model interaction example, where we update the model output after each run (this might be a bit more complex though)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me. I made a couple of minor suggestions based on my pedantry about data updates -- see what you think. I'll make a vale update on the branch if that's OK. (Vale isn't running at present as I need to finish PR #391 with @maxschulz-COL but it's in the codebase)
for more information, see https://pre-commit.ci
Co-authored-by: Jo Stichbury <[email protected]> Signed-off-by: Antony Milne <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approval from my side, given the pending docs discussion.
for more information, see https://pre-commit.ci
For the documentation purpose (as requested), here is a set of cases I tested:
|
@antonymilne - just note: The PR initially couldn't be merged automatically because they failed the Snyk tests. I've double-checked and most of them are low security issues connected to the newly added requirements redis, flask-cache etc. I've marked the snyk tests as passing on snyk, such that you can merge but we might want to come back to the requirements at some point or decide whether these snyk tests make sense at all, especially the one on flask-caching we might have to double-check again. |
@huong-li-nguyen ah, thank you for merging - I was actually battling snyk and trying to figure out what to do the last couple of hours though. The result is #417. |
Description
Vizro's most overdue PR is finally here! I will make some nice diagrams and explain this more in a learning session, but for now the best way to understand this change is to forget everything you currently know about the data manager and read the new docs: https://vizro--398.org.readthedocs.build/en/398/pages/user-guides/data/.
Reviewers
Don't be scared by the number of files changed. Most of this is find and replace in the docs. The only Python code that's changed substantially is data_manager.py, and even that is mainly lengthy comments. Don't worry too much about poring over the code. What's important to review here is:
Things to test @petar-qb @l0uden
TODO in this PR
request
should be possible but need to checkpage.build
, seefeat/data-manager-enhancements
. Overriding*args, **kwargs
should also be possible. Do have access to Dashctx
in data loading function.timeout=0
dynamic data. Seefeat/data-manager-enhancements
.TODO in future PRs
.
in id and other characters?). Note need for things that appear in page title though. Limit dataset names tovalid_chars = set(string.ascii_letters + string.digits + "_.")
so works with flask cachingNotice
I acknowledge and agree that, by checking this box and clicking "Submit Pull Request":