Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Barplots #179

Open
EricESeverson opened this issue Apr 14, 2021 · 6 comments
Open

Support for Barplots #179

EricESeverson opened this issue Apr 14, 2021 · 6 comments
Labels
enhancement New feature or request

Comments

@EricESeverson
Copy link

My intended use case is for visualizing data that gives counts of various states over a number of time snapshots. I want to show a barplot where the bar heights are a certain recorded snapshot, and to have a slider that lets me range over all recorded snapshots to see how these counts change over time.

Maybe I missed a way to do this with the library, but I didn't see any specific compatibility with matplotlib barplots. It would be nice to get some of the extra features, like easily saving an animation, that I wouldn't get from just that matplotlib.widgets Slider object.

@EricESeverson EricESeverson added the enhancement New feature or request label Apr 14, 2021
@ianhi
Copy link
Collaborator

ianhi commented Apr 14, 2021

Hi @EricESeverson that sounds like a reasonable use case and certainly belongs in this library. Ideally all of the pyplot functions would have slider generating equivalents, but I haven't gotten to them all yet.

A few questions for you:

  1. Are you using plt.bar to generate your static barplots?
  2. Do you have code that you use to animate barplots? If so we can use that a basis for the function for this library.
  3. Are you interested in making a PR adding this functionality? If you are I'm happy to help you through it, if not then no worries.

Thoughts on implementing iplt.bar
Unfortunately the returned object from plt.bar doesn't have a set_data method (https://matplotlib.org/stable/api/container_api.html#matplotlib.container.BarContainer) which always makes animating these things a bit more annoying (for example the implementation of plt.hist in this library is certainly not complete)

Also it looks as though bar can be super flexible https://matplotlib.org/stable/api/_as_gen/matplotlib.axes.Axes.bar.html#examples-using-matplotlib-axes-axes-bar so it may be a bit of a bear to fully support updating bars in all those different ways. So I propose starting simple with what I imagine are the basic use cases of a standard bar plot and then adding more as people want them or if someone has the energy to.

It will probably also be helpful to borrow good chunks of the code from https://github.com/matplotlib/matplotlib/blob/a051169c1b51a36c270b7b696ec86fae8f2f23e8/lib/matplotlib/axes/_axes.py#L2252

(I really should write a basic guide on how to add a new function - explaining how all the kwarg handling happens etc)

@EricESeverson
Copy link
Author

  1. No, so far I've been using the seaborn library https://seaborn.pydata.org/generated/seaborn.barplot.html
  2. The code is still not as general-purpose as I would like. If I generate a simple bar plot from an array of labels and array of heights, then I loop over all the rect objects in ax.patches and manually change all their heights, which seems to work fine.
    But for static images seaborn can do some nice stuff like making nested barplots that are great for visualizing the count of the something that's the cartesian product of two different variables. The thing I want to be able to do better is make a plot like this and still be able to change all the heights, the difficulty is being sure exactly how the heights are ordered so I know the mapping from data to heights. It's pretty likely that the best general purpose answer is doing everything from scratch with matplotlib, but I do like that seaborn has some very pretty defaults and talks to pandas dataframes nicely.
  3. Sure. A robust animated barplotter would accomplish most of the work I'm still trying to figure out for my exact animation task, so it sounds strictly better to make that part of things available. And I would very much appreciate help through it, as this project I've been working on has been a big python learning experience.

@EricESeverson
Copy link
Author

Other things I want for my exact use case is to make this barplot before the simulation data has been generated, then have it get updated live while the simulation is running, with the slider bar left over to go through the data has been generated.
The %matplotlib widgets backend seems to be handling that part ok, but ideally there would be better concurrency where I could use a slider to go through the data that's already been generated while new datapoints are still being added to the end.

And I have a toggle for the y-axis scale to change to 'symlog' to visualize if counts are 0 or close to 0. This seems like easy for your package to do.

@ianhi
Copy link
Collaborator

ianhi commented Apr 15, 2021

and talks to pandas dataframes nicely.

One thing to note is that matplotlib can actually do this as well! You just need to use the data argument see https://matplotlib.org/stable/users/prev_whats_new/whats_new_1.5.html#working-with-labeled-data-like-pandas-dataframes

ideally there would be better concurrency where I could use a slider to go through the data that's already been generated while new datapoints are still being added to the end.

This may actually be really tricky to achieve due to the current limitations of how widgets work. If your simulation is running continuously then the kernel will always be busy, and it won't be able to process the changes to the slider. You can see the issue with this simplified example. If you move the slider around before the sleep is done then the messages won't immediately print out.

import ipywidgets as widgets
import time
out = widgets.Output()

@out.capture()
def cb(change):
    print(change['new'])
slider = widgets.IntSlider()
slider.observe(cb, names='value')
display(slider)
display(out)
for i in range(5):
    # faking doing lots of work for the simulation
    time.sleep(2)

Although if you are running from a script and using matplotlib sliders it may actually work better - I'm not totally sure. Though the only way I can be sure you'd be able to have this happen live is to do something with threads, but that can get out of hand pretty easily.

And I have a toggle for the y-axis scale to change to 'symlog' to visualize if counts are 0 or close to 0. This seems like easy for your package to do.

I sure hope so! if it ends up not being please either open an issue or ask for help on https://discourse.matplotlib.org/c/3rdparty/18

@ianhi
Copy link
Collaborator

ianhi commented Apr 15, 2021

No, so far I've been using the seaborn library https://seaborn.pydata.org/generated/seaborn.barplot.html

It looks as though that uses ax.bar internally and doesn't do anything too fancy with the patches, mostly it seems to manage converting the way you supply data from what seaborn expects to what maptlotlib expects. So you may not get the same nice coloring as easily, but that function should be almost entirely replicable once we have a good iplt.bar.

But for static images seaborn can do some nice stuff like making nested barplots that are great for visualizing the count of the something that's the cartesian product of two different variables.

Yeah seaborn can make some awesome stuff. Unfortunately they don't return the underlying matplotlb objects so you can't really animate them see for example (https://github.com/mwaskom/seaborn/blob/10aa7a82130f0560f2b39f857349442f553e5e1a/seaborn/categorical.py#L1554-L1595). My naive guess is that if you want something to be both a complex visualization and animated you're going to end up needing to do one of those two parts manually. I don't think any library really hits both. You could with some work probably animate seaborn,but I think you are correct the easiest thing will be to construct your visualization from pure matplotlib and then animate that (potentially using this library).


The code is still not as general-purpose as I would like. If I generate a simple bar plot from an array of labels and array of heights, then I loop over all the rect objects in ax.patches and manually change all their heights, which seems to work fine.

That sounds like a great start! This is basically what hist does (except I go even more manual and create the patches from scratch). So I'd recommend starting with just this and getting all the integration with the controls object working - and then we can work on changing more than just the heights. Don't let the perfect be the enemy of the good!

If you post what you have so far I can help with how to integrate it into a form that will work with this library.

Sure. A robust animated barplotter would accomplish most of the work I'm still trying to figure out for my exact animation task, so it sounds strictly better to make that part of things available. And I would very much appreciate help through it, as this project I've been working on has been a big python learning experience.

Right on. FWIW making this package has been a huge python learning experience for me, so I totally get it.

As for help through the process my first piece of advice is to not be afraid to post partially working code here, or to open an unfinished PR. It's way easier to discuss these things when we're both looking at the same thing.

I also have two resources for learning how to contribute to this library:

  1. https://mpl-interactions.readthedocs.io/en/stable/Contributing.html
  2. The hot off the presses: How to add an interactive function - create a docs page #180

( I just wrote #180 inspired by this thread. My hope is that it can make some of the internals clear to someone who would like to contribute. If you end up reading that and find that anything unclear or could be improved please let me know!)

@EricESeverson
Copy link
Author

Sorry about the delay, I have been caught up with getting my project up and running on PyPI.
The plotting code I have is functional, but maybe still not as elegant as I would like. The biggest thing I would actually like to change now is to get to where I could share my example notebooks on colab. But given that colab doesn't support any of the interactive matplotlib backends, the StatePlotter objects that I have don't work, and it doesn't seem like mpl-interactions would work there either.

If you post what you have so far I can help with how to integrate it into a form that will work with this library.

My StatePlotter class is tied into the Simulator class, and most of the code is involved with manipulating data structures in these classes. The actual bar plot logic is just a couple lines. The labels of the bars come from a field categories, then I instantiate the plot with

self.ax = sns.barplot(x=[str(c) for c in self.categories], y=np.zeros(len(self.categories)))

and then the update function creates an array heights that comes from Simulation data, and updates the plot with the lines

        for i, rect in enumerate(self.ax.patches):
            rect.set_height(heights[i])

This seems to be a pretty standard way to do it, in places such as this stack overflow post.

My update function gets called by the Simulator periodically while the simulation itself is running. Then afterward the Simulator class has a method which makes a slider widget that calls the update function to show stored values from the simulation.

The main thing this still doesn't have that I would like at some point is a good way to get nested barplots. seaborn's barplot function can make these really well with the hue parameter, but if I try to rely on this function, it's not easy to have control over how to manually adjust all the heights in the same way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants