-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace broken Plotly plots in widget with matplotlib #17
Merged
Merged
Changes from 20 commits
Commits
Show all changes
23 commits
Select commit
Hold shift + click to select a range
32c1cb4
changed html widget to dom output widget for rendering matplotlib plots
shwinnn fe767e1
amended typo widget->widgets
shwinnn e46474f
Fixed display of matplotlib plots within output widget
shwinnn 85e0187
Set plot size in both matplotlib figure and Output widget
zblz 85786c2
Add plot_pairdensity_mpl method to plotting
zblz 2a72cbe
re-enabled pairdensity plot using output widgets
shwinnn ddd4666
added plot_correlation_mpl to create correlation plots in matplotlib
shwinnn 5efba7b
renamed figure, plot area and updated docstring in update_plot
shwinnn 4767c9b
fixed correlation plot maximum width & height
shwinnn f18eec9
moved axis ticks & labels, inverted colour map for correlation matrix…
shwinnn 657aef2
changed colourmap name
shwinnn 7b3bfba
enabled correlation matrix plots to be displayed in output widgets
shwinnn a7d39d0
corrected plot sizing in update_plot
shwinnn b99b67d
fixed the cells in correlation plots to be squares
shwinnn dde199b
changed number display format on correlation plots
shwinnn 76163f5
modified aspect ratio for correlation plots displayed in widget
shwinnn 1157969
Fix alignment of bar chart in plot_distribution
zblz 692e29b
Remove alpha for ax.bar
zblz 28c2ce5
Formatting and order fixes
zblz 5a14ff9
enforced PEP8 compliance
shwinnn 074f0f8
Enforces at least version 6 of ipywidgets
shwinnn 771e04f
Added comment explaining 'magic numbers' for enforcing plot width
shwinnn b95118c
Simplify correlation plot width computation
zblz File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,8 @@ | ||
import matplotlib.pyplot as plt | ||
from matplotlib.ticker import FuncFormatter, MaxNLocator | ||
import numpy as np | ||
import plotly.graph_objs as go | ||
import seaborn as sns | ||
try: | ||
import plotly.figure_factory as pff | ||
except ImportError: | ||
|
@@ -48,7 +50,8 @@ def plot_distribution(ls, column, bins=None): | |
|
||
fig, ax = plt.subplots() | ||
|
||
ax.bar(edges[:-1], counts, width=np.diff(edges), label=column, alpha=0.4) | ||
ax.bar(edges[:-1], counts, width=np.diff(edges), label=column, | ||
align='edge') | ||
|
||
ax.set_ylim(bottom=0) | ||
|
||
|
@@ -60,6 +63,161 @@ def plot_distribution(ls, column, bins=None): | |
return fig | ||
|
||
|
||
def _set_integer_tick_labels(axis, labels): | ||
"""Use labels dict to set labels on axis""" | ||
axis.set_major_formatter(FuncFormatter(lambda x, _: labels.get(x, ''))) | ||
axis.set_major_locator(MaxNLocator(integer=True)) | ||
|
||
|
||
def plot_pairdensity_mpl(ls, column1, column2): | ||
"""Plot the pairwise density between two columns. | ||
|
||
This plot is an approximation of a scatterplot through a 2D Kernel | ||
Density Estimate for two numerical variables. When one of the variables | ||
is categorical, a 1D KDE for each of the categories is shown, | ||
normalised to the total number of non-null observations. For two | ||
categorical variables, the plot produced is a heatmap representation of | ||
the contingency table. | ||
|
||
Parameters | ||
---------- | ||
ls : :class:`~lens.Summary` | ||
Lens `Summary`. | ||
column1 : str | ||
First column. | ||
column2 : str | ||
Second column. | ||
|
||
Returns | ||
------- | ||
:class:`plt.Figure` | ||
Matplotlib figure containing the pairwise density plot. | ||
""" | ||
pair_details = ls.pair_details(column1, column2) | ||
pairdensity = pair_details['pairdensity'] | ||
|
||
x = np.array(pairdensity['x']) | ||
y = np.array(pairdensity['y']) | ||
Z = np.array(pairdensity['density']) | ||
|
||
fig, ax = plt.subplots() | ||
|
||
if ls.summary(column1)['desc'] == 'categorical': | ||
idx = np.argsort(x) | ||
x = x[idx] | ||
Z = Z[:, idx] | ||
# Create labels and positions for categorical axis | ||
x_labels = dict(enumerate(x)) | ||
_set_integer_tick_labels(ax.xaxis, x_labels) | ||
x = np.arange(-0.5, len(x), 1.0) | ||
|
||
if ls.summary(column2)['desc'] == 'categorical': | ||
idx = np.argsort(y) | ||
y = y[idx] | ||
Z = Z[idx] | ||
y_labels = dict(enumerate(y)) | ||
_set_integer_tick_labels(ax.yaxis, y_labels) | ||
y = np.arange(-0.5, len(y), 1.0) | ||
|
||
X, Y = np.meshgrid(x, y) | ||
|
||
ax.pcolormesh(X, Y, Z, cmap=DEFAULT_COLORSCALE.lower()) | ||
|
||
ax.set_xlabel(column1) | ||
ax.set_ylabel(column2) | ||
|
||
ax.set_title(r'$\it{{ {} }}$ vs $\it{{ {} }}$'.format(column1, column2)) | ||
|
||
return fig | ||
|
||
|
||
def plot_correlation_mpl(ls, include=None, exclude=None): | ||
"""Plot the correlation matrix for numeric columns | ||
|
||
Plot a Spearman rank order correlation coefficient matrix showing the | ||
correlation between columns. The matrix is reordered to group together | ||
columns that have a higher correlation coefficient. The columns to be | ||
plotted in the correlation plot can be selected through either the | ||
``include`` or ``exclude`` keyword arguments. Only one of them can be | ||
given. | ||
|
||
Parameters | ||
---------- | ||
ls : :class:`~lens.Summary` | ||
Lens `Summary`. | ||
include : list of str | ||
List of columns to include in the correlation plot. | ||
exclude : list of str | ||
List of columns to exclude from the correlation plot. | ||
|
||
Returns | ||
------- | ||
:class:`plt.Figure` | ||
Matplotlib figure containing the pairwise density plot. | ||
""" | ||
|
||
columns, correlation_matrix = ls.correlation_matrix(include, exclude) | ||
num_cols = len(columns) | ||
|
||
if num_cols > 10: | ||
annotate = False | ||
else: | ||
annotate = True | ||
|
||
fig, ax = plt.subplots() | ||
sns.heatmap(correlation_matrix, annot=annotate, fmt='.2f', ax=ax, | ||
xticklabels=columns, yticklabels=columns, vmin=-1, vmax=1, | ||
cmap='RdBu_r', square=True) | ||
|
||
ax.xaxis.tick_top() | ||
|
||
w = len(columns) * 2.5 | ||
while w > 10: | ||
w /= np.sqrt(1.4) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why not just set |
||
|
||
fig.set_size_inches(w, w) | ||
|
||
return fig | ||
|
||
|
||
def plot_cdf(ls, column, N_cdf=100): | ||
"""Plot the empirical cumulative distribution function of a column. | ||
|
||
Creates a plotly plot with the empirical CDF of a column. | ||
|
||
Parameters | ||
---------- | ||
ls : :class:`~lens.Summary` | ||
Lens `Summary`. | ||
column : str | ||
Name of the column. | ||
N_cdf : int | ||
Number of points in the CDF plot. | ||
|
||
Returns | ||
------- | ||
:class:`~matplotlib.Axes` | ||
Matplotlib axes containing the distribution plot. | ||
""" | ||
tdigest = ls.tdigest(column) | ||
|
||
cdfs = np.linspace(0, 100, N_cdf) | ||
xs = [tdigest.percentile(p) for p in cdfs] | ||
|
||
fig, ax = plt.subplots() | ||
|
||
ax.set_ylabel('Percentile') | ||
ax.set_xlabel(column) | ||
ax.plot(xs, cdfs) | ||
|
||
if ls._report['column_summary'][column]['logtrans']: | ||
ax.set_xscale('log') | ||
|
||
ax.set_title('Empirical Cumulative Distribution Function') | ||
|
||
return fig | ||
|
||
|
||
def plot_pairdensity(ls, column1, column2): | ||
"""Plot the pairwise density between two columns. | ||
|
||
|
@@ -190,41 +348,3 @@ def plot_correlation(ls, include=None, exclude=None): | |
fig.data[0]['showscale'] = True | ||
|
||
return fig | ||
|
||
|
||
def plot_cdf(ls, column, N_cdf=100): | ||
"""Plot the empirical cumulative distribution function of a column. | ||
|
||
Creates a plotly plot with the empirical CDF of a column. | ||
|
||
Parameters | ||
---------- | ||
ls : :class:`~lens.Summary` | ||
Lens `Summary`. | ||
column : str | ||
Name of the column. | ||
N_cdf : int | ||
Number of points in the CDF plot. | ||
|
||
Returns | ||
------- | ||
:class:`~matplotlib.Axes` | ||
Matplotlib axes containing the distribution plot. | ||
""" | ||
tdigest = ls.tdigest(column) | ||
|
||
cdfs = np.linspace(0, 100, N_cdf) | ||
xs = [tdigest.percentile(p) for p in cdfs] | ||
|
||
fig, ax = plt.subplots() | ||
|
||
ax.set_ylabel('Percentile') | ||
ax.set_xlabel(column) | ||
ax.plot(xs, cdfs) | ||
|
||
if ls._report['column_summary'][column]['logtrans']: | ||
ax.set_xscale('log') | ||
|
||
ax.set_title('Empirical Cumulative Distribution Function') | ||
|
||
return fig |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -42,5 +42,6 @@ def read_version(): | |
'plotly', | ||
'scipy', | ||
'tdigest', | ||
'seaborn', | ||
], | ||
) |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some pretty heavy use of magic numbers here. Why those numbers? What does this do?