Example of a correlation map #1945

firasm · 2020-02-03T03:36:09Z

Here's a PR of a correlation map that my students (@vcuspinera and @AndresPitta) created.

the output of this example is

Not sure if this is something worth adding to the examples and admittedly this is similar to the Layered heat map with text example.

I think it would be worth adding if I could show only half of the correlation matrix like this example from here

Pull most recent changes

@vcuspinera

Original authors @vcuspinera and @AndresPitta

eitanlees · 2020-02-06T14:57:01Z

altair/examples/correlation_matrix.py

+heatmap = alt.Chart(corrMatrix_line).encode(
+    alt.Y('Var1:N', title = ''),
+    alt.X('Var2:N', title = '', axis=alt.Axis(labelAngle=20))
+).mark_rect().encode(


I would move mark_rect() to directly after alt.Chart() and only have a single call to encode() like many of the other examples.

eitanlees · 2020-02-06T14:59:01Z

altair/examples/correlation_matrix.py

@@ -0,0 +1,52 @@
+"""
+Correlation matrix
+--------------


I think it's important that the length of the underline matches the length of the title when the docs are compiled in Sphinx. You just need to add a few more dashes.

firasm · 2020-02-07T17:53:58Z

Thanks @eitanlees I'll address your comments soon!

…lower triangle

firasm · 2020-02-08T20:04:59Z

In the commit above, now have examples of a full correlation matrix, as well as a less redundant one with diagonals and upper triangle removed:

jakevdp · 2020-03-29T14:45:34Z

Sorry - this fell off my radar.

Looking at it, it seems like a fairly immense amount of code to create a relatively straightforward chart, so I'm hesitant to add this example as-is to the main example gallery.

jakevdp · 2020-03-29T14:58:50Z

Maybe simplify it to something like this?

import altair as alt
from vega_datasets import data

df_iris = data.iris()
corrMatrix = df_iris.corr().reset_index().melt('index')
corrMatrix.columns = ['var1', 'var2', 'correlation']

base = alt.Chart(corrMatrix).transform_filter(
    alt.datum.var1 < alt.datum.var2
).encode(
    x='var1',
    y='var2',
).properties(
    width=alt.Step(100),
    height=alt.Step(100)
)

rects = base.mark_rect().encode(
    color='correlation'
)

text = base.mark_text(
    size=30
).encode(
    text=alt.Text('correlation', format=".2f"),
    color=alt.condition(
        "datum.correlation > 0.5",
        alt.value('white'),
        alt.value('black')
    )
)

rects + text

jakevdp · 2020-03-29T15:07:46Z

Or, if you want both versions of the chart together:

import altair as alt
from vega_datasets import data

df_iris = data.iris()
corrMatrix = df_iris.corr().reset_index().melt('index')
corrMatrix.columns = ['var1', 'var2', 'correlation']

chart = alt.Chart(corrMatrix).mark_rect().encode(
    x=alt.X('var1', title=None),
    y=alt.Y('var2', title=None),
    color=alt.Color('correlation', legend=None),
).properties(
    width=alt.Step(80),
    height=alt.Step(80)
)

chart += chart.mark_text(size=25).encode(
    text=alt.Text('correlation', format=".2f"),
    color=alt.condition(
        "datum.correlation > 0.5",
        alt.value('white'),
        alt.value('black')
    )
)

chart | chart.transform_filter("datum.var1 < datum.var2")

firasm · 2020-04-08T19:32:42Z

Thanks that is indeed much cleaner! I'm happy with the above and can submit a commit once the term is over...

harabat · 2021-03-10T12:21:16Z

@jakevdp @firasm Assuming that wanting to sort the labels of a heatmap in non-alphabetical order is not rare (spent a lot of time on this personally), would it make sense to modify this example to allow for a custom sort?

For example, if I want to have the rows and columns sorted in this order: 'petalWidth', 'petalLength', 'sepalWidth', 'sepalLength'

import altair as alt
from vega_datasets import data

# create corr map
source = data.iris()
source_corr = source.corr().reset_index().melt(id_vars='index')

# create dummy ordinal var
sort = {'petalWidth': 0, 'petalLength': 1, 'sepalWidth': 2, 'sepalLength': 3}

heatmap = alt.Chart(source_corr)\
.mark_rect()\
.transform_calculate(
    order_rows='%s [datum.index]' % sort,
    order_cols='%s [datum.variable]' % sort
)\
.transform_filter(alt.datum.order_rows <= alt.datum.order_cols)\
.encode(
    alt.X('index:N', title=None, sort=list(sort.keys())),
    alt.Y('variable:N', title=None, sort=list(sort.keys())),
    alt.Color('value:Q', legend=None)
)\
.properties(width=300, height=300)

text = heatmap\
.mark_text(size=25)\
.encode(
    alt.Text('value:Q', format='.2f'),
    color=alt.condition(
        'datum.value > 0.5',
        alt.value('white'),
        alt.value('black')
    )
)

heatmap + text

Adapted from this StackOverflow question.

joelostblom · 2021-03-14T04:49:33Z

I started working on a package to facilitate creating these plots that might be too complex for the gallery, and that you would want to have easily accessible when doing EDA etc. I included correlation plots, even if they looks somewhat different from what is suggested here:

You can see some more examples here. I haven't created a release on PyPI yet and I still need to fix some things, but am happily accepting suggestions for what to include. Also @jakevdp, let me know if you want me to name it something else, in case altair_ally sounds too official and you want that pattern reserved for packages in the altair-viz repo.

pedromorais007 · 2023-01-07T09:05:58Z

Is it possible to change the jakevdp graph layout from blue colors to red colors?

mattijn · 2023-01-07T11:30:52Z

@pedromorais007 see this answer: #2779

pedromorais007 · 2023-01-07T12:29:57Z

Thanks mattijn for your suggestion.
I tried to put the code line:
color=alt.Color('z:Q', scale=alt.Scale(scheme="reds"))
in my correlation altair matrix but the color still in blue. Nothing has changed.

    base = alt.Chart(corrMatrix).transform_filter(alt.datum.var1 < alt.datum.var2).encode(  
        x='var1', 
        y='var2', 
        color=alt.Color('z:Q', scale=alt.Scale(scheme="reds"))
        ).properties(
            width=alt.Step(100), height=alt.Step(100), )   
    rects = base.mark_rect().encode(color='correlation')    
    text = base.mark_text(size=20).encode(
        text=alt.Text('correlation', format=".2f"),
        color=alt.condition("datum.correlation > 0.5", alt.value('white'),alt.value('black'),)
        ) 
    st.altair_chart(rects + text)

mattijn · 2023-01-07T13:33:49Z

with a normal heatmap this works:

import altair as alt
import numpy as np
import pandas as pd

# Compute x^2 + y^2 across a 2D grid
x, y = np.meshgrid(range(-5, 5), range(-5, 5))
z = x**2 + y**2

# Convert this grid to columnar data expected by Altair
source = pd.DataFrame({"x": x.ravel(), "y": y.ravel(), "z": z.ravel()})

c = alt.Chart(source, height=alt.Step(12), width=alt.Step(12)).mark_rect().encode(
    x="x:O",
    y="y:O", 
    color=alt.Color("z:Q", scale=alt.Scale(scheme='reds'))
)
c + c.mark_text(size=7).encode(text=alt.Text("z"), color=alt.value("white"))

I suspect something is overruling the color scheme in streamlit what you seems using (st.altair_chart())

ChristopherDavisUCI · 2023-01-07T13:52:57Z

@pedromorais007 Does it work the way you want if you remove color='correlation' from rects? Like @mattijn said, I believe that is overruling the color definition in base.

firasm added 3 commits November 2, 2019 15:09

Merge pull request #1 from altair-viz/master

78e5380

Pull most recent changes

Merge branch 'master' of github.com:altair-viz/altair

e8bdcdd

Add an example for a correlation map

cc7c155

Original authors @vcuspinera and @AndresPitta

eitanlees reviewed Feb 6, 2020

View reviewed changes

Updated based on code review, added an example of correlation matrix …

cd5290c

…lower triangle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Example of a correlation map #1945

Example of a correlation map #1945

firasm commented Feb 3, 2020

eitanlees Feb 6, 2020

eitanlees Feb 6, 2020

firasm commented Feb 7, 2020

firasm commented Feb 8, 2020

jakevdp commented Mar 29, 2020

jakevdp commented Mar 29, 2020 •

edited

Loading

jakevdp commented Mar 29, 2020 •

edited

Loading

firasm commented Apr 8, 2020

harabat commented Mar 10, 2021 •

edited

Loading

joelostblom commented Mar 14, 2021

pedromorais007 commented Jan 7, 2023

mattijn commented Jan 7, 2023

pedromorais007 commented Jan 7, 2023

mattijn commented Jan 7, 2023

ChristopherDavisUCI commented Jan 7, 2023

Example of a correlation map #1945

Are you sure you want to change the base?

Example of a correlation map #1945

Conversation

firasm commented Feb 3, 2020

eitanlees Feb 6, 2020

Choose a reason for hiding this comment

eitanlees Feb 6, 2020

Choose a reason for hiding this comment

firasm commented Feb 7, 2020

firasm commented Feb 8, 2020

jakevdp commented Mar 29, 2020

jakevdp commented Mar 29, 2020 • edited Loading

jakevdp commented Mar 29, 2020 • edited Loading

firasm commented Apr 8, 2020

harabat commented Mar 10, 2021 • edited Loading

joelostblom commented Mar 14, 2021

pedromorais007 commented Jan 7, 2023

mattijn commented Jan 7, 2023

pedromorais007 commented Jan 7, 2023

mattijn commented Jan 7, 2023

ChristopherDavisUCI commented Jan 7, 2023

jakevdp commented Mar 29, 2020 •

edited

Loading

jakevdp commented Mar 29, 2020 •

edited

Loading

harabat commented Mar 10, 2021 •

edited

Loading