mardi 21 juin 2016

Seaborn FacetGrid Heatmap with row missing

I'm attempting to create a FacetGrid of heatmaps and I've noticed if the last heatmap is missing all the data in one row, then all the plots get truncated. This doesn't happen if any of the other heatmaps do.

Based off the code from the question "Getting a legend in a seaborn FacetGrid heatmap plot" (I can't link because I don't have high enough reputation to include more than 2 links),

import pandas as pd
import numpy as np
import itertools
import seaborn as sns

methods=['method 1', 'method2', 'method 3', 'method 4']
times = range(0, 100, 10)
data = pd.DataFrame(list(itertools.product(methods, times, times)))
data.columns = ['method', 'dtsi','rtsi']
data['nw_score'] = np.random.sample(data.shape[0])

data = data.iloc[10:]

def facet_heatmap(data, color, **kws):
    data = data.pivot(index="dtsi", columns='rtsi', values='nw_score')
    sns.heatmap(data, cmap='Blues', **kws)

g = sns.FacetGrid(data, col="method", col_wrap=2, size=3, aspect=1)
g = g.map_dataframe(facet_heatmap)

The above plots correctly, with the first row of the first plot masked and grey to show there's no data: Correct plotting

But if we cut out data from the last facet:

data = data.iloc[:-10]
g = sns.FacetGrid(data, col="method", col_wrap=2, size=3, aspect=1)
g = g.map_dataframe(facet_heatmap)  

Everything gets shifted instead: incorrect plotting. Note that in this image, we've removed the first row of data from the first facet, and the last row from the last, yet it looks like we have no missing data.

It looks like the heatmaps are getting shifted up? And the labels are no longer correct (they should start from 10 and run to 90 for every plot but the last one, where they should run from 0 to 80).

Turning off sharex and sharey is one potential work-around:

g = sns.FacetGrid(data, col="method", col_wrap=2, size=3, aspect=1, sharex=False, sharey=False)
g = g.map_dataframe(facet_heatmap) 

This plots all the data, but it makes it hard to compare across facets and is not ideal. (I can't post the link because I don't have enough reputation).

The other solution I came up with is to simply reorder the columns so that the one with the missing data is not the last one. But that doesn't seem like a good general solution and, in my specific case, there's a column ordering I want to use that results in the facet with missing data being at the end.

Is there any way to make this work?

Aucun commentaire:

Enregistrer un commentaire