lundi 20 juin 2016

How to plot a function on the secondary y-axis of a histogram in python?

Let's use the famous Titanic dataset, found here:

http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/titanic3.xls

And read it in as a dataframe: df

The features that I'm interested in are 'age' (a float) and 'survived' (also a float, but binary 1 or 0).

I know how to plot a histogram to show the distribution of ages for the passengers:

sns.set_style("white")

df_age = df[df.age >= 0]
age = df_age.age

fig = plt.figure(figsize=(12,6))
plt.hist(age.values, facecolor='gray', bins=20, alpha=.8)
plt.axvline(age.median(), color='gray', linestyle='dashed', linewidth=2)
plt.xlabel('age')
plt.ylabel('passenger count')
plt.show()

enter image description here

Easy! But what I want is a line chart of survival rate for passengers in each histogram bin on top of that, like this:

enter image description here

How can I do that?

Aucun commentaire:

Enregistrer un commentaire