I've got a DataFrame of student test results, where the two columns that interest me are country
and result
, as in:
country result
FR Pass
FR Fail
US Pass
US Pass
DK Fail
DK Fail
SE Pass
... ...
What I'm trying to figure out is how to get the relative "Fail" frequency per country, descending (meaning - I want the students from that country that failed, as a percentage of all the students from that particular country), but only for countries that had over, let's say, 200 students take the test:
country % fail students
FR 0.056 997
US 0.051 855
DK 0.042 627
NL 0.032 511
I've seen colleagues at work do it with a very short SQL query, but for the life of me I can't figure out how to do it with pandas!
Aucun commentaire:
Enregistrer un commentaire