Skip to content

Commit

Permalink
markdown source builds
Browse files Browse the repository at this point in the history
Auto-generated via {sandpaper}
Source  : e2b184e
Branch  : main
Author  : Allen Lee <alee@users.noreply.github.com>
Time    : 2024-03-05 20:54:39 +0000
Message : Merge pull request #674 from alee/fix_mean_invocation

fix: clarify pandas usage with non-numeric columns
  • Loading branch information
actions-user committed Mar 5, 2024
1 parent cf0c3f3 commit b0b89c4
Show file tree
Hide file tree
Showing 2 changed files with 17 additions and 5 deletions.
20 changes: 16 additions & 4 deletions 14-looping-data-sets.md
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,10 @@ What other special strings does the [`float` function][float-function] recognize

Write a program that reads in the regional data sets
and plots the average GDP per capita for each region over time
in a single chart.
in a single chart. Pandas will raise an error if it encounters
non-numeric columns in a dataframe computation so you may need
to either filter out those columns or tell pandas to ignore them.


::::::::::::::: solution

Expand All @@ -200,8 +203,17 @@ for filename in glob.glob('data/gapminder_gdp*.csv'):
# we will split the string using the split method and `_` as our separator,
# retrieve the last string in the list that split returns (`<region>.csv`),
# and then remove the `.csv` extension from that string.
# NOTE: the pathlib module covered in the next callout also offers
# convenient abstractions for working with filesystem paths and could solve this as well:
# from pathlib import Path
# region = Path(filename).stem.split('_')[-1]
region = filename.split('_')[-1][:-4]
dataframe.mean().plot(ax=ax, label=region)
# pandas raises errors when it encounters non-numeric columns in a dataframe computation
# but we can tell pandas to ignore them with the `numeric_only` parameter
dataframe.mean(numeric_only=True).plot(ax=ax, label=region)
# NOTE: another way of doing this selects just the columns with gdp in their name using the filter method
# dataframe.filter(like="gdp").mean().plot(ax=ax, label=region)

plt.legend()
plt.show()
```
Expand Down Expand Up @@ -231,8 +243,8 @@ gapminder_gdp_africa
.csv
```

**Hint:** It is possible to check all available attributes and methods on the `Path` object with the `dir()`
function!
**Hint:** Check all available attributes and methods on the `Path` object with the `dir()`
function.


::::::::::::::::::::::::::::::::::::::::::::::::::
Expand Down
2 changes: 1 addition & 1 deletion md5sum.txt
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
"episodes/11-lists.md" "1257daeb542377a3b04c6bec0d0ffee1" "site/built/11-lists.md" "2023-07-24"
"episodes/12-for-loops.md" "1da6e4e57a25f8d4fd64802c2eb682c4" "site/built/12-for-loops.md" "2023-05-02"
"episodes/13-conditionals.md" "2739086f688f386c32ce56400c6b27e2" "site/built/13-conditionals.md" "2024-02-16"
"episodes/14-looping-data-sets.md" "e04f11544d1e5f3ca08ddcf22230a3a3" "site/built/14-looping-data-sets.md" "2023-05-02"
"episodes/14-looping-data-sets.md" "33bc3751e02186ba42ba35d937b03889" "site/built/14-looping-data-sets.md" "2024-03-05"
"episodes/15-coffee.md" "062bae79eb17ee57f183b21658a8d813" "site/built/15-coffee.md" "2023-05-02"
"episodes/16-writing-functions.md" "0f162f45b0072659b0113baf01ade027" "site/built/16-writing-functions.md" "2023-07-24"
"episodes/17-scope.md" "8109afb18f278a482083d867ad80da6e" "site/built/17-scope.md" "2023-05-02"
Expand Down

0 comments on commit b0b89c4

Please sign in to comment.