Here is a set of RDS files that contain sf objects of state county boundaries. We are going to work with these using iteration and functions for some of this weekâ€™s work.
- Letâ€™s warm up with some SF practice. the function
readRDS()reads in RDS files. The dplyr function
bind_rows()can take rows of a data frame, tibble, of sf object, and bind them together properly. Using the
purrrlibrary, read in all of the counties files and then combine them into a single data frame. Plot the result.
- This is great. Now, Iâ€™m curious – is there a link between the number of counties in a state and the ratio of area of the largest county in the state to the total state area? Letâ€™s find out!
A. Write a function that, given a state name, will use
readRDS to read in a single data file and fix up the CRS (these are all in lat/long – you want a mollweide, in which distance is in meters). Plot Massachusetts to make sure everything works.
B. Write a function that, given an sf object of a single state and its counties, will return a one row data frame with the number of counties, the area of the largest county, the average county area, the stateâ€™s area, and the ratio of the largest county to total area.
st_area() will help you calculate area – but you will need to
as.numeric(), and if you take an sf object and use
summarize() on it, it will merge all of the polygons into one.
C. Using iteration, make a data frame that has all of the above information for all of the states. +1 EXTRA CREDIT – have a column named state with the state name. (hint:
D. Plot that largest county ratio to number of counties! What do you learn? +1 extra credit for each exploration beyond this.
- Install and load up the package
repurrsive. It has an object in it,
got_charswith information about the characters from the Game of Thrones series. Notice it is a list of lists. To explore it, check out
listviewer::jsonedit(got_chars, mode = "view").
purrr functions make a tibble with the following columns:
- aliases (a list column)
- allegiances (a list column)
- Who has more aliases on average? Men or women? Visualize however you see fit.
- One thing that is cool about list columns is that we can filter on them. We can remove rows with list columns that have a length of 0 with
filter(lengths(x) < 0)where x is some column name. Note we are using
Another cool thing is that we can always
tidyr::unnest() columns to expand them out, repeating, say, names or other elements of a data frame.
A. Select just name and aliases. Filter the resulting data down to something usable, and then unnest aliases. Use the resulting data to determine, who had the most aliases!
B. Great! Now. Letâ€™s use this idea of unnesting to build and then visualize a dataset that shows the breakdown, within each allegiance, whether there are more aliases for men or women. What does this visualization teach you about the different allegiances?