library(tidyverse)
library(leaflet)
library(sf)
library(tigris)
library(tidycensus)
library(RColorBrewer)
Lab 10: Polygon Mapping in Leaflet
Introduction
This lab is an extension of last week’s lab on Point Mapping in Leaflet. In this lab, we will continue our research into Louisiana’s Cancer Alley, layering in consideration of the demographics of the communities surrounding polluting facilities. This will allow us to assess racial disparities in exposure to environmental toxics. In this lab, we will also consider how choices that we make in mapping can shape the stories that the maps tell.
Learning Goals
- Transform polygon data to an appropriate CRS
- Map polygon data in Leaflet
- Create palettes for polygons on maps
- Adjust layer controls on polygon maps
- Recognize some of the ways that maps can “lie”
Review of Key Terms
You may wish to reference this reference guide when completing this lab.
- Ecological Fallacy
-
A reasoning error that emerges when we assume that conclusions that we draw about a group also apply to individual members of that group
- Modifiable Aerial Unit Problem
-
A problem that occurs when representing spatial data; how we choose to draw administrative boundaries can bias our understanding of the relationships amongst spatial data points
U.S. Census Data
The United States Census is survey of the population of the United States, taken every ten years. This data collection is mandated by the U.S. Constitution. There are countless stories that can be documented about the census. For example: - The census is one of the first examples of “big data” in the U.S.; too large to be computed by hand, special machines had to be developed to tabulate census data in the late 1800s. - The categories available for documenting race and ethnicity have shifted considerably over the past two centuries, mirroring changing perceptions of race in the U.S.
- For more stories, you might consider reading (Bouk 2022) .
The census has a number of data tables available for public analysis. Today, we are just going to look at data documenting the population of Black residents in Louisiana, collected during the 2020 census.
Setting Up Your Environment
- Install the following packages in your Environment:
install.packages("devtools")
install.packages("remotes")
remotes::install_github("walkerke/tidycensus")
Create an API Key for accessing census data here. The key will be emailed to you, and you must activate it with the link you receive in your email. NOTE: It may take several minutes to activate. If you click on the link in your email, and it doesn’t work, try again in a bit.
Run the following code in your terminal to store the census API key in your environment.
library(tidycensus)
census_api_key("[ADD KEY HERE]", overwrite = FALSE, install = TRUE)
Restart R. Don’t skip this step!!
Run the code below to load the packages for today’s lab.
Data Analysis
We are going to start by pulling census data about the population of residents that identify as black in Louisiana at both the census tract and the county level. Note that we are using a specific census variable that counts the population that identifies as Black or African American alone. This is an undercount of Black residents as it does not account for those residents that identify as Black along with another race.
Question
Read the help pages for get_decennial()
. Load census data at the county level for variable
‘P1_004N’ (Total Population: Population of one race: Black or African American alone). You should set the state
to ‘LA’, the geometry
to TRUE
, and the summary_var
to ‘P1_001N’ (Total Population). We will eventually use these two tables to calculate the percentage of the population that identifies as Black on the census.
Repeat and edit the code to get the same data at the tract level.
<- get_decennial(geography = ______,
census_counties variables = ______,
state = _____,
summary_var = _____,
geometry = _____)
<- get_decennial(_____) census_tracts
By setting geometry to TRUE
when we loaded this data, the data imported with a geometry field. We can check the CRS of this data by running the function below.
|>
census_counties st_crs()
Coordinate Reference System:
User input: NAD83
wkt:
GEOGCRS["NAD83",
DATUM["North American Datum 1983",
ELLIPSOID["GRS 1980",6378137,298.257222101,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
CS[ellipsoidal,2],
AXIS["latitude",north,
ORDER[1],
ANGLEUNIT["degree",0.0174532925199433]],
AXIS["longitude",east,
ORDER[2],
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4269]]
As you recall, we need the CRS of our data to match the CRS of our basemap, which is WGS 84 (EPSG:4326).
Question
In the code block below, transform the CRS of both census_counties
and census_tracts
to 4326 using st_transform()
.
<- #Fill code here
census_counties
<- #Fill code here census_tracts
Projections
Whenever we are mapping data, we need to consider the map’s projection. Recall that there are infinite ways to flatten our spherical globe into a 2-dimensional map. The projection standard that we choose indicates how 3-dimensional geometric shapes are configured on our 2-dimensional map. By default, leaflet expects all latitude, longitude, and shape data to be input as EPSG:4326, and it projects all of that data according to EPSG:3857. All of the basetiles are also projected as EPSG:3857.
…but what if a different standard were used to project data on the map. To demonstrate how a map of Louisiana counties might look different with a different projection, below, I’ve created a custom leaflet projection that projects data according to the US National Atlas Equal Area (EPSG:9311) system.
<- leafletCRS(
epsg9311 crsClass = "L.Proj.CRS",
code = "EPSG:9311",
proj4def = "+proj=laea +lat_0=45 +lon_0=-100 +x_0=0 +y_0=0 +a=6370997 +b=6370997 +units=m +no_defs",
resolutions = 2 ^ (16:7)
)
Here’s how the map looks with leaflet’s default projection.
<- function(opts = leafletOptions()) {
plot_la_map leaflet(census_counties, options = opts) %>%
setView(lat = 30.5191, lng = -91.5209, zoom = 6) %>%
addProviderTiles(providers$CartoDB.Positron, group = "Minimal") %>%
addPolygons(
weight = 1,
color = "black",
)
}
plot_la_map()
Here’s how the map looks when data is projected according to my custom projection. Be sure to Zoom out so you can see where the polygons appear on the map.
plot_la_map(opts = leafletOptions(crs = epsg9311))
Here I’ve told leaflet to project my data according to a projection standard that does not match the projection of my base tiles, which is why the data appears in the wrong location in relation to the base tiles. Additionally, you will notice that the orientation of the polygons on the map look a bit different than our default map. Keep in mind that each of these projection standards has different purposes. There is not a particular way to project data that is more “correct” than another way. However, when working in leaflet, it is important to have our data projected in the same way that our base tiles are projected so that we know where shapes are located on the map.
Symbolization
How we symbolize geographic features on a map can tell a different story about what is going on at a particular location. So far, we have been using a rather minimalist basemap that only shows major streets and major land features in Louisiana. Proximity to major highways and bodies of water can tell important story when it comes to pollution. However, geographic features that don’t appear on this minimalist map also tell an important story. For example, valleys nestled between mountain ranges can trap pollution, leading to poorer air quality in those areas. In the code below, we are going to create a basemap that let’s us toggle between two different ways of symbolizing geography.
To create the option to toggle between these maps, we will need to assign each version of the map to a “group.” We can then use addLayersControl()
function to indicate that we want to allow users to toggle between these two groups.
Question
I’ve initialized a map of Louisiana for you below with our basic base tiles. Set the group for these tiles to “Minimal”. Note that we will call addProviderTiles()
to add a second set of base tiles. Set these base tiles to providers$Esri.WorldImagery
and the group to “Satellite”. Add the addLayersControl()
function and set the base groups to the names of the two groups you just established.
Your map should match mine below.
<- leaflet(width = "100%") %>%
la_map setView(lat = 30.5191, lng = -91.5209, zoom = 6) %>%
addProviderTiles(providers$CartoDB.Positron, group = ______) %>%
addProviderTiles(______, group = ______) %>%
_____(
baseGroups = c("_____","_____"),
options = layersControlOptions(collapsed = FALSE)
)
la_map
Zoom in to the region along the Mississippi River between Baton Rouge and New Orleans and then toggle to the Satellite map. What do you notice about the geographic features of that area that you didn’t notice with the minimalist basemap? Both show that Cancer Alley is situated along the Mississippi River. However, with the satellite map, you may notice that there are several long and skinny plots of land along the Mississippi River. This is giving us a view into the legacies of environmental racism in this space.
Prior to this land being referred to as Cancer Alley, it was often referred to as “Plantation Country” as it was the location of several large sugarcane plantations. Plantations were set up in these long and skinny plots of land along the Mississippi River, making it possible to bring sugar to market. While you might find stately plantation houses facing the river, on the opposite side of these stretches of land, you would find the houses of the enslaved workers that were forced to work on these plantations. In the 1900s, following emancipation and the rising costs of sugarcane production, these plots became attractive to oil and gas companies looking to set up petrochemical refineries with access to a water source near the Gulf of Mexico. Descendants of the Black enslaved population that worked on the sugarcane plantations still live in these communities along the Mississippi River, making this a site of considerable environmental injustices. For more information see (Knowles and Rogers 2020).
Standardization
Now that we have our basemaps, we can begin to visualize the data we have pulled regarding Louisiana’s Black population from the 2020 census. Below I’ve created the first version of that map for you. First I created a color palette spanning from light blue to dark blue, and binning Black population values into 5 buckets. Then I added polygons of Louisiana counties to the map and colored the counties according the palette that I created. Finally, I added a legend to explain the colors appearing on the map.
<- colorBin(palette="Blues",
pal_bin_census_county_pop domain = census_counties$value,
bins = 5)
%>%
la_map addPolygons(data = census_counties,
weight = 1,
color = "black",
fillColor = ~pal_bin_census_county_pop(value),
fillOpacity = 0.5,) %>%
addLegend(data = census_counties,
title = "Population Black Residents",
pal = pal_bin_census_county_pop,
values = ~value)
You may notice that the colors appear a darker shade of blue (indicating a higher Black population) right around Louisiana’s major cities. The problem with using this map to identify predominantly Black communities is this: when we compare the total population of a certain demographic across regions, we don’t know if a larger population is a result of that demographic being more represented in that region, alternatively if it is a result of the total population being higher in that region. To be able to compare how Black communities are represented across Louisiana, we need to standardize our data to account for the total population. In other words, we need to calculate the percentage of the total population that identifies as Black in each region.
Question
Using a data wrangling verb, add a column to each census data frame that calculates the percentage of the population in each census county/tract that identifies as Black. To do so, you will need to divide the Black population stored in value
by the total population summary_value
and multiply by 100.
<- #Fill code here
census_counties
<- #Fill code here census_tracts
Question
Copy the palette and map that I created above, which colored counties by the total Black population. Adjust the code to color the counties by the variable you created to record the percent Black population.
<- #Create county palette here
pal_bin_census_county_perc
#Create county map here
Aggregation
Examining our map of Louisiana counties, we can see that counties along the Mississippi River from Baton Rouge to New Orleans do appear to have a higher percentage of Black residents. However, counties are quite large and can have some considerable racial disparities within them. This map is not giving us much insight into the racial make-up of communities specifically along the Mississippi River. In other words, our data here is aggregated at too broad a spatial scale to see the racial make-up of Cancer Alley. We are going to add another layer to map you created above that allows us to toggle between visualizing the racial make-up of counties and the racial make-up of census tracts, which are smaller units of geography.
Question
Copy the palette and map that you created above. Create a second palette to color polygons by the percent Black population in census tracts.
Pipe a second addPolygons()
function call and a second addLegend()
function call onto your map. Copy arguments from the previous function calls and adjust them to color the polygons in this second call by the percent Black population in census tracts.
Finally, in change the weight
of the county polygons to 2 so that we can differentiate the borders of our census counties from our census tracts.
# Copy the palette you created above for census counties here
# Create palette for census tracts here
# Copy the map you created above here and adjust according to the prompt
We have two layers on our map now, and we want to be able to toggle between them. In other words, I want to be able to turn the county layer off and the tract layer on and vice versa. To do this, we need to assign each layer (which includes our polygons and the corresponding legend) to a group and then add layer controls to the map. We also need to indicate which layers should be turned on and which should be hidden by default when we load the map.
Question
Copy the map that you created above into the code chunk below. Add a group =
argument to each addPolygons()
and addLegend()
call; set the group for your county layers as “County” and the group for your tract layers as “Tract”.
Then pipe the addLayersControl()
function onto your map. Within this function, set the baseGroup =
argument as we did above to c("Minimal","Satellite")
so that we can toggle between basemaps. Then set the overlayGroups =
argument to c("County", "Tract")
so that we can toggle between overlay layers. Add options = layersControlOptions(collapsed = FALSE)
so that the layer controls appear when we load the map. Add position = "bottomleft"
so that the layer controls don’t overlap with the legends.
Finally, pipe the function hideGroup("Tract")
onto the end of your map. This will hide the “Tract” group when we load the map.
# Copy map here and adjust according to the prompt
Question
Zoom in to the region along the Mississippi River between Baton Rouge and New Orleans. Then toggle the Tract layer on. Notice how there is a higher representation of Black residents residing along the Mississippi River compared to the rest of the counties. Below, explain how this is an example of the ecological fallacy.
Layering Toxics Data
As a last step, we are going to add a third layer to our map, displaying the toxic facilities data that we mapped last week. Run the code below to load that data, transform the CRS, create a palette to color the points by total releases, and map the data.
<- read_csv("https://raw.githubusercontent.com/SDS-192-Intro/public-website-fall-25/refs/heads/main/data/2023_la.csv", name_repair = make.names) |>
tri_la_2023 filter(X50..UNIT.OF.MEASURE == "Pounds") |>
arrange(desc(X107..TOTAL.RELEASES)) |>
mutate(chemical_total_text = paste(X37..CHEMICAL, X107..TOTAL.RELEASES, sep = ": ")) |>
group_by(X2..TRIFD) |>
summarize(X4..FACILITY.NAME = first(X4..FACILITY.NAME),
X12..LATITUDE = first(X12..LATITUDE),
X13..LONGITUDE = first(X13..LONGITUDE),
X23..INDUSTRY.SECTOR = first(X23..INDUSTRY.SECTOR),
X107..TOTAL.RELEASES = sum(X107..TOTAL.RELEASES),
X..CHEMICAL.TEXT = paste(chemical_total_text, collapse = " <br> "))
<- tri_la_2023 %>%
tri_la_2023 st_as_sf(coords = c("X13..LONGITUDE", "X12..LATITUDE"), crs = 4269) %>%
st_transform(4326)
<- colorBin(palette="YlOrRd",
pal_bin domain = tri_la_2023$X107..TOTAL.RELEASES,
bins = 6)
%>%
la_map addCircleMarkers(data = tri_la_2023,
weight = 0.5,
radius = 3,
color = "black",
fillColor = ~pal_bin(X107..TOTAL.RELEASES),
fillOpacity = 0.8,
popup = ~paste("<b>",
X4..FACILITY.NAME,"</b><br><i>",
X23..INDUSTRY.SECTOR,"</i><br><br>",
X..CHEMICAL.TEXT),popupOptions = popupOptions(maxHeight = 200,
closeOnClick = TRUE)) %>%
addLegend(data = tri_la_2023,
title = "TRI Releases in Pounds",
pal = pal_bin,
values = ~X107..TOTAL.RELEASES)
Question
Copy the last map that you created above visualizing the percent Black population in LA counties and tracts. Then copy my code above to addCircleMarkers()
for TRI facilities and addLegend()
for TRI facilities. Pipe these functions onto your map after your addPolygons()
function calls but before the addLayersControl()
function call.
Set the group =
argument for the TRI circle markers and the TRI legend to “TRI”. Then add “TRI” as an additional group in the overlayGroups =
argument in your addLayersControl()
function.
# Copy map functions and adjust here
As you play with this map, you might notice one problem. As soon as you turn the tract layer on, it covers the TRI layer, and you can no longer click on points to learn more about each facility. We need to communicate to leaflet how we want our layers ordered. To do that we, are going to set a zIndex, which indicates the order by which layers should be stacked on top of each other. A higher zIndex indicates that the layer will be stacked higher. In leaflet zIndex values range from 400 to 500.
To set the zIndex for each layer, we need to add separate “panes” to our map - one for “circles” and one for “polygons”. We want the facility markers to appear above the polygons (so they remain clickable), so we will want our circle pane to be stacked higher than our polygons pane.
Question
Copy the last map that you created above. At the beginning of your code right after calling la_map
, pipe a addMapPane()
function twice. In the first function call, set name =
argument to “circles” and the zIndex =
argument to 420. In the second function call, set name =
argument to “polygons” and the zIndex =
argument to 410.
Add the argument options = pathOptions(pane = "polygons")
to both of your addPolygons()
function calls, and add the argument options = pathOptions(pane = "circles")
to your addCircleMarkers()
function call.
Adjust your code to hide the “County” map by default.
# Copy map here and adjust according to the prompt
Question
Zoom in to the region along the Mississippi River between Baton Rouge and New Orleans. Notice how there are at least a few census tracts along the Mississippi River where high-polluting industrial facilities are located, but there is a lower percentage Black population. Explain how the modifiable aerial unit problem may be at play here and why we should consider it in relation to pollution data.
For several years, the U.S. Environmental Protection Agency published a map called EJScreen. The map included layers for socioeconomic indicators like race, income, and age, along with layers for environmental indicators such as TRI releases, proximity to superfund sites, proximity to major traffic sources, and more. Users could also map layers for health disparities, such as cancer rates and asthma rates. It helped document how certain communities faced disproportionate environmental burdens across the country. On February 5, 2025, EJScreen was removed from the EPA’s website, at the same time that the EPA was overhauling several diversity, equity, and inclusion (DEI) initiatives under the Trump administration. Luckily, the advocacy organization Public Environmental Data Partners (PEDP) archived a reconstructed version of the map and made it available here. This one example of several government data sources that have been suppressed for political reasons. Why do you think these data removals are happening? What are some of the consequences of removing or defunding public government data like this? What role should non-governmental organizations play in securing access to this data? Are there ever good political reasons to suppress public data, and who should be involved those decisions? Share your ideas on our sds-192-discussions
Slack channel.