#Code hereData Wrangling
Question
Send me a list of all of the lending IDs for the mortgage applications that were originated for borrowers with a debt-to-income ratio of greater than 60%.
hmda_ms_2024 |>
filter(debt_to_income_ratio == ">60%") |>
select(lei)# A tibble: 2,344 × 1
lei
<chr>
1 549300DNWSJUNAWUM003
2 549300DMYENRV818D594
3 549300DMYENRV818D594
4 549300JPGMCMBEJEK584
5 5493000V86JDJLM0OV13
6 5493000V86JDJLM0OV13
7 5493000V86JDJLM0OV13
8 5493000V86JDJLM0OV13
9 5493000V86JDJLM0OV13
10 549300L7HZ4C5U35BO37
# ℹ 2,334 more rows
Question
Send me the lending ID and action taken for the home purchases with the top 10 loan-to value ratios.
# Code Herehmda_ms_2024 |>
filter(loan_purpose == "Home Purchase") |>
arrange(desc(loan_to_value_ratio)) |>
head(10) |>
select(lei, action_taken)# A tibble: 10 × 2
lei action_taken
<chr> <chr>
1 X05BVSK68TQ7YTOSNR22 Application denied
2 X05BVSK68TQ7YTOSNR22 Application denied
3 EQTWLK1G7ODGC2MGLV11 Application denied
4 EQTWLK1G7ODGC2MGLV11 Application denied
5 5493003GQDUH26DNNH17 Application denied
6 X05BVSK68TQ7YTOSNR22 Application denied
7 X05BVSK68TQ7YTOSNR22 Application denied
8 X05BVSK68TQ7YTOSNR22 Application denied
9 EQTWLK1G7ODGC2MGLV11 Application denied
10 2549009A2MMICIE00762 Application approved but not accepted
Question
Send me a list of the loan to value ratios, loan amounts, and property values for each originated loan.
# Code Herehmda_ms_2024 |>
mutate(property_value = loan_amount / (loan_to_value_ratio/100)) |>
select(loan_to_value_ratio, loan_amount, property_value)# A tibble: 34,284 × 3
loan_to_value_ratio loan_amount property_value
<dbl> <dbl> <dbl>
1 80 265000 331250
2 80 205000 256250
3 95 145000 152632.
4 90 485000 538889.
5 68.3 215000 315000.
6 NA 665000 NA
7 95 75000 78947.
8 NA 315000 NA
9 NA 215000 NA
10 80 145000 181250
# ℹ 34,274 more rows
Question
How many NA values were listed in the loan to value ratio variable?
# Code Herehmda_ms_2024 |>
filter(is.na(loan_to_value_ratio)) |>
nrow()[1] 19247
hmda_ms_2024 |>
summarize(total_na = sum(is.na(loan_to_value_ratio)))# A tibble: 1 × 1
total_na
<int>
1 19247
Question
Of applicants with a debt-to-income ratio of less than 36%, what was the mean and median interest rate of originated loans for each race?
# Code Herehmda_ms_2024 |>
filter(debt_to_income_ratio %in% c("<20%", "20%-<30%", "30%-<36%")) |>
group_by(derived_race) |>
summarize(mean_interest = mean(interest_rate, na.rm = TRUE),
median_interest = median(interest_rate, na.rm = TRUE))# A tibble: 9 × 3
derived_race mean_interest median_interest
<chr> <dbl> <dbl>
1 2 or more minority races 8.37 8.22
2 American Indian or Alaska Native 7.81 7.19
3 Asian 6.70 6.62
4 Black or African American 8.81 8.73
5 Free Form Text Only NaN NA
6 Joint 7.59 7.04
7 Native Hawaiian or Other Pacific Islander 10.8 11.4
8 Race Not Available 7.30 6.88
9 White 7.50 7
Question
Which 5 lending institutions lent the most? Send me a list of their IDs, the number of loans they originated, and the total originated.
# Code Herehmda_ms_2024 |>
filter(action_taken == "Loan originated") |>
group_by(lei) |>
summarize(total_loans = n(),
total_amount = sum(loan_amount)) |>
arrange(desc(total_amount)) |>
head(5)# A tibble: 5 × 3
lei total_loans total_amount
<chr> <int> <dbl>
1 01J4SO3XTWZF4PP38209 834 209840000
2 Q7C315HKI8VX0SSKBS64 695 194795000
3 549300SFG15JDKI5MD22 585 157785000
4 549300XQVJ1XBNFA5536 1238 137350000
5 5493005QK4NV0ZZ5EM64 543 120115000
Question
How many loans were denied because of debt-to-income ratio for each race and debt-to-income ratio bracket? Send me a list of the race, debt-to-income ratio, and count of denied loans, sorted first by the derived race, and second by the count in descending order.
# Code Herehmda_ms_2024 |>
filter(denial_reason == "Debt-to-income ratio") |>
group_by(derived_race, debt_to_income_ratio) |>
summarize(denied_for_debt_count = n()) |>
arrange(derived_race, desc(denied_for_debt_count))`summarise()` has grouped output by 'derived_race'. You can override using the
`.groups` argument.
# A tibble: 38 × 3
# Groups: derived_race [9]
derived_race debt_to_income_ratio denied_for_debt_count
<chr> <fct> <int>
1 2 or more minority races >60% 2
2 2 or more minority races 42%-49% 1
3 American Indian or Alaska Native >60% 3
4 American Indian or Alaska Native 42%-49% 1
5 American Indian or Alaska Native 50%-60% 1
6 Asian >60% 14
7 Asian 50%-60% 4
8 Asian 42%-49% 1
9 Black or African American >60% 263
10 Black or African American 50%-60% 124
# ℹ 28 more rows
Question
When applicants were denied, what was the most common denial reason for each race? Send me a list of the race, top denial reason, and count of loans that were denied for that reason.
# Code Herehmda_ms_2024 |>
filter(action_taken == "Application denied") |>
group_by(derived_race, denial_reason) |>
summarize(count = n()) |>
ungroup() |>
group_by(derived_race) |>
filter(count == max(count))`summarise()` has grouped output by 'derived_race'. You can override using the
`.groups` argument.
# A tibble: 9 × 3
# Groups: derived_race [9]
derived_race denial_reason count
<chr> <chr> <int>
1 2 or more minority races Credit history 15
2 American Indian or Alaska Native Multiple 18
3 Asian Debt-to-income ratio 19
4 Black or African American Multiple 2025
5 Free Form Text Only Credit history 2
6 Joint Multiple 43
7 Native Hawaiian or Other Pacific Islander Multiple 5
8 Race Not Available Multiple 213
9 White Multiple 1424