Investigating ICE Detention Data

Methods 1 — Urban Ecologies

Author

Annonymous

Published

December 18, 2025

Introduction

State Research Q:

My final project centers around an exploratory data investigation looking at the statistical and spatial data analysis focused on the current administration in regards to ICE arrests and detentions. The second emphasis of this project calls out the public-private partnerships of funding within the detention centers and private prison development overall.

A. What are the mechanisms of ICE arrests and detentions in the context of this current administrations focus on Making America Safe again and deporting 600,000 people within the end of Trump’s first term?

B.What facilities are used for detention-related activities? What insights into the relationship between private funding and detention centers can be made? 

C: Overall, how do these exploratory questions offer further points of research articulating: What is the relationship between the federal and private engagement in ICE-related activity? 

ICE activity in this current administration is unprecedented, egregious, and a deep signal of concern for democracy for all of us. Looking at this topic through a statistical and temporal lens, I argue that it is essential to begin to understand the mechanisms in which a federal and corporate level of support, collaboration, and most importantly profit-driven motives have created a calculated and strategic attack on immigrants and people of color, regardless of their citizenship.

Utilizing the Vera Project’s professionally reviewed, cleansed, and organized dashboard of the most detailed analysis and detail of ICE detention population, I began my own investigation to look deeper into the real, “how did we get here” story. Their data sets had the facility metadata, a comprehensive list of detention centers but also the non-dedicated facilities used hold to this standard of cooperation with an unprecedented era of detention including hospitals, country jails most notably. The database records 1,397 facilities engaged with ICE detentions over a 16 year period from 2009 to mid-fiscal year 2025. 

3. Sources

  • List dataset names.

  • Provide URLs and access dates.

  • Note file formats (csv, geojson, shp, API).

  • List spatial & temporal resolution.

  • Describe limitations or bias.

https://deportationdata.org/index.html - XSXL Files

https://www.vera.org/ice-detention-trends CSV Files

https://www.census.gov/geographies/mapping-files/time-series/geo/carto-boundary-file.html Shapefile

https://www.prisonpolicy.org/reports/jails_immigration.html Research Database for Facility Info

Limitations and Bias: Obtaining ICE related information that is detailed, consistent, and accurate was very difficult. This project was entirely reliant on the work of data justice centered project such as “Deportation Data Project” our of UC Berkeley, the Vera Project for Social Justice, and The Prison Policy Initiative. To speak on the federal level of data challenges, there is an infrequency and inconsistency of data releases, insufficient linking of individual case level details since ICE, CBP, and other federal agencies do not link their data, and finally, there is many insufficient documentation with not a ton of realistic accountability. Thanks to the FOIA act, requiring the release of more accurate data, these data justice oriented orgs have been able to produce more comprehensive data sets. Nevertheless, when we consider this very moment, the FOIA act does not mean much to the transparency and accountability of detention and deportation data.

4. Data Prep and Cleaning:

Part A Data Clean and Prep:

Code
# Install packages if you haven't already
# install.packages("readxl")
# install.packages("dplyr")

library(tidyverse)
library(readxl)
library(lubridate)
library(janitor)
library(ggplot2)
library(scales)
library(readxl)

#Read in Arrest Data


ERO_Admin_Arrests <- read_excel("ERO_Admin_Arrests_FINAL.xlsx")
#Sort to define data during current Trump Administration

trump_arrests <- ERO_Admin_Arrests_FINAL %>%
  filter(`Apprehension Date` >= as.Date("2025-01-20"))

trump_arrests <- trump_arrests %>% clean_names()

#Seeing Criminality Status and Apprehension Method
apprehension_method_tbl <- trump_arrests %>%
  count(apprehension_method, name = "n") %>%
  mutate(
    percent = round(100 * n / sum(n), 1)
  ) %>%
  arrange(desc(n)) %>%
  as_tibble()

apprehension_method_tbl
write.csv(apprehension_method_tbl, "arrests_by_method", row.names = FALSE)

#Plot Apprenhension/Arrest Methods for context
ggplot(apprehension_method_tbl,
       aes(x = reorder(apprehension_method, n),
           y = n)) +
  geom_col() +
  coord_flip() +
  scale_y_continuous(labels = comma) +
  labs(
    title = "ICE Apprehensions by Method",
    x = "Apprehension Method",
    y = "Number of Apprehensions"
  ) +
  theme_minimal(base_size = 12)

---------------------------------------

#Same process for Detention Data
#Read in Detention Data
ICE_Detentions <- read_excel(
"data/ICE Detentions_LESA-STU_FINAL Release_raw.xlsx",
)

#Define data to reflect current Trump administration

trump_detentions <- ICE_Detentions %>%
  filter(`Book In Date Time` >= as.Date("2025-01-20"))

trump_detentions <- trump_detentions %>% clean_names()


#Seeing detentions final program
detentions_finalprogram_tbl <- trump_detentions %>%
  count(final_program, name = "n") %>%
  mutate(
    percent = round(100 * n / sum(n), 1)
  ) %>%
  arrange(desc(n)) %>%
  as_tibble()

detentions_finalprogram_tbl

#Plot Apprehension By Enforcement Context
trump_arrests %>%
  count(apprehension_method) %>%
  ggplot(aes(
    x = apprehension_method,
    y = n,
    fill = apprehension_method
  )) +
  geom_col() +
  scale_y_continuous(labels = scales::comma) +
  labs(
    title = "ICE Apprehensions by Enforcement Context",
    x = NULL,
    y = "Number of Apprehensions"
  ) +
  theme_minimal() +
  theme(
    legend.position = "none",
    axis.text.x = element_text(angle = 30, hjust = 1)
  )

#Plot Detentions by Enforcement Context
trump_detentions %>%
  count(final_program) %>%
  ggplot(aes(
    x = final_program,
    y = n,
    fill = final_program
  )) +
  geom_col() +
  scale_y_continuous(labels = scales::comma) +
  labs(
    title = "ICE Detentions by Enforcement Context",
    x = NULL,
    y = "Number of Detentions"
  ) +
  theme_minimal() +
  theme(
    legend.position = "none",
    axis.text.x = element_text(angle = 30, hjust = 1)
  )

Part B:

Code
library(sf)
library(dplyr)
library(tmap)
library(janitor)
library(stringr)
library(scales)
names(facility_detention_join)



# Prepare spatial facility data

facility_sf <- facility_detention_join %>%
  clean_names() %>%                             
  filter(
    !is.na(longitude),
    !is.na(latitude),
    !is.na(avg_daily_pop)
  ) %>%
  mutate(
    avg_daily_pop = as.numeric(avg_daily_pop),
    ice_guaranteed_minimum =
      as.numeric(gsub("[^0-9.]", "", ice_guaranteed_minimum))
  ) %>%
  st_as_sf(
    coords = c("longitude", "latitude"),
    crs = 4326,
    remove = FALSE
  )


# Clean facility names for matching


facility_sf <- facility_sf %>%
  mutate(
    facility_name_clean = detention_facility_name %>%
      str_replace_all("\\u00A0", " ") %>%          # remove non-breaking spaces
      str_squish() %>%
      str_to_lower()
  )


# Call out the Top 25 Highest Average Daily Population Facilities


highlight_facilities <- c(
  "Adams County Detention Center",
  "Stewart Detention Center",
  "South Texas ICE Processing Center",
  "Winn Corr Institute",
  "Moshannon Valley Processing Center",
  "Eloy Federal Contract Fac",
  "Otay Mesa Detention Center",
  "NW ICE Processing Ctr",
  "Montgomery Processing Ctr",
  "Lasalle ICE Processing Center",
  "Port Isabel SPC",
  "Denver Contract Det. Fac.",
  "Jackson Parish Correctional Center",
  "Bluebonnet Det Fclty",
  "Krome North SPC",
  "Richwood Cor Center",
  "Basile Detention Center",
  "Prairieland Detention Center",
  "Otero Co Processing Center",
  "Pine Prairie ICE Processing Center",
  "Joe Corley Processing Ctr",
  "El Paso SPC",
  "El Valle Detention Facility",
  "Houston Contract Det.Fac.",
  "IAH Secure Adult Det. Facility"
)

highlight_facilities_clean <- highlight_facilities %>%
  str_replace_all("\\u00A0", " ") %>%
  str_squish() %>%
  str_to_lower()

facility_sf <- facility_sf %>%
  mutate(
    highlight = facility_name_clean %in% highlight_facilities_clean
  )


# Incorporating the Financial Data Research for the popup only popup only


facility_sf <- facility_sf %>%
  mutate(
    ice_guaranteed_minimum_popup =
      if_else(
        is.na(ice_guaranteed_minimum),
        "Not available",
        dollar(ice_guaranteed_minimum, accuracy = 1)
      )
  )

# Create tmap


tmap_mode("view")

tm_facilities <-
  # Non-highlighted facilities
  tm_shape(filter(facility_sf, !highlight)) +
  tm_symbols(
    size = "avg_daily_pop",                        # 👈 ONLY spatial encoding
    col = "#2b6cb0",
    alpha = 0.5,
    popup.vars = c(
      "Facility" = "detention_facility_name",
      "Avg Daily Population" = "avg_daily_pop",
      "ICE Guaranteed Minimum" = "ice_guaranteed_minimum_popup"
    )
  ) +
  
  # Highlighted facilities
  tm_shape(filter(facility_sf, highlight)) +
  tm_symbols(
    size = "avg_daily_pop",                        # 👈 still ONLY size
    col = "#d53f3f",
    alpha = 0.9,
    border.col = "black",
    popup.vars = c(
      "Facility" = "detention_facility_name",
      "Avg Daily Population" = "avg_daily_pop",
      "ICE Guaranteed Minimum" = "ice_guaranteed_minimum_popup"
    )
  ) +
  
  tm_layout(
    title = "ICE Detention Facilities\nAverage Daily Population",
    legend.outside = TRUE
  )

tm_facilities

Part C:

Code
#Temporal Mapping of PAC Donation money from Top Three Private Prison Developers
library(tidyverse)
library(readxl)
library(lubridate)
library(janitor)
library(tidyverse)
library(lubridate)
library(dplyr)

list.files() 
file_list <- c(
  "Private_Prison_PACDonations_2021_2025.xlsx",
)

Private_Prison_PACDonations_2021_2025 <- Private_Prison_PACDonations_2021_2025 %>% clean_names()

# Step 6: Sort by date
Private_Prison_PACDonations_2021_2025 <- Private_Prison_PACDonations_2021_2025 %>% arrange(date)


#Sum Money By each each group
# Create year-month variable
donations_ym <- Private_Prison_PACDonations_2021_2025 %>%
  mutate(
    ym = floor_date(date, "month")
  )

#Aggregate by contributor × month
donations_monthly <- donations_ym %>%
  group_by(contributor, ym) %>%
  summarise(
    total_amount = sum(amount, na.rm = TRUE),
    .groups = "drop"
  )

#Line graph 
ggplot(donations_monthly,
       aes(x = ym,
           y = total_amount,
           color = contributor,
           group = contributor)) +
  geom_line(linewidth = 1) +
  geom_point(size = 1.5) +
  scale_x_date(
    date_breaks = "1 month",
    date_labels = "%b %Y"
  )+
  scale_y_continuous(labels = dollar) +
  labs(
    title = "Private Prison PAC Contributions by Month (2021–2025)",
    x = "Month",
    y = "Total Contributions",
    color = "PAC / Contributor"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1)
  )



#Create year + month
donations_ym <- Private_Prison_PACDonations_2021_2025 %>%
  mutate(
    year  = year(date),
    month = month(date, label = TRUE, abbr = TRUE)
  )

#Aggregate by contributor × year × month
donations_monthly <- donations_ym %>%
  group_by(contributor, year, month) %>%
  summarise(
    total_amount = sum(amount, na.rm = TRUE),
    .groups = "drop"
  )

tbl(donations_yearly)

--------
# 3️⃣ Faceted monthly line plot
ggplot(
  data = donations_monthly,
  aes(
    x = month,
    y = total_amount,
    color = contributor,
    group = contributor
  )
) +
  geom_line(linewidth = 1) +
  geom_point(size = 1.5) +
  facet_wrap(~ year, ncol = 1) +
  scale_y_continuous(labels = dollar) +
  labs(
    title = "Private Prison PAC Contributions by Month (Faceted by Year)",
    x = "Month",
    y = "Total Contributions",
    color = "PAC / Contributor"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1),
    panel.grid.minor = element_blank()
  )

ggsave(
  "pac_contributions_monthly_faceted.png",
  width = 20,
  height = 14,   
  dpi = 300
)
Code
#Ranking PAC Donation money from Top Three Private Prison Developers
library(tidyverse)
library(readxl)
library(lubridate)
library(janitor)
library(tidyverse)
library(lubridate)
library(dplyr)


PAC_Donations_Full_2021_2025 <- read_excel("PAC_Donations_Full_2021_2025.xlsx")
View(PAC_Donations_Full_2021_2025)



donations_raw <- read_excel("PAC_Donations_Full_2021_2025.xlsx") %>%
  clean_names()

names(donations_raw)

donations_ccpl <- donations_raw %>%
  filter(contributor %in% c("candidate_committee_leadership_pac", "amount"))

top5_recipients_by_contributor <- donations_raw %>%
  group_by(contributor, candidate_committee_leadership_pac) %>%
  summarise(
    total_amount = sum(amount, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  group_by(contributor) %>%
  slice_max(total_amount, n = 5, with_ties = FALSE) %>%
  ungroup()

top5_recipients_by_contributor <- donations_raw %>% group_by(contributor, candidate_committee_leadership_pac) %>% summarise( total_amount = sum(amount, na.rm = TRUE), .groups = "drop" ) %>% group_by(contributor) %>% slice_max(total_amount, n = 5, with_ties = FALSE) %>% ungroup() library(ggplot2) library(scales) library(ggplot2) library(scales) ggplot( top5_recipients_by_contributor, aes( x = candidate_committee_leadership_pac, y = total_amount, fill = contributor ) ) + geom_col(position = "dodge") + facet_wrap(~ contributor, scales = "free_y") + scale_y_continuous(labels = dollar) + labs( title = "Top 5 Candidate / Leadership PAC Recipients by Contributor", x = NULL, y = "Total Contributions (USD)" ) + theme_minimal(base_size = 3) + theme( axis.text.x = element_text(angle = 45, hjust = 1), strip.text = element_text(face = "bold") )

ggplot(
  top5_recipients_by_contributor,
  aes(
    x = candidate_committee_leadership_pac,
    y = total_amount,
    fill = contributor
  )
) +
  geom_col(position = "dodge") +
  facet_wrap(~ contributor, scales = "free_y") +
  scale_y_continuous(labels = dollar) +
  labs(
    title = "Top 5 Candidate / Leadership PAC Recipients by Contributor",
    x = NULL,
    y = "Total Contributions (USD)"
  ) +
  theme_minimal(base_size = 4) +
  theme(
    axis.text.x = element_text(angle = 35, hjust = 1),
    strip.text = element_text(face = "bold")
  )



write_csv(top5_recipients_by_contributor)

4.1 How are people being apprehended by ICE?

This section explores the current available data about arrest and detentions. Looking at these relationships, summaries, and analysis, we are able to understand how ICE finds people, what systems and agencies ICE relies on for this work, and who ends up being detained.

This chart looks at the most up to date data since the start of Trumps term. We see that the majority of encounters, over 87%, of arrests occur from interior, non-bordrer, and non investigative interactions. This signals the effective use of local and federal coloration, data sharing, and administrative processing. The largest category was from the Non-Custodial arrest at nearly 35%. These arrests refer to an arrest of someone who is not already in jail and is apprehended at a check-in, court apprence, or in public. Nearly 25% came from the Criminal Alien Program (CAP) regarding apprehensions from ICE of people from local jails. This occurs after people have been brought in for minor charges, pretrial detention, and jail holds. We will see in the interactive map the powerful collaboration of local, state, federal, and private coperation with ICE. Within the Trump Administration, local jails have especially een a focus on increasing contracts and enforcement programs. Custodial Arrests at nearly 17% refer to apprehensions of those already in custody at State prisons, Federal facilities, detention centers.

Interestingly, we see a bit of discrepancy from the pipeline of arrests-to-detentions. CBP, or Border Patrol makes up the large majority of the detention statistics, roughly 50% of all detentions. Second is from the ICE Enforcement and Removal Operations Criminal Alien Program at nearly 35%. This agency is the entire body responsible for carrying out subsequent removals and detention management. It uses the CAP programs that focus on apprehension and 287 (g) agreements for its enforcement.

An important conclusion of this EDA analysis reveals a key mechanism existing in the infrastructure of this administrations goals: interior enforcement identify and apprehend individuals while detention infrastructure that are dependable around border enforcement, and as we will see later, the establishment of privately -owned detention enters near the boarders.

5. Part B: Spatial Analysis, Contextualizing Detention Centers and the Role of Private Prison Development

Overview

For the purpose of my spatial analysis, I combined mapping all the U.S. detention-use facilities in the U.S. and the average daily population. My leaflet map shows key information about the capacity of these facilities, where concentrations of these failities are located, and for a stepping stone to a larger project, who funds the most active detention centers.

Spatial Mapping Facilities | Calculating Detention Activity by Detainee Population and Ranking Most Active Facilities

Marking Top 25 facilities with the highest Average Daily Populations

The Vera Project had a daily count for both daily population, as defined as people who checked in within a 24 hour period and midnight population, as the finest people who checked in after midnight. I originally wanted to sum the total population held at each facility for the entire year, however with the data being sorted in such a way where it was really difficult to parse in the first place, I did too data manipulations, first, I did an average per month count on the population at these detention centers and then was able to average over the six month period per facility. This aggregation and average step meant to give the most accurate and fair representation of the numbers. I then was able to get these averages and rank them from highest count of detainees to lowest, with 1397 facilities listed.

To focus my investigation, I arranged my data from highest populations in the facilities. I examine the top 25 facilities with the highest average monthly population over the course of Jan-June.

Top 25 Most Active Detention Facilities Jan-June 2025

Code
library(gt)
facility_detention_join %>%
  slice(1:25) %>%
  gt() %>%
  tab_header(
    title = "Top 25 Most Active Detention Facilities January 2025-June 2025") 
# Save as PNG
gtsave(gt_table, "Top25_Detention_Facilities.png")

I was first curious to examine the classifications of these facilities. I grouped the facilities by type to understand the operational support the top 25 most active facilities could have. Within the highest 25 ranked facilities, engaged with detention activity, 80% are listed on the official government documentation as a “dedicated “facility. Dedicated facilities consist of locally owned facilities, both private and state/local/independent but engaged with funding from the U.S. government. Examples include mixed-used facilities with contracts for detention activities. Private companies and local jails under contract with ICE. “Dedicated” alone was not enough contextualization so I looked into the Prison Policy Initiative. My skepticism proved right when I did a deeper dive to look at my top 25 ranked facilities and the wider data from the initiative that included operator category, U.S Marshall ADP, ICE ADP, ICE guarantee minimum, and U.S Marshals per diem rate. 23 out of 25 of the highest ranked active detention centers are under predominant private contracts listed as #1 Geo Corp, #2 Corecivic, #3 MTC for number three. These findings were not of a huge surprise however, I was surprised that federal operations were not of a higher rank (#21 and #23.) It is critical to note that these top facilities are concentrated in areas in near the border, largely red states. This part of my project called for a much more serious dive into the social implications and would conclude my independent data generation and analysis.

Further Research on Operator Category: Where does Private Investment Hide?

I was really interested in my cumulative research on this theme to begin to further understand the collaboartion between public-private partnerships at it relates to ICE contracts. I began first at looking at Prison PACS. These refer to the donation of funds to a political candidate or campaign. After finding an anonomyous data set posted on Reddit that traced PAC, I aimed to do a temporal mapping of this financial data. The available data began in 2021-2025.

Temporal Mapping of Prison PAC Donations 2021-2025

Code
tibble::as_tibble(donations_yearly)
library(gt)

donations_yearly %>%
  gt() %>%
  tab_header(
    title = "PAC Donations",
    subtitle = "2021–2025"
  ) 
Code
tibble::as_tibble(top5_recipients_by_contributor)
library(gt)

top5_recipients_by_contributor %>%
  gt() %>%
  tab_header(
    title = "PAC Donations",
    subtitle = "2021–2025"
  ) %>%
  fmt_currency(
    columns = where(is.numeric),
    decimals = 0
  )
  1. Part C: Temporal Data Analysis: What were the Highest Donations and to Whom?

This graph I created shows the breakdown of the top 5 contributions by CoreCivic, GeoGroup, and MTC. Further contextualization reveals 89% of the listed contributions went to Republican-aligned candidates or party infrastructure. Only 19% of funds went to individual candidates; part-building or leadership structures made up over 80% of the PACs. House races acquired 4x more direct funding than for senate races. The prison PACS indicate the relational opportunity building for “pay-to-play” scenarios for candidates who favor lobbying and policy for federal and state support for policies favoring for Private Prisons.

9. Discussion & Interpretation

The breakdown of this project’s themes was deliberate. By starting witht he arrest and detention detention data, I was able to mark research points to investigate further. The picture is clear; the expansion of ICE activity at the federal level in the U.S during this current administration relies on the many collaborations between state, local, and private sectors. Moreover, the essentialness of private prison development and ifnrastructure is a key marker of profit-motive. Part C underscores a much more hidden narrative of the “pay-to-play” politis of prisons in the U.S., especially in republican, southern states. For next steps, I would like to mostly focus on expanding the detail of the Tmap of the detention facilities to include more of the financial breakdown points and to tag all of the faiclities by contributor type i.e. private/federal/local. That work was difficult to achieve on my own as I manually researched each individual facility. I am hoping to collaborate with projects focused on simmilar themes in the future.