Tools

R for Large Datasets

I prefer R to investigate large datasets because of its large user support community and its open access (I get tired of renewing my IBM-SPSS license every year!).  

In winter 2023, I conducted an independent investigation around the question, "Are Nebraska traffic stops disproportionately resulting in searches for different racial groups?" 

I leveraged the Stanford Open Policing Dataset to summarize data and determine rates of traffic searches for all Nebraska's 93 counties (n = 9,031,494).  This was too large for SPSS, Excel, or Google Sheets, so I used R (tidyverse) to build a summary of the data.

Why this data?

Data should be used to make the world better. In this project, I worked to illuminate patterns in Nebraska's State Patrol data so that the force can improve. Traffic stops are terrifying for BIPOC in the US because of centuries of sanctioned police brutality. The Stanford Open Policing Project is committed to providing data to the public so we can know how our tax dollars are being spent: to serve and protect?  

Why Nebraska? 

As a 5th generation  Nebraskan, I care deeply about its politics and hope to help it enact its vision of being "The Good Life" for all.

Data

 9,031,494 traffic stops over 15 years, officer-assigned racial category of the driver, and the county the stop occurred in. (Nebraska has 93 counties, many very racially white.)

Boolean "search conducted" data (TRUE/FALSE). 

Findings

 2.398% of White-categorized drivers were searched

Discussion

For Black and Hispanic drivers, a search was conducted for 2-3 more people out of 100 stops, on average. While police officers may use other available data about the driver to determine whether or not to conduct a search, it appears from the data available that the subject's race plays an outsized role. A subject's race alone is not an acceptable indicator for crime (see Bucerius & Tonry, 2013). I argue this warrants consideration by the Nebraska State Patrol and its constituents. 

Future Analyses