My cohort-mate and fellow R wonk, Paul Bloom, and I presented these slides for Columbia Foundations for Research Computing. Our presentation focused on cleaning and plotting data on civilian allegations of NYPD misconduct from the New York Civil Liberties Union.

The NYPD Misconduct Complaint Database

From the NYCLU:

The NYPD Misconduct Complaint Database is a repository of complaints made by the public on record at the Civilian Complaint Review Board (CCRB). These complaints span two distinct periods: the time since the CCRB started operating as an independent city agency outside the NYPD in 1994 and the prior period when the CCRB operated within the NYPD. The database includes 323,911 unique complaint records involving 81,550 active or former NYPD officers. The database does not include pending complaints for which the CCRB has not completed an investigation as of July 2020.

Our slides

Part 1: Data Cleaning

Part 2: Plotting, with a focus on using ggalluvial to visualize complaints as they are ruled on by the CCRB, and then by the NYPD itself

Source code

These slides were made with xaringan. They are hosted in their own GitHub repo should you like to clone the code yourself.

They are packaged with an renv lockfile that should allow you to download all the dependency packages to run the code with a few commands. Please note that the project was written primarily in R 4.0.3. If you have R >= 4.0.0, renv::restore() should work smoothly to download our dependency packages, but if you have R 3.x.x you may not find it so easy (some of the dependency versions require 4.0.0 or above).

Hosting the slides to this website

I used git submodules to keep the main git-tracked repo for this project outside of my personal website, but copy and sync the content into this personal website repo to take advantage of the already-set-up web hosting.

  1. Created cu-nypd-ccrb-data and cloned to my computer as usual
  2. Used the same clone link to initialize a submodule in content/posts of this repo, my personal website repo
  3. Realized I wanted to move the submodule to another subdirectory; used git mv to move my submodule directory to static/posts instead, per Yihui Xie.
    • I originally created this submodule in content/posts so that any Rmd files would be auto-knitted by blogdown every time I rendered my whole site. However, we ended up going with xaringan slides, which need to be knitted on their own, not using the blogdown::html_page knitting engine. Putting the slides in static ensures that the slide files will still be copied to the public folder, but they won’t be auto-knitted using the wrong Rmd template.
  4. Whenever big changes were made in cu-nypd-ccrb-data, pulled in upstream changes in the submodule directory of my personal website repo

Submodules can be kind of a huge headache, but in this instance they served my needs well. Since I was collaborating with Paul on the slides, it was way easier to have cu-nypd-ccrb-data in its own independent GitHub repository. That way we could collaborate on that repo without me having to give Paul access to my entire personal website repo (don’t want to overwhelm him with all my files!).