Data Cleaning

COVID-19 Data Cleaning

Introduction This blog post is dedicated to downloading and cleaning data for an exploratory analysis of COVID-19 in the United States. I will be using COVID-19 cases and deaths data from the New York Times, demographic and spatial data from the US Census Bureau, and social distancing data from the Opportunity Project. The end goal is to have four separate tidy datasets: two that contain a unique identifier and spatial geometries for the counties and states of the US, and two that contain combined time series and demographic data for US counties and states. Along the way, we will deal with idiosyncrasies between how different data is collected, and with a small number of missing variables.