Data Collection Process

Data Collection Process

The datasets our team collected fall into four categories: data on the Promise Zone’s economic characteristics,  data on housing insecurities, data on individual property conditions, and data on health outcomes.  In collecting our data, our team took a comprehensive and systematic approach to both data collection and “cleaning.”  Data cleaning refers to the process of organizing and readying collected so that the datasets can be analyzed.  Below is an overview of each category that explains each category’s contents (i.e. what the category measures), where its datasets come from, and how they fit into this project.

Data on the Population’s Economic Characteristics (From 2016 American Community Survey, 5-year Estimates)-

We used data from the 2016 American Community Survey to create a community profile for the North Hartford Promise Zone.  Having an understanding of the economic background residents is important, because it sets the context for understanding the existence of certain housing conditions and health outcomes.  Variables included in this category were the unemployment rate and family poverty rate for the Promise Zone.  All of these datasets came from the United States Census’s American Community Survey, 5-year Estimates, meaning that the data measured these conditions over a five year period.  In this case, the period runs from 2012 to 2016.  These data measure aggregate household responses to the Census tract level.  To provide some background, Census tract is essentially the equivalent of a small neighborhood of around 2,000 to 4,000 people that is used by the Census for its own research.  Hartford has 40 Census tracts.

Data on Housing Insecurities (From 2016 American Community Survey, 5-year Estimates)

These datasets measure the housing insecurities and challenges that are not related to the physical conditions of the area’s properties. Examples include homeownership rates and rent as a percentage of household income.  We analyzed these variables, because we recognize that there are other challenges to maintaining a home besides the physical upkeep of the property. All of these datasets also came from the United States Census and are organized at the Census tract level.

Data on Individual Property Conditions (From Community Solutions)

This dataset tracks property conditions for individual residential properties in the North Hartford Promise Zone.  It comes from Community Solutions’ Property Condition Survey which was conducted in the summer of 2017.  The volunteers and Community Solutions’ staff that conducted the survey answered 25 questions about each Promise Zone property.  Based on the answers to those questions (ex. Are all the foundation walls free of open cracks, breaks, and/or deterioration?), each property was given a score based on the number off of the survey.  As the score for a given property increased, that property was considered to have more issues or conditions of “blight”.  For example, a property that received a score of “7” was considered to have more issues than a property that received a score of “4.”  This dataset, as will be shown next, has been extremely valuable for going beyond the tract level in the neighborhood and getting a very detailed understanding of the neighborhood’s housing stock.  You can download the full survey here:

Community Solutions Property Conditions Survey

Data on Health Outcomes (From 2015 CDC 500 Cities Project)

These datasets measure a wide variety of health outcomes at Census tract level, including major physical health outcomes like the rates of asthma and stroke.   They also measure other health outcomes such as the percentage of the population that gets under seven hours of sleep a night.  They come from the 500 Cities Project conducted in 2015 by the Centers for Disease Control.  The CDC project gathered public health data for the 500 largest cities in the United States so that the data could be utilized for the analysis of urban public health.


Deprecated: Directive 'allow_url_include' is deprecated in Unknown on line 0