Air Quality Sensing and
Neighborhood Health Analysis
Tools: Kepler.gl/Mapbox, Python(pandas, numpy, seaborn), R, sklearn, Tableau,
Data sources: Array of Things (28 Million+ samples taken in October 2019), Chicago Health Atlas (2017 data), Walkscore.com API
Overview
This research examines the relationship between walkability, pollutant concentrations and zip code demographics (median area income, asthma related hospital visits for individuals under 18 years old) in Chicago. The analysis focuses on answering the following questions:
Do more walkable areas have different pollutant levels than less walkable areas?
Is median area income an indicator of community health and asthma related hospital visits?
How do pollutant levels vary by time of day and location?
Are there differences in the ozone levels between weekdays and weekends?
This analysis looks at a range of air quality sensors recorded by Array of Things and focuses a more in depth analysis on ozone concentration. ‘Ozone is one of the six common air pollutants identified in the Clean Air Act...It is created by chemical reactions between oxides of nitrogen (NOx) and volatile organic compounds (VOC). This happens when pollutants emitted by cars, power plants, industrial boilers, refineries, chemical plants, and other sources chemically react in the presence of sunlight’(EPA). Of the pollutants measured by AoT, Ozone was among the highest compared to the National max 8 hour average parts per million. In additional to negative environmental effects, ‘Breathing ozone can trigger a variety of health problems.’ (EPA)
Methods
Step 1 - Data Collection: Requests were made to the Walk Score (WS) API inputting the required parameters (URL address, lat+long, API key) based on the available coordinates from Array of Things (AoT), which was acquired off the project’s website. Demographic data was acquired from Chicago Health Atlas.
Step 2 - Data Wrangling: Two separate analyses were conducted using R and python. The WS, AoT, Chicago Health Atlas and shapefile data were all joined into one table based on common lat/lon/zipcode values. Walkability groups were created based on walk scores based on a normal distribution. For some analytical methods, values were normalized with a z-score.
Step 3 - Data Exploration and Visualizations: A normalized correlation matrix, box plots, line graphs, scatter plots and spatial maps were produced to address the aforementioned questions. Additionally, a k-means cluster was produced (not included here) to reveal similarities between different zip codes.
Data Sample
+
Exploratory Data Analysis
Crude Rate by Zip Code
Median Income by Zip Code
Ozone Concentrations in Chicago as Measure by Array of Things Sensors
Sample Context
Initial analysis conducted with Jen Foran
Next Steps:
AoT has introduced car and pedestrian sensors into their nodes, however, they are not yet calibrated/active as of this writing. This additional dataset would be insightful for understanding both walkability and car emissions.
This analysis was limited to 28 million entries and 16 nodes. A broader time frame with more entries could potentially result in additional insights.
The 16 available nodes with air quality data captured a limited number of zip codes. As additional node data becomes available we will be able to see a bigger picture of demographic trends.
Key Takeaway: There is a clear correlation between the median area income and asthma related hospital visits. Of the zip codes analysed, South East Chicago revealed areas with lower incomes and higher hospital visits. These areas also revealed pollutant concentrations that sustained over a longer period of time, although there is no clear correlation between ozone levels and hospitalizations. Additional nodes/zipcodes should be included in future analyses as data becomes available in order to verify these initial findings.