March 26, 2019
COVID-19 Cases and Air Quality
During the initial influx of information about the Coronavirus outbreak, I became interested in the hypothesis that the COVID-19 pandemic, while detrimental to health, was indirectly improving other health outcomes due to the reduction in air pollution (Kimbrough, 2020).
Pollutant Drops in Wuhan, from https://edition.cnn.com/2020/03/16/asia/china-pollution-coronavirus-hnk-intl/index.html
Figure 1 shows the difference in nitrogen dioxide readings in Wuhan, China, between 2019 and 2020. The lockdown in Wuhan due to the coronavirus disease 2019 (COVID-19) outbreak occurred in January 2020 (BBC, 2020). As people were required to remain in their homes, the lockdown had the effect of reducing human economic activity. Pollution generated by that activity would also be reduced (ESA, 2020). Others hypothesised that there might be a corresponding reduction in deaths due to the decrease in pollution (Burke, 2020). It was heartening to read that although some people were suffering from COVID-19, others may be spared because of the reduction in air pollution. However, as of March 2020, most visualisations seemed to compare 2020 air pollution data to previous year’s data. There did not seem to be a direct visualisation of the relationship of air pollution to the progression of the COVID-19 outbreak. This paper aims to present such a visualisation.
Week 9 lab session, Creating Visualisations with Software, will be used as a framework to facilitate designing and developing the visualisation. Before designing the visualisation, data must be obtained and formatted into the appropriate format for the software.
The initial inspiration for this paper’s visualisation was provided by Burke, as shown in Figure 2 (Burke, 2020). The first design idea would be to overlay the local COVID-19 case data on top of the air quality data to see when the reduction in air pollution occurred as COVID-19 cases increased. The air quality data was obtained from the AirNow website (AirNow, n.d.). Chengdu was chosen as it was the closest city, on the AirNow website, to Wuhan, China. The datasets readings were hourly, and there was one CSV file for each year. These were combined into one data frame, and the hourly readings were averaged [a process frowned upon by some (aqicn.org, n.d.)] to daily ones. The final data frame was exported as a CSV file “Chengdu.csv”.
PM2.5 concentrations in Chengdu in Jan-Feb 2016-2019 (red lines) vs the same period in 2020 (blue lines) (Burke, 2020).
AQI in Chengdu in Jan-Feb 2016-2019 (red lines) vs the same period in 2020 (blue lines) vs COVID-19 cases (black line).
The COVID-19 data was obtained on GitHub. The GitHub COVID-19 data was at the provincial level for China. However, the AirNow air quality was at the city level. For the purpose of providing a dataset for this coursework, as Chengdu is the capital of Sichuan, it seemed reasonable to extract the Sichuan COVID-19 data and combine with the Chengdu air quality data. Figure 3 shows a test ggplot2 (Wickham, 2016) chart prior to exporting the data set for visualising in third party tools.
The data was exported to Datawrapper. It was realised that creating a graph with two different Y axes, as in Figure 3, was not supported in Datawrapper. In fact, there was a blog post arguing against the practice (Rost, 2018). Figure 4 shows the chart of the data created with Datawrapper.
Another tool investigated in the week 9 lab was RAWGraphs. After the data importation step, the next page displayed the different charting options available. Since none seemed to be a good match for the visualisation requirements (dual Y-axis line chart), this approach was also abandoned.
Since neither of the tools met the requirements, it was decided to return to ggplot2. It was determined that ggplot2 did support separate Y axes (Holtz, 2018). The bbplot package (BBC, 2019) seemed to be an effective way of developing a modern look for the line chart. The colour palette chosen was Snorkel from Pantone (Pantone, 2020). Two charts were designed — one with dual Y axes, the other with two separate plots. Appendix 1 shows the data manipulation and the code used to generate the charts.
Option 1 – ggplot2 with dual Y axes
Option 2 - ggplot2 in two separate charts
Unfortunately, the author has been unable to thoroughly evaluate the two different options with prospective consumers of the charts. However, in option 1, the Y-axis label colours were changed to reduce ambiguity. In option 2, the AQI Y-axis limit was reduced to enable the line chart to display more centrally and to highlight the perceived slope relative to the lockdown date.
In terms of the communication the data, both visualisations described in Figure 5 and Figure 6, the shows a clear relationship between when the lockdown occurred in Chengdu and the effect it had on cumulative COVID-19 cases. Surprisingly, it was not apparent that there was a similar effect for the AQI. Perhaps AQI is a lagging indicator, and more recent datasets might show an effect? Maybe the model is too simplistic? There are additional determinants of air quality, like weather conditions (aqicn.org, n.d.)?
When using any tool to facilitate data visualisation, the user is constrained the functionality provided by that tool, and the user must manipulate the data for consumption by that tool. Typically, higher-level tools seem to prioritise a specific set of use cases. If the visualisation use case is not supported, then an alternative tool or design must be chosen. As pointed out in the IM921 lectures, even low-level tools like ggplot2 have constraints. For example, in Figure 5, it may have been clearer to display the month labels between the axis ticks. However, that feature is not specifically supported in ggplot2 (Add option for range ticks (tick labels between tick marks) · Issue #1966 · tidyverse/ggplot2, 2017).
Surprisingly, the charts showed that air quality in Chengdu did not seem to be influenced by the COVID-19 lockdown. Until one realises that the Wuhan lockdown did not occur until January 23rd (Wikipedia, 2020), this finding seemed to contradict other visualisations (Figure 1) that displayed a dramatic reduction in pollution due to the COVID-19. Further investigation would be needed to determine what proportion of the change in air quality in Chengdu was due to the 2020 COVID-19 outbreak as compared to other determinants.
Add option for range ticks (tick labels between tick marks) · Issue #1966 · tidyverse/ggplot2. (2017). GitHub. https://github.com/tidyverse/ggplot2/issues/1966
AirNow. (n.d.). Retrieved April 8, 2020, from https://airnow.gov/index.cfm?action=airnow.global_summary#China$Chengdu
aqicn.org. (n.d.). Beginner’s Guide to Air Quality Instant-Cast and Now-Cast. The World Air Quality Index. Retrieved April 8, 2020, from https://aqicn.org/search/vn/
BBC. (2019, February 1). How the BBC Visual and Data Journalism team works with graphics in R. Medium. https://medium.com/bbc-visual-and-data-journalism/how-the-bbc-visual-and-data-journalism-team-works-with-graphics-in-r-ed0b35693535
BBC. (2020, January 23). Lockdowns rise as China tries to control virus. BBC News. https://www.bbc.com/news/world-asia-china-51217455
Burke, M. (2020). COVID-19 reduces economic activity, which reduces pollution, which saves lives. http://www.g-feed.com/2020/03/covid-19-reduces-economic-activity.html
ESA. (2020). Coronavirus lockdown leading to drop in pollution across Europe. https://www.esa.int/Applications/Observing_the_Earth/Copernicus/Sentinel-5P/Coronavirus_lockdown_leading_to_drop_in_pollution_across_Europe
Holtz, Y. (2018). Dual Y axis with R and ggplot2. https://www.r-graph-gallery.com/line-chart-dual-Y-axis-ggplot2.html
Kimbrough, L. (2020). Response to one pandemic, COVID-19, has helped ease another: Air pollution. Response to One Pandemic, COVID-19, Has Helped Ease Another: Air Pollution. https://news.mongabay.com/2020/03/response-to-one-pandemic-covid-19-has-helped-ease-another-air-pollution/
Pantone. (2020). Pantone Color of the Year 2020 Palette Exploration | PANTONE 19-4052 Classic Blue | Pantone UK. https://store.pantone.com/uk/en/color-of-the-year-2020-palette-exploration
Rost, L. (2018). Why not to use two axes, and what to use instead | Chartable. https://blog.datawrapper.de/dualaxis/
Wickham, H. (2016). Create Elegant Data Visualisations Using the Grammar of Graphics. https://ggplot2.tidyverse.org/
Wikipedia. (2020). Hubei lockdowns. In Wikipedia. https://en.wikipedia.org/w/index.php?title=2020_Hubei_lockdowns&oldid=949385366
COVID-19 is an infectious disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). ↑
As this is a fast-developing topic, coupled with the 2-week submission delay, no material newer than the document date will be referenced. ↑
Particulate matter (PM) is a term used to describe the mixture of solid particles and liquid droplets in the air. PM2.5` is particulate matter diameter less than 2.5 micrometres (µm). ↑