Skip to content

rben01/covid19

Repository files navigation

COVID-19/Coronavirus Tracker

Read this page as a standalone webpage here, where it is formatted much more nicely than GitHub’s READMEs.

This page contains graphs of the spread of coronavirus throughout the world and code to create those graphs.

While this document discusses the effects of coronavirus from a statistical standpoint, its intent is not to reduce those affected by coronavirus to mere statistic. Unfortunately, graphs cannot exhibit the personal and societal impacts of a disaster; they can only show the numerical outcomes.

ℹ️

The COVID Tracking Project will stop releasing their daily data update on March 7, 2021. Since this site relies on that data, this site will also stop updating on that date.

[map signs] Reading the maps

This section contains an interactive 2x2 grid-plots. The four quadrants show how the number of cases and deaths, total and per capita, has progressed over time. Regions that have light gray stripes on a given day had no data available on that day. Otherwise, the colors are logarithmically scaled. Because the range of values in each quadrants is different, each quadrant’s color scale is unique to that quadrant.

{bullet-interaction} Interaction

At the top is a menu for selecting which data you would like to view, and below are the maps themselves. On desktop, zoom by holding the shift key and scrolling over the graph. (Due to what I can only assume is a Safari bug, shift+scroll doesn’t in Safari. It should work fine in other browsers, though.) Once you’ve zoomed in, you can drag to pan and can reset the view by double-clicking on the graph. On mobile, use two fingers to zoom and pan. Hovering your cursor over a region (desktop) or tapping it (mobile) will show its data for the selected date. Along the bottoms are controls for continuous playback of the data over time as well as a slider to pick the date manually.

📆 Interpreting the Axes

The dates on the graphs’ x-axes represent data collected in the time span from the most recent occurrence of midnight to the given date. For instance, the vertical line over March 21 represents data collected from 00:00 March 20 to 23:59:59 March 20. On the last day, there will be less than 24 hours of collected data, so the vertical line over today at current time represents the data collected from midnight (this morning) to now.
The y-axis of the graphs is log scaled. On all graphs, minor y-axis gridlines are spaced linearly between major gridlines.

{bullet-interaction} Interaction

Use the controls at the top to select the data you wish to display.
In each graph, the ten locations with the highest numbers for the selected variable are displayed. You can use the arrow buttons to display lower- or higher-ranked locations.
By default, the legend displays data for the graphed locations on the most recent date (usually within the past two days). Hover your mouse over the graph (desktop) or tap a spot on the graph (mobile) to choose a different date to display in the legend. Hover your mouse over (desktop) or tap (mobile) a row in the legend to select an individual region to view in the chart.

The data sources below — which are what’s used in the graphs below — have been aggregated into a single table available here. Refer to the note below for the interpretation of the dates in this table.

In addition, while not in use, Corona Data Scraper seems like a good source as well.

✅ Data Quality

These graphs only convey accurate information when the data feeding them is good; garbage in, garbage out. In particular, the number of confirmed cases in a given region is reflective of both that region’s true number of cases and their testing capabilities. A rapid initial increase in confirmed cases is likely more indicative of early testing initiatives than the true rate of spread, and as the true number of true cases outpaces a region’s testing capabilities, the reported number of confirmed cases will be an increasingly low estimate of the true number of cases.
Similarly, the number of deaths attributed to COVID may fall short of the true number of deaths COVID has caused. For instance, on April 6, 2020, New York announced that they no longer had the capacity to perform post mortem coronavirus tests, which means New Yorkers who die of COVID without having been diagnosed with COVID will not be recorded as having died from it. This is also not to mention those who died because of coronavirus, but not from it; there are many who would otherwise have received medical care but, due to the burden placed on the world’s healthcare infrastructure by coronavirus or reluctance to go to a hospital because of the risk of catching coronavirus, have not been able to.

For a more in-depth picture of the difficulties of data collection vis-à-vis pandemic modeling, see FiveThirtyEight’s A Comic Strip Tour Of The Wild World Of Pandemic Modeling

🔄 Data Updates

Due to changing quality and up-to-dateness, the data sources used for these graphs are subject to change. Additionally, while data sources are expected to update periodically with new, current data, they may also amend their past data as they get more accurate historical data.
On assumption these graphs make is that the population within a region is constant over time — any changes in a region’s population are ignored when computing per-capita numbers. This assumption is problematic. For instance, early on, many New Yorkers left the state for elsewhere in the U.S. If a region’s true population decreases (its residents emigrate), then its per-capita numbers will be artificially deflated. Correspondingly, if a region’s true population increases (people immigrate), then its per-capita numbers will be artificially inflated.

  1. Clone the GitHub repo:

    git clone https://github.com/rben01/covid19.git
    cd covid19
  2. Create the conda environment:

    conda env create -f environment.yml
    ℹ️
    If you do not already have conda installed, you can install it from here.
  3. Activate the environment:

    conda activate covid
  4. Finally, run the graphing script:

    python src/case_tracker.py
  5. The script has a command line interface; check it out:

    python src/case_tracker.py --help

About

Analyzing + plotting COVID-19 data to see how cases have spread through the world.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •