The EbolaMapper project is all about coming up with computer graphics (charts, interactives, maps, etc.) for visualizing infectious disease outbreaks.
A sign of excellent news for any given outbreak is when the bullseye plot animations go static. For example, consider the WHO’s visualization shown below which is plotting data for the 2014 Ebola Outbreak in West Africa.
Each bullseye shows two datum: the outer circle is cumulative deaths and the inner circle is new deaths in the last 21 days. 21 days is the accepted incubation period and that is why Hans Rosling tracks new cases for the last 21 days. When the inner circles shrink to zero the outbreak is over.
Yet there are much lower tech ways of presenting information to people that can be quite affecting. On the grim side there are the graves.
Sadder still are memorials such as The Lancet’s obituary for health care workers who died of ebola while caring for others – true fallen heroes.
On the other hand there are signs of positive progress.
The image on the left is from a MSF tweet:
The best part about battling #Ebola is seeing our patients recover. Here, hand prints are left in Monrovia #Liberia
The image on the right is from an UNMEER tweet:
Survivor Tree in Maforki, #SierraLeone holds piece of cloth for each patient who left ETU Ebola-free. #EbolaResponse
Those must be quite uplifting reminders on the front lines of the ebola response. Likewise EbolaMapper should have positive messages, say turning bulls-eyes green when there are no new cases. That will need to be kept in mind.
National Geographic sure knows the power of good visuals. Their ebola tracking report has one particularly neat yet unusual view: a graphical calendar of 2014, one map per week.
They do not have case mapped to locations over time. For that The New York Times is the best.
I have finally found quality outbreak data with which to work:
Sub-national time series data on Ebola cases and deaths in Guinea, Liberia, Sierra Leone, Nigeria, Senegal and Mali since March 2014
I came to this dataset via a long, convoluted hunt. The difficulty of this search has led me to understand that the problem I am working on is one of data, as well as of code. This will need to be addressed, with APIs and discoverability but for now it is time to move on to coding (finally).
After I concluded that the data was usable, I started poking around on its page on HDX a bit more. On the left side of the page there are links to various visualizations of the data. This is how I discovered Google’s Public Data Explorer which is quite a nice tool. Below is one view of the HDX data in the Explorer. Click through the image to interactively explore the data over at Google.
Also among the visualizations on the HDX page was, to my surprise, the NYTimes visualization. Low and behold that visualization credits their data source as the HDX:
Source: United Nations Office for the Coordination of Humanitarian Affairs, The Humanitarian Data Exchange
So, that is good enough for me: the data hunt over. It is time to code.
On November 7th a group of charities including MSF, Red Cross and HOT unveiled MissingMaps.org, a joint initiative to produce free, detailed maps of cities across the developing world—before humanitarian crises erupt, not during them.
I mention this here for multiple reasons.
1) OpenStreetMaps is a great resource and HOT has been active with the ebola response. OpenStreetMaps is the source for maps in the EbolaMapper project I am working on.
Although the current focus is the ebola outbreak, these open source tools that I am calling EbolaMapper can be easily repurposed for any future outbreaks, as they read their data via the generic <a href=”https://github.com/JohnTigue/EbolaMapper/wiki/Outbreak-Time-Series-Specification-Overview</a> APIs I am developing. Next time (and statistically that is likely to occur before 2020) there should be free, quality tools at the ready for people to quickly get started on outbreak monitoring without having to wait for large organizations to mobilize.
As I have gotten to know some of the folks who have been involved with responding to previous epidemic outbreaks, they sound like they are living through a nightmare version of Groundhog Day (Swine flu in 2009, SARS, etc.). Yet now this type of problem can be solved generic, mature, widely available Web technology i.e. it does not require complex novel technology that needs to be scaled massively. (On the other hand, we do need to be mindful that currently in Liberia “less than one percent of the population is connected to the internet.” [Vice News]).
With the current established culture of open source it would be shameful for this type of flatfooted, delayed response to occur again. We have the technology to enable local actors to immediately get started by themselves the next time there is an epidemic outbreak.
2) This is a perfect example of one way that funding in this weird space can be successful. Folks (private and public) trying to effectively allocate money can find open source and/or open data projects that are already working and then juice them with cash for scaling, which is always an aspect of the large success stories in open source.
This is a bizarre but exciting variant of the thinking of Steve Blank and the lean start-up folks as applied to open source business models. I say bizarre because the customers (those benefiting from public health and disaster relief projects like the ebola response) cannot pay and have no obvious monetizable value as users. Here the open-source community has found a successful model and now that it is proven out the funding organizations are providing the cash to accelerate tech development to scale, where normally that cash would come via a series A round with venture capitalists.
In many successful open source projects, tech companies are paying talent to produce code that will immediately be placed in essentially the public domain. The value of doing so is expertise status with paying customers and keeping that scarce talent in-house to service those customers.
“The best minds of my generation are thinking about how to make people buy support contracts for free software.” –Anonymous
But who is going to do the funding in disaster relief contexts, specifically for the maps and do so proactively? So, this is MissingMaps.org situations is great news.
3) This blog loves a good map visualization related to the ebola response. The Off The Map story has a neat one. In the above picture the red handle can be dragged left and right to see the before versus after map.
Hans Rosling was recently interviewed on the BBC’s More or Less. He was doing he regular excellent job of entertainingly engaging the public via statistics. The full interview is less than ten minutes. The BBC also did a write up of the interview.
Rosling reported that in Liberia at the peak of the outbreak daily infections were about 75 per day and are now stuck at around 25 per day. He believes the current (second) stage of the outbreak could well be labeled as endemic , an intermediate level epidemic that will take some time to put out.
A statistic that he says is important is the reproductive number. At the peak of the outbreak it was almost 2.0; currently it is closer to 1.0. The point is that the reproductive number is a key stat that needs to be tracked.
Later, five minutes into the interview, he has a go at main stream media’s reportage, specifically the use of cumulative numbers:
It is a bad habit of media. Media just wants as many zeros as possible, you know. So, they would prefer them to tell that in Liberia we have had about 2,700 cases or 3,000 case. The important thing is that it was 28 yesterday. We have to follow cases per day. I can take Lofa province, for instance, that has had 365 cases cumulative and the last week it was zero, zero, zero, zero every day. That is really hopeful that we can see the first infected [county] is where we now have very low numbers because everyone is aware.
Notice that the NYTimes ebola viz uses cases-per-week. We can build out visualization tools which provide a similar level headed overview of a situation, which might even help to reduce anxiety in the public compared to cumulative-cases representations.
Take away: the two statistics he pointed out, cases-per-day and reproductive-number, will be visualized in this open source epidemic monitoring dashboard tool set being built out here.
The New York Times’ ebola visualization sets the bar for high quality interactivity.
That makes sense as Mike Bostock works at the Times. He is one the creators of D3.js which is the open source engine behind most of the gorgeous data-driven visualizations on the Web these days. If you have not yet seen it, the D3.js examples gallery is a whole lot of eye candy.
Take this outbreak visualization as confirmation that any open-source white label outbreak widget should be based on D3.js.
Graphs ‘N’ Waffles is a twitter feed that delivers just what the name implies. This recently tweeted graph is worth a gander. That is quite an uptick the blue line took.
Another ebola factoid was reported by The Lancet:
During October, there were 21,037,331 tweets about Ebola in the USA, compared with 13,480 about Ebola in Guinea, Liberia, and Sierra Leone combined.
Visualizations on the Web can be classified as interactive or static. The split is not quite binary; is a zoomable map really “interactive”?
I want to produce both interactive and static viz, with hopefully the former being used to generate the latter. SVG is good for exporting static raster images to file or paper. D3.js uses SVG, so interactive D3.js-based visualizations should be able to export excellent static maps and charts (we will see). Some users of this information will be on limited machines so bandwidth-light static info should be readily available.
I have found very few highly interactive ebola visualizations. Please point out any that I have missed in the comments. The best three found so far are listed here.
All three’s features, pros, and cons are analyzed on the EbolaMapper wiki.
A major goal of this EbolaMapper project is to create the very best visualizations of ebola on the Web. Which leads to the question: what is the high bar? [Update: Spoiler, the answer is The New York Times’ visualization.]
To answer that question I will be curating a collection of links to the best visualizations found on the Web.
For example, The Economist is doing good work:
The curated links can be found on the EbolaMapper wiki.
Note: EbolaMapper is the working title for this project; really it is more like “Reusable Outbreak Monitoring Web Components for a Global Outbreak Monitoring Network Organization.” Right, so EbolaMapper is the working title until a better name comes along, if one did not just pass by a moment ago…