This article is updated regularly.
All data presented are as of July 20, 2020, the last update was on July 22, 2020.


The ongoing COVID-19 pandemic needs no explanation because the whole world is currently badly affected. The high infectivity of the SARS-CoV-2 pathogen and the potentially severe course of the disease are putting healthcare systems around the world to the test. However, to take the right steps to contain the pandemic, analyzing the number of cases is an essential step.

Many of us have now made it a habit to look at the current state of the case numbers at least once a day. The good availability of the data and new visualization technologies enable us to update graphs in the shortest possible time. Especially the data of the Johns Hopkins University and of the Robert Koch Institute are used intensively for reporting cases in Germany. On different platforms, for example, a lot of data is available to everyone, in the hope that the large global community of data scientists will generate valuable information and predictions from the data.

We at StatSoft Europe GmbH offer a variety of R training courses and have also used the COVID-19 data to generate knowledge through visualization. We have used the Open-Source statistics software R and disclosed the code below to provide interested analysts with free help for visualizing data in R. We used the data from the Johns Hopkins University (downloaded from this Kaggle Link). The original data, as used in Kaggle, originated from the Git Repository of the Johns Hopkins University.

At this point we would like to take the opportunity to draw your attention to various support platforms for the Corona crisis. Perhaps you are affected yourself or know people who are particularly at risk and should stay at home for their own safety. In order to make life a little easier for those people, many support platforms have emerged, for example to provide free grocery shopping. This is a list of German aid platforms, where you can register as a person affected or as a potential helper. There you will also find links to Germany-wide help platforms. Please also remember that animal shelters in particular may have a large influx of pets during this time, whose owners are no longer able to take care of due to the illness. You can support the shelters by donating food or money.

Reommended literature

For data analysis and visualization we can warmly recommend the excellent books R for Data Science and Hands on Programming with R by Hadley Wickham and Garret Grolemund. Incidentally, both authors are also the authors of most packages that we need in this project.

The Current Situation

In this section we would like to use various graphics to show you how the COVID-19 pandemic is spreading throughout the world. The status of the data record is the July 20, 2020. In the next section below, we then show how we created these graphics using R.


The following figure shows the number of confirmed cases from all countries, divided into active cases, recoveries and deaths, which had registered at least 50000 confirmed cases on July 20, 2020.

The following figure shows the number of confirmed cases on a log scale by days after the 100th confirmed case was registered in each country. This way, it is possible to see how fast the outbreak developed in the respective countries. For this figure, only the countries which registered the most cases as of July 20, 2020 as well as South Korea were selected. South Korea is a special case, because their fast reaction and testing regime caused a strong decrease in further infections. By choosing the log scale, it is easier to compare the increase in cases irrespective of their order of magnitude.

China, Europe and USA

The following figure shows how the number of cases developed in China, Europe, the USA and all other countries (other). It becomes clear how quickly China managed to get the outbreak under control by taking very strict measures. About five weeks after the number of cases in China reached a plateau, the number of cases in Europe was already higher than in China. Just two more weeks later, there were five times as many cases in Europe as in China. The USA also showed a rapid increase, which follows Europe’s trend, delayed by about a week and a half. Europe and the USA are currently the new epicentres of the COVID-19 pandemic.

The following figure shows the daily increase in case numbers in China, Europe, the USA and in all other countries. This clearly shows how quickly the Chinese government reacted and took drastic measures, because the quarantine of Hubei already began on January 23, at a time when there were only 639 confirmed cases and 18 deaths in China. Nevertheless, despite the quarantine, the confirmed cases in China rose to 83613, including 4634 deaths (as of July 20, 2020). This makes it very clear that a strong restriction on travel and freedom of movement (however painful this may be) a) can effectively limit the spread, b) should be carried rather earlier than later, and c) may flatten the curve only after a certain delay.

In Europe on 04.04.2020 the highest daily increase in cases (54896) was recorded, in the USA the highest increase was 78310 cases on 16.07.2020. In China, on the other hand, the highest increase was only 15133 cases on 13.02.2020.

The following figure shows the total active cases of all defined regions for the respective period, the width of the colored band symbolizes the proportion of those diseases that occurred in the respective region. The spread began in China first, then continued in Europe and other countries and at the same time decreased again in China. Around two weeks after the outbreak in Europe, the disease spread to the United States. The United States is the country with the highest number of acute illnesses.

The following figure shows the confirmed cases, deaths and active cases for the different regions on July 20, 2020.


In the following figure we see the confirmed cases of all European countries in which at least 5000 confirmed cases were registered on 30.03.2020. The width of the colored band shows which proportion of confirmed cases states can be assigned to the respective state and how this proportion changes over time. This shows that Italy was the first country to register a high number of confirmed cases. A few weeks later, the number of infections also rose sharply in surrounding European countries. Most infections were recorded until July 20, 2020 in Italy, Spain and Germany and France.

In the following figure we see the deaths in all European countries where at least 5000 confirmed cases were registered on 30.03.2020. The number of deaths in Germany is still very small compared to the relatively high number of confirmed cases.