Incidence vs prevalence and the epidemiologist’s bathtub

Sarah Arnold, Senior Surveillance and Information Scientist in Health Protection, Dr Lynsey Patterson, Head of Health Protection Surveillance, and Dr Claire Neill, Specialty Registrar, Public Health Agency.

Epidemiologists bathtbub graphic

Many terms that were unknown to the average person before 2020 have, through the COVID-19 pandemic, become commonplace. Before this year, few people knew what the role of an epidemiologist was. Now, they’re leading press conferences and appearing on breakfast television shows.

With that in mind, there’s still a lot of terminology used daily in news reports and online articles that are difficult to understand. Notable examples of this include incidence and prevalence. Often these words are used interchangeably however, when it comes to surveillance and epidemiological analyses these words have both distinct meanings and purpose.

To help understand the difference between incidence and prevalence, epidemiologists often refer to the familiar concept of water entering and leaving a bath.

Let’s take the term prevalence first and define it in the context of disease. The prevalence is the number of individuals with the disease either at a specific point in time (the point prevalence) or over a specified time period (the period prevalence). 

The key concept about the prevalence is that it includes both new and existing cases of disease. If we refer to our bathtub, the prevalence is the amount of water in the bathtub at a specified point in time.

As we all know, this quantity is not static and we can easily add more water, by turning on the tap, as well as lose water through evaporation or by opening the plug. Turning back to our disease model, the filled bath represents the total number of cases at a point in time which can change (through evaporation or releasing the plug) as individuals recover from the disease or if they unfortunately pass away.

Therefore, the prevalence of a disease in a population is influenced by a number of parameters: the number of new cases (turning on the tap), the speed of recovery – which is influenced by disease severity and treatment options (evaporation) and the number of people that lose their lives to the disease (releasing the plug).

Expressing the prevalence as a raw numbers makes it difficult to compare as it doesn’t take into account different population sizes or risk profiles eg a higher proportion of older people, or people with particular co-morbidities. Therefore, we often express the prevalence as a rate using the population as a denominator.

Turning to COVID-19, the prevalence of COVID-19 in Northern Ireland is the total number of active cases in the country at a specific time. While it is relatively straightforward to identify new cases of disease (through laboratory testing), knowing if an individual is recovered is more challenging particularly for mild cases who self-manage their illness at home.

Therefore, the prevalence of COVID-19 in Northern Ireland is estimated using different assumption about the proportion of the population reporting symptoms. In the latest weekly report, as of 16 August 2020, the prevalence of active cases is estimated to be 31 per 100,000 population assuming 50% of cases have no symptoms. This changes to 18 per 100,000 if only 15% have no symptoms and 77 per 100,000 if 80% of cases experience no symptoms.

The prevalence is an extremely useful measure in public health, particularly for service planning, as it allows us to understand the burden on the health and social care system at a particular point in time. It is more often used for chronic diseases, such as diabetes, because of the challenges of monitoring recovery and deaths for infectious diseases. Measurements of prevalence do have their limitations though, not least because they are influenced by many other factors of the disease. This is where the incidence can be extremely useful.

The incidence of disease is the number of new cases during a specified time period. It is used to describe how quickly the disease occurs in the population and is therefore used to identify increasing transmission and influence decision making about what public health interventions are required to slow this down. The incidence can be presented as a proportion, for example using the population as a denominator, or as a rate such as person-time, which requires individuals are followed up over time.

The former is commonly used in surveillance as we often don’t have access to detailed individual level data that would be available, for example, in a specialised cohort study. Going back to the bathtub, water entering the bath through the tap reflects the addition of new cases of disease to the population, the incidence. If the incidence is high, the water will be gushing into the bathtub whereas if it’s low, water may enter the bathtub a drip at a time.

In terms of COVID-19, the incidence can be observed using the number of positive cases identified using laboratory data. The most recent monthly bulletin shows that there were 387 new cases in the most recent week. As with the prevalence, to facilitate comparisons we express this number as a rate per 100,000 population (an incidence proportion). The most recent estimate of the incidence rate is 20.3 per 100,000 population.

The limitation with this value is that it may potentially underestimate the total incidence if those with very mild symptoms (or none at all) won’t be tested and therefore won’t be represented in the observed value. To account for this, an estimate of incidence is derived which uses the assumption (from the University of Southampton) that there are 1.3 individuals infected for every laboratory confirmed case. Using this methodology, it is estimated the weekly incidence is actually 36 per 100,000 people in Northern Ireland (source: weekly bulletin up to 16 August 2020).

Prevalence and incidence are two highly important measures of disease used in epidemiology. We have used the analogy of water entering and leaving a bathtub to try and explain these terms and their limitations. For this reason, both measures are necessary to give a better understanding of the disease. Decisions made during this pandemic have relied on both the incidence and estimated prevalence of the disease and will continue to be used to inform the provision of appropriate public health advice and support measures such as either relaxing or strengthening local restrictions.

Published Date