Formula Explanation

Jump to a section of this page:

Overall Approach

Humanity's understanding of COVID-19 is constantly evolving and we are actively updating the calculator as scientific understanding improves. See the Changelog below for past updates.

The dynamics at play in contracting any viral disease are complex. We do not consider the math underlying this calculator to be a nuanced model of COVID-19's communicability. Our aim is to provide an actionable approximation of risk given the knowable factors at play in a specific social situation.

Each metric in this calculator therefore represents a factor proven to contribute to your risk of contracting COVID-19 in a situation. We weight each metric as described below and then we multiply them. Given the open-ended nature of some metrics (e.g. you can invite an infinite number of people to an event), the range of scores is quite large, but ~80% of them fall between 0 and 4. The median score is currently around 0.25.

↑ top

Cutoff Score

The current cutoff score is . Our methodology for arriving at that score is as follows:

We keep our eyes and ears out for the latest science. When a new consensus begins to form – e.g. around mask wearing or distancing – we dive into the numbers to see how it might make our formula more accurate.

Whenever we update the formula in light of the newest science, we don't just test it in the interface – we also generate thousands of random scenarios, sort them by score, and then ask ourselves and our advisors two questions: 1) More often than not, are riskier scenarios scoring higher than less risky scenarios? and 2) Where in the list would we draw the line between events we'd attend and ones we woudln't?

While we we wish it were possible to find the cutoff via pure math or epidemiological principles, it's simply not yet. We are eager to hear from more specialists on how we could be doing this better, so definitely reach out if you have ideas for us.

↑ top

Regional Risk

When we first launched the calculator, we used to assess risk. Their assessment is a comprehensive blend of important epidemiological factors, but it had two major drawbacks: It was US-only and their risk assessment (low, medium, high, critical) was not available via their API.

We therefore spent considerable time looking for an alternative, eventually deciding on the data that powers the 2019 Novel Coronavirus Visual Dashboard operated by the Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE). This data is well-researched, reliable, and programmatically available, however it is far less comprehensive (often because regional reporting is lacking, not due to JH's efforts!).

We are therefore using a fairly coarse estimate of regional risk based on the data that is consistently available across regions:

First, we look for the most granular possible region – a county/district within in a state/province within a country. If the user doesn't know their county/district, or if there isn't data available at that level, we move up a level.

Then we look for an accurate number of currently active cases. We triangulate whether this number is accurate based on whether the region is also reporting a significant number of recovered cases. If it does, we divide the number of active cases by the population of the region to get a rough sense of the region's current incidence rate.

However, if the number of active cases is suspicious because there is no data for recoveries or deaths, we take the entire number of confirmed cases and divide it evenly across the population across time. That is, we get an average number of active cases per three-week period since January 1st, 2020.

We are painfully aware of how imperfect this approach is, but we believe it is better than having no time-bound incidence rate at all and it's better than only having data for the US.

The resulting score is a rough estimate of the percentage of the population that is currently infected with COVID-19.

↑ top

Indoors vs. Outdoors

This one is comparatively easy, since we have some high quality studies giving us numbers for this. Basically, indoor transmission is 18.7 times more likely than outdoors. We therefore score indoor events as 1.87 and outdoor events as .1.

↑ top


Given the size of the space and the number of attendees, we calculate the maximum possible distancing.

The effect of distancing on transmission has been shown to follow roughly an inversely exponential path – that is, as distancing shrinks, risk increases exponentially. Also, recent evidence suggests 6 feet is not nearly enough, especially indoors.

We therefore peg a neutral multiplier of 1 to 10 feet (or roughly 3 meters) for indoor events and 6 feet for outdoor events. Exposure gently slopes off at distances greater than – and shoots up exponetionally at distances shorter than – the neutral distance.

↑ top

Mask Wearing

A recent survey of masking studies created a model for decrease in risk based on the percentage of people in a situation wearing a mask and the effectiveness of the mask's construction.

Since we can't necessarily know beforehand how effective everyone's masks will be, we opted for a middle-of-the-road 50% effectiveness and then used the study's model based on percentage of wearers:

↑ top


We have yet to find some good studies that directly model risk as a function of duration, but broadly speaking epidemiologists agree that "Everyone has a little bit of risk per minute, and it's a cumulative thing".

Given that general principle, combined with what we know about indoor vs. outdoor transmission, we chose to represent indoor duration as exponential and outdoor as linear. We assume a neutral amount of exposure ends at 15 minutes (score = 1).

↑ top

Public Transportation

There's a tentative consensus emerging that public transportation is not especially risky. For example, studies in France and Japan have shown that none of those countries' COVID-19 clusters so far can be traced to public transit.

That said, these are enclosed spaces with varying levels of ventilation, mask adherence is variable, as is whether people are talking, etc. We therefore give taking public transportation a score of 1.2 in our formula – it is an additional risk, but not an overwhelming one.

↑ top

Public/Shared Restrooms

Similar to public transportation, public/shared restrooms have not proven to be large risk multipliers for this disease. Also similarly, however, it is an additional enclosed environment with varying levels of ventilation, mask-adherence, etc. We give using public/shared restrooms a score of 1.1 – a slight additional risk because the time frame tends to be extremely short.

↑ top


Consumption of alcohol lowers perception of risk, which may decrease distancing and mask-wearing. Consuming alcohol also harms hearing, which increases speech volume, which increases the emission of respiratory droplets.

For those reasons, we give alcohol consumption a score of 1.3 – higher than both restrooms and public transportation, but not overwhelming.

↑ top

Final Formula

As it appears in the JavaScript:


↑ top

Scenario Generator

We encourage the curious to download and analyze a CSV of 1000 random, unique scenarios for analysis:

Generate Scenarios

Each CSV will be different, so definitely make more than one. We find it fascinating!

↑ top


Formula Overhaul
8/17/2020 9:45 am ET

In light of requests for global data, new studies released about indoor vs. outdoor risk as well as effect of mask wearing, and feedback on some outlier scenarios, we have completely overhauled how we are scoring situations. The page above explains the new formula.

New Risk Level Mulipliers
7/5/2020 9:45 am ET

In light of feedback about low risk, outdoor, masked events in high-risk areas, we raised the cutoff from 28 to 35 and changed the multiplying values for Risk Levels: Risk Level Old Value New Value
Low 1 0.9
Medium 2 1.2
High 3 1.5
Critical 4 2

↑ top