A new analysis of dive profiles – but what can the data truly tell us?
A new paper on DCS risk assessment is currently sparking discussions. It was published in early 2026 by a group of researchers, almost all affiliated with DAN Europe. The data comes from the DAN DSL Database, analyzing 127,957 dives from 5,907 divers. In addition to dive profiles, the data includes information about the divers: gender, height, weight, BMI, cold exposure, fatigue, and more – an excellent opportunity to gain insights into the many open questions regarding risk factors.
On social media, some of the study’s “findings” are already being actively shared and discussed: Do women really have a 3-4 times higher risk of getting DCS? And is it plausible that lean, rather underweight divers have a higher risk than overweight ones?
The sometimes surprising results could be due to some structural weaknesses in the study.
"Identification of DCS risk factors in recreational diving..."
It starts with the question of how many dive profiles were actually examined. Perhaps it was 136,793 – some dives were sorted out due to data ambiguities but reappear later in the analysis. Or even 127,197, which appear in the question of DCS frequency. The total number of dives examined in the paper is unfortunately no more consistent than the rest, and the careless handling of these numbers raises suspicion.
At first glance, however, this still sounds impressive: many dives, real profiles, a large database. Many divers expect relevant insights from the analysis of such data, which is why the study is mentioned more frequently and perceived by more people than would be normal for a niche topic like decompression. And because it comes from DAN, the study immediately receives a vote of confidence – but does it truly deserve it?
The study is long and raises many questions – too many to discuss in one blog post. However, three problems stand out particularly in the paper, and we want to take a closer look at them in three separate posts.
- Where does the data come from? The entire data collection is not explained, and it appears that voluntarily reported profiles and DCS cases reported to DAN are simply combined. We discuss why this is a problem here: Statistics is hard to get
- Dives as independent events – the almost 130,000 dives are analyzed as if they were independent events, yet they come from only about 6,000 divers who contributed very different amounts. We discuss why this is a problem in this article.
- Confusion with the “DSSG”: The authors invent a new measure for supersaturation that they do not explain – but it is clearly NOT the normal GF upon reaching the surface that divers are familiar with. This will be covered in another blog post.
Some Divers Count More Than Others
In this post, we want to take a closer look at an aspect that can lead to distortions in the results: What happens when parts of the data are not independent of each other?
The study itself states that individual divers contributed very different numbers of profiles. Some only a single dive, others very many. The median is three, the maximum is 1,432 dives per diver.
What about median, averages, and so on? What is meant here? The median is the value where half of the data is below it, and half is above it. This means that out of the almost 6,000 divers, 3,000 contributed a maximum of 3 dives – making a maximum of 9,000 dives, probably less. The other 3,000 each contributed three or more, accounting for at least 120,000 of the profiles.
A person who contributes an extreme number of dives has many characteristics, such as BMI and gender, that remain constant. Their data counts more in the analysis than those who contributed only a few dives, and this can distort the picture. Moreover, a person’s diving behavior is usually relatively consistent, and thus the data that would otherwise vary from dive to dive becomes data where this person has a stronger influence on the statistics than others.
This is a relevant methodological problem. To understand it more precisely, let’s start with a small model.
When one diver contributes a lot of data
To understand fundamentally why it matters when relevant factors are weighted differently in the analysis, let’s take a small model. Right upfront: with larger datasets, the effect is far less pronounced – we want to show where the methodological problem lies, and therefore use an example where it becomes overtly visible.
Let’s imagine 100 divers who share a total of 2000 dives. One diver contributes 1000 dives, while the others are distributed among the remaining 99 divers.
The problem is immediately apparent here. Only one diver contributes 50% of the data – thus determining person-specific characteristics like gender and BMI by 50%, even though they are only one out of a hundred.
Next, we’ll look at what this means in the analysis when 10 DCS cases are randomly distributed among the shared profiles.
Two people swap places
From these profiles, we will now construct a small thought experiment. For this, the characteristics of the 100 divers are randomly determined once:
- 25 women, 75 men
- BMI roughly distributed as in a European adult population
- Number of shared dives randomly distributed
- 10 DCS cases are distributed among the dives, with a maximum of one per person
With this dataset, we analyze how gender and BMI are distributed across all dives – and how they are distributed across dives that ended with DCS. From this, we create an evaluation and a graph that looks very serious: What is the DCS risk by gender and by BMI? While we set a general risk of 0.5% across all 2000 dives, the distribution changes here.
We do this twice, changing only one thing in the data: In the first sorting, the diver with 1,000 dives is an overweight man without DCS. Among those with only one reported dive is an underweight woman with DCS. In the second analysis, we swap these two individuals. What do the results look like?
Small Change, Big Impact
Here we can clearly see what it means when an analysis simply ignores that the “dive profile” data is not independent of the “diver” data. The one underweight person with DCS made “underweight” look like a massive risk factor. However, it’s just that the three underweight divers contributed only 5 dives, while the one overweight diver contributed 1000 – and this massively shifts the result. And the moment the woman contributes 1000 dives, it suddenly appears that men have more DCS – even though she was the one who had it.
This effect must be checked in advance during an analysis, for example, using smaller sample datasets, or by including only one dive per person and then comparing it with the analysis – methods for this exist, as the problem is well-known.
The larger the amount of data, the less relevant the number of datasets per person can become. If the number of reported dives is distributed as it typically is among divers; if personal characteristics are roughly equally distributed among frequent divers and those with very few profiles, then this problem eventually becomes irrelevant. However, it must be validated in advance whether the dataset is structured in such a way that this question can be ignored.
What does this mean for the paper?
We have already clarified that the simulation is purely didactic and has nothing to do with the specific figures from the study. If you have an enormous amount of data, the effect eventually becomes irrelevant.
However: The data volumes in the study are not large in all areas. One of the “results” that causes some astonishment is the high DCS risk in underweight individuals – a result no one would have expected, as overweight is generally considered a risk factor, not underweight.
But from what amount of data does the P(DCS), the probability of 1.6% for a dive by a moderately underweight person, come? This number comes from exactly one (!) DCS case in this BMI class…. One more case, and the probability would be about 3.2%; one less case, and it would be 0 – here, one can hardly speak of much more than statistical noise.
But let’s take a closer look. The study defines BMI classes at the beginning, also stating how many dive profiles are available in the database for each BMI class. The authors arrive at 101,865 profiles for which BMI was known – but the individual numbers per profile add up to 102,474. We will continue to work with this number….. For about 20% of the profiles, the BMI is unknown. And how the BMI classes are distributed among the divers is also not mentioned.
The next piece of information about BMI and DCS is in a graph, Figure 6. It shows the P(DCS) per BMI class, i.e., the probability in percent that a profile of a person in that BMI class ends with DCS. The actual distribution of DCS cases across BMI classes is not shown, and the graph can only be read as an estimate – but that is enough to reconstruct the distribution of the 628 DCS cases.
Reconstructed BMI-DCS Table
Working reconstruction based on published profile numbers and P(DCS) values plausibly read from Figure 6. The specific raw data per BMI class were not provided as a table in the study.
| BMI Class | BMI Range | BMI | Profiles | P(DCS) from Graph | Case Count from Graph | Reconstructed Cases | P(DCS) Reconstructed |
|---|---|---|---|---|---|---|---|
| -3 | 16.0 | -3 | 12 | 0,00 % | 0 | 0 | 0,00 % |
| -2 | 16.0–16.9 | -2 | 63 | 1,60 % | 1,01 | 1 | 1,59 % |
| -1 | 17.0–18.4 | -1 | 877 | 1,05 % | 9,21 | 9 | 1,03 % |
| 0 | 18.5–24.9 | 0 | 41.429 | 0,85 % | 352,15 | 343 | 0,83 % |
| 1 | 25.0–29.9 | 1 | 44.888 | 0,50 % | 224,44 | 223 | 0,50 % |
| 2 | 30.0–34.9 | 2 | 13.588 | 0,35 % | 47,56 | 46 | 0,34 % |
| 3 | 35.0–39.9 | 3 | 1.125 | 0,30 % | 3,38 | 3 | 0,27 % |
| 4 | ≥40.0 | 4 | 492 | 0,65 % | 3,20 | 3 | 0,61 % |
| Sum of this reconstruction | 102,474 | 628 | |||||
The actual data may, of course, deviate from this reconstruction. This cannot be avoided, as the authors chose not to include these relevant numbers in the publication. However, the deviation is small enough that it cannot fundamentally alter the overall picture.
Now let’s do an experiment: A single dataset is added to the entire data. One dive, one person, one DCS case. Just one.
How does the analysis change depending on the BMI of this one new case?
And then we do something else: We assume that in BMI class 0, there was a diver who shared 1,000 dives. What would the result look like if this one diver had a different BMI class?
What we can see very clearly here is that the analysis in the middle BMI classes, where many profiles and a relevant number of DCS cases are available, can indeed be interesting. Further data analysis could start here – but it doesn’t end here. And for the very high and very low BMI classes, one simply has to acknowledge that the available data is insufficient to draw conclusions. This is not a problem – DCS is so rare that this issue frequently arises in research.
The problem of result distortion due to the highly varied participation of individual divers certainly also manifests in other aspects of the analysis. BMI is just the one for which most figures are visible, and whose “astonishing” result has already been discussed on social media, making it suitable for clarification. The same caution is, of course, required for the rest of the analysis.
However, the criticism of this analysis does not mean that the DAN data is worthless. On the contrary: large, real dive databases are valuable. They show genuine profiles, real repetitive dives, real behavioral patterns. We need precisely such data, and a good analysis of it would have real potential to provide relevant insights into open questions regarding risk factors.
Unfortunately, the necessary calibration of the method on smaller datasets was simply omitted here. The question of how the varying numbers of datasets per person influence the analysis was not even raised. And if insufficient data was available for certain statements, this was also not clearly stated.
The data itself is golden, and it would be a great gain if it were analyzed. But if so, then as two datasets – profiles and DCS cases – where the various biases, methodological pitfalls, and ambiguities are eliminated or clearly identified. Then, looking at what actually distinguishes DCS data from other data could be truly exciting.
This post is part of a series for which Dominik Elsässer, Robert Helling with his highly recommended blog The Theoretical Diver, and I collaborated.








