Shedding light on pulse oximeters
A case study on decision-making bias in medicine
Learning objectives
- Define sensitivity, specificity, and accuracy, and interpret these metrics in the context of hypoxemia detection using pulse oximeters.
- Analyze how differences in skin tone can lead to systematic differences in pulse oximetry measurements.
- Evaluate the ethical and practical trade-offs of adjusting diagnostic thresholds based on patient characteristics to mitigate disparities in care.
What are pulse oximeters?
Pulse oximeters are small devices that measure the oxygen levels in our blood. Clinicians use them to check for hypoxemia, which is a dangerous drop in our body's oxygen levels. Most healthy patients have an oxygen level of 95 to 100%.
It's important to recognize that pulse oximeters only estimate our oxygen levels. The gold standard is to obtain an arterial blood gas (ABG) measurement, which requires sticking a needle in the patient and drawing blood.
Are pulse oximeters biased?
Research has shown that compared to ABG measurements, pulse oximeters can be less accurate for patients with darker skin tones.1 Specifically, they can overestimate oxygen levels for Black patients†, leading to systemic delays in giving treatments like supplemental oxygen. This happens because pulse oximeters use light to detect the oxygen in our finger, and skin pigmentation can affect how that light is absorbed and reflected.
In a few minutes, we'll take a look at a dataset of real patients3 from a research team at UCSF4. The dataset consists of unique patients, of which are hypoxemic. There's a relatively even split in terms of self-reported race: identify as White and identify as Black.
†For accessibility, we use "White" and "Black" as shorthand for individuals with lighter and darker skin tones, respectively. However, race is a social construct2; while there may be a correlation between self-reported race and skin tone, it's important to recognize that these are distinct concepts. Skin tone varies widely within any racial group, and the terms should not always be used interchangeably.
How to measure bias
We'll primarily look at 3 different metrics:
- Sensitivity: If the patient does actually have low blood oxygen, did we actually find it with the pulse oximeter? A high sensitivity means we catch all the true cases of hypoxemia.
- Specificity: If the patient does not have low blood oxygen, did we correctly identify them as healthy? A high specificity means we don't unnecessarily raise false alarms.
- Accuracy: Overall, how good is the pulse oximeter in giving us the right answer?
In most cases, it's hard to have a test that is perfectly sensitive and specific. When diagnosing and treating hypoxemia, is it more important to have a high sensitivity, or a high specificity? Why?
Start treating patients!
As a doctor, your goal is to identify which patients have true hypoxemia and which do not. If a patient has a pulse oximetry reading less than a certain cutoff, then you will start them on supplemental oxygen. In the Threshold Strategy section on the left, use your clinical skills to adjust this threshold of when to start oxygen therapy. You will choose between 2 strategies:
- Race Unaware: use the same threshold for all patients regardless of race.
- Race Aware: use different thresholds for patients depending on their race.
What would happen if you set your threshold to 100% on the left (i.e., any patient with a pulse oximetry measurement less than 100% oxygen saturation is given supplemental oxygen)? What happens when you set it to 90%? To 80%?
Each dot is a patient — blue dots correspond to patients you have accurately diagnosed, and gray dots correspond to incorrect diagnoses.
White patients
Black patients
- A common threshold for hypoxemia used in medicine is a pulse oximetry reading less than 88%. Using this threshold, what is our diagnostic accuracy for White patients? How about for Black patients?
- What is the lowest threshold that maximizes the sensitivity for White patients? What is the sensitivity for Black patients using the same threshold?
- What is the highest threshold that maximizes the specificity for Black patients? What is the corresponding specificity for White patients using the same threshold?
- Can you achieve 100% sensitivity for both White and Black patients? How?
Let's discuss
- Is using different thresholds for patients of skin tones "racist", even if it leads to better outcomes? Why or why not?
- Thanks to recent research, we know the mechanism behind how skin color affects pulse oximetry accuracy. Would your answer to the previous question change if we did not know the mechanism, and could only rely on observational data?
- Thankfully, the cost of supplemental oxygen is relatively small—even for healthy patients. But this isn't always true! For example, suppose you have a diagnostic test for diagnosing a very serious cancer that requires expensive and intensive medications to treat. How might your decision making change?
This tutorial was largely inspired by an excellent prior piece from Google.5 Many thanks to their team for making such resources available to draw upon.