Stanford Medicine News Center - May 24th, 2017 - by Jennie Dusheck
Millions of people wear some kind of wristband activity tracker and use the device to monitor their own exercise and health, often sharing the data with their physician. But is the data accurate?
Such people can take heart in knowing that if the device is measuring heart rate, it’s probably doing a good job, a team of researchers at the Stanford University School of Medicine reports. But if it's measuring energy expenditure, it’s probably off by a significant amount.
An evaluation of seven devices in a diverse group of 60 volunteers showed that six of the devices measured heart rate with an error rate of less than 5 percent. The team evaluated the Apple Watch, Basis Peak, Fitbit Surge, Microsoft Band, Mio Alpha 2, PulseOn and the Samsung Gear S2. Some devices were more accurate than others, and factors such as skin color and body mass index affected the measurements.
In contrast, none of the seven devices measured energy expenditure accurately, the study found. Even the most accurate device was off by an average of 27 percent. And the least accurate was off by 93 percent.
“People are basing life decisions on the data provided by these devices,” said Euan Ashley, DPhil, FRCP, professor of cardiovascular medicine, of genetics and of biomedical data science at Stanford. But consumer devices aren’t held to the same standards as medical-grade devices, and it’s hard for doctors to know what to make of heart-rate data and other data from a patient’s wearable device, he said.
A paper reporting the researchers’ findings was published online May 24 in the Journal of Personalized Medicine. Ashley is the senior author. Lead authorship is shared by graduate student Anna Shcherbina, visiting assistant professor Mikael Mattsson, PhD, and senior research scientist Daryl Waggott.
Hard for consumers to know device accuracy
Manufacturers may test the accuracy of activity devices extensively, said Ashley, but it’s hard for consumers to know how accurate such information is or the process that the manufacturers used in testing the devices. So Ashley and his colleagues set out to independently evaluate activity trackers that met criteria such as measuring both heart rate and energy expenditure and being commercially available.
“For a lay user, in a non-medical setting, we want to keep that error under 10 percent,” Shcherbina said.
Sixty volunteers, including 31 women and 29 men, wore the seven devices while walking or running on treadmills or using stationary bicycles. Each volunteer’s heart was measured with a medical-grade electrocardiograph. Metabolic rate was estimated with an instrument for measuring the oxygen and carbon dioxide in breath — a good proxy for metabolism and energy expenditure. Results from the wearable devices were then compared to the measurements from the two “gold standard” instruments.
“The heart rate measurements performed far better than we expected,” said Ashley, “but the energy expenditure measures were way off the mark. The magnitude of just how bad they were surprised me.”
Heart-rate data reliable
The take-home message, he said, is that a user can pretty much rely on a fitness tracker’s heart rate measurements. But basing the number of doughnuts you eat on how many calories your device says you burned is a really bad idea, he said.
Neither Ashley nor Shcherbina could be sure why energy-expenditure measures were so far off. Each device uses its own proprietary algorithm for calculating energy expenditure, they said. It’s likely the algorithms are making assumptions that don’t fit individuals very well, said Shcherbina. “All we can do is see how the devices perform against the gold-standard clinical measures,” she said. “My take on this is that it’s very hard to train an algorithm that would be accurate across a wide variety of people because energy expenditure is variable based on someone’s fitness level, height and weight, etc.” Heart rate, she said, is measured directly, whereas energy expenditure must be measured indirectly through proxy calculations.
Ashley’s team saw a need to make their evaluations of wearable devices open to the research community, so they created a website that shows their own data. They welcome others to upload data related to device performance.
The team is already working on the next iteration of their study, in which they are evaluating the devices while volunteers wear them as they go about a normal day, including exercising in the open, instead of walking or running on a laboratory treadmill. “In phase two,” said Shcherbina, “we actually want a fully portable study. So volunteers’ ECG will be portable and their energy calculation will also be done with a portable machine.”
The work is an example of Stanford Medicine’s focus on precision health, the goal of which is to anticipate and prevent disease in the healthy and precisely diagnose and treat disease in the ill.
Other Stanford co-authors are clinical nurse specialist Heidi Salisbury, RN, MSN; clinical exercise physiologist Jeffrey Christle, PhD; Trevor Hastie, PhD, professor of statistics and of biomedical data science; and Matthew Wheeler, MD, PhD, clinical assistant professor of cardiovascular medicine. Ashley is also a member of the Stanford Cardiovascular Institute, the Stanford Child Health Research Institute and Stanford Bio-X. Hastie is a member of CHRI, Bio-X, the Stanford Cancer Institute and the Stanford Neurosciences Institute.
Stanford’s departments of Medicine, of Genetics and of Biomedical Data Science supported the work.