Estimating physical activity from self-reported behaviours in large-scale population studies using network harmonisation: findings from UK Biobank and associations with disease outcomes.
The international journal of behavioral nutrition and physical activity 2019 ; 17: 40.
PubMed ID : 32178703
PMCID : PMC7074990
UK Biobank is a large prospective cohort study containing accelerometer-based physical activity data with strong validity collected from 100,000 participants approximately 5 years after baseline. In contrast, the main cohort has multiple self-reported physical behaviours from > 500,000 participants with longer follow-up time, offering several epidemiological advantages. However, questionnaire methods typically suffer from greater measurement error, and at present there is no tested method for combining these diverse self-reported data to more comprehensively assess the overall dose of physical activity. This study aimed to use the accelerometry sub-cohort to calibrate the self-reported behavioural variables to produce a harmonised estimate of physical activity energy expenditure, and subsequently examine its reliability, validity, and associations with disease outcomes.
We calibrated 14 self-reported behavioural variables from the UK Biobank main cohort using the wrist accelerometry sub-cohort (n = 93,425), and used published equations to estimate physical activity energy expenditure (PAEE). For comparison, we estimated physical activity based on the scoring criteria of the International Physical Activity Questionnaire, and by summing variables for occupational and leisure-time physical activity with no calibration. Test-retest reliability was assessed using data from the UK Biobank repeat assessment (n = 18,905) collected a mean of 4.3 years after baseline. Validity was assessed in an independent validation study (n = 98) with estimates based on doubly labelled water (PAEE). In the main UK Biobank cohort (n = 374,352), Cox regression was used to estimate associations between PAEE and fatal and non-fatal outcomes including all-cause, cardiovascular diseases, respiratory diseases, and cancers.
PAEE explained 27% variance in gold-standard PAEE estimates, with no mean bias. However, error was strongly correlated with PAEE (r = -.98; p < 0.001), and PAEE had narrower range than the criterion. Test-retest reliability (Λ = .67) and relative validity (Spearman = .52) of PAEE outperformed two common approaches for processing self-report data with no calibration. Predictive validity was demonstrated by associations with morbidity and mortality, e.g. 14% (95%CI: 11-17%) lower mortality for individuals meeting lower physical activity guidelines.
The PAEE variable has good reliability and validity for ranking individuals, with no mean bias but correlated error at individual-level. PAEE outperformed uncalibrated estimates and showed stronger inverse associations with disease outcomes.