CONSORT 2025 statement: updated guideline for reporting randomised trials

Sally Hopewell; An-Wen Chan; Gary S Collins; Asbjørn Hróbjartsson; David Moher; Kenneth F Schulz; Ruth Tunn; Rakesh Aggarwal; Michael Berkwits; Jesse A Berlin; Nita Bhandari; Nancy J Butcher; Marion K Campbell; Runcie C W Chidebe; Diana Elbourne; Andrew Farmer; Dean A Fergusson; Robert M Golub; Steven N Goodman; Tammy C Hoffmann; John P A Ioannidis; Brennan C Kahan; Rachel L Knowles; Sarah E Lamb; Steff Lewis; Elizabeth Loder; Martin Offringa; Philippe Ravaud; Dawn P Richards; Frank W Rockhold; David L Schriger; Nandi L Siegfried; Sophie Staniszewska; Rod S Taylor; Lehana Thabane; David Torgerson; Sunita Vohra; Ian R White; Isabelle Boutron

doi:10.1136/bmj-2024-081123

What to write

How missing data were handled in the analysis

Examples

“Regarding the multiple imputation procedure, briefly, for each outcome, the analysis model used was a linear regression with treatment arm, baseline outcome, and ethnicity (randomization stratifier) as explanatory variables. The imputation models contained all the variables of the analysis model(s) as well as factors associated with missingness: age (identified empirically to predict missingness, P = .03) and adherence (number of doses taken of either vitamin D or placebo, P < .001).”¹

“To consider the potential impact of missing data on trial conclusions, we used multiple imputation (data missing at random) and sensitivity analysis (data not missing at random). Multiple imputation by chained equations was performed using the”mi impute chained” command in Stata. We used a linear regression model to impute missing outcomes for the HOS ADL [activities of daily living subscale of the hip outcome score] at eight months post-randomisation. Variables in the imputation model included all covariates in the analysis model (baseline HOS ADL (continuous), age (continuous), and sex). In addition, we included other variables that were thought to be predictive of the outcome (lateral centre-edge angle, maximum α angle, Kellgren-Lawrence grade, and baseline HADS score). Imputations were run separately by treatment arm and based on a predictive mean matching approach, choosing at random one of the five HOS ADL values with the closest predicted scores. Missing data in the covariates that were included in the multiple imputation model were imputed simultaneously (multiple imputation by chained equation approach). Sensitivity analysis was performed using the “rctmiss” command in Stata, and we considered scenarios where participants with missing data in each arm were assumed to have outcomes that were up to 9 points worse than when data were missing at random.”²

“Analyses for the 2 primary outcomes compared each treatment with usual care using multiple imputation to handle missing data and a Bonferroni-corrected 2-tailed type I error of .025. We performed 20 imputations with a fully conditional specification using Proc MI in SAS. Imputation was performed with the following prespecified variables: age, study group, study site, clinic, sex, race and ethnicity, body mass index, exercise frequency at baseline, education, employment status, smoking status, other medical conditions at baseline, number of medications used for spine pain at baseline, duration of pain at baseline, number of previous pain episodes, STarT Back score, baseline ODI, baseline self-efficacy, baseline EQ-5D-5L, and scores for patient-reported outcomes at every follow-up point (ODI [Oswestry Disability Index], cost, Lorig et al self-efficacy scale, and EQ-5D-5L [EuroQol 5-dimensional 5-level questionnaire]). Each imputed data set was analyzed separately using Proc GENMOD in SAS (with an identity link and normally distributed errors for ODI and a log link and Poisson-distributed errors for spine-related spending).”³

“Missing peak V̇o2 [oxygen consumption] data at week 20, regardless of the type of intercurrent event, was imputed using multiple imputation methodology under the missing at random assumption for the primary analysis. Sensitivity analyses were performed by exploring a missing not at random assumption in the imputation of peak V̇o2. The imputation model used a regression multiple imputation, which includes treatment group, baseline respiratory exchange ratio, persistent atrial fibrillation (yes or no), age, sex, baseline peak V̇o2, baseline hemoglobin level, baseline estimated glomerular filtration rate, baseline body weight, baseline KCCQ [Kansas City cardiomyopathy questionnaire] total symptom score, baseline NYHA [New York Heart Association] class, and baseline average daily activity units (refers to 10 hours of wearing during the awake time for ≥7 days unless otherwise specified). Treatment group, persistent atrial fibrillation (yes or no), baseline NYHA class, and sex were treated as categorical variables. Fifty imputed data sets were generated. Each of the imputed data sets was analyzed using the analysis of covariance model of the primary analysis. Least square mean (LSM) treatment difference and the standard error were combined using Rubin’s rules to produce an LSM estimate of the treatment difference, its 95% CI [confidence interval], and P value for the test of null hypothesis of no treatment effect.”⁴

“Multiple imputation was preplanned for the primary outcome measure in the case of missing data; however, because there were no missing data relating to ventilator-free days, imputation was not required.”⁵

Explanation

Missing data are common when conducting medical research. Collecting data on all study participants can be challenging even in a trial that has mechanisms to maximise data capture. Missing values can occur in either the outcome or in one or more covariates, or usually both. There are many reasons why missing values occur in the outcome. Patients may stop participating in the trial, withdraw consent for further data collection, or fail to attend follow-up visits; all of which could be related to the treatment allocation, specific (prognostic) factors, or experiencing a specific health outcome.⁶ Missing values could also occur in baseline variables, such that all the necessary data needed to conduct the trial have been only partially recorded. Despite the ubiquity of missing data in medical research, the reporting of missing data and how they are handled in the analyses is poor.⁷^8–14 ¹⁵

Many trialists exclude patients without an observed outcome. Once any randomised participants are excluded, the analysis is not strictly an intention-to-treat analysis. Most randomised trials have some missing observations. Trialists effectively must choose between omitting the participants without final outcome data, imputing their missing outcome data, or using model based approaches such as fitting a linear mixed model to repeated measures data.¹⁶ A complete case (or available case) analysis includes only those participants whose outcome is known. While a few missing outcomes will not cause a problem, many trials have more than 10% of randomised patients with missing outcomes.⁷^8–14 This common situation will result in loss of power by reducing the sample size, and bias may well be introduced if being lost to follow-up is related to a participant’s response to treatment. There should be concern when the frequency or the causes of dropping out differ between the intervention groups.

Participants with missing outcomes can be included in the analysis if their outcomes are imputed (ie, their outcomes are estimated from other information that was collected) or if using a model based approach. Imputing the values of missing data allows the analysis to potentially conform to intention-to-treat analysis but requires strong assumptions, which may be hard to justify. Simple imputation methods are appealing, but their use may be inadvisable as they fail to account for uncertainty introduced by missing data and may lead to invalid inferences (eg, estimated standard errors for the treatment effect will be too small).¹⁷ For randomised trials with missing data within repeated measures data, model based approaches such as fitting a linear mixed model can be used to estimate the treatment effect at the final time point which is valid under a missing-at-random assumption. A model is fit at a (limited) number of time points following randomisation, by including fixed effects for time and randomised group and their interaction.¹⁶

Another approach that is sometimes used is known as “last observation carried forward,” in which missing final values of the outcome variable are replaced by the last known value before the participant was lost to follow-up. Although this method might appear appealing through its simplicity, the underlying assumption will rarely be valid, so the method may introduce bias, and makes no allowance for the uncertainty of imputation. The approach of last observation carried forward has been severely criticised.^18–20 Sensitivity analyses should be reported to understand the extent to which the results of the trial depend on the missing data assumptions and subsequent analysis (item 21d).²¹ When the findings from the sensitivity analyses are consistent with the results from the primary analysis (eg, complete case for the primary analysis and multiple imputation for a sensitivity analysis), trialists can be reassured that the missing data assumptions and associated methods had little impact on the trial results.²²

Regardless of what data are missing, how such data are to be analysed and reported needs to be carefully planned. Authors should provide a description on how missing data were handled in sufficient detail to allow for the analysis to be reproduced (in principle; see the box below).

Guidance for reporting analytical approaches to handle missing data (adapted from Hussain et al²³)

Methods

Report any strategies used to reduce missing data throughout the trial process.
Report if and/or how the original sample size calculation accounted for missing data (item 16a) and the justification for these decisions. Report if and/or how the sample size was reassessed during the course of the trial (item 16b).
Report the assumption about the missing data mechanism for the primary analysis and the justification for this choice, for all outcomes. For multiple imputation methods, report²⁴:
- What variables were included in the imputation procedure?
- How were non-normally distributed and binary/categorical variables dealt with?
- If statistical interactions were included in the final analyses (item 21a), were they also included in imputation models?
- Was imputation done separately or by randomised group?
- How many imputed datasets were created?
- How were results from different imputed datasets combined?
Report the method used to handle missing data for the primary analysis (eg, complete case, multiple imputation) and the justification for the methods chosen, for all outcomes. Include whether or which auxiliary variables were collected and used.
Report the assumptions about the missing data mechanism (eg, missing at random) and methods used to conduct the missing data sensitivity analyses for all outcomes, and the justification for the assumptions and methods chosen.
Report how data that were truncated due to death or other causes were handled with a justification for the method(s) (if relevant).

Results

Report the numbers and proportions of missing data in each trial arm.
Report the reasons for missing data in each trial arm.
Report a comparison of the characteristics of those with observed and missing data.
Report the primary analysis based on the primary assumption about the missing data mechanism, for all outcomes.
Report results of the missing data sensitivity analyses for all outcomes. As a minimum, a summary of the missing data sensitivity analyses should be reported in the main paper with the full results in the supplementary material.

Discussion

Discuss the impact of missing data on the interpretation of findings, considering both internal and external validity. For multiple imputation, include whether the variables included in the imputation model make the missing-at-random assumption plausible.

Training

The UK EQUATOR Centre runs training on how to write using reporting guidelines.

Discuss this item

Visit this items’ discussion page to ask questions and give feedback.

References

1.

Gaughran F, Stringer D, Wojewodka G, et al. Effect of vitamin d supplementation on outcomes in people with early psychosis: The DFEND randomized clinical trial. JAMA Network Open. 2021;4(12):e2140858. doi:10.1001/jamanetworkopen.2021.40858

2.

Palmer AJR, Ayyar Gupta V, Fernquest S, et al. Arthroscopic hip surgery compared with physiotherapy and activity modification for the treatment of symptomatic femoroacetabular impingement: Multicentre randomised controlled trial. BMJ. Published online February 2019:l185. doi:10.1136/bmj.l185

3.

Choudhry NK, Fifer S, Fontanet CP, et al. Effect of a biopsychosocial intervention or postural therapy on disability and health care spending among patients with acute and subacute spine pain: The SPINE CARE randomized clinical trial. JAMA. 2022;328(23):2334. doi:10.1001/jama.2022.22625

4.

Lewis GD, Voors AA, Cohen-Solal A, et al. Effect of omecamtiv mecarbil on exercise capacity in chronic heart failure with reduced ejection fraction: The METEORIC-HF randomized clinical trial. JAMA. 2022;328(3):259. doi:10.1001/jama.2022.11016

5.

Schlapbach LJ, Gibbons KS, Horton SB, et al. Effect of nitric oxide via cardiopulmonary bypass on ventilator-free days in young children undergoing congenital heart disease surgery: The NITRIC randomized clinical trial. JAMA. 2022;328(1):38. doi:10.1001/jama.2022.9376

6.

Akl EA, Shawwa K, Kahale LA, et al. Reporting missing participant data in randomised trials: Systematic survey of the methodological literature and a proposed guide. BMJ Open. 2015;5(12):e008431. doi:10.1136/bmjopen-2015-008431

7.

Wood AM, White IR, Thompson SG. Are missing outcome data adequately handled? A review of published randomized controlled trials in major medical journals. Clinical Trials. 2004;1(4):368-376. doi:10.1191/1740774504cn032oa

8.

Bell ML, Fiero M, Horton NJ, Hsu CH. Handling missing data in RCTs; a review of the top medical journals. BMC Medical Research Methodology. 2014;14(1). doi:10.1186/1471-2288-14-118

9.

Ibrahim F, Tom BDM, Scott DL, Prevost AT. A systematic review of randomised controlled trials in rheumatoid arthritis: The reporting and handling of missing data in composite outcomes. Trials. 2016;17(1). doi:10.1186/s13063-016-1402-5

10.

Joseph R, Sim J, Ogollah R, Lewis M. A systematic review finds variable use of the intention-to-treat principle in musculoskeletal randomized controlled trials with missing data. Journal of Clinical Epidemiology. 2015;68(1):15-24. doi:10.1016/j.jclinepi.2014.09.002

11.

Kahale LA, Diab B, Khamis AM, et al. Potentially missing data are considerably more frequent than definitely missing data: A methodological survey of 638 randomized controlled trials. Journal of Clinical Epidemiology. 2019;106:18-31. doi:10.1016/j.jclinepi.2018.10.001

12.

Kearney A, Rosala-Hallas A, Rainford N, et al. Increased transparency was required when reporting imputation of primary outcome data in clinical trials. Journal of Clinical Epidemiology. 2022;146:60-67. doi:10.1016/j.jclinepi.2022.02.008

13.

Khan NA, Torralba KD, Aslam F. Missing data in randomised controlled trials of rheumatoid arthritis drug therapy are substantial and handled inappropriately. RMD Open. 2021;7(2):e001708. doi:10.1136/rmdopen-2021-001708

14.

Tan PT, Cro S, Van Vogt E, Szigeti M, Cornelius VR. A review of the use of controlled multiple imputation in randomised controlled trials with missing outcome data. BMC Medical Research Methodology. 2021;21(1). doi:10.1186/s12874-021-01261-6

15.

Zhang Y, Flórez ID, Colunga Lozano LE, et al. A systematic survey on reporting and methods for handling missing participant data for continuous outcomes in randomized controlled trials. Journal of Clinical Epidemiology. 2017;88:57-66. doi:10.1016/j.jclinepi.2017.05.017

16.

Sullivan TR, Morris TP, Kahan BC, Cuthbert AR, Yelland LN. Categorisation of continuous covariates for stratified randomisation: How should we adjust? Statistics in Medicine. 2024;43(11):2083-2095. doi:10.1002/sim.10060

17.

Schafer JL. Multiple imputation: A primer. Statistical Methods in Medical Research. 1999;8(1):3-15. doi:10.1177/096228029900800102

18.

Lachin JM. Fallacies of last observation carried forward analyses. Clinical Trials. 2015;13(2):161-168. doi:10.1177/1740774515602688

19.

Molnar. 2009;3.

20.

Kenward MG, Molenberghs G. Last observation carried forward: A crystal ball? Journal of Biopharmaceutical Statistics. 2009;19(5):872-888. doi:10.1080/10543400903105406

21.

Morris TP, Kahan BC, White IR. Choosing sensitivity analyses for randomised trials: principles. BMC Medical Research Methodology. 2014;14(1). doi:10.1186/1471-2288-14-11

22.

Food and drug administration. E9 (R1) statistical principles for clinical trials: Addendum: Estimands and sensitivity analysis in clinical trials. Guidance for industry. May 2021.

23.

Hussain JA, White IR, Johnson MJ, et al. Development of guidelines to reduce, handle and report missing data in palliative care trials: A multi-stakeholder modified nominal group technique. Palliative Medicine. 2022;36(1):59-70. doi:10.1177/02692163211065597

24.

Sterne JAC, White IR, Carlin JB, et al. Multiple imputation for missing data in epidemiological and clinical research: Potential and pitfalls. BMJ. 2009;338(jun29 1):b2393-b2393. doi:10.1136/bmj.b2393

Reuse

Most of the reporting guidelines and checklists on this website were originally published under permissive licenses that allowed their reuse. Some were published with propriety licenses, where copyright is held by the publisher and/or original authors. The original content of the reporting checklists and explanation pages on this website were drawn from these publications with knowledge and permission from the reporting guideline authors, and subsequently revised in response to feedback and evidence from research as part of an ongoing scholarly dialogue about how best to disseminate reporting guidance. The UK EQUATOR Centre makes no copyright claims over reporting guideline content. Our use of copyrighted content on this website falls under fair use guidelines.

Citation

For attribution, please cite this work as:

Hopewell S, Chan AW, Collins GS, et al. CONSORT 2025 statement: updated guideline for reporting randomised trials. BMJ. 2025;389:e081123. doi:10.1136/bmj-2024-081123

21c. Missing Data

What to write

Examples

Explanation

Methods

Results

Discussion

Training

Discuss this item

References

Reuse

Citation

Reporting Guidelines are recommendations to help describe your work clearly

Who reads research?

Cohort studies

Case-control studies

Cross-sectional studies

Systematic reviews

Systematic review protocols

TODO

Meta analyses of Observational Studies

TODO

Randomised Trials

Randomised Trial Protocols

TODO

Qualitative research

Case Reports

TODO

Diagnostic Test Accuracy Studies

Prediction Models

Animal Research

TODO

Quality Improvement in Healthcare

Economic Evaluations in Healthcare

TODO

Meta Analyses

How Meta-analyses and Systematic Reviews Work Together

Why Don't All Systematic Reviews Use a Meta-Analysis?

Protocol

Asdfghj