FDA statistician outlines adaptive trial designs to address rare-disease evidence gaps

FDA Center for Drug Evaluation and Research — Office of Biostatistics Grand Rounds · February 5, 2026

Loading...

AI-Generated Content: All content on this page was generated by AI to highlight key points from the meeting. For complete details and context, we recommend watching the full video. so we can fix them.

Summary

At FDA Grand Rounds, Natalie Morris of the agency's Office of Biostatistics reviewed adaptive clinical-trial designs for rare diseases, presenting simulations that show an adaptive treatment-duration approach can preserve type I error and offer similar power to a fixed two-year trial in some settings.

Natalie Morris, a statistician in the Food and Drug Administration’s (FDA) Office of Biostatistics, presented a technical review of adaptive clinical-trial designs for rare-disease drug development at FDA Grand Rounds. She emphasized that many very rare conditions have tiny potential trial populations and uncertain natural histories, so trial designs must be carefully chosen and justified to generate reliable evidence.

Morris outlined three recurring challenges in rare-disease trials: severely limited sample size, substantial heterogeneity of disease presentation, and sparse natural-history data. "Statistical expertise, working in collaboration with clinicians and other disciplines, can really help ensure an appropriate study design and analysis choices are made to provide reliable and timely results from these trials," she said, adding the customary disclaimer that her talk reflected her views and should not be construed as FDA policy.

She reviewed design alternatives that can increase efficiency when patient numbers are small. Crossover trials let each participant contribute data on both study treatments, which increases within-patient information but raises concerns about correlated outcomes and washout periods. Platform or master-protocol trials can share a control arm across multiple products, improving efficiency when multiple candidates are evaluated concurrently.

To handle heterogeneous presentations and multiple potentially relevant outcomes, Morris described global hypothesis tests — for example, O’Brien-style rank-sum approaches — that combine information across endpoints and can be powerful when an effect on at least one endpoint is persuasive. She illustrated the approach using prior trials in MPS II (Hunter syndrome) and Elaprase’s randomized study, which combined a 6-minute-walk measure and lung-function metrics in a global test.

The core of Morris’s talk focused on adaptive designs and, in particular, an "adaptive treatment-duration" approach. Per the draft ICH guidance on adaptive design (E20), adaptive designs plan prospectively for modifications based on interim analyses. Adaptive-duration designs begin with a maximum planned duration (for example two years) but include a prespecified interim analysis (for example at one year) that can stop the trial early for persuasive evidence or continue follow-up to the planned duration if not.

Using simulations tailored to a representative rare-disease setting (roughly 40 randomized patients, 1:1 allocation, and a treatment effect that grows linearly up to two years and then plateaus), Morris reported that a fixed one-year trial could be substantially underpowered (about 33% in her example) while a two-year fixed trial reached roughly 88% power under the same assumptions. An adaptive-duration design with a one-year interim, adjusted using group-sequential style alpha-spending rules, produced operating characteristics similar to a fixed two-year trial in that scenario.

Morris compared two common alpha-adjustment rules: O’Brien–Fleming, which spends little alpha at interim and preserves most for the final analysis, and Pocock, which spends alpha more evenly and thus has a higher chance of stopping early but typically reduces overall power. In her simulations the Pocock-like thresholds yielded a higher probability of stopping early (about 25% at a mid-level effect) versus about 9% for an O’Brien–Fleming threshold, illustrating the trade-off between power and early stopping.

She cautioned that these were simulation results for a specific example and that type I error control and power should be verified by simulations tailored to the trial’s endpoints, correlation structure, enrollment rates and other context-specific features. Morris said the team found group-sequential and adaptive-duration approaches to have similar power in many settings, with group-sequential sometimes slightly better when it is feasible to implement.

Morris also stressed practical safeguards. Adaptations must be fully prespecified, independent data-monitoring committees should recommend adaptations, and interim analyses need sufficient safety and benefit–risk information to support any early decisions. She warned that adaptations increase complexity and can jeopardize trial integrity if knowledge of an adaptation leaks or if interim estimates are based on too little information.

"Because there's often many information gaps at the design phase in rare disease, it's best to choose just one—have one type of adaptation—to reduce that complexity of the trial," Morris said. She recommended consulting the ICH E20 draft guidance and FDA guidances on master protocols and adaptive design, and she urged sponsors to run context-specific simulations before implementing nonstandard designs.

The presentation closed with references to public guidances and to several colleagues who assisted with the research. Morris acknowledged that simulation results are context dependent and emphasized that statistical planning and prespecification are essential when adaptive methods are used in confirmatory rare-disease trials.