1 Introduction

1.1 Estimands and Missing Data

Missing data in clinical studies is a major concern because it can compromise the validity, reliability, and generalizability of the study’s findings. In clinical trials, data is collected at various points to monitor the effects of interventions, track patient progress, and assess outcomes. However, missing data can occur due to various reasons, and if not handled properly, it can lead to biased results and flawed conclusions.

Missing data may result from an intercurrent event (ICE) or simply due to a missed assessment. An intercurrent event refers to an event that happens after the treatment has started and affects either the collection of data or the interpretation of the treatment’s effect. For example, a patient might stop participating in the study due to adverse effects or an unrelated medical condition that prevents further follow-up. These events can complicate the interpretation of the data because they reflect disruptions that were not anticipated at the start of the trial. In contrast, a missed assessment refers to situations where data collection fails at a particular visit or time point. This could be due to logistical issues, scheduling conflicts, or patient non-compliance, and while it may not directly affect how the treatment’s effect is interpreted, it still results in incomplete data.

Baseline data in clinical trials is usually complete. This is the data collected at the start of the study before any intervention begins. Baseline measurements are critical because they provide the reference point for understanding how patients change over time. Since baseline data is essential for trial validity, efforts are made to ensure that this data is thoroughly collected from all participants. However, as the study progresses, subsequent visits may have various missed assessments. These missing data points are more likely to occur during follow-up visits or data collection stages later in the study, particularly in long-term trials or studies involving frequent visits or tests.

The potential impact of missing data on the interpretation of study findings is significant and needs to be carefully considered when designing the study protocol. If not managed properly, missing data can introduce bias into the study. This occurs when the missing data is not random but instead is related to specific factors that could affect the outcome. For example, if participants who drop out of the study tend to have worse health outcomes, then the study results may overestimate the treatment’s effectiveness. Another issue caused by missing data is the loss of statistical power, which happens when fewer data points are available for analysis. This makes it more difficult to detect real differences between treatment groups, increasing the likelihood of false conclusions.

At the time of protocol development, researchers need to plan how to handle missing data and potential missing outcomes. Several strategies can be applied, including efforts to prevent missing data by ensuring robust follow-up processes or implementing strategies like reminder systems for patients. Additionally, statistical techniques such as imputation can be used to estimate missing values, or more advanced models that account for missing data can be employed. Sensitivity analysis, which tests how different assumptions about the missing data might affect the study’s results, is another important consideration. This ensures that the final conclusions are robust to various scenarios of missing data.

One of the first steps in managing missing data is to identify the cause of the missing data. This can involve understanding whether the missing data is Missing Completely at Random (MCAR), Missing at Random (MAR), or Missing Not at Random (MNAR). If data is MCAR, it means that the missing data is unrelated to any of the variables in the study, making the issue less concerning because it doesn’t introduce bias. MAR means that the missing data is related to observed variables, such as younger patients being more likely to miss follow-up visits. MNAR means that the missing data is related to unobserved data, such as patients with worse outcomes dropping out of the trial.

When designing a study, it is important to clearly define the research questions of interest and how missing data may affect these questions. In clinical studies, estimands are used to define the quantity being estimated and answer the research questions. Estimands provide a framework that helps define the effect of the treatment or intervention under study. Handling missing data will depend on how the estimands are defined. For example, if the goal is to estimate the effect of the treatment assuming full adherence, missing data from participants who discontinue treatment could pose a problem and would need to be appropriately accounted for. Alternatively, if the estimand reflects real-world use of the treatment, data from participants who drop out might be considered part of the natural course of the study and handled differently.

In conclusion, missing data is a complex issue that requires careful planning and consideration in clinical studies. The source of missing data, whether due to intercurrent events or missed assessments, needs to be clearly understood, and strategies for handling missing data should be built into the study design from the beginning. Identifying the causes of missing data, understanding the impact on study results, and aligning with the research objectives and estimands are essential for ensuring valid and interpretable findings in clinical research.

1.2 Estimand, Estimators and Estimation

1. Trial Objective to Estimand Flow - Trial Objective: This is the primary goal or question that the trial seeks to answer. It’s the starting point for defining what the trial will focus on. - Estimand: A precise definition of what is to be estimated in the trial. It translates the trial objective into a statistically quantifiable entity. It is essentially what the trial aims to measure or the specific effect that the trial intends to estimate. - Main Estimator: The statistical methodology or approach used to estimate the estimand. This could be a specific statistical model or analysis technique. - Main Estimate: The actual numerical estimate obtained from applying the main estimator to the trial data. This is the result that addresses the trial objective. - Sensitivity Estimators and Estimates: Additional analyses conducted to assess the robustness of the main estimate against different assumptions or conditions. These help validate the reliability of the findings under various scenarios.

2. Description of an Estimand: Attributes - Treatment: Identification of both the treatment of interest and the comparator (control treatment). It specifies what is being tested and against what it is being compared. - Variable: The primary measure or outcome variable that the study aims to evaluate. This could be a clinical metric, patient-reported outcome, or a biomarker. - Population: The specific group of patients targeted by the scientific question. This defines who the trial results will apply to. - Population-Level Summary: How the results are summarized across the entire study population, such as differences between means, ratios of proportions, or hazard ratios. This summarizes the treatment effect at the group level.

3. Intercurrent Events and Strategies to Address Them - Intercurrent events are occurrences during the trial that could affect the interpretation of the treatment effect, such as participants taking additional medication. - Treatment Policy: Considers the effect of the treatment as it is used in practice, including any additional medications taken. - Composite: Defines the endpoint to include whether or not intercurrent events occur, considering events like additional medication intake as part of the treatment assessment. - Hypothetical: Estimates what the treatment effect would have been if the intercurrent event had not occurred. - Principal Stratum: Focuses on a subset of participants who would not experience the intercurrent event, assessing the effect in this more controlled scenario. - While on Treatment: Looks at the treatment effect only before any intercurrent event occurs, isolating the effect of the treatment itself.

1.3 Causal

Causal estimands are a critical concept in the field of statistics, particularly when it comes to understanding the effect of interventions in clinical trials and observational studies. They are designed to estimate the impact of a treatment by considering what the outcome would have been under different treatment conditions.

Concept of Causal Estimands

Causal estimands are aimed at answering “what if” questions in a formal, quantitative way. They focus on understanding the effect of a specific treatment by asking how the outcomes would differ if the treatment were applied versus not applied, or if an alternative treatment were used. This approach aligns with causal inference, which seeks to infer the cause-and-effect relationship from data.

Framework of Potential Outcomes

The potential outcomes framework is fundamental to causal inference and was originally formalized by Donald Rubin. It considers every subject in a study to have potential outcomes under each treatment condition. For example: - \(Y(1)\): Outcome if the subject receives the treatment. - \(Y(0)\): Outcome if the subject does not receive the treatment.

These potential outcomes help to define the causal effect of the treatment, which cannot be observed directly since we can only observe one of these outcomes for each individual — the one corresponding to the treatment they actually received.

Causal Estimand Formula

The basic causal estimand in a randomized controlled trial (RCT) can be expressed as: - \(E(Y(1)) - E(Y(0))\) This represents the expected difference in outcomes between subjects assigned to the treatment versus those assigned to the control. This difference is what statisticians aim to estimate through the trial.

Challenges in Observational Studies

In observational studies, where treatments are not randomly assigned, estimating causal effects becomes more complex due to potential confounding factors. Here, additional models and assumptions about how treatments are assigned to patients (assignment models) and how outcomes are generated (outcome models) are necessary. These models help to adjust for factors that may influence both the treatment assignment and the outcomes.

International Council for Harmonization (ICH) and Causal Estimands

The ICH guidelines emphasize the importance of causal estimands in clinical trials, suggesting that the trials should be designed to answer specific causal questions. Even though the term “causal” is not explicitly used, the guidelines align with causal reasoning principles to ensure that the results of clinical trials are robust and interpretable in terms of causal effects.

Statistical Inference for Causal Estimands

Statistical methods are employed to estimate causal estimands from the observed data. In RCTs, this often involves comparing the observed outcomes between the treatment and control groups, leveraging the randomization to argue that these groups are comparable. In non-randomized studies, more sophisticated statistical techniques, such as instrumental variable analysis, propensity score matching, or structural models, are required.

Importance for Regulatory Authorities

Regulatory authorities, such as the FDA or EMA, are particularly interested in causal estimands because they provide a clear basis for regulatory decisions regarding drug approvals. By focusing on causal estimands, regulators can better understand the true effect of a drug, independent of other confounding treatment or patient characteristics

1.4 Missing data vs intercurrent event

Intercurrent events are incidents that occur after the initiation of treatment and may impact the interpretation of the trial outcomes or affect the continuity of the measurements related to the clinical question of interest. These can include events such as patients starting additional therapies, experiencing side effects leading to treatment discontinuation, or any other circumstances that alter the course of standard treatment administration.

Missing data refers to information that was intended to be collected but was not, due to various reasons such as patients dropping out of the study, missing visits, or failure to record certain outcomes. It’s important to distinguish between data that is missing because it was not collected (but could have been under different circumstances) and data that is considered not meaningful due to an intercurrent event.

Handling Strategies for Intercurrent Events and Missing Data

1. Treatment Policy Strategy: - Approach: Includes all data up to and possibly including the intercurrent event, considering the measurements of interest regardless of subsequent therapies or changes. - Missing Data Issue: Here, the missing data problem may arise when assuming values for unobserved outcomes based on the observed data. For example, if patients are lost to follow-up but are assumed to continue on their projected path without treatment changes.

2. Hypothetical Strategy: - Approach: Assumes a scenario where the intercurrent event, such as treatment discontinuation, does not occur. - Missing Data Issue: Focuses on hypothetical data. For instance, it would not consider data from follow-up visits if a patient had not been lost to follow-up, imagining the patient remained in the trial under the initial treatment conditions.

3. Composite Strategy: - Approach: Combines multiple elements or outcomes into a single variable that incorporates the intercurrent event as part of the variable of interest. - Missing Data Issue: Typically, there is no missing data concern under this strategy as the intercurrent events are accounted for within the composite outcome measure.

4. While-on-Treatment Strategy: - Approach: Analyzes the response to treatment only up to the point of an intercurrent event. - Missing Data Issue: There is generally no missing data because the analysis only includes data collected while the patients were on treatment and before any intercurrent event.

5. Principal Stratum Strategy: - Approach: Focuses on specific subgroups (strata) that are not affected by the intercurrent events, based on their potential outcomes under different treatment scenarios. - Missing Data Issue: This strategy avoids missing data issues by defining the population such that the intercurrent event is not considered relevant for the stratum of interest. It inherently excludes patients from the analysis if they are outside the target strata.

1.5 Sensitivity versus Supplementary Analysis

Sensitivity Analysis

Purpose: Sensitivity analysis is performed to assess the robustness of the conclusions derived from the main analysis of a clinical trial. It involves testing how the main findings hold up under various assumptions or variations in the analysis model.
Process: This type of analysis typically involves using multiple sensitivity estimators that deviate in specific ways from the main estimator’s assumptions. For instance, sensitivity analyses may involve changing assumptions about the distribution of the data, the model used, or handling of missing data.
Objective: The goal is to explore the extent to which the main findings are dependent on the assumptions made in the statistical modeling. This is crucial for verifying that the conclusions are not unduly influenced by these assumptions and therefore can be considered reliable under a variety of scenarios.

Supplementary Analysis

Purpose: Supplementary analysis goes beyond sensitivity analysis to further explore and understand the treatment effects. These analyses are usually more exploratory in nature and are often conducted to address additional research questions or hypotheses that were not the primary focus of the main analysis.
Process: This could include additional analyses as requested by regulatory authorities, or analyses planned after reviewing initial findings to probe deeper into specific areas of interest. Supplementary analyses may investigate different subgroups of patients, additional endpoints, or longer-term outcomes that were not part of the original estimand.
Objective: The main aim is to provide a broader understanding of the data and treatment effects. This might involve confirming the findings from the main analysis, exploring areas where the main analysis was inconclusive, or generating new insights that could lead to further research questions.

Key Differences and Interplay

Scope: Sensitivity analysis is more focused on testing the stability and reliability of the results under different assumptions directly linked to the main outcome of interest defined by the estimand. In contrast, supplementary analysis often has a broader scope, potentially addressing new or secondary questions that extend beyond the original estimand.
Outcome Dependency: Sensitivity analyses are inherently tied to the outcomes of the main estimator and focus on the dependencies and variabilities around these outcomes. Supplementary analyses, however, might explore entirely new outcomes or expand on the findings of the main analysis in ways that provide additional context or insights.
Regulatory Impact: Sensitivity analyses are critical for regulatory review, providing evidence that the study findings are robust and not unduly influenced by specific assumptions. Supplementary analyses, while informative, may not always be crucial for regulatory approval but can be important for labeling, post-marketing commitments, or future clinical development strategies.

1.6 Disease Specific Guideline

European Medicines Agency (EMA) and the U.S. Food and Drug Administration (FDA) include estimands in their specific disease guidance documents

EMA Guidance:
- Diseases Covered: The EMA has incorporated the use of estimands in the guidelines for several conditions and areas, including:
  - Diabetes
  - Alzheimer’s disease
  - Acute kidney injury
  - Chronic non-infectious liver diseases
  - Epileptic disorders
  - Medicinal products with genetically modified cells
  - Registry-based studies
FDA Guidance:
- Diseases Covered: The FDA also specifies the use of estimands in their guidance for diseases such as:
  - Eosinophilic esophagitis
  - Acute myeloid leukemia
  - Chronic rhinosinusitis with nasal polyps

2 Five Strategies

2.1 Treatment Policy Strategy Explained

Introduction

Treatment Policy Strategy Context - Basic Concept: The treatment policy strategy incorporates all aspects of a treatment regimen, including the main drug (Drug X in this scenario) and any additional rescue medications taken as required. This approach assesses the overall effectiveness of the treatment strategy, rather than just the isolated effect of the drug. - Inclusion of Rescue Medication: In clinical trials, rescue medication is often allowed to manage symptoms or adverse effects that are not adequately controlled by the study drug alone. Including rescue medication in the ‘Treatment’ attribute acknowledges its role as part of the therapeutic regimen, providing a more comprehensive evaluation of the treatment efficacy.

Pretect the randomization arm

The treatment policy strategy is a framework used in clinical trials and medical research to handle intercurrent events. These events are occurrences that happen during a study that may affect the outcome or treatment of participants but are not part of the planned treatment protocol. A treatment policy strategy takes a pragmatic approach to dealing with these intercurrent events.

Variable Value Usage Regardless of Intercurrent Events In a treatment policy strategy: - The value of the variable of interest (such as a clinical outcome like blood sugar level) is used regardless of whether or not the intercurrent event occurs. - This approach considers the occurrence of unexpected events, such as the need for additional medication during the study, as part of the overall treatment strategy. The effect of the treatment is measured as if these events are integral to the treatment regimen.

Intercurrent Event as Part of the Treatment Strategy - Intercurrent events are treated as part of the overall treatment plan. This approach contrasts with other strategies that might censor or exclude data if an intercurrent event occurs. - By incorporating the intercurrent event into the treatment strategy, the analysis aims to reflect real-world treatment conditions where patients might experience various changes in their treatment paths.

Example: Type 2 Diabetes Study

Treatments - Test Treatment with Rescue Medication: Patients receive a test medication for diabetes. If their blood sugar levels rise too high during the trial, they are given additional rescue medication to manage these levels. - Placebo with Rescue Medication: Similarly, the control group receives a placebo but can also use rescue medication as needed.

Clinical Question - The research question could be: What is the effect of the test treatment (with rescue medication as needed) compared to placebo (with rescue medication as needed) on HbA1c change from baseline to Week 26? - HbA1c is a marker that reflects a person’s average blood glucose levels over the past 2-3 months. The study aims to measure how this marker changes after 26 weeks of treatment.

Key Aspects of the Strategy in This Example - Rescue medication is considered an intercurrent event. Patients receive this medication if their blood sugar rises significantly during the study. - Under the treatment policy strategy, the impact of rescue medication is neither excluded nor ignored. Instead, it is integrated into the treatment strategy because: - In real-world scenarios, patients often need additional medication to control their blood sugar. - The clinical question seeks to measure the overall effectiveness of the test treatment, including how often patients require rescue medication and its combined effect on their HbA1c levels.

This strategy assesses how the test treatment (with the option for rescue medication) compares to placebo (with the option for rescue medication) in a real-world setting. It acknowledges that not every patient responds the same way to treatment and that intercurrent events like needing additional medication can occur.

Summary of the Treatment Policy Strategy - Intercurrent events, such as the use of rescue medication, are not disregarded but are incorporated into the analysis. - This strategy mirrors real-world treatment scenarios, aiming to address practical clinical questions regarding the overall effectiveness of the treatment, including its performance amidst unplanned events like the need for additional medications.

This comprehensive approach provides a holistic view of the treatment’s impact over time, considering the effect of the treatment alongside any additional interventions.

Remark

Importance of Complete Follow-Up - Data Collection: It is crucial to follow all participants until the end of the trial and systematically collect data to ensure that the treatment effects are accurately captured. This comprehensive data collection supports robust statistical analysis and helps in addressing any potential biases caused by dropouts or missing data.

Alignment with Intention-to-Treat Principle - Definition: The intention-to-treat (ITT) principle involves analyzing data from all randomized participants based on the treatment they were initially assigned, regardless of whether they completed the treatment as planned. - Relevance: The treatment policy strategy aligns with the ITT principle by including all participants’ data in the analysis, regardless of deviations from the protocol such as the use of rescue medication. This method provides a realistic picture of the drug’s effectiveness in practical settings.

Regulatory Considerations - Acceptance: Treatment policy strategies are widely accepted in regulatory decision-making because they reflect the treatment’s effectiveness under typical clinical conditions. However, the appropriateness of this strategy can vary depending on the specific regulatory environment and study design. - Impact on Drug Labeling: When a treatment policy strategy is used, it may influence how the treatment is described in drug labels. The labels might need to indicate that the effectiveness data includes the use of rescue medications, which could affect the drug’s perceived efficacy.

Critical Questions for Implementation 1. Influence of Rescue Medications: It is essential to consider whether and how rescue medications might alter the perceived effectiveness of the primary treatment. Their use could mask or enhance the effects of the study drug, complicating the interpretation of results. 2. Relevance in Placebo-Controlled Studies: In placebo-controlled studies, where the control group receives a placebo instead of an active treatment, assessing the impact of rescue medication becomes even more critical. It’s important to determine whether the use of rescue medication is balanced across study arms and how it might affect the comparability of the treatment effects.

Presentation and Discussion of Limitations - Transparency: Results derived from a treatment policy strategy should be presented with a thorough discussion of any limitations related to trial design, data collection, or statistical analysis. This includes acknowledging the potential confounding effects of rescue medications and other protocol deviations.

2.2 Composite Strategy Explained

Introduction

Integration into Trial Endpoint: - Endpoint Definition: In a scenario where rescue intake is deemed undesirable, the trial’s endpoint might include not only the primary measure (e.g., HbA1c level) but also whether rescue medication was used. For example, a responder endpoint might be defined as a patient achieving an HbA1c level ≤ 7% at 24 weeks without the use of any rescue medication. - Outcome Assessment: This approach allows for a clearer assessment of the drug’s efficacy in controlling the condition without supplementary intervention. It distinguishes between those who meet the target solely through the use of the study drug and those who require additional medication, categorizing the latter group as non-responders.

Example Scenario with Patient Data: - Patients’ Outcomes: In your hypothetical patient data, patients are evaluated based on their HbA1c change from baseline at Week 24, and their responder status is determined based on whether they achieved this without rescue medication. - Patient 1 and Patient 6 are considered responders as they achieve HbA1c ≤ 7% without rescue medication. - Patient 2, despite achieving an HbA1c of 6.9%, is not considered a responder as they might have used rescue medication. - Patient 3, Patient 4, and Patient 5 are explicitly noted as non-responders either due to higher HbA1c or the intake of rescue medication.

In the composite strategy, the occurrence of an intercurrent event (such as the need for additional medication or other unexpected outcomes) is seen as informative about the patient’s overall outcome. This strategy incorporates the intercurrent event directly into the clinical outcome or variable being studied, effectively altering the way the variable is analyzed.

Intercurrent Event as Informative Unlike the treatment policy strategy where the intercurrent event is treated as part of the overall treatment process but not directly tied to the outcome variable, the composite strategy views the intercurrent event as providing critical information about the patient’s outcome. If a patient experiences an intercurrent event, such as needing rescue medication, it affects how the treatment outcome is interpreted.

Altering the Outcome Variable The key difference in this strategy is that the outcome variable is modified based on the occurrence of the intercurrent event. The variable’s definition is changed to reflect both the intended treatment outcome and the intercurrent event. For instance, instead of just looking at whether the patient’s blood sugar improved, the outcome might also account for whether the patient needed rescue medication, combining these into a single variable that captures more information about the treatment’s overall success.

Example: Type 2 Diabetes Study

In this case, the intercurrent event is the use of rescue medication, and it provides significant information about the patient’s response to the treatment. The strategy will incorporate this into the analysis of the treatment effect.

Defining a Composite Outcome

One way to apply a composite strategy is by dichotomizing (splitting the outcome into two categories) the variable based on both HbA1c improvement and whether or not rescue medication was needed:

Responder: A patient who:
- Achieves an improvement of x% in HbA1c from baseline to Week 26 AND
- Does not need to take rescue medication at any point during the study.
Non-Responder: A patient who:
- Fails to achieve the target HbA1c improvement OR
- Needs rescue medication at any time during the study.

By classifying patients as “responders” or “non-responders” based on these two criteria, the intercurrent event (needing rescue medication) is considered to be part of the outcome rather than just something that happened along the way.

Clinical Question In this example, the clinical question might be framed as:

What is the effect of the test treatment vs placebo on achieving an improvement of x% in HbA1c from baseline to Week 26, without the need for rescue medication at any point during the study?

This question looks at two components: - Improvement in HbA1c: How effective is the treatment at lowering blood sugar levels over time? - Use of Rescue Medication: How effective is the treatment in maintaining blood sugar control without requiring additional medications?

How This Composite Strategy Works - Combination of Variables: The outcome variable now combines two aspects: 1. Biochemical improvement (HbA1c reduction), 2. Clinical management success (no need for additional medications).

By creating a composite outcome, this strategy provides a more holistic measure of the treatment’s effectiveness. It answers the clinical question of whether the treatment not only improves blood sugar levels but also maintains stability without needing extra intervention.

Key Points of the Composite Strategy - Informative Event: The intercurrent event (such as using rescue medication) tells us something critical about the patient’s health and treatment success. - Modified Variable: The variable used to measure success is changed to incorporate the intercurrent event, providing a richer picture of the treatment’s overall effect. - Dichotomization: The composite variable can be dichotomized (e.g., responder vs non-responder) based on criteria that include the intercurrent event.

Summary of the Composite Strategy - In the composite strategy, the intercurrent event (like needing rescue medication) is considered to provide meaningful information about the outcome and is integrated into the analysis. - The clinical outcome variable is modified to reflect both the primary outcome (e.g., HbA1c improvement) and the occurrence of the intercurrent event, giving a more comprehensive assessment of the treatment’s effectiveness. - In the Type 2 diabetes example, the outcome could be dichotomized into responders (those who both improve and don’t need rescue medication) and non-responders (those who either fail to improve or need rescue medication).

This approach helps to evaluate the treatment’s success from a broader, more inclusive perspective.

Remark

Implications of the Strategy:
- Outcome Categorization: Using rescue medication reclassifies a patient’s response status, directly impacting the overall assessment of treatment efficacy.
- Variable and Estimand Definition: This method necessitates clear definitions in the trial protocol and analyses, affecting how outcomes are reported and interpreted.
- Drug Labeling: Implementing this strategy can influence drug labeling, as the labels must clearly state that efficacy assessments exclude cases where rescue medication was used.
Broad Application: While the example focuses on a dichotomous outcome, composite strategies can also be applied to continuous outcomes by defining specific thresholds or criteria that categorize continuous scores into binary outcomes, thus creating different estimands based on how the data are dichotomized.

2.3 Hypothetical Strategies Explained

Introduction

Hypothetical Strategy Overview: - Scenario Definition: The hypothetical strategy involves defining a scenario in which no rescue medication is used throughout the trial. It aims to isolate the pure effect of the treatment by assuming an idealized course of treatment. - Data Handling: In this strategy, any data collected after the administration of rescue medication is disregarded, as the analysis focuses solely on data unaffected by the rescue intervention. - Modeling Approach: To estimate what the outcomes would have been if rescue medications had not been administered, sophisticated statistical modeling techniques are employed. These models extrapolate the response trajectory of patients as if they had continued without any additional intervention.

A hypothetical strategy involves imagining a scenario where certain events, such as intercurrent events, did not occur and estimating what the treatment effect would have been in that scenario. This strategy is frequently used in clinical trials to explore “what if” questions. It can be particularly useful for understanding what the treatment’s impact might have been under different conditions, but it also comes with challenges related to feasibility, clinical plausibility, and interpretation.

What is a Hypothetical Strategy? A hypothetical strategy asks the question: What would the treatment effect be in a scenario where some intercurrent event (such as taking rescue medication) did not happen? This creates an alternative reality where an event like rescue medication is imagined to have not occurred or where patients behave in a particular way (e.g., adhering strictly to treatment). The goal is to estimate the treatment effect under this hypothetical condition, which often requires assumptions about what would have happened. For instance, if analyzing the effect of a diabetes drug on blood sugar levels (HbA1c) and some patients took rescue medication during the study, a hypothetical strategy might ask: What would the treatment effect have been if rescue medication was never made available?

Key Questions When Considering Hypothetical Scenarios Hypothetical strategies come with various considerations: - What would the treatment effect be if patients had not taken rescue medication? This is a common hypothetical scenario but requires careful thought. Is it plausible to imagine patients not having access to rescue medication in real-world settings? - What would the treatment effect be if rescue medication had not been made available? This could be more plausible in placebo-controlled trials, where we want to know the true effect of the test drug without rescue medication being used. - What would the treatment effect be if patients had not needed rescue medication due to lack of efficacy? This scenario is typically less relevant because it assumes that patients simply would not need rescue medication, which is not something that can be controlled or predicted in practice.

The Challenge of Precision in Defining Hypothetical Scenarios A major challenge with hypothetical strategies is that they can introduce ambiguity. There can be a broad range of hypothetical scenarios, each with different clinical relevance, so it’s essential to define them precisely. Speaking of “THE hypothetical strategy” leaves too much room for vagueness. Hypothetical strategies need to be carefully formulated to ensure they make sense in a clinical context. The strategy must define a clear hypothetical condition and outline exactly how the intercurrent event is imagined not to occur.

Examples of Hypothetical Scenarios: - Relevant Hypotheticals: In placebo-controlled trials, it might be reasonable to ask what the effect of the treatment would be if rescue medication was not made available. This scenario is plausible because it reflects a condition where additional intervention is restricted. - Irrelevant Hypotheticals: A scenario where patients fully adhere to treatment despite serious adverse events is unlikely to be useful. It is not realistic to assume that patients would always stick to treatment in real-world clinical settings, so such a scenario lacks clinical relevance.

When is a Hypothetical Scenario Relevant? There are certain considerations that help determine whether a hypothetical strategy is relevant: - Can You Intervene on the Intercurrent Event? If an intervention can be made on the intercurrent event (such as limiting the use of rescue medication), then a hypothetical scenario can be useful. If an event is uncontrollable (such as how patients react to treatment), then a hypothetical strategy may not be of interest because it is too far removed from reality. - Clinical Plausibility: The clinical plausibility of the hypothetical scenario is key. If the scenario cannot plausibly happen in clinical practice (like patients always adhering to treatment despite severe side effects), then it may not offer any meaningful insights. Relevant scenarios are usually those that change the study design or treatment options, while changing patient behaviors tends to lead to irrelevant scenarios.

Case Study: Type 2 Diabetes Example In the context of a Type 2 diabetes trial, a hypothetical scenario might be: What if rescue medication had not been made available?

Clinical Question: What is the effect of test treatment vs placebo on HbA1c change from baseline to Week 26 as if rescue medication had not been made available? In this case, the hypothetical scenario assumes that no rescue medication was provided, meaning the analysis would estimate the treatment’s effect on blood sugar control (HbA1c) without any additional medications to influence the results.

Regulatory Guidance: The European Medicines Agency (EMA) in 2024 guidance suggests using a hypothetical strategy for rescue medication in type 2 diabetes studies. Specifically, they recommend analyzing data under the assumption that rescue medication or other medications influencing HbA1c values were not introduced.

Estimation Challenges Estimating estimands (the treatment effect or outcome of interest) with a hypothetical strategy often requires missing data or causal inference methods. Since the scenario is hypothetical, and patients did receive rescue medication, there is a need for advanced statistical techniques to estimate what would have happened if they had not. Methods such as causal inference and handling missing data (from patients who took rescue medication) play a role in these estimations.

Summary of Hypothetical Strategies: Hypothetical strategies explore what if scenarios, imagining treatment effects in situations where intercurrent events (like taking rescue medication) did not occur. These strategies are relevant when the hypothetical scenario is clinically plausible, such as removing the option for rescue medication in a trial, but they lose relevance when the scenario assumes unrealistic patient behaviors. In the context of clinical trials, defining the hypothetical scenario clearly is crucial to avoid ambiguity and ensure that the strategy is meaningful for clinical practice. Estimation of hypothetical estimands typically requires advanced statistical methodologies to deal with missing data and causal inference challenges.

Remark

Clinical Question and Relevance: - Clinical Relevance: The fundamental clinical question is whether Drug X can effectively manage the condition without the need for additional interventions (rescue medications). This question is crucial from a regulatory and clinical perspective as it tests the drug’s adequacy in controlling the condition independently. - Regulatory Interest: Such a scenario is particularly relevant for decision-making by regulatory bodies that may prefer data demonstrating a drug’s efficacy in the absence of confounding variables like rescue medications.

Considerations for Implementation: - Clinically Relevant Questions: It’s essential to assess whether this hypothetical scenario addresses a clinically relevant question. If a significant proportion of the trial population requires rescue medications, the relevance of the hypothetical scenario might be questioned, as it could diverge significantly from real-world conditions. - Modeling Assumptions: The assumptions underpinning the modeling must be carefully considered and justified. These include assumptions about the progression of the disease and the drug’s effectiveness over time without supplemental treatment. - Proportion of Rescue Medication Intake: An assessment of how many patients take rescue medication and at what point in their treatment journey they do so can influence the applicability of the hypothetical model’s results.

Language Precision and Estimand Definition: - Clear Language: Precise language is necessary to define what the hypothetical scenario entails and how it is realized in the context of the study. Vague terms can lead to misinterpretations of the study’s objectives and findings. - Formulating Hypothetical Estimands: It’s possible to formulate various hypothetical estimands, and selecting the most clinically plausible and relevant ones is critical. This selection should be guided by the specific needs of the study and its stakeholders.

2.4 While-on-Treatment Strategy Explained

Introduction

Strategy Overview: - Endpoint Restriction: The primary endpoint of the trial is defined to include only data collected from the start of the treatment until the occurrence of the intercurrent event, which in this case is the intake of rescue medication. This approach effectively restricts the observation period to the time when patients were receiving their assigned treatment without any supplementary intervention. - Data Exclusion: All data points collected after the initiation of rescue medication are discarded or ignored in the analysis. This ensures that the treatment effect being measured is attributed solely to the study drug, without the confounding effects of additional treatments.

The while-on-treatment strategy focuses on the response to treatment that occurs prior to the occurrence of an intercurrent event (such as discontinuation of treatment, use of rescue medication, or death). This strategy is designed to estimate the treatment effect only while the patient remains on the assigned treatment and before any major event occurs that would disrupt the planned treatment regimen.

Response to Treatment Prior to the Intercurrent Event

The while-on-treatment strategy focuses on analyzing how patients respond to the treatment up until the point that an intercurrent event occurs. Once the event happens, such as the need for rescue medication or death, the data after that point is no longer used to measure treatment efficacy.

For example: - In a diabetes trial, the focus would be on how much the test treatment affects HbA1c (a measure of blood sugar control) before the patient requires rescue medication. - In trials where death is the intercurrent event, the strategy might be called the while-alive strategy, where the treatment’s impact is measured while the patient is still alive.

Challenges of Interpretation

One of the main challenges with the while-on-treatment strategy is that it can be difficult to interpret the results, particularly when the duration of treatment differs significantly between treatment arms.

Unequal treatment durations: If patients in one treatment group remain on the treatment longer than those in the other group, comparisons between the two groups can become difficult. This is because the treatment effects might appear stronger in one group simply due to the fact that patients remained on treatment longer, rather than because the treatment is more effective.

In such cases, the treatment effect estimates may be biased or not fully comparable across groups.

Changes in the Variable and Summary Measure

This strategy changes the variable attribute (the way the outcome is measured) because the outcome is only measured up until the intercurrent event occurs. After the event, the data is not considered. Additionally, the summary measure (such as the average HbA1c reduction over time) may also change because the analysis only includes the time when patients are still on treatment.

In other words, the effectiveness of the treatment is evaluated based on the period when the patient is actively receiving it and before any other interventions (like rescue medication) occur.

Type 2 Diabetes Example

Let’s consider a Type 2 diabetes example:

Variable of Interest: - The variable here could be HbA1c change from baseline to Week 26, or the time of rescue medication usage, whichever occurs earlier.

This means that for patients who need rescue medication during the trial, their HbA1c change is measured only until the point when they require rescue medication. After that, the data is no longer used, as the intercurrent event (rescue medication) alters the outcome being measured.

Clinical Question: - The clinical question might be: What is the effect of test treatment vs placebo on HbA1c change from baseline to Week 26, or time of rescue medication usage, whichever occurs earlier?

This analysis would estimate the treatment effect on HbA1c before the intercurrent event (i.e., rescue medication) occurs. This means that the focus is on how effective the treatment is at controlling blood sugar up until the point when the patient needs additional intervention.

Key Points of the While-on-Treatment Strategy - Response before the event: The primary focus is on analyzing the treatment’s effect before the intercurrent event, which means that the data used is limited to the time before rescue medication (or other events like death) is introduced.

Variable and summary measure change: The way the outcome is measured changes because data collection stops once the intercurrent event occurs. This may also alter the summary measures used to describe the treatment effect (e.g., average HbA1c reduction over time).
Interpretation challenges: If the treatment duration differs between the two arms (e.g., one group receives treatment for a longer period before needing rescue medication), comparing the results between groups may become difficult and could introduce bias.

Summary of the While-on-Treatment Strategy

The while-on-treatment strategy focuses on estimating the treatment effect while patients remain on the assigned treatment and before any intercurrent event (like rescue medication or death) occurs.
This approach is useful when the goal is to understand the effect of a treatment up until the point where other interventions are introduced or patients stop treatment.
The strategy can introduce challenges in interpreting results, especially when treatment durations differ between groups, which may lead to biased or unclear comparisons.
In the Type 2 diabetes example, this strategy would analyze HbA1c changes from baseline to Week 26 or until the patient needed rescue medication, providing insight into the treatment’s impact before additional interventions are required.

This strategy is particularly relevant when it is important to measure the treatment effect in isolation before other factors come into play, but it requires careful consideration of potential biases introduced by differing treatment durations across groups.

Remark

Clinical and Regulatory Considerations: - Assessment of Initial Efficacy: This strategy is particularly useful for assessing the initial efficacy of a treatment. It answers the question of how well the treatment performs under optimal conditions, before any need for additional intervention arises. - Regulatory Perspective: Regulatory authorities often require evidence of a drug’s effectiveness in a controlled environment, and this strategy provides such evidence. However, they may also be interested in understanding the drug’s performance in a real-world setting, where rescue medications might commonly be used. - Relevance to Clinical Practice: The results from this strategy can inform clinicians about the initial effectiveness of a treatment, which is crucial for understanding how quickly a treatment can act and the potential need for additional medications in practice.

2.5 Principal Stratum Strategy Explained

Introduction

Understanding Principal Stratification: Principal stratification defines subgroups (strata) based on hypothetical scenarios concerning potential outcomes with each treatment. For example, a principal stratum could include patients who would not require rescue medication under either the experimental or control treatment. This subgroup represents patients for whom the treatments are effective without additional interventions.

The principal stratum strategy is used to address intercurrent events in clinical trials by focusing on specific subgroups (or strata) of patients based on how they would respond to treatment or placebo in relation to the occurrence of an intercurrent event. This strategy seeks to isolate the treatment effect within well-defined subgroups of the population, often aiming to clarify the effect in people with particular characteristics related to the event.

Principal Stratification: Splitting the Population

The principal stratum strategy involves dividing the study population into four distinct strata based on their expected response to treatment or placebo in terms of whether or not they experience the intercurrent event.

The Four Strata: 1. Stratum 1: Patients who would not experience the intercurrent event regardless of whether they are assigned to the test treatment or placebo. These are people for whom the intercurrent event would not happen, regardless of treatment.

Stratum 2: Patients who would experience the intercurrent event regardless of whether they are assigned to the test treatment or placebo. For these people, the intercurrent event will occur no matter what treatment they receive.
Stratum 3: Patients who would not experience the intercurrent event if assigned to the test treatment, but would experience it if assigned to the placebo. These patients benefit from the test treatment in terms of avoiding the intercurrent event.
Stratum 4: Patients who would not experience the intercurrent event if assigned to the placebo, but would experience it if assigned to the test treatment. These are patients for whom the test treatment could increase the risk of the intercurrent event compared to the placebo.

Clinical Question Within a Principal Stratum

Once these strata are defined, the principal stratum strategy seeks to answer clinical questions within a specific stratum. For instance, instead of looking at the overall population, the study focuses on one subgroup (or stratum), such as patients who would not need rescue medication if they received the test treatment but would need it if they were on the placebo.

This changes the population attribute of the study because the analysis is restricted to just one stratum rather than the entire population. The objective is to estimate the treatment effect specifically for this subgroup, making the results more targeted.

Principal Strata Are Not Identifiable with Certainty

One of the main challenges with the principal stratum strategy is that the strata are not always identifiable with certainty. In other words, it can be difficult or impossible to know for sure which patients fall into which stratum based on observed data. This is because we cannot observe both potential outcomes (e.g., what happens under test treatment and what happens under placebo) for the same individual at the same time.

To overcome this, statistical methods are used to estimate the membership in each stratum, but these estimates can come with some level of uncertainty.

Response to Treatment Before the Intercurrent Event

In some cases, the response to treatment prior to the occurrence of the intercurrent event is of interest. This is particularly important when considering events like death, where the outcome after the event can no longer be measured. In such cases, the while-alive strategy or while-on-treatment strategy may be applied.

While-alive Strategy: If the intercurrent event is death, the focus is on the treatment effect while the patient is still alive. After death, the outcome can no longer be measured, so the analysis is constrained to the time before the event.
While-on-treatment Strategy: This strategy focuses on analyzing the data while the patient is still on the assigned treatment. If patients stop treatment early, their outcomes are considered only until the point of discontinuation. This can be difficult to interpret, especially if the treatment durations vary significantly between groups.

Type 2 Diabetes Example

Let’s apply this to a Type 2 diabetes example:

Variable of Interest In a study where the intercurrent event is the use of rescue medication, the variable might be: HbA1c change from baseline to Week 26, or the time of rescue medication usage, whichever occurs earlier.

This variable focuses on measuring HbA1c (a marker of long-term blood sugar control) until the point when patients require rescue medication. Once rescue medication is introduced, the measurement stops, as the use of rescue medication significantly affects HbA1c values.

Clinical Question The clinical question here might be: What is the effect of test treatment vs placebo on HbA1c change from baseline to Week 26 or the time of rescue medication usage, whichever occurs earlier?

This analysis would estimate the effect of the test treatment only up to the point when the intercurrent event (rescue medication) occurs. By doing so, the study focuses on the time when patients are still on the assigned treatment before the additional intervention (rescue medication) changes the dynamics of the outcome.

Key Points of Principal Stratum Strategy - Focus on Subgroups: The strategy splits the population into different strata based on their likelihood of experiencing an intercurrent event, allowing for a more focused analysis of specific subgroups. - Changing Population Attribute: The analysis is not on the entire population but rather on a specific subgroup, so the population attribute is changed. - Uncertainty in Strata Membership: Identifying the exact members of each stratum can be difficult, and statistical methods must be used to estimate this with some level of uncertainty. - Treatment Effect Prior to Intercurrent Event: The strategy often looks at the effect of treatment before the intercurrent event occurs, such as analyzing HbA1c levels before patients need rescue medication.

Summary of the Principal Stratum Strategy - The principal stratum strategy focuses on estimating the treatment effect within a particular subgroup (or stratum) of patients based on how they would respond to treatment or placebo in relation to an intercurrent event. - This strategy changes the population being studied, as the analysis is limited to specific subgroups rather than the entire population. - Strata are based on the likelihood of experiencing an intercurrent event, but these strata are not identifiable with certainty, requiring advanced statistical estimation. - In the Type 2 diabetes example, the analysis might focus on HbA1c changes until the point of rescue medication use, providing insight into the treatment effect before the intercurrent event alters the outcome.

This approach allows researchers to gain insights into how specific groups of patients respond to treatment, offering more targeted information than a population-wide analysis.

Remark

Challenges in Identifying Strata: Identifying which patients belong to a particular stratum presents several challenges: - Predictive Limitations: It is generally impossible to predict in advance which patients will require an intercurrent event like rescue medication, as these outcomes depend on numerous unknown factors until treatment is administered. - Observational Limitations: In a typical randomized controlled trial, patients receive only one of the treatment options, making it impossible to directly observe how they would respond to the alternative treatment. This complicates accurate stratification based on potential outcomes.

Focus of Analysis: The analysis within principal stratification concentrates on evaluating the treatment effect specifically within these strata. It aims to determine the efficacy of treatments for those who hypothetically would not need any rescue medication, using statistical models to estimate these unobserved outcomes.

Distinction from Other Analytical Approaches: - Subgroup Analyses: Unlike standard subgroup analyses that categorize patients based on observable characteristics or responses, principal stratification uses potential outcomes to define groups, providing a more theoretical perspective on treatment effects. - Per-Protocol Analyses: This strategy differs from per-protocol analyses, which include only participants adhering fully to the study protocol. Principal stratification explores hypothetical efficacy under ideal conditions, considering what would happen if no protocol deviations occurred.

3 Defining Estimands

3.1 Scientific Question of Interest

Strategies for Addressing Intercurrent Events:

Treatment Policy Strategy: This approach uses the value for a variable regardless of whether or not the intercurrent event occurs. It considers the intercurrent event as part of the treatment strategies being compared. This means that the analysis accepts the intercurrent event as a natural part of the treatment process and incorporates its occurrence into the overall treatment evaluation.
Hypothetical Strategies: These involve imagining a scenario in which the intercurrent event does not occur. This strategy is used to assess what the outcome would have been if the intercurrent event had been entirely absent, providing a clearer picture of the treatment’s effect without the confounding impact of the event.
Composite Variable Strategies: This approach integrates the intercurrent event into the outcome variable itself, recognizing that the event provides meaningful information about the patient’s outcome. By incorporating the intercurrent event into the analysis variable, this strategy allows for a comprehensive evaluation of the treatment’s effect, including how it relates to the occurrence of specific events.
While on Treatment Strategies: This category was mentioned in the list but not further explained in the image. Typically, this strategy would involve analyzing data only from the period during which patients are actively receiving treatment, disregarding data from after treatment discontinuation or other deviations.
Principal Stratum Strategies: Also listed but not detailed in the image, this approach typically involves focusing on a subgroup of participants who are unaffected by the intercurrent event. This method aims to isolate the effect of the treatment in a “pure” form by evaluating only those who would adhere to the treatment regimen regardless of potential intercurrent events.

3.2 Tinking Process

Therapeutic Setting and Intent of Treatment Determining a Trial Objective:
- This initial step involves defining the therapeutic context and the specific goals of the treatment under investigation. It sets the foundation for the trial by clarifying its primary objectives based on the medical need and intended therapeutic benefits.
Identify Intercurrent Events:
- Intercurrent events are occurrences that could potentially affect the interpretation of the trial’s outcome. This step involves identifying all possible events such as additional treatments, protocol deviations, or loss of follow-up that may interfere with the trial results.
Discuss Strategies to Address Intercurrent Events:
- Once intercurrent events are identified, this step focuses on developing strategies to manage them. These strategies ensure that the trial can proceed as smoothly as possible and that the data remains reliable despite these events.
Agree on the Estimand(s):
- An estimand is a precise description of the effect to be estimated by the trial. This step involves reaching consensus on what exactly the trial aims to estimate regarding the treatment effect, taking into account the strategies for handling intercurrent events.
Align Choices on Trial Design, Data Collection, and Method of Estimation:
- This step is about making informed decisions on the trial design, how data will be collected, and the methods used for estimating the treatment effect. These choices are crucial for ensuring that the trial will effectively address its objectives.
Identify Assumptions for the Main Analysis and Suitable Sensitivity Analyses to Investigate These Assumptions:
- Assumptions related to the trial’s main analysis are identified here. Sensitivity analyses are planned to test these assumptions, helping to understand how robust the findings are to changes in the assumptions.
Document the Chosen Estimands:
- The final step involves formally documenting the agreed-upon estimands. This documentation is vital for clarity and ensures that all stakeholders have a clear understanding of what the trial aims to estimate and how it will be done.

3.3 Case Study 1: Treatment efficacy in patients with chronic inflammatory conditions

3.3.1 Background

Study Purpose
- The primary objective of the study is to demonstrate the superiority of a novel biologic treatment over placebo in managing patients with a chronic inflammatory condition.
Novel Treatment
- The treatment is a biologic administered once per month, aimed at controlling the inflammatory condition.
Variable
- The primary outcome or endpoint of the study is the clinical response at 12 months, which is measured as a binary variable (yes/no response).
Study Design
- The trial is set up as a double-blind, placebo-controlled, randomized, parallel-group design. This ensures that neither the participants nor the researchers know which treatment the participants are receiving, to prevent bias.

Study Design and Assumptions

Study Timeline and Patient Flow
- The study timeline is illustrated for six patients, showing their progress from randomization through to 12 months. For ethical reasons, patients in both the novel treatment and placebo groups are allowed to switch to an open-label novel treatment after the first 4 months.
Open Label Treatment
- Indicates that after 4 months, regardless of their initial grouping (novel treatment or placebo), participants have the option to receive the novel treatment openly. This transition is visualized with red arrows indicating the switch.
Planning Assumptions
- It’s estimated that approximately 40% of patients in the novel treatment arm and 70-80% in the placebo arm will switch to the open-label novel treatment after 4 months.
- Historical studies in similar conditions reported no deaths, which might influence the planning regarding safety monitoring and data analysis expectations.

3.3.2 Estimands Proposed in the Study

**Estimand 1 (Hypothetical):

Estimand 1 (Hypothetical): The treatment difference in proportion of clinical responders that would be observed if patients could not switch to open-label treatment.
- Description: This estimand would evaluate the treatment difference in the proportion of clinical responders as if no patients were allowed to switch to open-label treatment. It aims to estimate the pure effect of the initial treatment regimens without the confounding effect of switching.
- Health Authority Feedback: The HA expressed concerns about the feasibility of estimating this effect due to the assumptions required:
  - Assumption of Comparability: Assuming that patients who did not switch are comparable to those who would have switched.
  - Identification of Representative Subset: The ability to identify and appropriately use data from a representative subset of non-switching patients to predict outcomes for those who might have switched.
  - Unverifiability of Assumptions: Noting that these assumptions are largely theoretical and cannot be empirically verified, which undermines the reliability of the estimand.

HA: Even if the estimand is considered clinically relevant (in this setting of a treatment targeting the symptoms of a chronic disease), we continue to have concerns about whether it can be estimated with minimal and plausible assumptions. First, we would have to assume that some of the patients who do not initiate biologic escape are similar enough (to the patients who escape) that their outcomes at the end of the study are representative of the hypothetical outcomes in those patients that initiated escape. Second, we would have to be able to identify that subset of representative patients and effectively use their collected data in the statistical model to predict the hypothetical outcomes in patients who escaped. It is not at all clear whether your proposed statistical model has identified such representative patients. Finally, any such assumptions are unverifiable, as you note

Estimand 2 (Treatment Policy): The treatment difference in proportion of clinical responders regardless of whether patients switched to open-label treatment.
- Description: This estimand considers the treatment difference in the proportion of clinical responders regardless of any switching to open-label treatment. It reflects a real-world scenario where treatment effects are evaluated inclusive of management changes like switching.
Estimand 3 (Composite): The treatment difference in proportion of clinical responders where switching to open-label treatment is considered as no clinical response
- Description: This approach treats any switching to open-label treatment as equivalent to a non-response. It simplifies the analysis by directly associating the switching action with treatment failure or inadequacy.
- Selected by Study Team: Following the HA’s critique, the study team adopted this approach due to its straightforward interpretation and alignment with clinical trial objectives.

3.3.3 Chosen Estimand Attributes (Composite Approach)

Population: Patients with a chronic inflammatory condition.
Variable: Clinical response at 12 months, where switching is equated to non-response.
Treatments: Novel biologic treatment administered monthly versus placebo.
Population Summary: The difference in the proportion of patients achieving a clinical response between the novel treatment and placebo groups.

Clinical Question of Interest:

The key question is: “What is the difference in the proportion of patients achieving clinical response at 12 months for patients with a chronic inflammatory condition treated with a novel biologic treatment versus placebo, where the need for rescue (switch) would count as non-response?”

Significance of the Chosen Estimand:

The composite estimand was chosen because it provides a clear and straightforward method for dealing with the confounding factor of treatment switching. By considering switches as non-responses, it directly reflects the efficacy of the initial treatment in controlling the disease without the influence of additional interventions. This approach not only aligns with the trial’s regulatory expectations but also addresses the HA’s concerns about the practical challenges and verifiability associated with the hypothetical estimand.

3.4 Case Study 2: Effectiveness of an oral treatment in chronic dermatological condition

3.4.1 Background

Study Purpose:
- The aim is to establish the superiority of a new oral treatment over placebo in patients with a chronic dermatological condition that does not respond well to standard therapies.
Novel Treatment:
- This involves a new oral medication administered daily, targeting the specified dermatological condition.
Variable:
- The primary endpoint is assessed using the Weekly Activity Score (WAS), which evaluates symptoms and their intensity over a week, scaled from 0 (no symptoms) to 50 (numerous severe symptoms). This score is assessed at baseline and then every week up to week 12.
Study Design:
- The trial is structured as a double-blind, placebo-controlled, randomized, parallel-group design, maintaining the standard for clinical research to ensure objectivity and reliability of the results.

Study Design and Assumptions

Primary Endpoints:
- Two potential primary endpoints are proposed:
  - Change from baseline in WAS at Week 12: This continuous measure is deemed clinically relevant and previously accepted by health authorities.
  - WAS ≤ 10 at week 12 (binary): This endpoint, representing a specific treatment goal (mild or no symptoms), is favored by experts but has not been previously used in registration studies.
Medication Guidelines:
- Background Medication: Participants will receive either the novel treatment or placebo along with a second-generation antihistamine. The dose and type of this background medication are fixed throughout the study.
- Rescue Medication: An alternative second-generation antihistamine can be used daily if symptoms are unbearable.
- Prohibited Medications: Any corticosteroid or other treatments for the skin condition are forbidden during the trial to avoid confounding the study results.
Assumptions:
- Rescue Medication: It’s anticipated that more than 75% of participants may need rescue medication. This is expected in both treatment arms and is not thought to significantly affect the WAS at week 12.
- Prohibited Medications: Less than 10% of participants are expected to use prohibited medications. However, if used, corticosteroids could significantly alter the WAS scores.

Implications for Analysis

Rescue and Background Medications: The consistent use of background medication and the availability of rescue medication mimic typical clinical practices, potentially increasing the generalizability of the study results.
Assessment of Primary Endpoints: The binary endpoint of WAS ≤ 10 adds a clear, practical measure of success, while the continuous change from baseline provides a detailed quantification of treatment effect.
Handling of Prohibited Medications: The strict prohibition of certain medications ensures the integrity of the trial results but requires diligent monitoring and adherence from participants.

3.4.2 Initial Proposed Estimand

In Case Study 2, the study team initially proposed a hypothetical estimand for a clinical trial aimed at evaluating a new treatment for a chronic dermatological condition. This proposal and the subsequent feedback from the health authority led to a revision of the estimand approach.

Hypothetical Estimand:

Objective: To measure the treatment difference in change from baseline in the Weekly Activity Score (WAS) at 12 weeks assuming no patient took corticosteroids.
Health Authority Feedback: The authority criticized this approach because it does not reflect real-world clinical practice where patients might need prohibited medications. The hypothetical scenario was deemed inappropriate as it might not provide an accurate or feasible assessment of the treatment’s effectiveness due to its detachment from typical clinical scenarios.

Health Authority’s Rationale

Concerns: The hypothetical estimand could not reliably estimate the treatment effect due to the unrealistic assumption that no patients would use prohibited medications. Furthermore, any necessity for such medications would suggest treatment inadequacy, thus affecting the reliability of the outcome.

3.4.3 Chosen Estimand Attributes (Composite Approach)

Composite Estimand:

Population: Patients with a chronic dermatological condition unresponsive to standard therapies.
Variable: Change from baseline in WAS at 12 weeks, with the assignment of the worst possible value (50) to patients who take prohibited medication.
Treatments: The novel oral treatment given daily, compared against placebo, with both groups allowed to take rescue antihistamines as needed.
Population Summary: The mean difference in change from baseline to week 12 in the WAS score.
Clinical Question of Interest: The question focuses on the difference in the mean change from baseline in WAS at week 12, considering the use of prohibited medications as a treatment failure.

Reflection of Real-World Scenarios: This approach acknowledges that patients may require additional medications (deemed as treatment failures for the purpose of the study), which aligns with real-world treatment scenarios and regulatory expectations.
Treatment Effectiveness: By assigning the worst score to those who need prohibited medication, the study directly addresses the question of whether the new treatment can adequately manage the condition without additional interventions.

4 Analysis of Treatment Policy Estimands

4.1 Case Study: type 2 diabetes

Study Design

Type: Parallel, randomized, placebo-controlled, blinded trial
Size: 400 patients, randomized in a 1:1 ratio between the test treatment and placebo

Population

Participants: Patients with type 2 diabetes managed solely by diet and exercise

Treatments

Comparison: Test treatment versus placebo

Key Variable

Primary Endpoint: Change in Hemoglobin A1c (HbA1c) levels from baseline to week 26. HbA1c is a marker of average blood glucose levels over the previous two to three months, with a decrease indicating improvement in diabetes management.

Summary Measure

Assessment: Expected change from baseline to week 26 in HbA1c, with a between-group comparison focusing on the difference in changes between the test treatment and placebo groups.

Intercurrent Event

Event Description: Discontinuation of the randomized treatment and switch to unblinded use of the test treatment. This event is considered under a single category termed ‘treatment non-adherence,’ which includes:
- For patients initially receiving the test treatment, this would involve continuing the test treatment but in an unblinded manner.
- For patients initially receiving placebo, this would involve switching to the test treatment, also unblinded.

Visit Schedule

Visits: One baseline visit (V0) and five post-baseline visits (V1-V5), with V5 at week 26 marking the end of the study period.

Implications of the Design and Intercurrent Event

Unblinding Risks: The possibility of patients switching from placebo to the test treatment and from blinded to unblinded test treatment use could potentially introduce biases or affect the trial’s integrity by revealing treatment assignments. This needs careful handling to maintain the validity of the study outcomes.
Handling of Non-adherence: The study’s approach to treatment discontinuation (categorized as non-adherence) could impact the interpretation of the efficacy data. It’s crucial how this data will be analyzed, as non-adherence might affect the comparability between groups if not properly accounted for in the analysis.
Efficacy Measurement: The primary focus on the change in HbA1c allows for a direct assessment of the treatment’s impact on glucose regulation over a substantial period, aligning well with clinical objectives in diabetes care. The measure is objective and quantifiable, providing a clear metric for evaluating the effectiveness of the treatment.
Ethical Considerations: Allowing patients on placebo to switch to the test treatment (unblinded) after discontinuation could be seen as enhancing the ethical conduct of the trial by potentially providing a beneficial treatment to those not initially receiving it. However, this must be balanced against the risk of bias introduced by such switches.

4.2 Treatment Policy Estimand of Interest

Here’s a breakdown of the key components:

Population:

Patients with type 2 diabetes who are managing their condition solely through diet and exercise.

Treatments:

The trial compares a test treatment with a placebo. The crucial aspect of this estimand is that it considers the effect of the treatments regardless of the patients’ adherence to the assigned treatment regimen.

Variable:

The primary endpoint is the change in Hemoglobin A1c (HbA1c) from baseline to week 26. HbA1c is a key indicator that reflects the average blood glucose concentration over the previous three months.

Summary Measure:

The expected change from baseline in HbA1c at week 26, with the analysis focusing on the difference between the two groups. This measure will help determine if the test treatment is more effective than the placebo in lowering blood glucose levels over the trial period.

Data Collection Approach:

Data will continue to be collected until the primary endpoint for all patients, including those who do not adhere to the treatment to which they were initially randomized. This approach supports the treatment policy estimand by capturing the full scope of treatment effects, inclusive of all deviations from the protocol that might occur during the trial.

Significance of This Estimand:

This estimand is significant because it aims to capture the ‘real-world’ effectiveness of the test treatment. By evaluating the impact of the treatment irrespective of adherence, the estimand provides a more comprehensive understanding of how effective the treatment could be in typical clinical practice, where patients may not always follow prescribed treatments strictly.

The treatment policy estimand approach allows the study results to be more generalizable and reflective of practical clinical outcomes, acknowledging that non-adherence is a common occurrence in real-world settings. This makes the findings relevant for healthcare providers and policymakers when considering the potential benefits and limitations of new treatments for type 2 diabetes.

4.3 Missing data under treatment policy strategy

Missing data imputation is a critical process in clinical trials, particularly when ensuring the integrity and robustness of the study’s results in the face of missing data due to non-adherence, dropouts, or other reasons. Aligning missing data imputation strategies with the targeted estimand and considering clinically and statistically sound assumptions are essential for maintaining the validity of the trial’s conclusions.

Principles for Missing Data Imputation:

Alignment with Estimand: The method of imputation should reflect the nature of the estimand. For a treatment policy estimand, the imputation method should accommodate data in a way that reflects the intention-to-treat principle, considering all assigned treatments as if they were followed as per protocol.
Clinically Plausible Assumptions: These depend on the therapeutic context, disease characteristics, and the treatment mechanism. Assumptions must consider factors like whether the drug is disease-modifying or merely symptomatic and its pharmacokinetics such as half-life.
Adequate Modelling Assumptions: The statistical model used for imputation should be robust, minimizing bias and providing a reliable approximation of missing values based on available data.

Common Imputation Methods:

These methods are often used in scenarios where treatment discontinuation leads to missing data, and the aim is to estimate the trajectory of patients’ outcomes as if they had continued on the assigned treatment or shifted to a control or placebo condition.

When choosing an imputation method, it is critical to consider the nature of the disease and treatment. For instance, in chronic conditions where effects are prolonged and discontinuations common, more nuanced approaches like CIR might be more appropriate than J2R, which could be more suitable for acute settings or where the drug effect is expected to cease immediately upon discontinuation.

Comparison of Methods

Reference-Based Methods (J2R, CIR, CR):
- Jump to Reference (J2R / JR): Assumes all drug effects cease immediately upon discontinuation, with future outcomes following the placebo trajectory.
- Copy Increment in Reference (CIR): Assumes the rate of change (increment) in the patient’s outcomes will start to mimic those observed in the placebo group post-discontinuation.
- Copy Reference (CR): Patients are assumed to follow the entire trajectory of the placebo group from the point of their discontinuation.
Missing at Random (MAR):
- Suitable when the reasons for missing data are related to observed factors rather than the missing data itself, assuming a similarity in behavior between those with complete and incomplete data.
Retrieved Dropout (RDO) Imputation:
- Useful in scenarios where it’s possible to track outcomes of patients post-discontinuation, providing a more direct observation of potential outcomes for dropouts. This method is particularly valuable when analyzing long-term effects and adherence issues in clinical trials.

1. Jump to Reference (J2R / JR) Imputation

Overview: This method assumes that any drug effect disappears immediately upon discontinuation, and the patient’s condition reverts to what it would have been under the placebo.
- Description: This method assumes that any effect from the active drug ceases immediately upon discontinuation. Patients are then assumed to “jump” to the trajectory typical of the reference group, usually the placebo arm.
- Use Case: Appropriate when the drug effect dissipates quickly after discontinuation.
Visualization Details:
- Similar to CR, the blue line represents the drug arm and the black line represents the control (placebo) arm.
- Patients who discontinue the drug are assumed to revert to a condition similar to the placebo group immediately.
- Future values are worse than if they had continued on the drug but are aligned with the mean of the placebo group for simplicity in analysis.

2. Copy Reference (CR) Imputation

Overview: In the CR method, it is assumed that once a patient discontinues the drug, their future values will mimic the trajectory of the placebo arm, regardless of any benefit they might have initially experienced from the drug.
- Description: Here, the assumption is that after discontinuation, the changes (increments) in the patient’s measurements mimic those observed in the reference group from that point forward.
- Use Case: Useful when the drug’s effect diminishes gradually rather than instantly, allowing for a more gradual transition in the effect observed in patients.
Visualization Details:
- The blue line represents the drug arm, and the black line represents the placebo arm.
- Patients who drop out are assumed to follow the trajectory of the placebo arm exactly from the point of dropout.
- The imputed values for future observations are aligned with the mean trajectory of the placebo group.

3. Copy Increments in Reference (CIR) Imputation

Overview: This method assumes that patients who drop out from the drug arm do not revert entirely to the placebo condition but instead begin to follow the incremental changes observed in the placebo arm. This acknowledges some residual effect of the drug that was taken before dropout.
- Description: Patients are assumed to follow the distribution pattern of the reference group from the point of randomization. This method effectively resets the patient’s expected outcome to mirror the reference group entirely from the start of the study.
- Use Case: Best suited for scenarios where treatment effects are unclear or highly variable, or when the treatment is suspected to have no lasting independent effects beyond discontinuation.
Visualization Details:
- The drug’s impact is considered to taper off, not abruptly stop, as patients begin to mimic the incremental progress (or lack thereof) of the placebo arm from their last observed value.
- This creates a more gradual transition in the dataset, potentially reflecting a more realistic scenario of drug discontinuation effects.

4. Missing at Random (MAR)

Implementation: This approach assumes that missing data can be modeled based on similar subjects within the same treatment arm, considering that missingness is not related to unobserved variables.
Visualization and Usage: The graph illustrates how variability in outcomes increases over time, which is a typical scenario in long-term studies. The imputed values (squares) are based on the conditional mean given the observed values, which helps maintain the internal consistency of the dataset.

5. Retrieved Dropout (RDO)

Concept: The RDO method focuses on utilizing data from patients who discontinued treatment but whose outcomes continue to be tracked. This approach helps model what could happen to patients who drop out, by using data from those who have similar profiles but remain under observation.
Implementation and Challenges: For patients with missing data at a specific visit, information is borrowed from similar patients in the same treatment arm who have available dropout data. This method requires a sufficient amount of RDO data for reliable imputation and can lead to variance inflation if the data is not sufficient, impacting the bias-variance trade-off.
Visualization: The diagram shows various points where patients either continue, drop out, or are followed after discontinuation, with imputed values being informed by the retrieved dropout data.

4.4 Multiple Imputation

Step 1: Parameter Estimation (Imputation Model)

Objective: Fit a multivariate normal distribution for each treatment arm using data observed prior to any intercurrent event (ICE), such as dropout or switching treatments.
Components:
- \(\mu_a, \Sigma_a\): Mean and covariance matrix for the active treatment arm.
- \(\mu_r, \Sigma_r\): Mean and covariance matrix for the reference (placebo) arm.
- Uninformative priors are used for both the mean and the covariance matrices, with the covariance matrix typically employing an Inverse Wishart distribution. This choice helps in avoiding bias from overly prescriptive assumptions about the data structure.

Step 2: Imputation

Objective: Generate multiple complete datasets by imputing missing values based on the distributions estimated in Step 1.
Process:
- Draw from the posterior distribution of parameters \(\mu_r, \Sigma_r, \mu_a, \Sigma_a\) established in Step 1.
- Construct a joint distribution of observed and missing data to facilitate imputation.
- Impute missing data from the conditional distribution of \(Y_{miss} | Y_{obs}\), where \(Y_{miss}\) is the missing data and \(Y_{obs}\) is the observed data, based on the relationships established in the model.
- The imputation is repeated multiple times (commonly denoted as \(M\) times), creating multiple complete datasets.
- Different imputation means are calculated depending on the method: J2R, CIR, CR. Each method adjusts the imputation based on the reference trajectory, whether it’s a direct copy, an increment adjustment, or a complete jump to the reference values at the time of dropout.
  - J2R Mean (\(\tilde{\mu}\)):
    - For Jump to Reference, the imputed values for post-ICE data are a direct continuation of the reference group mean from the latest observed time point (\(t_i\)), effectively assuming that the treatment effect disappears and the patient follows the placebo trajectory.
  - CIR Mean (\(\tilde{\mu}\)):
    - For Copy Increment in Reference, the imputed values are calculated as a blend of the active arm’s trajectory up to the last observed time point and then shifting towards the change observed in the reference group. This reflects a gradual decline or alteration in the treatment effect rather than an abrupt stop.
  - CR Mean (\(\tilde{\mu}\)):
    - For Copy Reference, the imputed values are straightforwardly set to follow the reference arm’s mean (\(\mu_r\)), assuming that post-dropout, the patient’s outcomes align exactly with those typically seen in the placebo group.

Step 3: Analysis

Objective: Analyze each dataset independently to compute the summary measures of interest, which might include means, variances, or other statistical tests.
Importance: This step allows for the assessment of variability and the robustness of the study results across different imputed datasets.

Step 4: Pooling

Objective: Combine results from multiple imputed datasets.
Methodology: Use Rubin’s rules to pool the results. Rubin’s rules provide a way to combine estimates from multiple imputed datasets to obtain overall estimates and their variances, accounting for both within-imputation and between-imputation variability. This approach helps in deriving more accurate confidence intervals and p-values.

4.5 Analysis of Treatment Policy Estimands

An analysis of the example randomized controlled trial in patients with type 2 diabetes. The analyses in this worksheet target a treatment policy estimand, i.e., we are interested in the comparison between treatment versus placebo group irrespective of whether or not patients experienced the intercurrent event of treatment discontinuation.

4.5.1 Review Data

Although we are interested in the treatment effect irrespective of whether or not a patient discontinued randomized treatment, it is still generally of interest to understand the proportion of patients who adhered or discontinued treatment.

Let us create a table to summarize the number and proportion of patients who discontinued treatment by group and visit.

## $ctl
##  ontrt            1            2            3            4            5
##      0   8   (4.0%)  18   (9.0%)  31  (15.5%)  43  (21.5%)  50  (25.0%)
##      1 192  (96.0%) 182  (91.0%) 169  (84.5%) 157  (78.5%) 150  (75.0%)
##  Total 200 (100.0%) 200 (100.0%) 200 (100.0%) 200 (100.0%) 200 (100.0%)
## 
## $trt
##  ontrt            1            2            3            4            5
##      0   5   (2.5%)  12   (6.0%)  20  (10.0%)  27  (13.5%)  29  (14.5%)
##      1 195  (97.5%) 188  (94.0%) 180  (90.0%) 173  (86.5%) 171  (85.5%)
##  Total 200 (100.0%) 200 (100.0%) 200 (100.0%) 200 (100.0%) 200 (100.0%)

Now let us create a figure to show the mean change in HbA1c from baseline by visit, treatment group and whether they have experienced the intercurrent event of treatment discontinuation

4.5.2 Analysis of Data (ANCOVA)

As we have no missing data, we can perform our analysis just using ANCOVA. Our primary estimand is interested in the change in HbA1c from baseline to week 26 (visit 5), so we can restrict our analysis to visitn==5.

## 
## Call:
## lm(formula = hba1cChg ~ group + hba1cBl, data = .)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.1306 -0.7002 -0.0591  0.7915  3.2440 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  5.94031    0.63404   9.369  < 2e-16 ***
## grouptrt    -0.68208    0.10915  -6.249 1.07e-09 ***
## hba1cBl     -0.76454    0.07976  -9.585  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.092 on 397 degrees of freedom
## Multiple R-squared:  0.2482, Adjusted R-squared:  0.2444 
## F-statistic: 65.52 on 2 and 397 DF,  p-value: < 2.2e-16

4.5.3 Trial with Missing Data

## $ctl
##                    dispo            1            2            3            4            5
##                Off-study   2   (1.0%)   7   (3.5%)  14   (7.0%)  24  (12.0%)  32  (16.0%)
##  Off-treatment, on-study   6   (3.0%)  11   (5.5%)  17   (8.5%)  19   (9.5%)  18   (9.0%)
##             On-treatment 192  (96.0%) 182  (91.0%) 169  (84.5%) 157  (78.5%) 150  (75.0%)
##                    Total 200 (100.0%) 200 (100.0%) 200 (100.0%) 200 (100.0%) 200 (100.0%)
## 
## $trt
##                    dispo            1            2            3            4            5
##                Off-study   1   (0.5%)   4   (2.0%)   9   (4.5%)  14   (7.0%)  16   (8.0%)
##  Off-treatment, on-study   4   (2.0%)   8   (4.0%)  11   (5.5%)  13   (6.5%)  13   (6.5%)
##             On-treatment 195  (97.5%) 188  (94.0%) 180  (90.0%) 173  (86.5%) 171  (85.5%)
##                    Total 200 (100.0%) 200 (100.0%) 200 (100.0%) 200 (100.0%) 200 (100.0%)

4.5.4 Complete-Case Analysis

## 
## Call:
## lm(formula = hba1cChg ~ group + hba1cBl, data = .)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.1634 -0.6974 -0.0336  0.7934  3.1896 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  5.86939    0.67544   8.690  < 2e-16 ***
## grouptrt    -0.74453    0.11422  -6.518 2.49e-10 ***
## hba1cBl     -0.74898    0.08605  -8.704  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.07 on 349 degrees of freedom
##   (48 observations deleted due to missingness)
## Multiple R-squared:  0.2602, Adjusted R-squared:  0.256 
## F-statistic: 61.38 on 2 and 349 DF,  p-value: < 2.2e-16

4.5.5 Multiple imputation analysis (JR - Jump to Reference)

However, first we need to create:

A dataset showing the visit when the intercurrent event occurred for each patient.
A list of the key variables to be used in the imputation step.

Note that in reality we never observe the missing data. Therefore, we would not be able to decide on the imputation strategy based on the data. Instead this should be based on clinically plausible assumptions depending on the therapeutic setting, disease characteristics and treatment mechanism.

ry running some different analyses on data_missing using different assumptions for the missing data. Strategies available in rbmi are:

MAR - Missing At Random
JR - Jump to Reference
CR - Copy Reference
CIR - Copy Increments in Reference
LMCF - Last Mean Carried Forward

Step 1: Fit the imputation model to the observed data

## 
## SAMPLING FOR MODEL 'MMRM' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 0.001639 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 16.39 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 1700 [  0%]  (Warmup)
## Chain 1: Iteration:  170 / 1700 [ 10%]  (Warmup)
## Chain 1: Iteration:  201 / 1700 [ 11%]  (Sampling)
## Chain 1: Iteration:  370 / 1700 [ 21%]  (Sampling)
## Chain 1: Iteration:  540 / 1700 [ 31%]  (Sampling)
## Chain 1: Iteration:  710 / 1700 [ 41%]  (Sampling)
## Chain 1: Iteration:  880 / 1700 [ 51%]  (Sampling)
## Chain 1: Iteration: 1050 / 1700 [ 61%]  (Sampling)
## Chain 1: Iteration: 1220 / 1700 [ 71%]  (Sampling)
## Chain 1: Iteration: 1390 / 1700 [ 81%]  (Sampling)
## Chain 1: Iteration: 1560 / 1700 [ 91%]  (Sampling)
## Chain 1: Iteration: 1700 / 1700 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 3.975 seconds (Warm-up)
## Chain 1:                17.813 seconds (Sampling)
## Chain 1:                21.788 seconds (Total)
## Chain 1:

Step 2: Impute the missing data using the imputation model multiple times

Step 3: Analyse each complete dataset

Step 4: Combine the results to obtain point estimate and variance estimation

## 
## Pool Object
## -----------
## Number of Results Combined: 250
## Method: rubin
## Confidence Level: 0.95
## Alternative: two.sided
## 
## Results:
## 
##   ==================================================
##    parameter   est     se     lci     uci     pval  
##   --------------------------------------------------
##      trt_1    -0.287  0.053  -0.392  -0.183  <0.001 
##    lsm_ref_1  -0.073  0.038  -0.147  0.001   0.055  
##    lsm_alt_1  -0.36   0.038  -0.434  -0.286  <0.001 
##      trt_2    -0.691  0.074  -0.836  -0.547  <0.001 
##    lsm_ref_2  -0.052  0.052  -0.155   0.05   0.318  
##    lsm_alt_2  -0.744  0.052  -0.846  -0.641  <0.001 
##      trt_3    -0.703  0.091  -0.882  -0.524  <0.001 
##    lsm_ref_3  -0.08   0.065  -0.207  0.048    0.22  
##    lsm_alt_3  -0.783  0.064  -0.909  -0.657  <0.001 
##      trt_4    -0.703  0.104  -0.908  -0.499  <0.001 
##    lsm_ref_4  -0.055  0.075  -0.202  0.091   0.458  
##    lsm_alt_4  -0.759  0.074  -0.903  -0.614  <0.001 
##      trt_5    -0.716  0.116  -0.943  -0.488  <0.001 
##    lsm_ref_5  0.009   0.083  -0.155  0.173   0.916  
##    lsm_alt_5  -0.707  0.081  -0.866  -0.548  <0.001 
##   --------------------------------------------------

4.5.6 Retrieved-dropout models

This practical above has only covered reference-based imputation methods. An alternative approach is to use retrieved-dropout models. The link below contains a vignette showing how the rmbi package can also be used to implement this approach.

https://insightsengineering.github.io/rbmi/main/articles/retrieved_dropout.html

5 Analysis of Hypothetical Estimands

5.1 Estimation for hypothetical estimands

5.2 Prediction of Hypothetical Trajectories

Explicit or implicit predictions of hypothetical trajectories: This refers to making predictions about what might happen under various hypothetical scenarios, using either explicitly stated models or assumptions.
Assumptions for the predictions: Assumptions must align with the hypothetical scenarios, influencing the model’s design and expected outcomes.

Notation and Study Design

Randomized Treatment (Z): Indicates whether a participant received the treatment (1) or was in the control group (0).
Intercurrent Event Indicator (Eᵢ): Shows whether an intercurrent event (ICE) occurred at each visit (1 if occurred, 0 if not). Assumes once an intercurrent event occurs, it continues to exist.
Outcome Variable (Y): The observed change in HbA1c from baseline at week 26.
Potential Outcome (Yᵢ): Hypothetical outcome without any intercurrent event or under different treatment conditions.
Estimand of interest: The difference in the outcome between treatment and control groups, assuming no intercurrent events.
Covariates (X₀, Xᵢ): Baseline and subsequent covariates that might affect the outcome, measured throughout the study.

Z influences E₁: Treatment can affect the likelihood of an intercurrent event.
Eᵢ influences Eᵢ₊₁: Indicates a cascade effect where an intercurrent event at one point increases the likelihood of another in the future.
X₀ → X₁ → X₂ → … → X₄: Represents changes or measurements of covariates (like HbA1c) over time.
Arrows into Y: Shows that all these factors, including treatment, intercurrent events, and covariates, influence the final outcome of HbA1c levels.

Treatment Policy

Green Pathways (Estimated Treatment Effect): These represent the direct and indirect effects of the treatment (Z) on the outcome (Y). The treatment affects each point in time where HbA1c is measured (X1 through X4) and can influence intercurrent events (E1 through E5), which in turn can affect subsequent measurements and the final outcome.
This diagram shows all possible impacts of the treatment throughout the course of the study, including its potential to affect the occurrence of intercurrent events, which are particularly critical in clinical trials.

Hypothetical Estimand

Green Pathways (Estimated Treatment Effect): These lines show the direct influence of treatment on the outcome (Y) and intermediate HbA1c measurements (X1 to X4), assuming no intercurrent events (E1 to E5) affected the outcomes. This hypothetical estimand aims to estimate what the treatment effect would be in an ideal scenario where no intercurrent events alter the course of treatment.
Red Pathways (Biasing path): These paths highlight potential sources of bias if the intercurrent events were ignored in the analysis. They show how each intercurrent event (E1 to E5) could influence the HbA1c measurements (X2 to X4), potentially confounding the true treatment effect.

Key Points

Treatment Policy Estimand: This considers all real-world effects, including intercurrent events, providing a comprehensive view of treatment effectiveness.
Hypothetical Estimand: By ignoring intercurrent events, this focuses on the direct effect of the treatment under idealized conditions, useful for understanding the intrinsic efficacy of the treatment.

Concept of Time-dependent Confounding:

Time-dependent Confounders (X₁-X₄): These are variables that:
1. Are affected by previous treatment.
2. Influence the probability of future treatment.
3. Impact the outcome of interest (Y).

In this study, the levels of HbA1c measured at different times (X₁ through X₄) are time-dependent confounders because each measurement can influence and be influenced by treatment decisions and outcomes.

The challenge lies in adjusting for these confounders without inadvertently blocking the pathway through which the treatment effect is transmitted. This is a key concern in causal inference: - Blocking the Effect Pathway: Adjusting for X₁-X₄ directly could block some of the treatment effects since these confounders are also intermediaries of treatment effects.

Directed Acyclic Graphs (DAGs) Analysis - DAG (a):

Green Paths: Show the estimated treatment effect paths which demonstrate how treatment (Z) potentially affects the final HbA1c measurement (Y) through intermediate measurements (X₁-X₄) and intercurrent events (E₁-E₅).
Red Paths: Illustrate the biasing paths where confounders (X₁-X₄) and intercurrent events (E₁-E₅) may misrepresent the true treatment effect if not properly accounted for.

Directed Acyclic Graphs (DAGs) Analysis - DAG (b):

Simplified Representation: Focuses on paths that are purely related to treatment effects, ignoring paths through time-dependent confounders to avoid the bias introduced by adjusting for these confounders.

Desired Comparison:

Y(Z = 1, E₁-₄ = 0) vs Y(Z = 0, E₁-₄ = 0): This comparison aims to measure the treatment effect assuming no intercurrent events have occurred to purely see the effect of the treatment.

5.3 Methods for Estimating Hypothetical Estimands

5.3.1 Multiple Imputation (MI)

Multiple Imputation for Hypothetical Estimands:

Purpose: In clinical trials, especially those with longitudinal measurements, data after an ICE are often not considered relevant for the hypothetical estimand (the outcome that would have been observed had the ICE not occurred). This can be thought of as a missing data problem.
Method: MI treats all post-ICE data as missing. For example, if an intercurrent event \(E1\) occurs, then subsequent measurements \(X1, X2, X3, X4\), and the final outcome \(Y\) are treated as missing. The imputation model is used to estimate these missing values based on other available data, under the assumption that the missingness is related to observed data but not to the missing data itself.
Assumptions: A key assumption in this context might be Missing At Random (MAR), where the likelihood of missing data depends only on observed data.

Use of MAR in Predicting Hypothetical Trajectories:

Definition: Under MAR, the missingness of data is related only to the observed data, not to any unobserved data.
Application: In hypothetical estimands, assuming MAR suggests that the data missing due to ICEs can be imputed based on the observed characteristics and responses of similar patients who did not experience the ICEs.
Feasibility: Whether MAR is a sensible assumption depends on the specific characteristics of the intercurrent event and the study design. If the ICEs are believed to be random with respect to future outcomes after controlling for observed data, MAR can be a reasonable assumption. However, if ICEs are related to unobserved future outcomes or unmeasured confounders, then MAR would not be appropriate, and a more sophisticated method like MNAR (Missing Not At Random) might be needed.

5.3.2 Inverse Probability Weighting (IPW)

Inverse Probability Weighting for Hypothetical Estimands: - Purpose: IPW is used to adjust for the non-random occurrence of intercurrent events by modeling the process leading to these events. - Method: Each participant’s data is weighted by the inverse of the probability of their observed treatment path, given their covariates and previous treatment history. This approach aims to create a pseudo-population where the occurrence of intercurrent events is independent of treatment, mimicking the condition of no ICEs.

Problem Setup:

Confounder (\(X_i\)): This is a variable that influences both the outcome (\(Y\)) and the likelihood of an intercurrent event (\(E_{i+1}\)).
Intercurrent Event (\(E_i\)): Events that can affect the continuation or outcome of the treatment.

IPW Idea:

Upweighting: In the presence of intercurrent events that might skew the observed outcome, IPW adjusts the influence of each individual’s data based on their probability of not experiencing the intercurrent event, given their confounders. This adjustment helps in maintaining a balanced representation of all groups within the study.
Creating a Pseudo-Population: IPW adjusts the dataset to create a “pseudo-population” in which the distribution of individuals who did and did not experience intercurrent events is balanced as if these events were independent of the measured confounders. This adjustment helps to mitigate the effect of confounders that are linked to the likelihood of experiencing an intercurrent event.

Imagine a study with two baseline groups differentiated by color (blue/green), which represent different levels or types of a confounder \(X\). Suppose that the probability of not having an intercurrent event \(E=0\) given the confounder blue is \(P(E=0|blue) = 0.5\) (i.e., 50%). If in the actual study, fewer blue individuals did not experience the event compared to green, then each blue individual who did not experience the event might be weighted more heavily (e.g., a weight of 2) to represent not only themselves but also those blue individuals who did have the event, thus simulating a scenario where the intercurrent event is independent of being blue or green.

Weights (\(w_k\)): Each subject \(k\) receives a weight calculated as the inverse of the probability of being free from the intercurrent event, conditioned on their treatment status \(Z\) and confounders \(X\). Mathematically, this is expressed as: \[ w_k = \frac{1}{P(E_k = 0|Z_k, X_k)} \] This weight is used to adjust their contribution to the analysis, effectively increasing the influence of underrepresented scenarios within the observed data.

IPW for Hypothetical Estimand:

In estimating a hypothetical estimand (where we hypothesize the outcome had the ICEs not occurred), IPW helps to simulate a dataset where: - ICEs are absent: It weights the data so that the analysis can proceed as if the ICEs did not occur. - Independent of Confounders \(X\): It also ensures that this simulated dataset is independent of the distribution of the confounder, \(X\), making the analysis robust against confounding bias due to \(X\).

This method is crucial for ensuring that estimates of treatment effects or exposures are unbiased by confounders or selection mechanisms related to intercurrent events, providing a clearer picture of the causal effects of interest.

5.3.3 G-Computation

Purpose: G-computation is a statistical technique used to estimate the effect of a treatment or exposure in the presence of confounders.
Method: It involves modeling the outcome as a function of treatment and confounders. In the context of hypothetical estimands, it can sometimes be equivalent to multiple imputation, depending on how the outcome model is specified and used.

5.3.4 Advanced Methods

Augmented IPW and Targeted Maximum Likelihood Estimation (TMLE): These methods combine the strengths of IPW and outcome modeling to produce more efficient and less biased estimates.
G-Estimation: This method is specifically designed for estimating the effects of time-varying treatments in the presence of time-varying confounders that are also affected by past treatment.

5.4 Time-dependent Intercurrent Event Occurrence

Objective: - We aim to estimate the probability that a patient does not experience the intercurrent event at any time during the study. This is crucial for properly weighing each subject in the analysis to account for these events.

Calculation of Probabilities

Formula for Probability: - The probability calculation involves taking the product of the conditional probabilities of not experiencing the intercurrent event across all time points up to the current visit \(i\): \[ \prod_{i=1}^5 P(E_{i,k} = 0 | Z = z_k, E_{1,i-1,k} = 0, X_{i-1,k} = x_{i-1,k}) \] Here: - \(E_{i,k}\) = Intercurrent event indicator at visit \(i\) for individual \(k\). - \(Z = z_k\) = Treatment assignment for individual \(k\). - \(E_{1,i-1,k} = 0\) = Condition where no intercurrent events have occurred up to the previous visit. - \(X_{i-1,k} = x_{i-1,k}\) = Covariates observed up to the previous visit for individual \(k\).

Intuition: - Each factor in the product adjusts for the history of treatment and intercurrent events, along with the observed covariates, making the probability specific to the pathway that individual \(k\) has actually followed.

Subject Weights Calculation

Weight Formula: - Each subject \(k\) receives a weight calculated as: \[ w_k = \frac{1}{\prod_{i=1}^5 P(E_{i,k} = 0 | Z = z_k, E_{1,i-1,k} = 0, X_{i-1,k} = x_{i-1,k})} \] - Purpose of Weights: These weights are used to create a weighted sample (pseudo-population) in which the occurrence of intercurrent events is statistically independent of the observed covariates and treatment assignment. This adjustment is necessary to estimate the effect of the treatment under the hypothetical scenario where no intercurrent events occur.

This approach is particularly important in longitudinal studies where events occurring after baseline can affect the treatment and subsequent outcomes. By adjusting the contribution of each participant’s data based on the likelihood of remaining event-free, IPW helps to reduce bias in estimating treatment effects, providing a clearer picture of the treatment’s potential impact under ideal conditions.

5.5 Estimation of weight and treatment effect

5.5.1 Estimation of Weight

Methods:

Nonparametric Methods:
- Example: Calculate the sample proportion within each stratum of \(X\). This approach does not assume any specific form of the relationship between the covariates and the probability of the intercurrent event \(E\). It’s straightforward but may not be practical with continuous or high-dimensional covariates due to the “curse of dimensionality.”
Parametric Methods:
- Example: Use logistic regression to model the probability of \(E\) given covariates \(X\) and treatment \(Z\). This approach allows for more efficient estimation in the presence of multiple or continuous covariates and can provide better insights into how specific variables influence the probability of experiencing an intercurrent event.

5.5.2 Estimation of Treatment Effect

Weighted Sample Mean: - Uses the weights derived (potentially from one of the above methods) to calculate a mean that reflects a population where the treatment assignment \(Z\) is independent of the potential outcomes under no intercurrent events.

Hájek Estimator: - A specific type of estimator for the mean that adjusts the weighted sample mean by the sum of the weights, helping to stabilize estimates especially in smaller samples or in unbalanced designs: \[ \text{Hájek Estimator: } \frac{\sum_{i=1}^n (1-E_i)Z_iY_iw_i}{\sum_{i=1}^n (1-E_i)Z_iw_i} - \frac{\sum_{i=1}^n (1-E_i)(1-Z_i)Y_iw_i}{\sum_{i=1}^n (1-E_i)(1-Z_i)w_i} \] This formula calculates the weighted averages of the outcome \(Y\) separately for the treated and control groups, adjusting for the distribution of the weights.

Weighted Least Squares Regression: - Uses the weights in a regression model of \(Y\) on \(Z\) for those cases where \(E = 0\), treating it as a Marginal Structural Model. This model provides a way to estimate causal effects by appropriately accounting for time-dependent confounders adjusted by weights.

Inference and Confidence Intervals (CIs): - Robust Sandwich Variance Estimator: This method is used to calculate confidence intervals that are robust to misspecification of the model. While valid, these CIs might be conservative (i.e., wider than necessary), potentially overestimating the uncertainty. - Non-parametric Bootstrap: Often used to validate CIs by resampling the data with replacement and recalculating the estimator multiple times. This method can provide a more accurate representation of the uncertainty if the original CI assumptions are violated or if the sample size is small.

5.6 Issue of Large Weights in IPW

Positivity Assumption: This is crucial for IPW. If the propensity score (probability of receiving treatment given covariates) for any individual is exactly 0 or 1, it leads to infinite weights, which are problematic.
Consequence of Extreme Propensity Scores: Propensity scores close to 0 or 1 result in very large weights for those individuals, making the estimates less precise and potentially unstable. This usually occurs when an individual’s characteristics (covariates) make them highly unlikely or likely to receive treatment compared to the rest of the sample.

How to Handle Large Weights

Investigate the Cause:
- Identify Outliers: Determine who these individuals with large weights are and investigate whether they represent a combination of covariate values that are rare within the dataset.
- Understand Their Impact: Analyze whether these individuals have extreme values for some confounders which might be influencing the propensity score significantly.
Use Stabilized Weights:
- Method: Adjust the original weights by the ratio of the marginal probability of receiving treatment to the conditional probability given the covariates. This often reduces the variance of the weights without introducing bias.
- Reference: Cole & Hernan (2008) provide a detailed discussion on this technique.
Trim Weights:
- Approach: Remove or cap individuals with weights beyond a certain threshold (e.g., greater than \(w_0\)).
- Effect: This method changes the target population of the inference because it effectively excludes or down-weights the most extreme cases.
Truncate Weights:
- Method: Set weights that are smaller than the \(p\%\) quantile to the \(p\%\) quantile and similarly for weights larger than the \((1-p)\%\) quantile.
- Trade-off: Truncating weights introduces some bias (because it alters the actual contribution of each observation based on its probability of treatment) but reduces variance, which can lead to more stable estimates. This is a classic bias-variance trade-off scenario.

5.7 Analysis of Hypothetical Estimands

In the below scenario, using complete-case analysis results in a smaller treatment effect estimate, highlighting that biases due to confounding can vary in direction. Although Multiple Imputation (MI) and Inverse Probability Weighting (IPW) show similar standard errors (SE) in this specific instance with only one post-baseline covariate, typically, MI produces smaller SE compared to IPW. However, the reduced SE with MI comes at the expense of more extensive parametric assumptions, indicating a trade-off between statistical precision and reliance on model-based assumptions.

5.7.1 Review Data

Load the dataset DiabetesExampleData_wide_noICE.rds, which contains the ideal trial in which no intercurrent events were observed. The dataset is contextually the same as those used previously in the Treatment Policy Estimands Practical, except that for the analyses in this worksheet we will primarily work with data in wide, rather than long, format.

5.7.2 Analysis of Data (ANCOVA)

## 
## Call:
## lm(formula = hba1cChg_5 ~ group + hba1cBl, data = .)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.2518 -0.6455 -0.0299  0.7744  3.1562 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  5.28553    0.63645   8.305 1.59e-15 ***
## grouptrt    -0.80323    0.10957  -7.331 1.29e-12 ***
## hba1cBl     -0.66254    0.08007  -8.275 1.97e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.096 on 397 degrees of freedom
## Multiple R-squared:  0.2356, Adjusted R-squared:  0.2317 
## F-statistic: 61.17 on 2 and 397 DF,  p-value: < 2.2e-16

5.7.3 Trial where some participants did not adhere to randomized treatment

Now we will move onto the more realistic example, where some patients did not adhere to their randomized treatment. We are interested in the hypothetical estimand as if all patients adhered to their randomized treatment.

	ctl (N=200)	trt (N=200)	Overall (N=400)
ontrt_1
0	8 (4.0%)	5 (2.5%)	13 (3.3%)
1	192 (96.0%)	195 (97.5%)	387 (96.8%)
ontrt_2
0	18 (9.0%)	12 (6.0%)	30 (7.5%)
1	182 (91.0%)	188 (94.0%)	370 (92.5%)
ontrt_3
0	31 (15.5%)	20 (10.0%)	51 (12.8%)
1	169 (84.5%)	180 (90.0%)	349 (87.3%)
ontrt_4
0	43 (21.5%)	27 (13.5%)	70 (17.5%)
1	157 (78.5%)	173 (86.5%)	330 (82.5%)
ontrt_5
0	50 (25.0%)	29 (14.5%)	79 (19.8%)
1	150 (75.0%)	171 (85.5%)	321 (80.3%)

5.7.4 Analysis of Patients without Intercurrent Event

Let us first perform the analysis dropping any patients who did not adhere to treatment, i.e., only using the observed outcome data in the patients who did adhere to their randomized treatment until the end of follow-up. This analysis will only provide an unbiased estimate of the treatment effect under the assumption that treatment non-adherence occurred completely at random in both treatment arms.

## 
## Call:
## lm(formula = hba1cChg_5 ~ group + hba1cBl, data = .)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.1646 -0.7189 -0.0195  0.8072  3.1928 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  5.99485    0.72999   8.212 5.51e-15 ***
## grouptrt    -0.73757    0.12147  -6.072 3.60e-09 ***
## hba1cBl     -0.76496    0.09408  -8.131 9.60e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.081 on 318 degrees of freedom
##   (79 observations deleted due to missingness)
## Multiple R-squared:  0.2629, Adjusted R-squared:  0.2582 
## F-statistic:  56.7 on 2 and 318 DF,  p-value: < 2.2e-16

5.7.5 Analysis using Multiple imputation

Now, let us move onto more principled analyses, starting with a multiple imputation approach. Here, for the patients who experience the intercurrent event of treatment non-adherence, we aim to impute what their values of HbA1c would have been if they had not experienced the intercurrent event by modelling the hypothetical future trajectory based on their past trajectory and the trajectories of similar patients.

As soon as a patient experiences the intercurrent event of treatment non-adherence, all future values of HbA1c will be missing. Therefore, we have a monotone missingness pattern, i.e., if a patient discontinues treatment at visit 1, then HbA1c will be missing from visit 1 to visit 5, whereas if a patient discontinues treatment at visit 4, then HbA1c will only be missing for visits 4 and 5.

The code below will perform a sequential imputation starting by imputing HbA1c at visit 1, sequentially through the visits, finally imputing HbA1c at visit 5. Each imputation is performed using Bayesian linear regression method=“norm” including treatment and all previous values of HbA1c.

We then fit our ANCOVA model to each imputed dataset, and finally pool the results of each imputation using Rubin’s rules to obtain a point estimate and its standard error.

5.7.6 Analysis using Inverse Probability Weighting (IPW)

Another way to perform this analysis is to use inverse probability weighting. In this approach, we only use the outcome data for the patients who did not experience the intercurrent event, but we up-weight these patients in such a way that they also represent what we estimate would have been observed in similar patients who did experience the intercurrent event.

The weights are given by the inverse of the propensity score, which is the propensity for patients to experience the intercurrent event at each visit. We start by calculating the weights to account for patients who experienced the intercurrent event at visit 1.

	ctl (N=192)	trt (N=195)	Overall (N=387)
weights
Mean (SD)	1.04 (0.0615)	1.03 (0.0284)	1.03 (0.0484)
Median [Min, Max]	1.02 [1.00, 1.51]	1.02 [1.00, 1.16]	1.02 [1.00, 1.51]
Sum	200	200	400

Our original sample size was 400 patients (200 in each arm). At visit 1, 8 patients had experienced the intercurrent event in the control arm, and 5 patients in the treatment arm. This reduced our sample size to 192 and 195 respectively, but through up-weighting patients who were similar to those who experienced the intercurrent event, we have maintained an effective sample size of 200 per arm.

Next we continue this process sequentially from visit 2 up to the end of follow-up.

	ctl (N=150)	trt (N=171)	Overall (N=321)
weights
Mean (SD)	1.36 (1.64)	1.18 (0.243)	1.26 (1.14)
Median [Min, Max]	1.11 [1.00, 20.9]	1.11 [1.01, 3.41]	1.11 [1.00, 20.9]
Sum	203	202	405

It is always a good idea to check the weights, because extreme weights can highlight potential violations of the positivity and/or modelling assumptions.

In the example above, the largest weight is almost 21. This means that the data of this one patient is now being used to represent twenty other similar patients who did not adhere to the randomized treatment. We could consider this as suggesting that patients with these characteristics only have a 1/21 = ~5% probability of adhering to their randomized treatment. This is perhaps not so extreme, and suggests that the positivity assumption holds in this case. However, it is not uncommon to see examples with weights >100, meaning this patient had less than 1% probability of not experiencing the intercurrent event. Although this would still strictly meet the positivity assumption that the probability is >0 and <1, the variance estimation will start to increase significantly as weights become more extreme.

The sum of the weights should equal the original sample size in each treatment arm. Here we see the sum of the weights in each arm is not quite equal to 200, but it is close. There is no strict limit on what can be considered ‘close enough’, but if the sum of the weights differs significantly from the original sample size it can suggest issues with the modelling of the propensity score.

Finally, we can check the results by fitting a weighted linear regression. The variance is estimated using a “robust” (Huber-White) sandwich estimator to account for the weighting.

## 
## t test of coefficients:
## 
##              Estimate Std. Error t value  Pr(>|t|)    
## (Intercept)  5.201939   0.625226  8.3201 2.619e-15 ***
## grouptrt    -0.787932   0.126312 -6.2380 1.412e-09 ***
## hba1cBl     -0.650241   0.078175 -8.3178 2.661e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

5.7.7 IPW (bootstrap SE)

Note that the robust variance estimation used above assumes that the propensity score is known rather than estimated. This leads to a conservative overestimated variance. Unbiased variance estimates can instead be obtained via a bootstrap procedure.

## [1] 0.1227755

6 Case Study #1 Continuous Endpoint

6.1 Study Introduction

Case Study Overview:

This case study provides guidance on the definition and analysis of primary and key secondary estimands, choice of sensitivity analyses, and handling of missing data. It focuses on a Phase 3 randomized, placebo-controlled study in patients with an endocrine condition. The study spans a 24-week period and includes two continuous endpoints.

Study Design: - Study Type: Phase 3 randomized, placebo-controlled study - Population: Patients with an endocrine condition - Endpoints: Two continuous endpoints measured over a 24-week period - Co-primary endpoint 1: Blood pressure control (change in average systolic blood pressure, SBP) - Co-primary endpoint 2: Glycemic control (change in HbA1c)

Note: Estimands are only defined for primary and key secondary endpoints.

Primary Objectives: The primary objective of the study is to evaluate the effect of the test drug (Drug X) compared to placebo in terms of: 1. Blood Pressure Control: Patients with uncontrolled hypertension 2. Glycemic Control: Patients with uncontrolled blood sugar

Co-primary Endpoints:

Estimand 1: Blood Pressure Control
- Treatment: Drug X vs. Placebo
- Target Population: Patients with the endocrine disorder who have uncontrolled hypertension.
- Variable: Change in average systolic blood pressure (SBP) based on 24-hour ABPM from Baseline to Week 24.
Intercurrent Events and Strategy:
- Treatment discontinuation due to lack of efficacy: Handled using the treatment policy strategy.
- Treatment discontinuation due to safety concerns: Handled using the treatment policy strategy.
- Treatment discontinuation due to other reasons: Handled using the treatment policy strategy.
- Use of rescue medication for blood pressure control: Handled using the treatment policy strategy.
- No dose of study treatment received: Handled using the treatment policy strategy.
- Death: Handled using the “while alive” strategy.
Population Level Summary:
- Analysis Method: Difference in Least Square Means (LSM) of change from Baseline to Week 24 in average SBP based on 24-hour ABPM between Drug X and Placebo.
Text Summary: The study will evaluate the effect of Drug X on blood pressure control compared to Placebo. This evaluation will consider participants while they are alive, regardless of treatment discontinuation due to lack of efficacy or safety, failure to receive a dose of study treatment, or the use of rescue medication for blood pressure control.

Estimand 2: Glycemic Control
- Treatment: Drug X vs. Placebo (as above)
- Target Population: Patients with the endocrine disorder who have uncontrolled blood sugar.
- Variable: Change in HbA1c from Baseline to Week 24.
Intercurrent Events and Strategy:
- Treatment discontinuation due to lack of efficacy: Handled using the treatment policy strategy.
- Treatment discontinuation due to safety concerns: Handled using the treatment policy strategy.
- Treatment discontinuation due to other reasons: Handled using the treatment policy strategy.
- Use of rescue medication for glycemic control: Handled using the treatment policy strategy.
- No dose of study treatment received: Handled using the treatment policy strategy.
- Death: Handled using the “while alive” strategy.
Population Level Summary:
- Analysis Method: Difference in Least Square Means (LSM) of change from Baseline to Week 24 in HbA1c between Drug X and Placebo.
Text Summary: The study will evaluate the effect of Drug X on glycemic control compared to Placebo. This evaluation will consider participants while they are alive, regardless of treatment discontinuation due to lack of efficacy or safety, failure to receive a dose of study treatment, or the use of rescue medication for glycemic control.

6.2 Other Points Considered

Handling Participants Who Were Randomized but Did Not Meet Eligibility Criteria:

Regulators typically expect that participants who were randomized, even if they did not meet all eligibility criteria, be included in the analysis population. This inclusion is particularly important when the study uses a treatment policy strategy, which implies that the effects of the treatment should be evaluated across all patients randomized, regardless of deviations from protocol or treatment adherence. This strategy assumes that all randomized participants at least met the basic condition requirements, such as having the endocrine disorder and either uncontrolled hypertension or uncontrolled blood sugar.

In certain circumstances, exclusion of ineligible participants from the analysis population may be warranted based on specific scientific questions. However, such exclusions must be clearly justified, and regulatory authorities should be consulted for agreement on the approach. For this specific case study, there is an additional risk of randomization errors since the trial involves two patient groups that may overlap. This overlap should be considered during the protocol’s risk assessment, and appropriate mitigation steps should be planned to handle potential issues.

Handling Participants Who Did Not Receive Study Treatment:

Similarly, regulators generally expect that participants who were randomized but did not receive study treatment are still included in the analysis population, particularly when the treatment policy strategy is applied. The treatment policy strategy considers the treatment effect for all randomized participants, regardless of whether they completed the intervention according to protocol. In cases where excluding such participants is scientifically necessary, the justification should be clearly stated, and agreement sought with the regulators.

In summary, the target population is critical for defining the scope and relevance of a clinical study. It is important to align the analysis population with the estimand definition based on the study’s goals and regulatory expectations. Both the inclusion of ineligible or untreated participants in the analysis population and any deviations from this general rule need to be justified within the study’s framework and properly discussed with regulatory authorities.

Data Collection to Align with Intercurrent Event (ICE) Strategy

In clinical trials, it is crucial to collect adequate and accurate data regarding Intercurrent Events (ICEs) to properly implement the predefined strategies for handling such events. These events, which occur after treatment initiation and may influence the interpretation of treatment effects, need to be well-documented so that they can be incorporated into the analysis according to the chosen strategy (e.g., treatment policy strategy). Below are the key aspects to consider:

What data do we need to collect regarding the ICE?

To align data collection with the ICE strategy, the following information is essential:

Identification of ICE:
- Clearly document whether an ICE has occurred, based on the protocol definition. ICEs may include treatment discontinuation due to lack of efficacy, adverse events, or the use of rescue medication.
- Investigators should have a comprehensive list of potential ICEs that could occur during the trial and precise criteria for identifying these events.
Detailed Documentation:
- Record the specific type of ICE (e.g., treatment discontinuation due to lack of efficacy, safety concerns, or the use of rescue medication).
- Collect the date and time of the ICE occurrence to determine when the event took place in relation to the study timeline.
- Document the reason for the ICE, including whether it was due to clinical judgment, participant choice, or another cause.
Contextual Information:
- Gather information about the participant’s status at the time of the ICE (e.g., clinical signs, symptoms, or any ongoing adverse events).
- If relevant, capture information on any actions taken as a result of the ICE (e.g., whether rescue medication was administered or alternative treatments were introduced).

Can Lack of Efficacy Be Determined?

To determine lack of efficacy, investigators must collect data that reflects treatment outcomes in relation to the primary endpoints (e.g., blood pressure or HbA1c levels). The key elements to capture include:

Treatment Response Data: Track whether the patient is responding to the treatment as expected over time. This includes regular measurements of the primary endpoints (e.g., blood pressure and glycemic control).
Follow-up Assessments: Continue collecting endpoint data even after an ICE, especially if the treatment policy strategy is in place, to assess whether lack of efficacy might be contributing to the event.
Clinical Judgment: Ensure that investigators record any clinical decisions or observations that indicate lack of efficacy (e.g., worsening of the condition, or failure to achieve desired treatment thresholds).

Operational and Data Collection Issues for the Treatment Policy Strategy

When applying the treatment policy strategy, there are several operational and data collection challenges to consider:

Continued Data Collection After ICE:
- Protocol Clarity: It should be explicitly stated in the protocol that data collection must continue even after an ICE occurs. This is crucial for trials using the treatment policy strategy, which evaluates treatment effects regardless of intercurrent events.
- Investigator Awareness: Investigators need to understand that post-ICE data collection is necessary and that stopping data collection after an ICE would violate the treatment policy strategy’s intent.
Data Collection Burden:
- Ensure that post-ICE follow-up assessments are practical for both investigators and participants. There may be logistical issues if participants discontinue the trial drug or shift to other treatments.
- Systems should be in place to ensure that participants remain engaged and continue to provide outcome data, even if they are no longer receiving the study treatment.
Maintaining Data Integrity:
- Consistency: Data collection tools should be designed to ensure consistency in how ICEs and subsequent events are recorded.
- Data Completeness: Investigators should be encouraged to collect complete data sets following an ICE, including all relevant outcomes, adverse events, and clinical interventions.
Adjusting Analysis Plans:
- ICEs introduce complexities in data analysis, particularly in treatment policy strategies. Proper documentation will allow the correct handling of such events during statistical analysis, ensuring that the estimands reflect the treatment effect despite the ICE.

ICE of Death

When considering death as an Intercurrent Event (ICE), even in studies where the number of deaths is anticipated to be small, it is essential to determine how these cases will be handled both in the study protocol and in the statistical analysis. Death, in many studies, impacts outcome availability and thus must often be treated as an ICE. However, when deaths are expected to be very rare or unimportant to the primary objectives of the study, there can be a decision not to treat death as an ICE, but this requires careful planning.

Is it Necessary to Consider Death as an ICE?

While death is always a potential occurrence in any study, whether or not to explicitly include it as an ICE depends on the study’s context and design. For indications where only a small number of deaths are anticipated, sponsors may opt not to treat death as an ICE, especially if the focus of the trial is on treatment efficacy rather than survival or safety. However, in these cases, it’s important to establish clear guidelines for how such participants are handled in the analysis. Simply ignoring deaths can lead to biased results if not addressed appropriately.

If death is not considered an ICE, it still impacts data availability (since no further outcome data can be collected for the deceased participants). Therefore, even in situations where death is infrequent, strategies for handling deaths in the analysis must be well thought out.

What Strategies Can Be Considered for Death as an ICE?

Treatment Policy Strategy (Not Applicable): The treatment policy strategy generally assumes that data is collected and considered regardless of intercurrent events. However, in the case of death, this strategy becomes impractical because data cannot be collected post-mortem. As a result, applying the treatment policy strategy is not feasible for handling death as an ICE.
Hypothetical Strategy (Not Recommended): The hypothetical strategy involves asking “what would have happened if the ICE had not occurred?” While this approach can be useful for some types of intercurrent events, it is generally considered irrelevant and not useful for death. In most cases, simulating a hypothetical scenario in which a deceased participant survives does not provide meaningful or practical insights for a clinical study, especially when the death is related to factors beyond the study’s primary objectives.
Composite Endpoint Strategy (Problematic for Continuous Endpoints): In some cases, death is combined with other outcomes in a composite endpoint, where death and other serious events are considered together in one combined metric. However, this approach is problematic for continuous endpoints, such as blood pressure or glycemic control, because it is not feasible to incorporate death into the measurement of these types of outcomes. Continuous endpoints measure a change over time, which cannot be extended beyond the point of death. Defining a composite endpoint in this case is not practical and may lead to statistical complications.
While Alive Strategy (Suggested Approach): The most appropriate strategy when dealing with death as an ICE for continuous endpoints is the while alive strategy. This approach excludes participants from the analysis set following their death, as it is no longer possible to measure their outcomes beyond that point.

The “while alive” strategy essentially censors data after death, ensuring that only data collected while the participant was alive is considered. This strategy avoids the pitfalls of trying to analyze data beyond the point of death and allows for a realistic assessment of the treatment’s effect up to that point.
- Key Considerations for the While Alive Strategy:
  - Exclude participants from analysis following their death while keeping their data prior to death.
  - Ensure proper documentation and explanation in the statistical analysis plan to avoid bias and misinterpretation of the results.
  - Provide transparent reporting on the number of deaths and how they were handled in the analysis to maintain the study’s credibility.

Study Withdrawal

In clinical trials, study withdrawal refers to situations where a participant withdraws from the study entirely, either due to personal choice, administrative reasons, or other non-treatment-related circumstances. According to the ICH E9 addendum, study withdrawal is not considered an Intercurrent Event (ICE). The addendum makes a clear distinction between intercurrent events, which affect the interpretation of the treatment effect, and other events, such as study withdrawal, which result in missing data.

Study Withdrawal vs. Intercurrent Events (ICE)

The ICH E9 addendum emphasizes that intercurrent events are to be handled by defining the estimand in a way that reflects the precise trial objective. These events are closely related to the treatment or study context, such as discontinuation due to lack of efficacy, adverse events, or the use of rescue medication. In contrast, study withdrawal results in missing data but does not directly impact the interpretation of the treatment’s effect because it is not linked to the treatment assigned.

When a participant withdraws from the study: - The outcome of interest remains relevant, but it becomes unobserved due to the participant no longer being part of the study. - The withdrawal is not related to the efficacy or safety of the treatment, so it does not affect the estimand. - Instead, study withdrawal introduces missing data, which should be managed during the statistical analysis, typically using methods for handling missing data such as imputation, last observation carried forward (LOCF), or other appropriate techniques based on the nature of the missing data.

Handling Study Withdrawal in the Analysis

Since study withdrawal leads to missing data, it should be addressed using appropriate statistical techniques. The ICH E9 addendum suggests that while a participant may withdraw from the study, the outcome of interest still exists in principle, even though it is not observed. Thus, the trial should handle this missing data in a way that preserves the integrity of the analysis without introducing bias. Strategies for addressing missing data include: - Multiple Imputation: Estimating the missing outcomes based on observed data to create a complete dataset for analysis. - Mixed Models: Incorporating available data up to the point of withdrawal and accounting for the fact that some data is missing. - Sensitivity Analyses: Testing how different assumptions about the missing data (e.g., assuming the data is missing at random vs. not at random) may influence the study’s conclusions.

Keeping Participants in the Study Despite Treatment Discontinuation

The ICH E9 addendum also stresses that discontinuing study treatment should not result in study withdrawal. Participants who stop the treatment due to personal reasons, adverse effects, or lack of efficacy should continue to be followed and remain in the study. Their data should still be collected and used in the analysis. This is critical for maintaining the integrity of the study and ensuring that the treatment’s effect is accurately assessed, even in those who discontinue treatment.

In cases where study withdrawal is unavoidable, the focus shifts to managing the resulting missing data rather than interpreting it as an ICE that alters the treatment effect.

6.3 Supplementary Estimands

The primary estimand takes a broad perspective by considering the effect of Drug X on blood pressure control across a variety of real-world scenarios (e.g., discontinuation, use of rescue medication). It provides an overall view of how the drug performs in practice, which is highly relevant to regulators and HTA.
Supplementary Estimand 1 narrows the focus by exploring the effect of Drug X without the availability of rescue medications, making it more relevant for healthcare systems with limited resources.
Supplementary Estimand 2 focuses on the ideal use of the treatment (while participants are on the study treatment and without the need for rescue medication), which is of interest to patients and clinicians who want to understand how the drug performs when taken as planned.
Supplementary Estimand 3 evaluates the practical use of the treatment, considering real-world factors such as patient-initiated treatment discontinuation, and is relevant to regulators and clinicians seeking to understand how the drug works in real-world scenarios.

6.3.1 Primary Estimand (Effect of Drug X on Blood Pressure Control)

Research Question and Relevance: The primary estimand focuses on the overall effect of Drug X on blood pressure control in comparison to a placebo. It considers the participant to be alive, regardless of whether they discontinued treatment due to lack of efficacy or safety, failed to receive the study treatment, or required rescue medication for blood pressure control.

Relevance: This approach is particularly relevant to regulators and Health Technology Assessments (HTA), as it provides a broad assessment of how Drug X impacts blood pressure control under real-world conditions, where patients may discontinue the treatment or use rescue medication. It enables the assessment of Drug X’s practical effectiveness in a wide range of circumstances.

Supplementary Estimand 1 (Hypothetical Approach to Rescue Medication Unavailability)

Research Question and Relevance: The supplementary estimand examines the effect of Drug X on blood pressure control under the hypothetical scenario where rescue medication is not available. This differs from the primary estimand, where the availability and use of rescue medication is incorporated into the treatment policy strategy.

Relevance: This approach is relevant to healthcare systems where rescue medication is not available as part of the potential treatment plan. It allows for an analysis of the drug’s effect in settings with limited resources or where the use of additional rescue medications is not part of standard care. By isolating the effect of Drug X without the intervention of rescue medications, this estimand addresses concerns specific to resource-limited environments, which may be of particular interest to healthcare planners and policymakers.

Supplementary Estimand 2 (Effect While on Study Treatment)

Research Question and Relevance: The second supplementary estimand focuses on the effect of Drug X on blood pressure control while the participant is still on the study treatment and without the use of rescue medication for blood pressure control. This estimand excludes data after treatment discontinuation, as it specifically looks at what happens while the treatment is taken as planned.

Relevance: This estimand is particularly relevant when understanding the effect of treatment when taken as prescribed is of interest, which is a key question for both patients and clinicians. By only considering the data from the period when patients were still on the treatment, it gives insight into the efficacy of Drug X when patients adhere to the prescribed regimen. This helps answer questions about the effectiveness of the drug in ideal conditions, which can influence patient adherence strategies and clinical recommendations.

Supplementary Estimand 3 (Practical Use of Drug X in Real-World Settings)

Research Question and Relevance: The third supplementary estimand investigates the effect of Drug X on blood pressure control as compared with placebo, considering treatment discontinuation and rescue medication use. However, it is focused on the assessment of treatment in practical settings, where patients start the treatment voluntarily and make real-world decisions about their treatment adherence.

Relevance: This estimand is relevant for evaluating the treatment effect in practice, particularly in real-world clinical settings where patients may start or stop treatment based on their own decisions or physician recommendations. This perspective is of interest to regulators, HTA, and clinicians because it evaluates how the drug performs when it is initiated and followed in practical scenarios. It aligns with the need to understand how treatments are used in less controlled environments, thus providing valuable data on the practical implementation of the treatment.

6.4 Statistical Analysis

6.4.1 Analysis Populations

The analysis populations for the co-primary endpoints are crucial to ensure that the study’s findings are both valid and relevant to the research objectives. For this study, two co-primary endpoints are being assessed: change in average systolic blood pressure (SBP) and change in HbA1c. The analysis populations for these endpoints are defined as follows:

Change from Baseline in Average SBP:
- The Intent to Treat SBP (ITTSBP) analysis set will be used. This analysis set includes all randomized participants who had uncontrolled blood pressure at the time of study screening, ensuring that all relevant subjects are included in the analysis.
- The analysis will be conducted based on the randomized treatment arm, meaning participants will be analyzed according to the treatment group they were assigned to at the start of the study, regardless of any subsequent events (such as discontinuation or treatment changes).
- This analysis population will be used for both the primary estimand and the supplementary estimands, ensuring consistency across the different strategies being applied to handle intercurrent events (ICEs).
Change in HbA1c:
- Similarly, the analysis population for the co-primary endpoint of change in HbA1c includes all randomized patients with poor glycemic control at baseline. The definition of the analysis population aligns with the endpoint being studied, ensuring that the analysis focuses on participants with the relevant health condition (poor glycemic control).
- Like the SBP analysis, the randomized treatment arm will be the basis for the analysis, maintaining consistency with the intent-to-treat principle.

Supplementary Estimands 2 and 3

For supplementary estimands 2 and 3, it is acknowledged that data from some subjects may be excluded from the analysis. This is primarily due to the different ICE handling strategies applied in these estimands. For example:

In Supplementary Estimand 2, the focus is on participants while they are on the study treatment. This means that once a participant discontinues the treatment, their data may no longer be included in the analysis beyond that point.
In Supplementary Estimand 3, the aim is to evaluate the practical use of the treatment, which may exclude certain participants based on their adherence to the treatment regimen.

However, since the ICE strategies for these supplementary estimands are clearly defined (e.g., “while on study treatment” or “while treated”), there is no need to define additional analysis sets. The analysis populations remain consistent, and the exclusion of some data based on these strategies is a natural consequence of applying the specified handling methods for intercurrent events.

6.4.2 Exploring Missing Data

In any clinical trial, missing data can present a challenge in the interpretation of the results. While it’s not always possible to directly test for the underlying mechanisms causing missing data, it is essential to explore missing data patterns as part of the statistical analysis. Understanding these patterns helps to interpret the results more accurately and determine the robustness of the conclusions drawn from the study.

For this study, the exploration of missing data should be approached systematically, and at a minimum, the following steps should be taken:

Exploration of the Missing Data Mechanism Using Descriptive Approaches

The first step in understanding missing data is to explore why data may be missing. Although we can’t definitively test for the mechanism behind the missingness (e.g., Missing Completely at Random (MCAR), Missing at Random (MAR), or Missing Not at Random (MNAR)), we can use descriptive statistics to get an idea of potential patterns. This could include: - Summarizing the amount of missing data for each variable, such as co-primary endpoints (e.g., SBP or HbA1c). - Looking at the distribution of missing data across treatment groups to determine if one group has more missing data than the other, which could signal a pattern related to the treatment. - Descriptive statistics like the percentage of missing data for each visit or time point to identify trends (e.g., more missing data in later visits).

These exploratory analyses help to hypothesize whether the data might be missing due to participant dropout, protocol deviations, or other reasons that could be linked to the treatment or study procedures.

Tables of Missing Data for Primary and Key Secondary Endpoints by Visit

Creating tables of missing data for the primary and key secondary endpoints, broken down by visit, provides a clear visualization of where and when the missing data occurred. These tables should: - List the number and percentage of missing data points for each visit. - Break down the missing data by treatment group, as differences between groups may point to treatment-related causes of missingness. - Show the timing of missing data to see if it is concentrated around certain visits (e.g., near the end of the study) or is more random throughout the study.

These tables are a straightforward way to assess whether the pattern of missing data is consistent with expectations or if it might be biased toward certain visits or treatment arms.

Lists of Intercurrent Events by Visit

Since intercurrent events (ICEs) are a key factor in this study, it is important to generate lists of ICEs by visit, especially for cases where ICEs might lead to missing data. These lists will: - Show the types and frequencies of ICEs (e.g., treatment discontinuation, use of rescue medication, death) at each visit. - Highlight any links between ICEs and missing data, for example, if participants who experience an ICE are more likely to miss subsequent data collection points. - Differentiate between ICEs handled according to the treatment policy strategy and those handled by other strategies (e.g., hypothetical, while alive), as different strategies might affect how missing data is treated in the analysis.

By reviewing these intercurrent events by visit, the analysis can identify where the study’s outcomes may be influenced by external events, helping to interpret the overall results more effectively.

6.4.3 Primary Efficacy Analysis

The primary efficacy analysis for the co-primary endpoint of change in average systolic blood pressure (SBP) from Baseline to Week 24 will be performed using an analysis of covariance (ANCOVA). This statistical method is suitable for comparing the treatment effects on blood pressure changes while controlling for baseline differences and other covariates.

Handling of Missing Data

For missing data related to the primary endpoint caused by intercurrent events (ICEs) such as: - Treatment discontinuation due to lack of efficacy or safety, - Use of rescue medication, or - Not receiving a dose of study treatment,

the analysis will use multiple imputation with a retrieved dropout approach, as described by Wang and Hu. This approach aims to account for the missing data while maintaining the validity of the statistical inference about the treatment effects.

Wang, S., Hu, H. Impute the missing data using retrieved dropouts. BMC Med Res Methodol 2022, 22: 82. https://doi.org/10.1186/s12874-022-01509-9

Retrieved Dropout Subgroups

Two retrieved dropout subgroups will be defined to handle missing data:

Discontinued Treatment Retrieved Dropout Subgroup:
- This subgroup will include all participants who have discontinued treatment but still have observations at the time of the endpoint (Week 24). These participants’ data will be used to estimate the missing data following treatment discontinuation.
Rescue Medication Retrieved Dropout Subgroup:
- This subgroup includes all participants who have received rescue medication and have observations at the time of the endpoint. The data from these participants will be used to impute missing values following the use of rescue medication.

Imputation Procedure

Imputation Frequency: Missing values will be imputed 1,000 times using a multiple imputation procedure. This creates multiple complete datasets that account for the uncertainty associated with the missing data.
Separate Models for Each Treatment Group: The imputation process will be performed separately for each treatment group to reflect potential differences in the missing data mechanism between the treatment arms.
Linear Model for Imputation:
- The imputation model will use a linear regression approach.
- Covariates for the imputation model will include:
  - Baseline mean SBP, which controls for participants’ initial blood pressure values,
  - Last on-treatment visit mean SBP, which represents the most recent observed blood pressure before the treatment discontinuation or rescue medication was administered.

This imputation strategy helps to preserve the study’s statistical power by utilizing the available data from participants who either discontinued treatment or used rescue medication, while appropriately accounting for the missing data in a way that minimizes bias.

6.4.4 Missing Data Not Due to Intercurrent Events (ICE)

For missing data that is not due to an Intercurrent Event (ICE), it is assumed to be Missing at Random (MAR). This means that the probability of the data being missing is related to observed data but not the unobserved values themselves.

To handle this type of missing data, the study will use Markov Chain Monte Carlo (MCMC) methods to impute both: - Non-monotone missing data: Data where missingness occurs at random points, without a clear pattern. - Monotone missing data: Data where missingness occurs in a sequential manner (e.g., if a participant misses multiple follow-up visits after initially completing a few visits).

The MCMC approach is well-suited for multiple imputation in these scenarios, as it models the joint distribution of the data and generates imputations based on this model, filling in the missing values while maintaining the relationships in the observed data.

6.4.5 Sensitivity Analyses

To assess the robustness of the analysis to different assumptions about the missing data, several sensitivity analyses will be conducted. These analyses provide insights into how deviations from the missing at random assumption might affect the study results.

Imputation Based on the MAR Assumption (Using MCMC)

As part of the sensitivity analyses, the study will use multiple imputation methods based on the Missing at Random (MAR) assumption, with MCMC to handle the imputation of missing data. This analysis will evaluate the treatment effect under the assumption that the missing data is related to the observed data, but not to unobserved outcomes. This approach helps assess how much the results depend on the MAR assumption.

Tipping Point Analysis Using the Delta Adjustment Approach

The tipping point analysis will be used to evaluate how sensitive the results are to deviations from the MAR assumption. The delta adjustment approach will be employed for this purpose: - The delta adjustment introduces a systematic shift (or “delta”) to the imputed values to simulate Missing Not at Random (MNAR) scenarios. This allows for the exploration of how much deviation from the MAR assumption would be needed to change the overall conclusions of the analysis. - By adjusting the imputed data in this way, the tipping point analysis provides a range of potential outcomes under different missing data scenarios. It identifies whether there is a “tipping point” at which the study’s conclusions change as the assumptions about the missing data deviate from MAR.

This combination of multiple imputation methods and sensitivity analyses ensures that the treatment effect estimates are robust and account for potential biases introduced by missing data. If the results remain consistent across different imputation strategies, confidence in the study’s findings is reinforced.

6.4.6 Supplementary Analyses

While supplementary analyses are typically not detailed in the Clinical Study Protocol (CSP), they are instead defined in the Statistical Analysis Plan (SAP), which offers more in-depth guidance on how these analyses will be conducted. The supplementary estimands differ from the primary estimand in their handling of intercurrent events (ICEs), reflecting different assumptions about how ICEs, such as rescue medication use and treatment discontinuation, influence the analysis.

Each supplementary estimand adjusts the analysis based on different assumptions regarding the treatment effect and how intercurrent events (particularly rescue medication use and not receiving a dose of study treatment) should be handled. These adjustments allow for a more nuanced understanding of how Drug X performs under various conditions:

Supplementary Estimand 1 uses a hypothetical strategy to estimate the treatment effect assuming no rescue medication is available, and imputes missing data based on different approaches for the intervention and control groups.
Supplementary Estimand 2 applies a while on study treatment strategy, where participants are excluded from the analysis after using rescue medication or not receiving a dose of study treatment.
Supplementary Estimand 3 focuses specifically on excluding participants from the analysis after they fail to receive a dose of the study treatment, using a similar while on study treatment approach.

These supplementary analyses provide complementary perspectives to the primary analysis, exploring the robustness of the treatment effect under different real-world scenarios.

Supplementary Estimand 1

This estimand differs from the primary estimand in the ICE strategy for rescue medication. The key difference lies in the adoption of a hypothetical approach: - Hypothetical Strategy for Rescue Medication: This approach assumes that rescue medication is not available for the intervention group. As a result, the treatment effect is estimated under the assumption that participants do not have access to rescue medication. - Handling of Missing Data: All data following the use of rescue medication will be considered missing. - For the intervention group, a reference-based multiple imputation method will be used, which assumes that after the ICE, the trajectory of the participant’s data will reflect that of a reference group (e.g., control group). - In the control group, missing data will be imputed using the MAR (Missing at Random) assumption. This reflects the assumption that the missing data is related to the observed data but not to the unobserved outcomes themselves.

Supplementary Estimand 2

This estimand differs from the primary estimand by the use of a “while on study treatment” strategy for two ICEs: rescue medication use and the participant not receiving a dose of the study treatment. - While on Study Treatment for Rescue Medication and No Dose Received: This strategy means that participants who either received rescue medication or never received a dose of the study treatment will be excluded from the analysis after these events occur. - Specifically, the data from timepoints following the ICE will not be included in the analysis for those participants, as their experiences are considered irrelevant to the effect of the treatment under study in this particular estimand.

Supplementary Estimand 3

This estimand differs from the primary estimand by using the “while on study treatment” approach for the ICE of not receiving a dose of the study treatment. - While on Study Treatment for No Dose of Study Treatment: In this case, participants who did not receive a dose of the study treatment will be excluded from the analysis for any timepoints following the ICE. - The rationale is to assess the treatment effect only during the period in which participants are actually receiving the treatment, excluding any data that comes after the treatment regimen was interrupted or never initiated.

6.5 ANCOVA VS MMRM

The shift in regulatory preference from Mixed Model for Repeated Measures (MMRM) to Analysis of Covariance (ANCOVA) for handling missing data, particularly in studies with intercurrent events (ICEs), represents an important change in how missing data is managed in clinical trials.

While MMRM was previously the preferred model for longitudinal data analysis, regulators now favor ANCOVA for handling missing data and ICEs, particularly in studies where participants may be excluded following an ICE. ANCOVA allows for more flexibility in missing data handling, especially when using whilst on treatment or whilst alive strategies, and ensures consistency across sensitivity analyses. Nonetheless, MMRM remains a viable option when there are no ICEs and the study aims to analyze the trend over time with proper covariance structures in place.

6.5.1 Mixed Model for Repeated Measures (MMRM)

Previously, the MMRM approach was commonly used for the primary analysis of longitudinal studies like this one, as it handles repeated measures and accounts for correlations between time points within individuals. However, regulators have raised concerns about its use, particularly in the context of intercurrent events (ICEs) and missing data assumptions. Specifically:

MMRM under MAR (Missing at Random) assumes that missing data is unrelated to unobserved outcomes unless explicitly handled through imputation prior to analysis. In the presence of ICEs, this assumption can be problematic because it implies that participants who discontinue treatment or experience an ICE would have continued as expected, which is not always realistic.
If a whilst on treatment or whilst alive strategy is used to handle ICEs, MMRM implicitly imputes missing values after the ICE. This means that participants who should be excluded following an ICE are instead included in the analysis beyond that point. This can lead to biased estimates of the treatment effect because the model assumes that all participants, including those experiencing an ICE, would continue similarly to those who did not experience an ICE.
MMRM with MAR estimates the mean treatment effect as if participants would have continued treatment just like others in the same arm, effectively creating a hypothetical scenario. This approach may not align with real-world treatment discontinuation patterns, especially when other ICE-handling strategies (e.g., whilst on treatment or whilst alive) are being used.

6.5.2 ANCOVA Model for Each Timepoint

Given these limitations, the ANCOVA model has become the preferred approach by both the European Medicines Agency (EMA) and increasingly the FDA. This model offers several advantages: 1. Separate Timepoint Analysis: The ANCOVA model can analyze data separately at each timepoint, allowing more flexibility in handling missing data and excluding participants after an ICE without implicitly imputing values, which can happen with MMRM.

Handling Missing Data After ICEs: ANCOVA allows for participants to be excluded from the analysis after an ICE (such as treatment discontinuation or death), aligning with strategies like whilst on treatment or whilst alive. This ensures that post-ICE data is not included for individuals who should no longer be contributing to the treatment effect estimation.
Consistency with Sensitivity Analyses: Using ANCOVA for the primary analysis also allows for sensitivity analyses to be conducted with the same statistical model. This consistency is beneficial when evaluating different assumptions about missing data (e.g., Missing at Random (MAR) versus Missing Not at Random (MNAR)).

6.5.3 When MMRM Can Still Be Used

Limitation: Loss of Correlation Between Timepoints

One limitation of the ANCOVA approach is that it does not account for the correlation between time points within individuals. This is important because the repeated measures design of the study means that an individual’s measurements at different time points are likely to be correlated.
In contrast, MMRM uses an unstructured covariance matrix to account for within-subject correlations, as long as the model converges. However, if the goal of the study is not to estimate a trend over time, but rather the treatment effect at specific timepoints (such as at the end of the study), ANCOVA remains a suitable and simpler alternative.

MMRM is still considered a valid approach under certain conditions:

No ICEs: If the study does not include any ICEs that are being handled with a whilst on treatment or whilst alive strategy, MMRM may still be appropriate. For example, when there are no treatment discontinuations or other ICEs of concern, MMRM can be used effectively, especially when imputing missing data under the MNAR assumption.
Convergence: If the model can converge with the appropriate unstructured covariance matrix, MMRM can still provide reliable estimates of within-subject correlations, particularly when the treatment effect over time is of interest.

6.5.4 Primary Efficacy Analysis and Missing Data Handling

The primary efficacy analysis should explicitly detail the approach for handling missing data due to ICEs as well as other types of missing data. Key considerations include: 1. ICE Handling: Clear strategies should be specified for how to handle data following key ICEs (such as treatment discontinuation or the use of rescue medication), particularly when using a whilst on treatment or whilst alive strategy.

Imputation of Missing Data: Missing data due to reasons unrelated to ICEs should be handled using appropriate multiple imputation techniques, such as MCMC, under the assumption that data is MAR or MNAR, depending on the scenario.
Consistency Across Sensitivity Analyses: Sensitivity analyses should be performed to assess the robustness of the primary analysis, testing different assumptions about the missing data mechanism (e.g., MAR vs. MNAR). Using the same ANCOVA model for both the primary and sensitivity analyses ensures that results are consistent and interpretable across different analyses.

7 COVID-19 pandemic Estimand (Cytel Webinar Series)

7.1 Complications due to COVID-19

Administrative and Operational Challenges: - Treatment Discontinuation Due to Drug Supply Issues: The pandemic disrupted global supply chains, leading to shortages of medications and other essential supplies. In clinical trials, this resulted in unexpected discontinuation of treatment for some patients, not due to health reasons but because the necessary drugs were simply unavailable. - Missed Visits Due to Lockdown: Lockdown measures and restrictions on movement meant that many trial participants were unable to attend scheduled visits at clinical sites. This led to gaps in data collection and loss of follow-up data, complicating the analysis and interpretation of trial results.

Health-Related Complications: - Impact of COVID-19 on Health Status: Participants contracting COVID-19 during the trial could experience a range of symptoms affecting their health status independently of the trial’s intended measures. This includes severe cases leading to death, which not only affects the human aspect of the trial but also significantly impacts the statistical power and outcomes of the study. - Intake of Additional Medications Due to COVID-19 Symptoms: Patients in the trial might have taken medications to manage COVID-19 symptoms or related complications, which could interact with the trial medications or confound the treatment effects being studied.

Handling Intercurrent Events: - The complications introduced by COVID-19 represent intercurrent events, which are occurrences that affect the trial’s endpoints outside the planned experimental conditions. Handling these requires careful consideration and adjustment in the trial’s statistical analysis to accurately reflect the treatment effects under these disrupted conditions. - One common approach to managing these intercurrent events is to adjust the analysis plan to account for these deviations. This might involve using statistical methods that handle missing data, adjusting the analysis population to exclude affected subjects, or incorporating additional analytical models to estimate the impact of these events on the primary outcomes.

Examples: - Patient 1 and Patient 2: Both experienced treatment discontinuation due to drug supply issues. This event must be documented and factored into the analysis as it impacts the treatment continuity and efficacy assessment. - Patient 4: The death of a patient due to COVID-19 is a significant event that needs to be addressed in the trial’s safety and efficacy evaluations. It may require re-assessment of risk-benefit ratios and could lead to potential amendments in study protocols or informed consent documents. - Patient 3: This patient’s intake of additional medications to treat COVID-19 symptoms introduces additional variables into their treatment regimen, possibly requiring the use of methods like stratification or sensitivity analyses to separate the effects of the study drug from other medications.

7.2 Treatment discontinuation due to drug supply issues

1. Treatment Policy Strategy: Including Intercurrent Events in the Treatment - Integration of COVID-19 Infections: Under this approach, COVID-19 infections are considered a part of the overall treatment experience of the patient population. This means that the analysis of the treatment’s effectiveness includes outcomes for patients who may have been infected with COVID-19 during the trial. This strategy acknowledges the reality that these infections are a part of the trial conditions and does not attempt to adjust or exclude these data from the analysis. - No Adaptation of the Estimand: By not modifying the original estimand, the treatment policy strategy effectively accepts that the trial conditions include the impact of COVID-19. This approach does not separate the drug’s effect from the pandemic’s impact, reflecting a real-world scenario where the treatment might be used under similar conditions.

2. Hypothetical Strategy: Excluding Specific Intercurrent Events - Exclusion of Administrative and Operational Challenges: This scenario involves analyzing what the treatment effect would have been if administrative and operational challenges caused by the pandemic (like lockdowns and supply chain disruptions) had not occurred. It requires a hypothetical adjustment to the data, estimating outcomes as if these disruptions were absent. - Precision in Language: When defining multiple hypothetical scenarios, it’s crucial to use precise language to clearly delineate each scenario. This helps avoid ambiguity in understanding what each hypothetical estimand is intended to measure.

Scenarios for Analysis: 1. Treatment Effect During the Pandemic: - This scenario includes analyzing the treatment effect across the overall patient population recruited before, during, and after the pandemic. It incorporates real infections as part of the treatment policy strategy but excludes the impact of administrative and operational challenges using a hypothetical strategy.

Treatment Effect in a Post-Pandemic World:
- This analysis assumes a world where society and healthcare systems have adapted to COVID-19, meaning that the disruptions experienced during the height of the pandemic are no longer present. This scenario uses the same hypothetical setting as the first, assuming normal operation without the initial pandemic-related disruptions.
Treatment Effect in the Absence of COVID-19:
- This hypothetical scenario examines the treatment’s effectiveness in a setting where COVID-19 does not exist at all. It is aligned with the original trial objective, assuming no impact from the disease—neither direct health effects on patients nor indirect effects via administrative challenges.

7.3 Handling administrative complications

Handling administrative complications, especially during events like the COVID-19 pandemic, presents unique challenges for clinical trial management and data analysis. The focus often shifts to understanding what the outcomes would have been in the absence of such disruptions.

Understanding the Impact of Administrative Complications: - Assumption of Independence: One key assumption in handling administrative complications such as treatment discontinuation due to drug supply issues is that these events are independent of the randomized treatment. This means that the likelihood of experiencing these complications is similar across all treatment groups, which simplifies the handling of such events in the analysis. - Continued Existence of COVID-19: Even in a post-pandemic world, COVID-19 may still be present, but the administrative and operational challenges it initially caused (e.g., lockdowns, hospital overloads) might have been mitigated. The scenario becomes one of managing a known infectious disease without the extra layer of crisis-induced administrative hurdles.

Hypothetical Estimand for Administrative Complications: - Defining the Hypothetical Scenario: A hypothetical estimand could be formulated to explore outcomes if there had been no treatment discontinuation due to drug supply issues. This approach asks what the trial results might have shown if all participants could have continued their prescribed treatments uninterrupted by such supply chain disruptions. - Estimation Challenges: Estimating this hypothetical scenario requires robust statistical methods to model what the trial outcomes might have been under uninterrupted treatment conditions. This involves using techniques that can handle missing data, simulate missing intervention periods, or otherwise predict outcomes under these hypothetical conditions.

Example of Handling Unplanned Disruptions: - Literature and Resources: A useful resource in this context is the report from the NISS Ingram Olkin Forum Series on ‘Unplanned Clinical Trial Disruptions’ which discusses various approaches to handling and estimating the effects of unplanned disruptions in clinical trials. The report can be accessed at arXiv:2202.03531 and provides detailed insights into the statistical methodologies that can be employed to address such issues.

Steps to Robust Estimation: 1. Modeling Missing Data: Use statistical models that can effectively handle missing data due to administrative disruptions. Techniques such as multiple imputation or mixed models that account for random effects might be appropriate. 2. Adjusting the Analysis: Adjust the analysis to account for the fact that some data points are missing not at random. Sensitivity analyses can be particularly useful here to test how sensitive results are to the assumptions made about the missing data. 3. Simulating Potential Outcomes: Employ simulation techniques to estimate what the data might have looked like had the disruptions not occurred. This often requires making assumptions about the progression of disease or response to treatment, which should be clearly justified and tested for their impact on the study conclusions.

8 Reference

Bornkamp B, et al. Principal stratum strategy: Potential role in drug development[J]. Pharmaceutical Statistics, 2021.
EMA. Guideline on clinical investigation of medicinal products in the treatment or prevention of diabetes mellitus[S]. CPMP/EWP/1080/00 Rev.2, 2024.
Olarte Parra C, Daniel RM, Bartlett JW. Hypothetical estimands in clinical trials: a unification of causal inference and missing data methods[J]. Statistics in Biopharmaceutical Research, 2022.
Rubin DB. Multiple Imputation for Nonresponse in Surveys[M]. New York: Wiley, 1987.
Bartlett JW. Reference-Based Multiple Imputation—What is the Right Variance and How to Estimate It[J]. Statistics in Biopharmaceutical Research, 2021, 15(1): 178–186.
Carpenter JR, Roger JH, Kenward MG. Analysis of longitudinal trials with protocol deviation: a framework for relevant, accessible assumptions, and inference via multiple imputation[J]. Journal of Biopharmaceutical Statistics, 2013, 23(6): 1352-1371.
Cro S, Morris TP, Kenward MG, Carpenter JR. Sensitivity analysis for clinical trials with missing continuous outcome data using controlled multiple imputation: a practical guide[J]. Statistics in Medicine, 2020, 39(21): 2815-2842.
Polverejan E, Dragalin V. Aligning Treatment Policy Estimands and Estimators—A Simulation Study in Alzheimer’s Disease[J]. Statistics in Biopharmaceutical Research, 2020, 12(2): 142-154.
White I, Joseph R, Best N. A causal modeling framework for reference-based imputation and tipping point analysis in clinical trials with quantitative outcome[J]. Journal of Biopharmaceutical Statistics, 2020, 30(2): 334-350.
Wolbers M, Noci A, Delmar P, Gower-Page C, Yiu S, Bartlett JW. Reference-based imputation methods based on conditional mean imputation[J].
Cole SR, Hernan MA. Constructing inverse probability weights for marginal structural models[J]. American Journal of Epidemiology, 2008, 168(6): 656–664.
Olarte Parra C, Daniel RM, Bartlett JW. Hypothetical estimands in clinical trials: a unification of causal inference and missing data methods[J]. Statistics in Biopharmaceutical Research, 2022, 15(2): 421–432.
Olarte Parra C, Daniel RM, Wright D, Bartlett JW. Estimating hypothetical estimands with causal inference and missing data estimators in a diabetes trial[E]. arXiv e-prints, 2023, arXiv-2308.
Hernán MA, Robins JM. Causal inference: What if[M]. Boca Raton: Chapman & Hall/CRC, 2020.

Estimands and Sensitivity Analyses

Zehui Bai

2024-10-21 19:09:18

1 Introduction

1.1 Estimands and Missing Data

1.2 Estimand, Estimators and Estimation

1.3 Causal

1.4 Missing data vs intercurrent event

1.5 Sensitivity versus Supplementary Analysis

1.6 Disease Specific Guideline

2 Five Strategies

2.1 Treatment Policy Strategy Explained

Introduction

Remark

2.2 Composite Strategy Explained

Introduction

Remark

2.3 Hypothetical Strategies Explained

Introduction

Remark

2.4 While-on-Treatment Strategy Explained

Introduction

Remark

2.5 Principal Stratum Strategy Explained

Introduction

Remark

3 Defining Estimands

3.1 Scientific Question of Interest

3.2 Tinking Process

3.3 Case Study 1: Treatment efficacy in patients with chronic inflammatory conditions

3.3.1 Background

3.3.2 Estimands Proposed in the Study

3.3.3 Chosen Estimand Attributes (Composite Approach)

3.4 Case Study 2: Effectiveness of an oral treatment in chronic dermatological condition

3.4.1 Background

3.4.2 Initial Proposed Estimand

3.4.3 Chosen Estimand Attributes (Composite Approach)

4 Analysis of Treatment Policy Estimands

4.1 Case Study: type 2 diabetes

4.2 Treatment Policy Estimand of Interest

4.3 Missing data under treatment policy strategy

1. Jump to Reference (J2R / JR) Imputation

2. Copy Reference (CR) Imputation

3. Copy Increments in Reference (CIR) Imputation

4. Missing at Random (MAR)

5. Retrieved Dropout (RDO)

4.4 Multiple Imputation

Step 1: Parameter Estimation (Imputation Model)

Step 2: Imputation

Step 3: Analysis

Step 4: Pooling

4.5 Analysis of Treatment Policy Estimands

4.5.1 Review Data

4.5.2 Analysis of Data (ANCOVA)

4.5.3 Trial with Missing Data

4.5.4 Complete-Case Analysis

4.5.5 Multiple imputation analysis (JR - Jump to Reference)

4.5.6 Retrieved-dropout models

5 Analysis of Hypothetical Estimands

5.1 Estimation for hypothetical estimands

5.2 Prediction of Hypothetical Trajectories

5.3 Methods for Estimating Hypothetical Estimands

5.3.1 Multiple Imputation (MI)

5.3.2 Inverse Probability Weighting (IPW)

5.3.3 G-Computation

5.3.4 Advanced Methods

5.4 Time-dependent Intercurrent Event Occurrence

5.5 Estimation of weight and treatment effect

5.5.1 Estimation of Weight

5.5.2 Estimation of Treatment Effect

5.6 Issue of Large Weights in IPW

5.7 Analysis of Hypothetical Estimands

5.7.1 Review Data

5.7.2 Analysis of Data (ANCOVA)

5.7.3 Trial where some participants did not adhere to randomized treatment

5.7.4 Analysis of Patients without Intercurrent Event

5.7.5 Analysis using Multiple imputation

5.7.6 Analysis using Inverse Probability Weighting (IPW)

5.7.7 IPW (bootstrap SE)

6 Case Study #1 Continuous Endpoint