A decomposition of Fisher’s information to inform sample size for developing or updating fair and precise clinical prediction models — part 2: time-to-event outcomes

Riley, RD; Collins, GS; Archer, L; Whittle, R; Legha, A; Kirton, L; Dhiman, P; Sadatsafavi, M; Adderley, NJ; Alderman, J; Martin, GP; Ensor, J

Journal article

A decomposition of Fisher’s information to inform sample size for developing or updating fair and precise clinical prediction models — part 2: time-to-event outcomes

Abstract:: Background: When developing a clinical prediction model using time-to-event data (i.e. with censoring and different lengths of follow-up), previous research focuses on the sample size needed to minimise overfitting and precisely estimating the overall risk. However, instability of individual-level risk estimates may still be large. Methods: We propose using a decomposition of Fisher’s information matrix to help examine and calculate the sample size required for developing a model that aims for precise and fair risk estimates. We propose a six-step process which can be used either before data collection or when an existing dataset is available. Steps 1 to 5 require researchers to specify the overall risk in the target population at a key time-point of interest: an assumed pragmatic ‘core model’ in the form of an exponential regression model, the (anticipated) joint distribution of core predictors included in that model and the distribution of censoring times. The ‘core model’ can be specified directly or based on a specified C-index and relative effects of (standardised) predictors. The joint distribution of predictors may be available directly in an existing dataset, in a pilot study or in a synthetic dataset provided by other researchers. Results: We derive closed-form solutions that decompose the variance of an individual’s estimated event rate into Fisher’s unit information matrix, predictor values and total sample size; this allows researchers to calculate and examine uncertainty distributions around individual risk estimates and misclassification probabilities for specified sample sizes. We provide an illustrative example in breast cancer and emphasise the importance of clinical context, including any risk thresholds for decision-making, and examine fairness concerns for pre- and postmenopausal women. Lastly, in two empirical evaluations, we provide reassurance that uncertainty interval widths based on our exponential approach are close to using more flexible parametric models. Conclusions: Our approach allows users to identify the (target) sample size required to develop a prediction model for time-to-event outcomes, via the pmstabilityss module. It aims to facilitate models with improved trust, reliability and fairness in individual-level predictions.

Publication status:: Published

Peer review status:: Peer reviewed

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Share
Cite

Cite this record

APA Style

Riley, R. D., Collins, G. S., Archer, L., Whittle, R., Legha, A., Kirton, L., Dhiman, P., Sadatsafavi, M., Adderley, N. J., Alderman, J., Martin, G. P., & Ensor, J. (2025). A decomposition of Fisher’s information to inform sample size for developing or updating fair and precise clinical prediction models — part 2: time-to-event outcomes. Diagnostic and Prognostic Research, 9(1).

MLA Style

Riley, RD, et al. “A Decomposition of Fisher’s Information to Inform Sample Size for Developing or Updating Fair and Precise Clinical Prediction Models — Part 2: Time-to-Event Outcomes.” Diagnostic and Prognostic Research, vol. 9, no. 1, 2025.

Chicago Style

Riley, RD, GS Collins, L Archer, et al. 2025. “A Decomposition of Fisher’s Information to Inform Sample Size for Developing or Updating Fair and Precise Clinical Prediction Models — Part 2: Time-to-Event Outcomes.” Diagnostic and Prognostic Research 9 (1).
Print