Influence diagnostics in regression analysis allow analysts to identify observations that have a strong influence on model fitted probabilities and parameter estimates. The most common influence diagnostics, such as Cook’s Distance for linear regression, are based on a deletion approach where the results of a model with and without observations of interest are compared. Here, deletion-based influence diagnostics are proposed for generalized estimating equations (GEE) for correlated, or clustered, nominal multinomial responses. The proposed influence diagnostics focus on GEEs with the baseline-category logit link function and a local odds ratio parameterization of the association structure. Formulas for both observation- and cluster-deletion diagnostics are provided which are multivariate extensions of the current one-step approximation approaches used for GEEs with univariate marginal responses. Simulation studies were conducted to evaluate the accuracies of the one-step diagnostics in multinomial GEE as well as in other commonly used categorical response models. Applications are presented on 2017-2018 English Premier League shot-outcome data and on a cohort study on small renal mass histologic subtype distributions.

Degree Date

Spring 2023

Document Type


Degree Name



Statistical Science


Dr. Ian Harris

Subject Area


Number of Pages