Subject Area

Statistics

Abstract

Influence diagnostics in regression analysis allow analysts to identify observations that have a strong influence on model fitted probabilities and parameter estimates. The most common influence diagnostics, such as Cook’s Distance for linear regression, are based on a deletion approach where the results of a model with and without observations of interest are compared. Here, deletion-based influence diagnostics are proposed for generalized estimating equations (GEE) for correlated, or clustered, nominal multinomial responses. The proposed influence diagnostics focus on GEEs with the baseline-category logit link function and a local odds ratio parameterization of the association structure. Formulas for both observation- and cluster-deletion diagnostics are provided which are multivariate extensions of the current one-step approximation approaches used for GEEs with univariate marginal responses. Simulation studies were conducted to evaluate the accuracies of the one-step diagnostics in multinomial GEE as well as in other commonly used categorical response models. Applications are presented on 2017-2018 English Premier League shot-outcome data and on a cohort study on small renal mass histologic subtype distributions.

Degree Date

Spring 2023

Document Type

Dissertation

Degree Name

Ph.D.

Department

Statistical Science

Advisor

Dr. Ian Harris

Number of Pages

184

Format

.pdf

Share

COinS