Influence Diagnostics for Generalized Estimating Equations Applied to Correlated Categorical Data
Influence diagnostics in regression analysis allow analysts to identify observations that have a strong influence on model fitted probabilities and parameter estimates. The most common influence diagnostics, such as Cook’s Distance for linear regression, are based on a deletion approach where the results of a model with and without observations of interest are compared. Here, deletion-based influence diagnostics are proposed for generalized estimating equations (GEE) for correlated, or clustered, nominal multinomial responses. The proposed influence diagnostics focus on GEEs with the baseline-category logit link function and a local odds ratio parameterization of the association structure. Formulas for both observation- and cluster-deletion diagnostics are provided which are multivariate extensions of the current one-step approximation approaches used for GEEs with univariate marginal responses. Simulation studies were conducted to evaluate the accuracies of the one-step diagnostics in multinomial GEE as well as in other commonly used categorical response models. Applications are presented on 2017-2018 English Premier League shot-outcome data and on a cohort study on small renal mass histologic subtype distributions.
Dr. Ian Harris
Number of Pages
Vazquez, Louis, "Influence Diagnostics for Generalized Estimating Equations Applied to Correlated Categorical Data" (2023). Statistical Science Theses and Dissertations. 36.
Categorical Data Analysis Commons, Longitudinal Data Analysis and Time Series Commons, Statistical Methodology Commons