Subject Area

Statistics

Abstract

Impact evaluations of regional development programs often require estimating counterfactual outcomes for a small number of treated regions using survey-based areal data. In practice, evaluators typically rely on two-group quasi-experimental methods such as propensity score matching (PSM) and Difference-in-Differences (DiD). These approaches perform poorly when only a few regions receive treatment, and when the set of observed covariates is limited or only partially relevant. Moreover, they typically do not explicitly exploit the spatial and temporal dependence present in survey-based areal data such as in ACS (American Community Survey). This dissertation develops a family of Bayesian spatial predictive models for directly forecasting counterfactual outcomes without constructing matched control groups. The core methodological contribution is the ICAR_Corr framework, which replaces traditional binary adjacency weights in intrinsic conditional autoregressive (ICAR) priors with correlation-based weights derived from historical outcome similarity between neighboring regions. This framework is first evaluated as a spatial predictive model applied to all tracts jointly and compared to non-spatial, purely temporal and standard ICAR models, as well as PSM, under a simulation design calibrated to Dallas County Census tracts and an application to tract-level log median house prices from ACS 5-year summary files (2013-2019). Across both simulated and observed data, the ICAR_Corr model yields substantially lower mean squared error and higher coverage probabilities than PSM and alternative predictive models, particularly when only a small number of tracts are treated and a small number of covariates are available.  To address spatial heterogeneity, the ICAR_Corr framework is extended in two directions. First, a ``first-cluster-then-model" strategy  uses the SKATER algorithm to partition Dallas County into spatially contiguous, socioeconomically homogeneous clusters, within which ICAR_Corr is applied. Second, a spatially varying coefficient model embeds ICAR_Corr priors on both spatial random effects and regression coefficients, using historical covariate correlations to smooth covariate effects across neighboring tracts. In Dallas County application, these extensions further improve counterfactual forecasts. Together, these results demonstrate that correlation-based spatial priors, especially when combined with localized modeling and spatially varying coefficients, can significantly enhance counterfactual prediction for regional development impact evaluations in complex urban environments.

Degree Date

Spring 5-16-2026

Document Type

Dissertation

Degree Name

Ph.D.

Department

Statistics and Data Science

Advisor

Jing Cao

Second Advisor

S. Lynne Stokes

Creative Commons License

Creative Commons Attribution-Noncommercial 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License

Share

COinS