Subject Area
Statistics
Abstract
Impact evaluations of regional development programs often require estimating counterfactual outcomes for a small number of treated regions using survey-based areal data. In practice, evaluators typically rely on two-group quasi-experimental methods such as propensity score matching (PSM) and Difference-in-Differences (DiD). These approaches perform poorly when only a few regions receive treatment, and when the set of observed covariates is limited or only partially relevant. Moreover, they typically do not explicitly exploit the spatial and temporal dependence present in survey-based areal data such as in ACS (American Community Survey). This dissertation develops a family of Bayesian spatial predictive models for directly forecasting counterfactual outcomes without constructing matched control groups. The core methodological contribution is the ICAR_Corr framework, which replaces traditional binary adjacency weights in intrinsic conditional autoregressive (ICAR) priors with correlation-based weights derived from historical outcome similarity between neighboring regions. This framework is first evaluated as a spatial predictive model applied to all tracts jointly and compared to non-spatial, purely temporal and standard ICAR models, as well as PSM, under a simulation design calibrated to Dallas County Census tracts and an application to tract-level log median house prices from ACS 5-year summary files (2013-2019). Across both simulated and observed data, the ICAR_Corr model yields substantially lower mean squared error and higher coverage probabilities than PSM and alternative predictive models, particularly when only a small number of tracts are treated and a small number of covariates are available. To address spatial heterogeneity, the ICAR_Corr framework is extended in two directions. First, a ``first-cluster-then-model" strategy uses the SKATER algorithm to partition Dallas County into spatially contiguous, socioeconomically homogeneous clusters, within which ICAR_Corr is applied. Second, a spatially varying coefficient model embeds ICAR_Corr priors on both spatial random effects and regression coefficients, using historical covariate correlations to smooth covariate effects across neighboring tracts. In Dallas County application, these extensions further improve counterfactual forecasts. Together, these results demonstrate that correlation-based spatial priors, especially when combined with localized modeling and spatially varying coefficients, can significantly enhance counterfactual prediction for regional development impact evaluations in complex urban environments.
Degree Date
Spring 5-16-2026
Document Type
Dissertation
Degree Name
Ph.D.
Department
Statistics and Data Science
Advisor
Jing Cao
Second Advisor
S. Lynne Stokes
Creative Commons License

This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License
Recommended Citation
Gonzalez, Duwani W., "Bayesian Spatiotemporal Model for Counterfactual Estimation in Socioeconomic Studies" (2026). Statistical Science Theses and Dissertations. 57.
https://scholar.smu.edu/hum_sci_statisticalscience_etds/57
Included in
Applied Statistics Commons, Data Science Commons, Other Economics Commons, Regional Economics Commons, Social Statistics Commons, Statistical Methodology Commons, Statistical Models Commons, Statistical Theory Commons
