Appendix C Multilevel Regression with Poststratification and Estimating State and District Ambient Temperature
Multilevel regression with poststratification (MRP) is a technique that uses multilevel modeling and Bayesian statistics to generate estimates that are a function of both demographic and geographic characteristics (Reference PopkinPark, Gelman, and Bafumi, 2004; Reference Lee and OppenheimerLax and Phillips, 2009; Reference WawroWarshaw and Rodden, 2012). This method combines demographic and public opinion data to create predictions for small subsets of the population, which are then weighted by subgroup population within a geographic area and summed for all subgroups within that area (in this case, a congressional district.) For data with an inherently hierarchical structure (as is the case for individuals within districts that are within states), multilevel models have an advantage over classical regression models. Classical regression models use either complete pooling data to generate effects (as when no district or state effects are taken into account) or no pooling (as when models include fixed effects for a respondent’s state or district). Multilevel regression models allow for data to be partially pooled to a degree dictated by the data, based upon group sample size and variation. These models thus allow for the effects of demographics to vary by geography, while also pulling the estimates for states or districts with limited numbers of observations or high variance toward the mean, and allowing estimates for states and districts with more robust samples and tighter variances to be more influenced by district-specific effects.
MRP generated estimates of public opinion outperform both disaggregated means and presidential vote share measures at the state-, congressional district-, and state senate district-levels, producing estimates that are more correlated with population means, have smaller errors, and are more reliable (Reference Lee and OppenheimerLax and Phillips, 2009; Reference WawroWarshaw and Rodden, 2012). These differences are even more apparent with the smaller sample sizes (2,500 for congressional districts) common to most national surveys. MRP estimates are also far less subject to bias than disaggregated means. Disaggregating from nationally (rather than district or state) representative samples can result in biased predictions. MRP avoids this pitfall because all estimates are weighted according to the percentage of a state or district that any particular subgroup makes up. Additionally, nonresponse bias is less likely to influence within-group estimates for MRP relative to disaggregation because of the effects of partial pooling (Reference Lee and OppenheimerLax and Phillips, 2009).
Reference Buttice and HightonButtice and Highton (2013) find that MRP is most effective as an estimator when higher-level variables (in this case, state or district) are strongly predictive of the concept of interest, and when there is a high level of geographic variation in the quantity being estimated.Footnote 1 To ensure the greatest level of validity and reliability in my estimates, I include a number of state- and district-level predictors with a clear theoretical tie to expected levels of warmth or hostility toward the selected disadvantaged groups. I also have a clear expectation that due to geographically driven district heterogeneity and distinct state and district cultures, inter-district variability should be high.
Data
To model individual responses, I use the ANES aggregated time-series data from 1992 to 2016. This data set is intended to be nationally representative, and has a total of 24,122 observations. Given the sampling technique and relatively small sample size (relative to the CCES or the NAES), MRP is the best estimator for generating unbiased and reliable measures of district opinion. To account for over-time changes in district lines and public opinion, I model each decade separately, with 9,085 observations for the 1990s; 5,006 observations for the 2000s; and 10,031 observations for the 2010s. Feeling thermometer estimates are generated for each group in each of the three decades.
In each of these models, the dependent variable is the group feeling thermometer score. The individual-level predictor variables in each of these models includes a respondent’s gender (two categories: male, female),Footnote 2 race/ethnicity (four categories: white, Black, Hispanic, other), education (five categories: less than high school completion, completed high school, some college, college graduate, graduate school), state, and congressional district. Additionally, district-level predictors (average income, percent urban, percent military, same-sex couples, percent Hispanic, and percent African American) and state-level predictors (region, percent union, and percent Evangelical or Mormon) were obtained using decennial US Census data, as well as data from the US Religion Census. Survey year is also included to account for any variation in context or questions.
Model
I generate estimates of district hostility by modeling individual responses as a function of individual-level demographic characteristics as well as district- and state-level predictors. I model this as a multilevel linear regression equation, using the lmer package in R.Footnote 3 The structure of the model estimating individual feelings toward the poor is given by the following:
The random effects across each level of these individual predictors (e.g., all five categories of education) are modeled.Footnote 4 These effects are expected to be normally distributed with a mean of 0, and a variance determined by the data. Both the district- and state-levels model random effects for each district and state (respectively) in the dataset as well as fixed effects for the other relevant predictors, while random effects are modeled for each of the four region categories:Footnote 5
Poststratification
This model is then used to generate district hostility estimates for the average member of each of 17,400 subgroups. Each of these subgroups represents a unique combination of demographic categories by which the sample is weighted: race (4), gender (2),Footnote 6 education (5), and congressional district (435).Footnote 7 Once predictions for average feeling thermometer scores are generated for each of these subgroups (from white men with less than a high school education in the first district of Alabama to non-white, Black, or Hispanic women with a graduate education in the large district of Wyoming), these estimates are then weighted according to the proportion of a district that is composed of members of these subgroups, and summed across districts.
Formally, weighted district opinion estimates are obtained using this method:
where c represents each of the forty demographic subcategories (race, gender, and education) within d, a given congressional district, θc is the prediction associated with each subcategory, and Nc is the frequency of individuals within a district that belong to a demographic subcategory. To weight my estimates, I use the calculated frequency proportions for each demographic category in each state or district. A summary of the estimates generated is given in Table 4.1, and graphical illustrations of each of the estimates produced are given in Figure 4.1.