Hostname: page-component-cd9895bd7-dzt6s Total loading time: 0 Render date: 2024-12-29T17:36:54.118Z Has data issue: false hasContentIssue false

Hierarchically Regularized Entropy Balancing

Published online by Cambridge University Press:  20 June 2022

Yiqing Xu*
Affiliation:
Department of Political Science, Stanford University, Stanford, CA, USA. E-mail: [email protected]
Eddie Yang
Affiliation:
Department of Political Science, University of California San Diego, La Jolla, CA, USA. E-mail: [email protected]
*
Corresponding author Yiqing Xu
Rights & Permissions [Opens in a new window]

Abstract

We introduce hierarchically regularized entropy balancing as an extension to entropy balancing, a reweighting method that adjusts weights for control group units to achieve covariate balance in observational studies with binary treatments. Our proposed extension expands the feature space by including higher-order terms (such as squared and cubic terms and interactions) of covariates and then achieves approximate balance on the expanded features using ridge penalties with a hierarchical structure. Compared with entropy balancing, this extension relaxes model dependency and improves the robustness of causal estimates while avoiding optimization failure or highly concentrated weights. It prevents specification searches by minimizing user discretion in selecting features to balance on and is also computationally more efficient than kernel balancing, a kernel-based covariate balancing method. We demonstrate its performance through simulations and an empirical example. We develop an open-source R package, hbal, to facilitate implementation.

Type
Letter
Copyright
© The Author(s) 2022. Published by Cambridge University Press on behalf of the Society for Political Methodology

1 Introduction

Entropy balancing (ebal) is a popular reweighting method that aims at estimating the average treatment on the treated (ATT) using nonexperimental data with binary treatments (Hainmueller Reference Hainmueller2012). It adjusts the weights for the control units to achieve exact covariate balance by solving the following constrained maximization problem:

$$ \begin{align*} \max_{w} H(w) &= -\sum_{i\in\mathcal{C}} w_{i}log(w_{i}/q_{i}),\\[6pt] s.t. &\qquad \sum_{i\in\mathcal{C}}w_{i}G_{ij} =m_{j} \mbox{ for } j \in 1, \dots, J,\\[6pt] &\qquad \sum_{i\in\mathcal{C}}w_{i} =1 \text{ and } w \ge 0 \mbox{ for all } i\in\mathcal{C}, \end{align*} $$

in which $w = \{w_{i}\}_{i\in \mathcal {C}}$ is set of solution weights for units in the control group (denoted as $\mathcal {C}$ ); $q_{i}>0$ is the base weight for unit i (and $\sum _{i\in \mathcal {C}}q_{i}=1$ ); $H(\cdot )$ is the Kullback–Leibler divergence between the distributions of the solution weights and base weights; and $\sum _{i\in \mathcal {C}}w_{i}G_{ij}=m_{j}$ specifies a set of J balance constraints, where $G\in {\mathbb {R}}^{J}$ includes J pretreatment covariates and $m_{j}$ is the mean of the jth covariate of the treatment group.

Despite its appealing properties, such as exact balance, computational efficiency, and double robustness, ebal has two main drawbacks. First, it requires researchers to specify the moments of the covariates to be balanced on, which leaves room for specification searching and selective reporting. Second, when the number of control units is small relative to the number of available covariates, the algorithm either does not converge or generates highly concentrated weights. As a result, researchers face a dilemma that balancing on too few terms leads to biases while balancing on too many terms may be infeasible or induce high variance due to extreme weights. To address these problems, we propose hierarchically regularized entropy balancing (hbal) as an extension to ebal. hbal achieves approximate balance on reasonably flexible functions of the covariates through a ridge-regularized entropy balancing framework.

2 Approximate Balancing with Hierarchical Regularization

Hainmueller (Reference Hainmueller2012) proves that the global solution for each unit i’s weight exists and is given by $w_{i}^{ebal}=\frac {q_{i}\exp {(-G_{i}^{\prime }Z)}}{\sum _{D_{i} = 0}q_{i}\exp {(-G_{i}^{\prime }Z)}}$ , where $Z = \{\lambda _{1}, \lambda _{2} \dots , \lambda _{J}\}^{\prime }$ is a set of Lagrange multipliers for the balance and normalizing constraints and $D_{i} = \{0, 1\}$ is the binary treatment indicator. Using the Lagrangian multipliers and the solution weights $w_{i}^{ebal}$ , the constrained optimization problem can be rewritten as the following dual problem:

(1) $$ \begin{align} \min_{Z} L^{d} = \log \left(\sum_{i\in\mathcal{C}}q_{i}\exp{\left(-\sum_{j=1}^{J}\lambda_{j}G_{ij}\right)}\right) + \sum_{j=1}^{J}\lambda_{j}m_{j}. \end{align} $$

After obtaining $w_{i}^{ebal}$ , one can use a difference in means (DIM) approach to estimate the ATT:

$$ \begin{align*}\hat\tau_{ebal} = \frac{1}{n_{1}} \sum_{D_{i} = 1} Y_{i} - \sum_{D_{i} = 0} w_{i}^{ebal}Y_{i},\end{align*} $$

in which $n_{1}$ is the number of treated units. Zhao and Percival (Reference Zhao and Percival2017) show that Problem (1) is an M-estimator for the propensity score with a logistic link using G as predictors, and the solution weights $w_{i}^{ebal}$ belong to a class of inverse probability weights. They also show that under strong ignorability and positivity, $\hat \tau _{ebal}$ is a doubly robust estimator for the ATT: when either the untreated potential outcome $Y_{i}(0)$ or treatment assignment is linear in G, $\hat \tau _{ebal}$ is consistent (see the Supplementary Material (SM) for details).

In practice, the linearity assumption may be unrealistic. To make this assumption more plausible, we can conduct a series expansion of G, for example, by including higher-order terms and various kinds of interactions, obtaining $X\in \mathbb {R}^{T}$ , $(T \gg J)$ . However, exact balancing on high-dimensional X is often infeasible; even when it is, the large number of balancing constraints may cause the solution weights to be heavily concentrated on a few control units, resulting in high variance of the ATT estimates. hbal addresses this problem by modifying the objective function in (1), that is, adding an $\ell ^{2}$ penalty with a hierarchical structure to the Lagrangian multipliers Z:

(2) $$ \begin{align} \min_{Z^{+}} L^{d} =\log \left(\sum_{D_{i} = 0}q_{i}\exp{\left(-\sum_{t=1}^{T}\lambda_{t}X_{it}\right)}\right) + \sum_{t=1}^{T}\lambda_{t}m_{t} + \sum_{k=1}^{K} \alpha_{k}r_{k}, \end{align} $$

where $Z^{+} = \{\lambda _{1} \dots \lambda _{T}\}^{\prime }$ is a vector of Lagrangian multipliers corresponding to T moment conditions. $\sum _{k=1}^{K} \alpha _{k}r_{k}$ is the newly added penalty term, in which $\alpha _{k}$ is a scalar tuning parameter that adjusts the strength of penalty for the kth group, for $k = 1, 2, \dots , K$ ; $r_{k} = \sum _{t \in P_{k}} \lambda _{t}^{2}$ is the squared $\ell ^{2}$ norm of the Lagrangian multipliers ( $\lambda _{t}$ ) for moment conditions in the kth group, in which $P_{k}$ is the set of their indices. We choose $\ell ^{2}$ penalty over $\ell ^{1}$ penalty mainly because the former is twice differentiable, making computation much more efficient. This grouped structure allows differential strengths of regularization to be applied to different groups of balance constraints and prioritizes feature groups that have heavy influence on the overall covariate balance between the treatment and control groups. For example, it is possible that two-way interactions are more important to the overall balance in an application than the squared terms of the covariates (see the SM for a performance comparison of hierarchical and nonhierarchical regularization). This optimization problem gives $w_{i}^{hbal}=\frac {q_{i}\exp {(-X^{\prime }Z^{+})}}{\sum _{D_{i} = 0}q_{i}\exp {(-X^{\prime }Z^{+})}}$ ( $i\in \mathcal {C}$ ) and

$$ \begin{align*}\hat\tau_{hbal} = \frac{1}{n_{1}} \sum_{D_{i} = 1} Y_{i} - \sum_{D_{i} = 0} w_{i}^{hbal}Y_{i}.\end{align*} $$

2.1 Implementation Details

Implementing hbal involves several technical details, such as grouping moment conditions, selecting tuning parameters, and prescreening covariates. Due to space limitations, we only provide a sketch here and offer more details in the SM.

In terms of grouping, we put all the level terms of G in the first group ( $k=1$ ); two-way interactions ( $k=2$ ), squared terms ( $k=3$ ), three-way interactions ( $k=4$ ), interactions between square and level terms ( $k=5$ ), and cubic terms ( $k=6$ ) each represent a separate group. Because the Lagrangian multipliers can be interpreted as covariate coefficients in a logistic regression for propensity scores, shrinking the Lagrangian multipliers differentially enables us to prioritize groups of features in the expanded covariate space that are the more predictive of propensity scores. By default, we impose a hierarchical structure by setting $\alpha _{1} = 0$ , that is, hbal seeks exact balance on the level terms just like ebal, and only regularizing higher moment constraints. When $\alpha _{2} = \alpha _{3} = \cdots = \alpha _{K} = 0$ , hbal is reduced to ebal applied to the full series expansion of the covariates. To select the tuning parameters, we combine a trust-region optimization method (Powell Reference Powell1994) with a V-fold cross-validation procedure that minimizes mean absolute error of expanded covariate balance between the held-out subsample of control units and the treated units. This procedure encapsulates the intuition that the optimal Lagrangian multipliers, based on which the solution weights are constructed, should generalize to randomly selected held-out data and result in approximate covariate balance. While the proposed approach is data-driven, we also allow researcher to incorporate their prior knowledge when applying hbal, for example, by imposing custom covariate groupings and by specifying the parameter space of the tuning parameters.

2.2 Combining hbal with an Outcome Model

When the number of control units is small relative to the number of moment conditions, hbal only achieves approximate balance. To remove bias caused by the remaining imbalance, we suggest combining hbal with an outcome model that includes the same set of covariates in the moment conditions, which we label as hbal+. Because hbal gives higher weights to units that have similar propensity scores to those of the treated units, this strategy leads to efficiency gain for a regression-based double selection approach. When combined with an outcome model, an ATT estimator is given by

$$ \begin{align*}\hat{\tau}_{hbal+} = \frac{1}{n_{1}}\sum_{D_{i}=1}(Y_{i}-\hat{g}_{0}(G_{i})) - \sum_{D_{i}=0}w^{hbal}(Y_{i}-\hat{g}_{0}(G_{i})),\end{align*} $$

where $\hat {g}_{0}(G_{i}) = X_{i}^{\prime }\hat \beta $ is based on a linear regression on the expanded features. Zhao and Percival (Reference Zhao and Percival2017) show that $\hat {\tau }_{hbal+}$ is consistent for the ATT when either $g_{0}$ is linear in $X_{i}$ or $w^{hbal}$ converges to the logit of the true propensity scores. With nonzero tuning parameters, hbal achieves exact balancing on G and only approximate balance on $X\backslash G$ . Hence, if the true propensity scores depend on $X\backslash G$ , $w^{hbal}$ does not converge to the logit of the true propensity scores, in which case a correctly specified outcome model $g_{0}$ ensures the consistency of $\hat {\tau }_{hbal+}$ .

2.3 Related Work

hbal builds on a class of preprocessing methods that explicitly seek to achieve approximate covariate balance for causal inference purposes (e.g., Athey, Imbens, and Wager Reference Athey, Imbens and Wager2018; Hazlett Reference Hazlett2020; Imai and Ratkovic Reference Imai and Ratkovic2014; Ning, Sida, and Imai Reference Ning, Sida and Imai2020; Zubizarreta Reference Zubizarreta2015). These methods are shown to estimate propensity scores with loss functions targeting good covariate balance (Ben-Michael, Feller, and Rothstein Reference Ben-Michael, Feller and Rothstein2021; Wang and Zubizarreta Reference Wang and Zubizarreta2019; Zhao and Percival Reference Zhao and Percival2017). hbal extends this line of research in that it aims at achieving approximate balance in a large covariate space. Hence, hbal’s solution weights can be interpreted as penalized propensity scores with a special loss function. Moreover, the balancing approach is closely connected to the survey literature on calibrated weighting, or raking (e.g., Deming and Stephan Reference Deming and Stephan1940). The key component of hbal, hierarchical ridge regularization, shares similarity with recent research in survey methods that deal with high dimensionality of crosstabs of respondent characteristics (Ben-Michael et al. Reference Ben-Michael, Feller and Hartman2021; Caughey and Hartman Reference Caughey and Hartman2017; Tan Reference Tan2020).

3 Monte Carlo Evidence

To evaluate the performance of hbal, we conduct Monte Carlo simulations to compare hbal and hbal+ with five commonly used matching and weighting methods, including inverse propensity score weighting (PSW; e.g., Hirano, Imbens, and Ridder Reference Hirano, Imbens and Ridder2003), covariate balancing propensity score (CBPS; Imai and Ratkovic Reference Imai and Ratkovic2014), coarsened exact matching (CEM; Iacus, King, and Porro Reference Iacus, King and Porro2012), ebal (Hainmueller Reference Hainmueller2012), and kernel balancing (Hazlett Reference Hazlett2020). To illustrate the advantage of hierarchical regularization, we also report the results from using ebal to balance on the serially expanded covariate set (ebal*). The naive DIM (Raw) estimator is also included as a benchmark.

3.1 Design

We use six covariates $G = \{G_{1}, G_{2}, G_{3}, G_{4}, G_{5}, G_{6}\}$ , in which $G_{1}, \dots , G_{4}$ are drawn independently from a multivariate normal distribution with mean $0$ , variance $1$ , and covariances of $0.2$ ; $G_{5}$ and $G_{6}$ are independently drawn from Bernoulli distributions with probability of success $0.3$ and $0.2$ , respectively. To simulate a complex treatment assignment process, we use all level terms and a random subset of higher-order terms to be relevant for treatment assignment and generate their individual effect from a normal distribution. Specifically, the treatment assignment indicator is given by $D=\mathbf {1}\{f(G)-2 +\epsilon> 0\}$ , where $f(G)$ is a linear combination of the subset of the serial expansion of G and $\epsilon \stackrel {i.i.d.}{\sim } N(0, \sqrt {8})$ .Footnote 1 To capture the empirical variability in the outcome-generating process, we consider three outcome designs with increasing degree of nonlinearity: (1) linear: $Y = G_{1}+G_{2}+ G_{3}-G_{4} +G_{5} +G_{6} + u$ ; (2) nonlinear $Y=G_{1} + G_{1}^{2} - G_{4}G_{6} +u$ ; and (3) trigonometric: $Y=2\times \cos (G_{1}) - \sin (\pi \times G_{2})+u$ , with $u \stackrel {i.i.d.}{\sim } N(0, 1)$ and the true treatment effect fixed at $0$ for all units. For PSW, CBPS, CEM, ebal, and kbal, we include only the original variables G. Note that kbal expands the covariate space through Gaussian kernels; for ebal*, hbal, and hbal+, we include all 69 covariates in the third-degree series expansion of G.

3.2 Results

Figure 1 presents the simulation result with sample size $N=900$ and control to treatment ratio 5:1. We report additional results with varying outcome designs, sample sizes, and control to treatment ratios in the SM. The comparative performance of hbal (or hbal+) is similar across different simulation setups.

Figure 1 Monte Carlo evidence. The top-left, top-right, and bottom-left panels correspond to results from outcome designs 1, 2, and 3, respectively. All three designs share the same treatment assignment mechanism, that is, $D=\mathbf {1}\{f(G)-2 +\epsilon> 0\}$ . The bottom-right panel compares the speed between kbal and hbal.

For the linear outcome design 1, most methods substantially reduce bias as compared to the naive DIM estimator. One exception is ebal*, whose poor performance is caused by nonconvergence of the ebal algorithm with many moment constraints. In comparison, hbal’s ability to discriminate balance among covariate moments leads to superior performance in both bias and variance reduction. As the outcome-generating process becomes more complex, methods that rely only on G to estimate propensity scores or weights perform poorly. Compared to ebal, ebal*, and kbal, hbal and hbal+ yield estimates with substantially less bias and smaller variance in designs 2 and 3. These results demonstrate that, when the data-generating process is not too far off from a polynomial expansion of the covariates, hbal’s hierarchical structure is able to tailor the strengths of regularization to the importance of balance constraints, thus reducing bias while maintaining a relatively small variance of the estimates. In the SM, we provide additional evidence that hierarchical regularization leads to higher correlation between the solution weights and the true propensity scores.

Figure 2 The effect of vacancy period on presidential ideological vote. The solid circle and triangle represent the ATT estimates from ebal+ using the original seven covariates and hbal+ using covariates after series expansion, respectively, with 95% confidence intervals. The gray bars represent the 95% confidence intervals of 500 ebal+ estimates using various combination of the covariates and their higher-order terms.

Moreover, hbal is also much more computationally efficient and scalable than kbal. The bottom-right panel of Figure 1 shows the average running time across $500$ samples for varying sample sizes. Across sample sizes, hbal finds the solution weights using a fraction of kbal’s time and hbal’s advantage of scalability becomes more evident as the sample size increases.

4 Promotion Prospect and Circuit Court Judge Behavior

To illustrate how hbal works in empirical settings, we replicate a recent article by Black and Owens (Reference Black and Owens2016), who study the effect of promotion prospect to the Supreme Court on the behavior of circuit court judges. Using CEM, the authors show that judges who are on the president’s shortlist to fill Supreme Court vacancies are more likely to vote in line with the president’s position during the vacancy period as compared to the nonvacancy period; they find no such effect among noncontending judges who stand little chance to be nominated to the Supreme Court. We focus on whether circuit court judges ruled in line with the president as the outcome of interest. The binary treatment variable is vacancy period (vs. nonvacancy period). To address potential confounding, the authors use CEM to match cases on seven covariates that might influence a judge’s behavior and the treatment, such as the judge’s Judicial Common Space score, the judge’s ideological alignment with the president, and whether the case decision was reversed by the circuit court.

In Figure 2, we compare the results from mean balancing on the level terms of the covariates using ebal+ (shown in solid circle) and from balancing on a set of serially expanded covariate using hbal+ (shown in solid triangle). To assess whether the ebal+ estimate is sensitive to different model specifications, we also include an additional 500 models in which random higher moments of the covariates are included (shown in gray). For both methods, we use the solution weights to estimate a weighted linear regression and report 95% confidence intervals based on standard errors clustered at the individual judge level. For contending judges, estimates from both methods indicate judges are more likely to rule in line with the president during vacancy periods than nonvacancy periods, although the estimate from ebal+ using the level terms only is not statistically significant at the $5\%$ level. Because hbal+’s specification includes higher-order terms that can explain additional variation in the outcome and treatment assignment, we obtain a more precise and likely more reliable estimate than ebal+’s. For noncontending judges, ebal+’s estimate suggests that noncontending judges tend to be more likely to rule in line with the president during a vacancy period, while hbal+’s estimate shows no significant difference between the vacancy and nonvacancy periods. In short, hbal+’s results are broadly in line with Black and Owens’ (Reference Black and Owens2016) original findings, whereas the estimates from ebal+ for both contending and non-contending judges vary widely depending on specifications, ranging from negative to positive effects.

5 Conclusion

In this letter, we extend ebal to hbal by introducing hierarchical regularization on the Lagrangian multiplier in the transformed objective function. It achieves approximate balance on a potentially large covariate space. Through simulations and an empirical study, we demonstrate hbal’s desirable properties in comparison to ebal and other commonly used preprocessing methods. We also show that ebal is computationally more efficient than kbal, another popular covariate balancing method. hbal thus can serve as a building block for methods that seek approximate covariate balance. To facilitate implementation, we develop an open source routine, hbal, in R.

In the SM, we provide more details about the identifying assumptions and theoretical guarantee of hbal, its algorithm, implementation procedure, and inferential method, as well as additional information on the simulation results and the empirical example. We also apply hbal to the famous Lalonde data and find reassuring results, which are provided in the SM.

Acknowledgements

We thank Ryan Black, Avi Feller, Justin Grimmer, Jens Hainmueller, Chad Hazlett, Molly Offer-Westort, Xiang Zhou, seminar participants at Stanford and UCSD, the Editor of this article Jeff Gill, as well as four anonymous reviewers for extremely helpful comments.

Data Availability Agreement

Replication data and code for this article have been published at Harvard Dataverse at https://doi.org/10.7910/DVN/QI2WP9 (Xu and Yang Reference Xu and Yang2022).

Supplementary Material

For supplementary material accompanying this paper, please visit https://doi.org/10.1017/pan.2022.12.

Footnotes

Edited by Jeff Gill

1 The selected covariates are $G_{1}$ , $G_{2}$ , $G_{3}$ , $G_{4}$ , $G_{5}$ , $G_{6}$ , $G_{1}G_{2}$ , $G_{1}G_{5}$ , $G_{2}G_{3}$ , $G_{2}G_{4}$ , $G_{2}G_{6}$ , $G_{3}G_{6}$ , $G_{1}^{2}$ , $G_{2}^{2}$ , $G_{3}^{2}$ , and $G_{4}^{2}$ . The selection of higher-order terms is a random draw from square and two-way interaction terms of the covariates. We also include an intercept of $-2$ in $f(G)$ so that it is centered close to $0$ . Replication data and code are available at https://doi.org/10.7910/DVN/QI2WP9 (Xu and Yang Reference Xu and Yang2022).

References

Athey, S., Imbens, G. W., and Wager, S.. 2018. “Approximate Residual Balancing: Debiased Inference of Average Treatment Effects in High Dimensions.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 80 (4): 597623.10.1111/rssb.12268CrossRefGoogle Scholar
Ben-Michael, E., Feller, A., and Hartman, E.. 2021. “Multilevel Calibration Weighting for Survey Data.” Preprint, arXiv:2102.09052.Google Scholar
Ben-Michael, E., Feller, A., and Rothstein, J.. 2021. “The Augmented Synthetic Control Method.” Journal of the American Statistical Association 116 (536): 17891803.10.1080/01621459.2021.1929245CrossRefGoogle Scholar
Black, R. C., and Owens, R. J.. 2016. “Courting the President: How Circuit Court Judges Alter Their Behavior for Promotion to the Supreme Court.” American Journal of Political Science 60 (1): 3043.10.1111/ajps.12132CrossRefGoogle Scholar
Caughey, D., and Hartman, E.. 2017. “Target Selection as Variable Selection: Using the Lasso to Select Auxiliary Vectors for the Construction of Survey Weights.” Available at SSRN 3494436.CrossRefGoogle Scholar
Deming, W. E., and Stephan, F. F.. 1940. “On a Least Squares Adjustment of a Sampled Frequency Table when the Expected Marginal Totals Are Known.” The Annals of Mathematical Statistics 11 (4): 427444.10.1214/aoms/1177731829CrossRefGoogle Scholar
Hainmueller, J. 2012. “Entropy Balancing for Causal Effects: A Multivariate Reweighting Method to Produce Balanced Samples in Observational Studies.” Political Analysis 20 (1): 2546.10.1093/pan/mpr025CrossRefGoogle Scholar
Hazlett, C. 2020. “Kernal Balancing: A Flexible Non-Parametric Weighting Procedure for Estimating Causal Effects.” Statistica Sinica 30 (3): 11551189.Google Scholar
Hirano, K., Imbens, G. W., and Ridder, G.. 2003. “Efficient Estimation of Average Treatment Effects using the Estimated Propensity Score.” Econometrica 71 (4): 11611189.10.1111/1468-0262.00442CrossRefGoogle Scholar
Iacus, S. M., King, G., and Porro, G.. 2012. “Causal Inference without Balance Checking: Coarsened Exact Matching.” Political analysis 20 (1): 124.CrossRefGoogle Scholar
Imai, K., and Ratkovic, M.. 2014. “Covariate Balancing Propensity Score.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 76 (1): 243263.10.1111/rssb.12027CrossRefGoogle Scholar
Ning, Y., Sida, P., and Imai, K.. 2020. “Robust Estimation of Causal Effects Via a High-Dimensional Covariate Balancing Propensity Score.” Biometrika 107 (3): 533554.10.1093/biomet/asaa020CrossRefGoogle Scholar
Powell, M. J. 1994. “A Direct Search Optimization Method That Models the Objective and Constraint Functions by Linear Interpolation.” In Advances in Optimization and Numerical Analysis, 5167. Dordrecht: Springer.10.1007/978-94-015-8330-5_4CrossRefGoogle Scholar
Tan, Z. 2020. “Regularized Calibrated Estimation of Propensity Scores with Model Misspecification and High-Dimensional Data.” Biometrika 107 (1): 137158.CrossRefGoogle Scholar
Wang, Y., and Zubizarreta, J. R.. 2019. “Minimal Dispersion Approximately Balancing Weights: Asymptotic Properties and Practical Considerations.” Biometrika 107 (1): 93105.Google Scholar
Xu, Y., and Yang, E.. 2022. “Replication Data for: Hierarchically Regularized Entropy Balancing.” https://doi.org/10.7910/DVN/QI2WP9, Harvard Dataverse, V1.CrossRefGoogle Scholar
Zhao, Q., and Percival, D.. 2017. “Entropy Balancing Is Doubly Robust.” Journal of Causal Inference 5 (1): 20160010. https://doi.org/10.1515/jci-2016-0010 CrossRefGoogle Scholar
Zubizarreta, J. R. 2015. “Stable Weights That Balance Covariates for Estimation with Incomplete Outcome Data.” Journal of the American Statistical Association 110 (511): 910922.10.1080/01621459.2015.1023805CrossRefGoogle Scholar
Figure 0

Figure 1 Monte Carlo evidence. The top-left, top-right, and bottom-left panels correspond to results from outcome designs 1, 2, and 3, respectively. All three designs share the same treatment assignment mechanism, that is, $D=\mathbf {1}\{f(G)-2 +\epsilon> 0\}$. The bottom-right panel compares the speed between kbal and hbal.

Figure 1

Figure 2 The effect of vacancy period on presidential ideological vote. The solid circle and triangle represent the ATT estimates from ebal+ using the original seven covariates and hbal+ using covariates after series expansion, respectively, with 95% confidence intervals. The gray bars represent the 95% confidence intervals of 500 ebal+ estimates using various combination of the covariates and their higher-order terms.

Supplementary material: PDF

Xu and Yang supplementary material

Xu and Yang supplementary material

Download Xu and Yang supplementary material(PDF)
PDF 1.6 MB
Supplementary material: Link

Xu and Yang Dataset

Link