Article contents
Fightin' Words: Lexical Feature Selection and Evaluation for Identifying the Content of Political Conflict
Published online by Cambridge University Press: 04 January 2017
Abstract
Entries in the burgeoning “text-as-data” movement are often accompanied by lists or visualizations of how word (or other lexical feature) usage differs across some pair or set of documents. These are intended either to establish some target semantic concept (like the content of partisan frames) to estimate word-specific measures that feed forward into another analysis (like locating parties in ideological space) or both. We discuss a variety of techniques for selecting words that capture partisan, or other, differences in political speech and for evaluating the relative importance of those words. We introduce and emphasize several new approaches based on Bayesian shrinkage and regularization. We illustrate the relative utility of these approaches with analyses of partisan, gender, and distributive speech in the U.S. Senate.
- Type
- Special Issue: The Statistical Analysis of Political Text
- Information
- Political Analysis , Volume 16 , Issue 4: Special Issue: The Statistical Analysis of Political Text , Autumn 2008 , pp. 372 - 403
- Copyright
- Copyright © The Author 2009. Published by Oxford University Press on behalf of the Society for Political Methodology
Footnotes
Author's note: We would like to thank Mike Crespin, Jim Dillard, Jeff Lewis, Will Lowe, Mike MacKuen, Andrew Martin, Prasenjit Mitra, Phil Schrodt, Corwin Smidt, Denise Solomon, Jim Stimson, Anton Westveld, Chris Zorn, and participants in seminars at the University of North Carolina, Washington University, and Pennsylvania State University for helpful comments on earlier and related efforts. Any opinions, findings, and conclusions or recommendations expressed in the paper are those of the authors and do not necessarily reflect the views of the National Science Foundation.
References
- 247
- Cited by