Hostname: page-component-78c5997874-xbtfd Total loading time: 0 Render date: 2024-11-14T09:34:06.816Z Has data issue: false hasContentIssue false

A Timely Intervention: Tracking the Changing Meanings of Political Concepts with Word Vectors

Published online by Cambridge University Press:  25 July 2019

Emma Rodman*
Affiliation:
Department of Political Science, University of Washington, Seattle, WA 98195, USA. Email: [email protected]

Abstract

Word vectorization is an emerging text-as-data method that shows great promise for automating the analysis of semantics—here, the cultural meanings of words—in large volumes of text. Yet successes with this method have largely been confined to massive corpora where the meanings of words are presumed to be fixed. In political science applications, however, many corpora are comparatively small and many interesting questions hinge on the recognition that meaning changes over time. Together, these two facts raise vexing methodological challenges. Can word vectors trace the changing cultural meanings of words in typical small corpora use cases? I test four time-sensitive implementations of word vectors (word2vec) against a gold standard developed from a modest data set of 161 years of newspaper coverage. I find that one implementation method clearly outperforms the others in matching human assessments of how public dialogues around equality in America have changed over time. In addition, I suggest best practices for using word2vec to study small corpora for time series questions, including bootstrap resampling of documents and pretraining of vectors. I close by showing that word2vec allows granular analysis of the changing meaning of words, an advance over other common text-as-data methods for semantic research questions.

Type
Articles
Copyright
Copyright © The Author(s) 2019. Published by Cambridge University Press on behalf of the Society for Political Methodology. 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Author’s note: Replication materials for this paper are available (Rodman 2019). This work was supported by the Center for American Politics and Public Policy (CAPPP) at the University of Washington and by the National Science Foundation [#1243917]. I am grateful for the invaluable advice and feedback received at various stages of this project from Chris Adolph, Jeffrey Arnold, Andreu Casas, Ryan Eastridge, Aziz Khan, Brendan O’Connor, Brandon Stewart, Rebecca Thorpe, Nora Webb Williams, and John Wilkerson, as well as from participants at the Ninth Annual Conference on New Directions in Analyzing Text as Data (TADA 2018). The paper was also much improved by thoughtful editorial and reviewer feedback at PA. Allyson McKinney and Molly Quinton contributed cheerful and diligent research assistance. This project was also improved by statistical and computational consulting provided by the Center for Statistics and the Social Sciences as well as the Center for Social Science Computation and Research, both at the University of Washington.

Contributing Editor: Daniel Hopkins

References

Antoniak, M., and Mimno, D.. 2018. “Evaluating the Stability of Embedding-based Word Similarities.” Transactions of the Association for Computational Linguistics 6:107119.Google Scholar
Arnold, J. B., Erlich, A., Jung, D. F., and Long, J. D.. 2018.“ Covering the Campaign: News, Elections, and the Information Environment in Emerging Democracies.” http://osf.io/preprints/socarxiv/af9jq.Google Scholar
Bamler, R., and Mandt, S.. 2017. “Dynamic Word Embeddings.” In Proceedings of the 34th International Conference on Machine Learning , 380389.Google Scholar
Blaydes, L., Grimmer, J., and McQueen, A.. 2018. “Mirrors for Princes and Sultans: Advice on the Art of Governance in the Medieval Christian and Islamic Worlds.” Journal of Politics 80:11501167.Google Scholar
Blei, D., Ng, A., and Jordan, M.. 2003. “Latent Dirichlet Allocation.” Journal of Machine Learning and Research 3:9931022.Google Scholar
Bolukbasi, T., Chang, K.-W., Zou, J. Y., Saligrama, V., and Kalai, A.. 2016. “Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings.” CoRR abs/1607.06520.Google Scholar
Box-Steffensmeier, J., Freeman, J., Hitt, M., and Pevehouse, J.. 2014. Time Series Analysis for the Social Sciences . New York: Cambridge University Press.Google Scholar
Bruni, E., Boleda, G., Baroni, M., and Tran, N.-K.. 2012. “Distributional Semantics in Technicolor.” In Proceedings of the Annual Meeting of the Association for Computational Linguistics , 136145.Google Scholar
Caliskan, A., Bryson, J. J., and Narayanan, A.. 2017. “Semantics Derived Automatically from Language Corpora Contain Human-Like Biases.” Science 356:183186.Google Scholar
de Bolla, P. 2013. The Architecture of Concepts: The Historical Formation of Human Rights . New York: Fordham University Press.Google Scholar
Firth, J. R. 1957. “A Synopsis of Linguistic Theory, 1930–1955.” In Studies in Linguistic Analysis , edited by Firth, J. R., 132. Oxford, UK: Basil Blackwell.Google Scholar
Foner, E. 1998. The Story of American Freedom . New York: W. W. Norton.Google Scholar
Gallie, W. B. 2013. “Essentially Contested Concepts.” Proceedings of the Aristotelian Society, New Series 56:167198.Google Scholar
Garg, N., Schiebinger, L., Jurafsky, D., and Zou, J.. 2018. “Word Embeddings Quantify 100 Years of Gender and Ethnic Stereotypes.” Proceedings of the National Academy of Sciences 115(16):E3635E3644, https://www.pnas.org/content/115/16/E3635.Google Scholar
Goldberg, Y., and Levy, O.. 2014. “word2vec Explained: Deriving Mikolov et al.’s Negative-Sampling Word-Embedding Method.” CoRR abs/1402.3722.Google Scholar
Goldman, M., and Perry, E.. 2002. Changing Meanings of Citizenship in Modern China . Cambridge, MA: Harvard University Press.Google Scholar
Grimmer, J., and Stewart, B.. 2013. “Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts.” Political Analysis 21(3):267297.Google Scholar
Gurciullo, S., and Mikhaylov, S.. 2017. “Detecting Policy Preferences and Dynamics in the UN General Debate with Neural Word Embeddings.” In IEEE Proceedings of the 2017 International Conference on the Frontiers and Advances in Data Science , 7479.Google Scholar
Hamilton, W., Leskovec, J., and Jurafsky, D.. 2016a. “Cultural Shift or Linguistic Drift? Comparing Two Computational Measures of Semantic Change.” In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing , 21162121.Google Scholar
Hamilton, W., Leskovec, J., and Jurafsky, D.. 2016b. “Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change.” In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics , 14891501.Google Scholar
Han, R., Gill, M., Spirling, A., and Cho, K.. 2018. “Conditional Word Embedding and Hypothesis Testing via Bayes-by-Backprop.” In Conference on Empirical Methods in Natural Language Processing .Google Scholar
Harris, Z. 1954. “Distributional Structure.” Word 10:146162.Google Scholar
Heuser, R.2016. “Gensim word2vec Procrustes Alignment.” Github Repository at https://gist.github.com/quadrismegistus/09a93e219a6ffc4f216fb85235535faf.Google Scholar
Hopkins, D., and King, G.. 2010. “Extracting Systematic Social Science Meaning from Text.” American Journal of Political Science 54(1):229247.Google Scholar
Howard, J., and Ruder, S.. 2018. “Universal Language Model Fine-tuning for Text Classification.” arXiv:1801.06146.Google Scholar
Huang, E. H., Socher, R., Manning, C. D., and Ng, A. Y.. 2012. “Improving Word Representations via Global Context and Multiple Word Prototypes.” In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics , 873882.Google Scholar
Iyyer, M., Enns, P, Boyd-Graber, J., and Resnik, P.. 2014. “Political Ideology Detection Using Recursive Neural Networks.” In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics , 11131122.Google Scholar
Jockers, M. L. 2013. Macroanalysis: Digital Methods and Literary History . Champaign: University of Illinois Press.Google Scholar
Jurafsky, D., and Martin, J.. 2009. Speech and Natural Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition . Upper Saddle River, NJ: Prentice Hall.Google Scholar
Kim, Y. 2014. “Convolutional Neural Networks for Sentence Classification.” arXiv:1408.5882.Google Scholar
Kim, Y., Chiu, Y.-I., Hanaki, K., Hegde, D., and Petrov, S.. 2014. “Temporal Analysis of Language through Neural Language Models.” In Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science , 6165.Google Scholar
Kulkarni, V., Al-Rfou, R., Perozzi, B., and Skiena, S.. 2015. “Statistically Significant Detection of Linguistic Change.” In Proceedings of the 24th International Conference on World Wide Web , 625635.Google Scholar
Larson, J., Angwin, J., and Parris, T.. 2016. Breaking the Black Box: How Machines Learn to be Racist, https://www.propublica.org/article/breaking-the-black-box-how-machines-learn-to-be-racist?word=Clinton.Google Scholar
Levy, O., Goldberg, Y., and Dagan, I.. 2015. “Improving Distributional Similarity with Lessons Learned from Word Embeddings.” Transactions of the Association for Computational Linguistics 3:211225.Google Scholar
Mikolov, T., Chen, K., Corrado, G., and Dean, J.. 2013. “Efficient Estimation of Word Representations in Vector Space.” arXiv:1301.3781.Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., and Dean, J.. 2013. “Distributed Representations of Words and Phrases and Their Compositionality.” In Proceedings of the 26th International Conference on Neural Information Processing Systems – Volume 2 , 31113119.Google Scholar
Mikolov, T., Yih, W.-T., and Zweig, G.. 2013. “Linguistic Regularities in Continuous Space Word Representations.” In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , 746751.Google Scholar
Mimno, D. 2012. “Computational Historiography: Data Mining in a Century of Classics Journals.” ACM Journal on Computing and Cultural Heritage 5(1):3:1–3:19.Google Scholar
Mosteller, F., and Wallace, D. L.. 1964/2008. Inference and Disputed Authorship: The Federalist . Chicago, IL: University of Chicago Press.Google Scholar
Nay, J. 2017. “Predicting and understanding law-making with word vectors and an ensemble model.” PloS One 12(5):114.Google Scholar
Newman, D. J., and Block, S.. 2006. “Probabilistic Topic Decomposition of an Eighteenth-Century American Newspaper.” Journal of the American Society for Information Science and Technology 57(6):753767.Google Scholar
Pan, S. J., and Yang, Q.. 2010. “A Survey on Transfer Learning.” IEEE Transactions on Knowledge and Data Engineering 22(10):13451359.Google Scholar
Pennington, J., Socher, R., and Manning, C. D.. 2014. “GloVe: Global Vectors for Word Representation.” EMNLP 14:15321543.Google Scholar
Quinn, K., Monroe, B., Colaresi, M., Crespin, M., and Radev, D.. 2010. “How to Analyze Political Attention with Minimal Assumptions and Costs.” American Journal of Political Science 54(1):209228.Google Scholar
Rehurek, R., and Sojka, P.. 2010. “Software Framework for Topic Modeling with Large Corpora.” In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks , 4550.Google Scholar
Reynolds, N. B., and Saxonhouse, A.. 1995. Hobbes and the Horae Subsecivae . Chicago, IL: University of Chicago Press.Google Scholar
Rhody, L. 2012. “Topic Modeling and Figurative Language.” Journal of Digital Humanities 2(1):1938.Google Scholar
Rodman, E.2019. “Replication Data for: A Timely Intervention: Tracking the Changing Meanings of Political Concepts with Word Vectors.” https://doi.org/10.7910/DVN/CGNX3M, Harvard Dataverse, V1.Google Scholar
Rong, X.2014. “word2vec Parameter Learning Explained.” arXiv:1411.2738.Google Scholar
Rudkowsky, E., Haselmayer, M., Wastian, M., Jenny, M., Emrich, S., and Sedlmair, M.. 2018. “More than Bags of Words: Sentiment Analysis with Word Embeddings.” Communication Methods and Measures 12:140157.Google Scholar
Saldana, J. 2009. The Coding Manual for Qualitative Researchers . Thousand Oaks, CA: Sage.Google Scholar
Schnabel, T., Labutov, I., Mimno, D., and Joachims, T.. 2015. “Evaluation methods for unsupervised word embeddings.” In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing , 298307.Google Scholar
Turney, P., and Pantel, P.. 2010. “From Frequency to Meaning: Vector Space Models of Semantics.” Journal of Artificial Intelligence Research 37:141188.Google Scholar
Washington, B. T. 1895/1974. “Atlanta Compromise Speech.” In The Booker T. Washington Papers , edited by Harlan, L. R., 583587. Urbana: University of Illinois Press.Google Scholar
Wilkerson, J., and Casas, A.. 2017. “Large-Scale Computerized Text Analysis in Political Science: Opportunities and Challenges.” Annual Review of Political Science 20(1):529544.Google Scholar
Yang, T.-I., Torget, A. J., and Mihalcea, R.. 2011. “Topic Modeling on Historical Newspapers.” In Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities , 96104.Google Scholar
Yao, Z., Sun, Y., Ding, W., Rao, N., and Xiong, H.. 2018. “Dynamic Word Embeddings for Evolving Semantic Discovery.” In WSDM 2018: The Eleventh ACM International Conference on Web Search and Data Mining .Google Scholar
You, Q., Luo, J., Jin, H., and Yang, J.. 2015. “Robust Image Sentiment Analysis Using Progressively Trained and Domain Transferred Deep Networks.” In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence , 381388.Google Scholar
Zhang, Y., and Wallace, B.. 2015. “A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification.” In Proceedings of the 8th International Joint Conference on Natural Language Processing , 253263.Google Scholar