Skip to main content Accessibility help
×
Hostname: page-component-745bb68f8f-d8cs5 Total loading time: 0 Render date: 2025-01-31T14:53:11.756Z Has data issue: false hasContentIssue false

22 - Construction Grammar and Language Models

from Part VI - Constructional Applications

Published online by Cambridge University Press:  30 January 2025

Mirjam Fried
Affiliation:
Univerzita Karlova
Kiki Nikiforidou
Affiliation:
University of Athens, Greece
Get access

Summary

Recent progress in deep learning and natural language processing has given rise to powerful models that are primarily trained on a cloze-like task and show some evidence of having access to substantial linguistic information, including some constructional knowledge. This groundbreaking discovery presents an exciting opportunity for a synergistic relationship between computational methods and Construction Grammar research. In this chapter, we explore three distinct approaches to the interplay between computational methods and Construction Grammar: (i) computational methods for text analysis, (ii) computational Construction Grammar, and (iii) deep learning models, with a particular focus on language models. We touch upon the first two approaches as a contextual foundation for the use of computational methods before providing an accessible, yet comprehensive overview of deep learning models, which also addresses reservations construction grammarians may have. Additionally, we delve into experiments that explore the emergence of constructionally relevant information within these models while also examining the aspects of Construction Grammar that may pose challenges for these models. This chapter aims to foster collaboration between researchers in the fields of natural language processing and Construction Grammar. By doing so, we hope to pave the way for new insights and advancements in both these fields.

Type
Chapter
Information
Publisher: Cambridge University Press
Print publication year: 2025

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Bahdanau, D., Cho, K., & Bengio, Y. (2016). Neural machine translation by jointly learning to align and translate. arXiv:1409.0473. https://doi.org/10.48550/arXiv.1409.0473.CrossRefGoogle Scholar
Balasubramanian, S., Jain, N., Jindal, G., Awasthi, A., & Sarawagi, S. (2020). What’s in a name? Are BERT named entity representations just as good for any other name? arXiv:2007.06897. https://doi.org/10.48550/arXiv.2007.06897.CrossRefGoogle Scholar
Baroni, M., Bernardini, S., Ferraresi, A. & Zanchetta, E. (2009). The WaCky wide web: A collection of very large linguistically processed web-crawled corpora. Language Resources and Evaluation, 43(3), 209226.CrossRefGoogle Scholar
Beniaguev, D., Segev, I., & London, M. (2021). Single cortical neurons as deep artificial neural networks. Neuron, 109(17), 27272739.CrossRefGoogle ScholarPubMed
Bergen, B. K. & Chang, N. (2005). Embodied Construction Grammar in simulation-based language understanding. In Östman, J.-O. & Fried, M., eds., Construction Grammars: Cognitive Grounding and Theoretical Extensions. Amsterdam & Philadelphia: John Benjamins, pp. 147190.CrossRefGoogle Scholar
Bonial, C. & Tayyar Madabushi, H. (2023). Proceedings of the First International Workshop on Construction Grammars and NLP (CxGs+NLP, GURT/SyntaxFest 2023). Washington, DC: Association for Computational Linguistics.Google Scholar
Borin, L., Dannélls, D., & Grūzītis, N. (2018). Linguistics vs. language use in constructicon building and use. In Lyngfelt, B., Borin, L., Ohara, K., & Torrent, T., eds., Constructicography. Amsterdam & Philadelphia: John Benjamins, pp. 229253.CrossRefGoogle Scholar
Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., & Amodei, D. (2020). Language models are few-shot learners. arXiv:2005.14165. https://doi.org/10.48550/arXiv.2005.14165.CrossRefGoogle Scholar
Chi, E. A., Hewitt, J., & Manning, C. D. (2020). Finding universal grammatical relations in multilingual BERT. arXiv:2005.04511. https://doi.org/10.18653/v1/2020.acl-main.493.CrossRefGoogle Scholar
Chomsky, N. (1959). Review of Skinner’s Verbal Behavior. Language, 35, 2658.CrossRefGoogle Scholar
Church, K. W. (2020). Emerging trends: Subwords, seriously? Natural Language Engineering, 26(3), 375382.CrossRefGoogle Scholar
Clark, K., Luong, M.-T., Le, Q. V., & Manning, C. D. (2020). Electra: Pre-training text encoders as discriminators rather than generators. arXiv:2003.10555. https://doi.org/10.48550/arXiv.2003.10555.CrossRefGoogle Scholar
Croft, W. & Cruse, D. A. (2004). Cognitive Linguistics. New York: Cambridge University Press.CrossRefGoogle Scholar
Dai, A. M. & Le, Q. V. (2015). Semi-supervised sequence learning. In Proceedings of the 28th International Conference on Neural Information Processing Systems, Volume 2, pp. 30793087.Google Scholar
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805. https://doi.org/10.48550/arXiv.1810.04805.CrossRefGoogle Scholar
Diessel, H. (2013). Construction Grammar and first language acquisition. In Hoffmann, T. & Trousdale, G., eds., The Oxford Handbook of Construction Grammar. Oxford: Oxford University Press, pp. 346364.Google Scholar
Dunn, J. (2017). Computational learning of Construction Grammars. Language and Cognition, 9(2), 254292.CrossRefGoogle Scholar
Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14(2), 179211.CrossRefGoogle Scholar
Feldman, J. (2020). Advances in Embodied Construction Grammar. Constructions and Frames, 12(1), 149169.CrossRefGoogle Scholar
Fillmore, C. J. (1977). The case for case reopened. In Cole, P., ed., Grammatical Relations. New York: Academic Press, pp. 5981.CrossRefGoogle Scholar
Firth, J. (1957). A synopsis of linguistic theory 1930–1955. In Studies in Linguistic Analysis (Special Volume of the Philological Society). Oxford: Basil Blackwell, pp. 1–32.Google Scholar
Garí Soler, A. & Apidianaki, M. (2021). Let’s play mono-poly: BERT can reveal words’ polysemy level and partitionability into senses. Transactions of the Association for Computational Linguistics, 9, 825844.CrossRefGoogle Scholar
Goldberg, A. E. (1995). Constructions: A Construction Grammar Approach to Argument Structure. Chicago: University of Chicago Press.Google Scholar
Gow-Smith, E. & Tayyar Madabushi, H. (2022). Improving tokenisation by alternative treatment of spaces. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022), Abu Dhabi. Association for Computational Linguistics, pp. 1143011433.CrossRefGoogle Scholar
Gulordava, K., Bojanowski, P., Grave, E., Linzen, T., & Baroni, M. (2018). Colorless green recurrent networks dream hierarchically. Paper presented at the Conference of the NAACL-HLT, New Orleans, Louisiana, June 1–6, 2018.CrossRefGoogle Scholar
Haber, J. & Poesio, M. (2021). Patterns of lexical ambiguity in contextualised language models. arXiv:2109.13032. https://doi.org/10.48550/arXiv.2109.13032.CrossRefGoogle Scholar
Hart, B. & Risley, T. R. (1992). American parenting of language-learning children: Persisting differences in family–child interactions observed in natural home environments. Developmental Psychology, 28(6), 10961105.CrossRefGoogle Scholar
Hassan, H., Aue, A., Chen, C., Chowdhary, V., Clark, J., Federmann, C., Huang, X., Junczys-Dowmunt, M., Lewis, W., & Li, M. (2018). Achieving human parity on automatic Chinese to English news translation. arXiv:1803.05567. https://doi.org/10.48550/arXiv.1803.05567.CrossRefGoogle Scholar
He, P., Liu, X., Gao, J. & Chen, W. (2021). DeBERTa: Decoding-enhanced BERT with disentagled attention. arXiv:2006.03654v6. https://doi.org/10.48550/arXiv.2006.03654.CrossRefGoogle Scholar
Henderson, J. (2020). The unstoppable rise of computational linguistics in deep learning. Paper presented at the 58th Annual Meeting of the Association for Computational Linguistics, 62946306.CrossRefGoogle Scholar
Hewitt, J. & Manning, C. D. (2019). A structural probe for finding syntax in word representations. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 41294138.Google Scholar
Hilpert, M. & Perek, F. (2015). Meaning change in a petri dish: Constructions, semantic vector spaces, and motion charts. Linguistic Vanguard, 1(1), 339350.CrossRefGoogle Scholar
Hochreiter, S. & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 17351780.CrossRefGoogle ScholarPubMed
Hoffmann, T. & Trousdale, G. (2013). Construction Grammar: Introduction. In Hoffmann, T. & Trousdale, G., eds., The Oxford Handbook of Construction Grammar. Oxford: Oxford University Press, pp. 114.Google Scholar
Janda, L. A., Endresen, A., Zhukova, V., Mordashova, D., & Rakhilina, E. (2020). How to build a constructicon in five years. Belgian Journal of Linguistics, 34(1), 161173.CrossRefGoogle Scholar
Kim, T., Choi, J., Edmiston, D., & Lee, S.-G. (2020). Are pre-trained language models aware of phrases? Simple but strong baselines for grammar induction. arXiv:2002.00737. https://doi.org/10.48550/arXiv.2002.00737.CrossRefGoogle Scholar
Knight, K., Badarau, B., Baranescu, L., Bonial, C., Bardocz, M., Griffitt, K., Hermjakob, U., Marcu, D., Palmer, M., O’Gorman, T., & Schneider, N. (2021). Abstract Meaning Representation (AMR) Annotation Release 3.0. https://hdl.handle.net/11272.1/AB2/82CVJF.Google Scholar
Kudo, T. (2018). Subword regularization: Improving neural network translation models with multiple subword candidates. arXiv:1804.10959. https://doi.org/10.48550/arXiv.1804.10959.CrossRefGoogle Scholar
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., & Soricut, R. (2019). Albert: A lite bert for self-supervised learning of language representations. arXiv:1909.11942. https://doi.org/10.48550/arXiv.1909.11942.CrossRefGoogle Scholar
Landauer, T. K. & Dumais, S. T. (1997). A solution to Plato’s problem: The latent semantic analysis of acquisition induction, and representation of knowledge. Psychological Review, 104(2), 211240.CrossRefGoogle Scholar
Levshina, N. & Heylen, K. (2014). A radically data-driven Construction Grammar: Experiments with Dutch causative constructions. In Boogaart, R., Colleman, T., & Rutten, G., eds., Extending the Scope of Construction Grammar. Berlin: De Gruyter Mouton, pp. 1746.CrossRefGoogle Scholar
Levy, O. & Goldberg, Y. (2014). Neural word embedding as implicit matrix factorization. Advances in Neural Information Processing Systems, 27, 21772185.Google Scholar
Levy, O., Goldberg, Y., & Dagan, I. (2015). Improving distributional similarity with lessons learned from word embeddings. Transactions of the Association for Computational Linguistics, 3, 211225.CrossRefGoogle Scholar
Li, B., Zhu, Z., Thomas, G., Rudzizc, F., & Xu, Y. (2022). Neural reality of argument structure constructions. arXiv:2202.12246. https://doi.org/10.48550/arXiv.2202.12246.CrossRefGoogle Scholar
Linzen, T. & Baroni, M. (2021). Syntactic structure from deep learning. Annual Review of Linguistics, 7, 195212.CrossRefGoogle Scholar
Liu, N. F., Gardner, M., Belinkov, Y., Peters, M. E., & Smith, N. A. (2019). Linguistic knowledge and transferability of contextual representations. arXiv:1903.08855. https://doi.org/10.48550/arXiv.1903.08855.CrossRefGoogle Scholar
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., & Stoyanov, V. (2019). RoBERTa: A robustly optimized BERT pretraining approach. arXiv:1907.11692. https://doi.org/10.48550/arXiv.1907.11692.CrossRefGoogle Scholar
Loureiro, D. & Jorge, A. (2019). Language modelling makes sense: Propagating representations through WordNet for full-coverage word sense disambiguation. arXiv:1906.10007. https://doi.org/10.48550/arXiv.1906.10007.CrossRefGoogle Scholar
Loureiro, D., Jorge, A. M., & Camacho-Collados, J. (2022). LMMS reloaded: Transformer-based sense embeddings for disambiguation and beyond. Artificial Intelligence, 305, 103661. https://doi.org/10.1016/j.artint.2022.103661.CrossRefGoogle Scholar
Lund, K. & Burgess, C. (1996). Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instrumentation, and Computers, 28, 203208.CrossRefGoogle Scholar
Lyngfelt, B., Bäckström, L., Borin, L., Ehrlemark, A., & Rydstedt, R. (2018). Constructicography at work. In Lyngfelt, B. et al., eds., Constructicography: Constructicon Development across Languages. Amsterdam & Philadelphia: John Benjamins, pp. 41106.CrossRefGoogle Scholar
Lyngfelt, B., Borin, L., Ohara, K., & Torrent, T. T., eds. (2018). Constructicography: Constructicon Development across Languages. Amsterdam & Philadelphia: John Benjamins.CrossRefGoogle Scholar
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv:1301.3781. https://doi.org/10.48550/arXiv.1301.3781.CrossRefGoogle Scholar
Nair, S., Srinivasan, M., & Meylan, S. (2020). Contextualized word embeddings encode aspects of human-like word sense knowledge. arXiv:2010.13057. https://doi.org/10.48550/arXiv.2010.13057.CrossRefGoogle Scholar
Nevens, J., Van Eecke, P., & Beuls, K. (2019). A practical guide to studying emergent communication through grounded language games. Paper presented at the AISB Language Learning for Artificial Agents Symposium, Falmouth, UK.Google Scholar
Nivre, J., De Marneffe, M.-C., Ginter, F., Goldberg, Y., Hajic, J., Manning, C. D., McDonald, R., Petrov, S., Pyysalo, S., & Silveira, N. (2016). Universal dependencies v1: A multilingual treebank collection. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC ’16), pp. 16591666.Google Scholar
Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kübler, S., Marinov, S., & Marsi, E. (2007). MaltParser: A language-independent system for data-driven dependency parsing. Natural Language Engineering, 13(2), 95135.CrossRefGoogle Scholar
Perek, F. (2016). Recent change in the productivity and schematicity of the way-construction: A distributional semantics analysis. Corpus Linguistics and Linguistic Theory, 14(1), 6597.CrossRefGoogle Scholar
Piao, S., Bianchi, F., Dayrell, C., D’egidio, A., & Rayson, P. (2015). Development of the multilingual semantic annotation system. Paper presented at the Association for Computational Linguistics.CrossRefGoogle Scholar
Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training. PrePrint retrieved from https://bit.ly/3zFOl3M.Google Scholar
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8), 9.Google Scholar
Ribeiro, M. T., Wu, T., Guestrin, C., & Singh, S. (2020). Beyond accuracy: Behavioral testing of NLP models with CheckList. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, pp. 49024912.CrossRefGoogle Scholar
Rogers, A., Kovaleva, O., & Rumshisky, A. (2020). A primer in BERTology: What we know about how BERT works. Transactions of the Association for Computational Linguistics, 8, 842866.CrossRefGoogle Scholar
Romain, L. (2022). Putting the argument back into argument structure construction. Cognitive Linguistics, 33(1), 3564.CrossRefGoogle Scholar
Rosa, R. & Mareček, D. (2019). Inducing syntactic trees from BERT representations. arXiv:1906.11511. https://doi.org/10.48550/arXiv.1906.11511.CrossRefGoogle Scholar
Schmid, H. (1994). TreeTagger – a language independent part-of-speech tagger. Retrieved from www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/.Google Scholar
Sennrich, R., Haddow, B., & Birch, A. (2015). Neural machine translation of rare words with subword units. arXiv:1508.07909. https://doi.org/10.48550/arXiv.1508.07909.CrossRefGoogle Scholar
Skinner, B. F. (1957). Verbal Behavior. New York: Appleton-Century-Crofts.CrossRefGoogle Scholar
Steels, L. (2004). Constructivist development of Grounded Construction Grammar. In Daelemans, W. & Walker, M., eds., Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics. Barcelona: Association for Computational Linguistic Conference, pp. 919. https://doi.org/10.3115/1218955.1218957.Google Scholar
Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Proceedings of the 27th International Conference on Neural Information Processing Systems, pp. 31043112.Google Scholar
Tang, Y., Nyengaard, J. R., De Groot, D. M., & Gundersen, H. J. G. (2001). Total regional and global number of synapses in the human brain neocortex. Synapse, 41(3), 258273.CrossRefGoogle ScholarPubMed
Tayyar Madabushi, H., Divjak, D. & Milin, P. (2022). Abstraction not memory: BERT and the English article system. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 924931.CrossRefGoogle Scholar
Tayyar Madabushi, H., Romain, L., Divjak, D., & Milin, P. (2020). CxGBERT: BERT meets Construction Grammar. In Proceedings of the 28th International Conference on Computational Linguistics, pp. 40204032.CrossRefGoogle Scholar
Tenney, I., Xia, P., Chen, B., Wang, A., Poliak, A., McCoy, R. T., Kim, N., Van Durme, B., Bowman, S. R., & Das, D. (2019). What do you learn from context? Probing for sentence structure in contextualized word representations. arXiv:1905.06316. https://doi.org/10.48550/arXiv.1905.06316.CrossRefGoogle Scholar
Tomasello, M. (2000). First steps toward a usage-based theory of language acquisition. Cognitive Linguistics, 11(1–2), 6182.CrossRefGoogle Scholar
Tseng, Y.-H., Shih, C.-F., Chen, P.-E., Chou, H.-Y., Ku, M.-C., & Hsieh, S.-K. (2022). CxLM: A construction and context-aware language model. In Proceedings of the 13th Conference on Language Resources and Evaluation, Marseille, 20–25 June 2022, pp. 63616369.Google Scholar
Van Eecke, P. & Beuls, K. (2018). Exploring the creative potential of computational Construction Grammar. Zeitschrift für Anglistik und Amerikanistik, 66(3), 341355.CrossRefGoogle Scholar
van Trijp, R. (2017). A computational Construction Grammar for English. In The AAAI 2017 Spring Symposium on Computational Construction Grammar and Natural Language Understanding Technical Report SS-17-02. Stanford: AAAI, pp. 266273.Google Scholar
van Trijp, R., Beuls, K., & Van Eecke, P. (2022). The FCG Editor: An innovative environment for engineering computational construction grammars. PLoS ONE, 17(6). https://doi.org/10.1371/journal.pone.0269708.CrossRefGoogle ScholarPubMed
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems, 30, pp. 59986008.Google Scholar
Vilares, D., Strzyz, M., Søgaard, A., & Gómez-Rodríguez, C. (2020). Parsing as pretraining. In Proceedings of the AAAI Conference on Artificial Intelligence 34(5), pp. 91149121.CrossRefGoogle Scholar
Vulić, I., Ponti, E. M., Litschko, R., Glavaš, G., & Korhonen, A. (2020). Probing pretrained language models for lexical semantics. arXiv:2010.05731. https://doi.org/10.48550/arXiv.2010.05731.CrossRefGoogle Scholar
Wei, J., Tay, Y., Bommasani, R., Raffel, C., Zoph, B., Borgeaud, S., Yogatama, D., Bosma, M., Zhou, D., Metzler, D., Chi, E. H., Hashimoto, T., Vinyals, O., Liang, P., Dean, J., & Fedus, W. (2022). Emergent abilities of large language models. Transactions on Machine Learning Research. https://openreview.net/forum?id=yzkSU5zdwD.Google Scholar
Weissweiler, L., Hofmann, V., Köksal, A., & Schütze, H. (2022). The better your syntax, the better your semantics? Probing pretrained language models for the English comparative correlative. arXiv:2210.13181. https://doi.org/10.48550/arXiv.2210.13181.CrossRefGoogle Scholar
Yenicelik, D., Schmidt, F., & Kilcher, Y. (2020). How does BERT capture semantics? A closer look at polysemous words. In Proceedings of the Third Blackbox NLP Workshop on Analyzing and Interpreting Neural Networks for NLP, pp. 156162.CrossRefGoogle Scholar
Zhang, Y., Warstadt, A., Li, H.-S., & Bowman, S. R. (2020). When do you need billions of words of pretraining data? arXiv:2011.04946. https://doi.org/10.48550/arXiv.2011.04946.CrossRefGoogle Scholar
Zhu, Y., Kiros, R., Zemel, R., Salakhutdinov, R., Urtasun, R., Torralba, A., & Fidler, S. (2015). Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In Proceedings of the IEEE International Conference on Computer Vision, pp. 1927.CrossRefGoogle Scholar

Save book to Kindle

To save this book to your Kindle, first ensure [email protected] is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×