Hostname: page-component-78c5997874-g7gxr Total loading time: 0 Render date: 2024-11-03T08:06:35.348Z Has data issue: false hasContentIssue false

A structural approach to the automatic adjudication of word sense disagreements

Published online by Cambridge University Press:  01 October 2008

ROBERTO NAVIGLI*
Affiliation:
Dipartimento di Informatica, University of Rome “La Sapienza”, 00198 Rome, Italy e-mail: [email protected]

Abstract

The semantic annotation of texts with senses from a computational lexicon is a complex and often subjective task. As a matter of fact, the fine granularity of the WordNet sense inventory [Fellbaum, Christiane (ed.). 1998. WordNet: An Electronic Lexical Database MIT Press], a de facto standard within the research community, is one of the main causes of a low inter-tagger agreement ranging between 70% and 80% and the disappointing performance of automated fine-grained disambiguation systems (around 65% state of the art in the Senseval-3 English all-words task). In order to improve the performance of both manual and automated sense taggers, either we change the sense inventory (e.g. adopting a new dictionary or clustering WordNet senses) or we aim at resolving the disagreements between annotators by dealing with the fineness of sense distinctions. The former approach is not viable in the short term, as wide-coverage resources are not publicly available and no large-scale reliable clustering of WordNet senses has been released to date. The latter approach requires the ability to distinguish between subtle or misleading sense distinctions. In this paper, we propose the use of structural semantic interconnections – a specific kind of lexical chains – for the adjudication of disagreed sense assignments to words in context. The approach relies on the exploitation of the lexicon structure as a support to smooth possible divergencies between sense annotators and foster coherent choices. We perform a twofold experimental evaluation of the approach applied to manual annotations from the SemCor corpus, and automatic annotations from the Senseval-3 English all-words competition. Both sets of experiments and results are entirely novel: structural adjudication allows to improve the state-of-the-art performance in all-words disambiguation by 3.3 points (achieving a 68.5% F1-score) and attains figures around 80% precision and 60% recall in the adjudication of disagreements from human annotators.

Type
Papers
Copyright
Copyright © Cambridge University Press 2008

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Agirre, Eneko and de Lacalle, Oier López. 2003. Clustering wordnet word senses. In Proceedings of Conference on Recent Advances on Natural Language (RANLP), Borovets, Bulgary, pp. 121–30.Google Scholar
Agirre, Eneko, Martínez, David, de Lacalle, Oier López, and Soroa, Aitor. 2006. Two graph-based algorithms for state-of-the-art wsd. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, pp. 585–93.Google Scholar
Barzilay, Regina and Elhadad, Michael. 1997. Using lexical chains for text summarization. In Proceedings of the ACL Workshop on Intelligent Scalable Text Summarization, Madrid, Spain, pp. 10–17.Google Scholar
Bentivogli, Luisa, Forner, Pamela, and Pianta, Emanuele. 2004. Evaluating cross-language annotation transfer in the multisemcor corpus. In Proceedings of the 20th International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland, pp. 364–70.Google Scholar
Berners-Lee, Tim. 1999. Weaving the Web. Harper, San Francisco, CA, USA.Google Scholar
Brody, Samuel, Navigli, Roberto, and Lapata, Mirella. 2006. Ensemble methods for unsupervised WSD. In Proceedings of the 44th Annual Meeting of the Association for Computational Linguistics joint with the 21st International Conference on Computational Linguistics (COLING-ACL 2006), Sydney, Australia, pp. 97–104.Google Scholar
Chklovski, Tim and Mihalcea, Rada. 2002. Building a sense tagged corpus with open mind word expert. In Proceedings of ACL 2002 Workshop on WSD: Recent Successes and Future Directions, Philadelphia, PA.CrossRefGoogle Scholar
Chklovski, Tim and Rada, Mihalcea. 2003. Exploiting agreement and disagreement of human annotators for word sense disambiguation. In Proceedings of Recent Advances in NLP (RANLP 2003), Borovetz, Bulgaria.Google Scholar
Cohen, Jacob A. 1960. A coefficient of agreement of nominal scales. Educational and Psychological Measurement 20 (1): 3746.CrossRefGoogle Scholar
Cuadros, Montse and German, Rigau. 2006. Quality assessment of large scale knowledge resources. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, pp. 534–41.Google Scholar
Decadt, Bart, Hoste, Véronique, Daelemans, Walter, and Antal, van den Bosch. 2004. Gambl, genetic algorithm optimization of memory-based wsd. In Proceedings of ACL 2004 SENSEVAL-3 Workshop. Barcelona, Spain, pp. 108–12.Google Scholar
Dolan, William B. 1994. Word sense ambiguation: clustering related senses. In Proceedings of 15th Conference on Computational Linguistics (COLING), Kyoto, Japan, pp. 712–16.Google Scholar
Edmonds, Philip and Adam, Kilgarriff. 2002. Introduction to the special issue on evaluating word sense disambiguation systems. Journal of Natural Language Engineering 8 (4): 279–91.CrossRefGoogle Scholar
Fellbaum, Christiane (ed.) 1998. WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA, USA.CrossRefGoogle Scholar
Fellbaum, Christiane, Joachim, Grabowski, and Shari, Landes. 1998. Performance and confidence in a semantic annotation task. In Fellbaum, Christiane (ed.) WordNet: an Electronic Lexical Database, pp. 217–37, MIT Press, Cambridge, MA, USA.Google Scholar
Florian, Radu, Cucerzan, Silviu, Schafer, Charles, and Yarowsky, David. 2002. Combining classifiers for word sense disambiguation. Journal of Natural Language Engineering 8 (4): 114.CrossRefGoogle Scholar
Galley, Michel and McKeown, Kathleen. 2003. Improving word sense disambiguation in lexical chaining. In Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI), Acapulco, Mexico, pp. 1486–8.Google Scholar
Hanks, Patrick. 2000. Do word meanings exist? Computers and the Humanities 34 (1–2): 205–15.CrossRefGoogle Scholar
Harabagiu, Sanda, Miller, George, and Moldovan, Dan. 1999. Wordnet 2 – a morphologically and semantically enhanced resource. In Proceedings of SIGLEX-99, University of Maryland, USA, pp. 1–8.Google Scholar
Hirst, Graeme and St-Onge, David. 1998. Lexical chains as representations of context for the detection and correction of malapropisms. In Fellbaum, Christiane (ed.) WordNet: An electronic lexical database, pp. 305–32, MIT Press.Google Scholar
Hovy, Eduard H., Marcus, Mitchell P., Palmer, Martha, Ramshaw, Lance A., and Weischedel, Ralph M.. 2006. Ontonotes: the 90% solution. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, New York, USA.Google Scholar
Jiang, Jay J. and Conrath, David W.. 1997. Semantic similarity based on corpus statistics and lexical taxonomy. In Proceedings of International Conference Research on Computational Linguistics (ROCLING X), Taiwan, pp. 19–33.Google Scholar
Kilgarriff, Adam. 1997. I don't believe in word senses. Computers and the Humanities 31 (2): 91113.Google Scholar
Klein, Dan, Toutanova, Kristina, Ilhan, H. Tolga, Kamvar, Sepandar D., and Manning, Christopher D.. 2002. Combining heterogeneous classifiers for word-sense disambiguation. In Proceedings of the ACL-02 workshop on Word sense disambiguation: recent successes and future directions, Morristown, NJ, pp. 74–80.Google Scholar
Kohomban, Upali Sathyajith and Lee, Wee Sun. 2007. Optimizing classifier performance in word sense disambiguation by redefining sense classes. In Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI), Hyderabad, India, pp. 1635–40.Google Scholar
Lea, Diana (ed.) 2002. Oxford Collocations. Oxford University Press, USA.Google Scholar
Leacock, Claudia, Chodorow, Martin, and Miller, George. 1998. Using corpus statistics and wordnet relations for sense identification. Computational Linguistics 24 (1): 147–65.Google Scholar
Litkowski, Ken. 2004. Senseval-3 task: word-sense disambiguation of wordnet glosses. In Proceedings of ACL 2004 SENSEVAL-3 Workshop, Barcelona, Spain, pp. 13–16.Google Scholar
Longman, (ed.) 2003. Longman Language Activator. Pearson Education, Harlaw, Essex, UK.Google Scholar
Magnini, Bernardo and Cavaglià, Gabriela. 2000. Integrating subject field codes into wordnet. In Proceedings of the 2nd Conference on Language Resources and Evaluation (LREC), Athens, Greece, pp. 1413–18.Google Scholar
Mihalcea, Rada and Faruque, Ehsanul. 2004. Senselearner: minimally supervised word sense disambiguation for all words in open text. In Proceedings of ACL 2004 SENSEVAL-3 Workshop, Barcelona, Spain, pp. 155–8.Google Scholar
Mihalcea, Rada, Tarau, Paul, and Figa, Elizabeth. 2004. Pagerank on semantic networks, with application to word sense disambiguation. In Proceedings of the 20th COLING 2004, Geneva, Switzerland, pp. 1126–32.Google Scholar
Miller, George A., Leacock, Claudia, Tengi, Randee, and Bunker, Ross T.. 1993. A semantic concordance. In Proceedings of the ARPA Workshop on Human Language Technology, Princeton, NJ, USA, pp. 303–8.Google Scholar
Miller, Irwin and Miller, Marylees (eds.) 2003. John E. Freund's Mathematical Statistics with Applications, 7th Edition. Prentice Hall, NJ, USA.Google Scholar
Morris, Jane and Hirst, Graeme. 1991. Lexical cohesion computed by thesaural relations as an indicator of the structure of text. Computational Linguistics 17 (1): 2143.Google Scholar
Navigli, Roberto. 2005. Semi-automatic extension of large-scale linguistic knowledge bases. In Proceedings of the 18th FLAIRS, Clearwater Beach, USA, pp. 548–53.Google Scholar
Navigli, Roberto. 2006a. Consistent validation of manual and automatic sense annotations with the aid of semantic graphs. Computational Linguistics 32 (2): 273–81.Google Scholar
Navigli, Roberto. 2006b. Experiments on the validation of sense annotations assisted by lexical chains. In Proceedings of the European Chapter of the Annual Meeting of the Association for Computational Linguistics (EACL), Trento, Italy, pp. 129–36.Google Scholar
Navigli, Roberto. 2006c. Meaningful clustering of senses helps boost word sense disambiguation performance. In Proceedings of the 44th Annual Meeting of the Association for Computational Linguistics joint with the 21st International Conference on Computational Linguistics (COLING-ACL 2006), Sydney, Australia, pp. 105–12.Google Scholar
Navigli, Roberto and Velardi, Paola. 2005. Structural semantic interconnections: a knowledge-based approach to word sense disambiguation. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 27 (7): 1075–88.Google Scholar
Navigli, Roberto, Litkowski, Kenneth C., and Hargraves, Orin. 2007. Semeval-2007 task 07: coarse-grained english all-words task. In Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval-2007), Czech Republic, pp. 30–5, Prague, Association for Computational Linguistics.CrossRefGoogle Scholar
Ng, Hwee T., Lim, Chung Y., and Foo, Shou K.. 1999. A case study on the inter-annotator agreement for word sense disambiguation. In Proceedings of ACL Workshop: Standardizing Lexical Resources, College Park, MD, pp. 9–13.Google Scholar
Palmer, Martha. 2000. Consistent criteria for sense distinctions. Computers and the Humanities 34 (1–2): 217–22.Google Scholar
Palmer, Martha, Dang, Hoa, and Fellbaum, Christiane. 2007. Making fine-grained and coarse-grained sense distinctions, both manually and automatically. Journal of Natural Language Engineering 13 (2): 137–63.CrossRefGoogle Scholar
Peters, Wim, Peters, Ivonne, and Vossen, Piek. 1998. Automatic sense clustering in eurowordnet. In Proceedings of the 1st Conference on Language Resources and Evaluation (LREC), Granada, Spain.Google Scholar
Pianta, Emanuele, Bentivogli, Luisa, and Girardi, Christian. 2002. Multiwordnet: developing an aligned multilingual database. In Proceedings of the First International Conference on Global WordNet, Mysore, India, pp. 21–5.Google Scholar
Pustejovsky, James. 1995. The Generative Lexicon. Cambridge, MA, MIT Press.Google Scholar
Rigau, German, Atserias, Jordi, and Agirre, Eneko. 1997. Combining unsupervised lexical knowledge methods for word sense disambiguation. In Proceedings of 35th Annual Meeting of the Association for Computational Linguistics joint with 8th Conference of the European Chapter of the Association for Computational Linguistics (ACL/EACL'97), Madrid, Spain, pp. 48–55.Google Scholar
Snyder, Benjamin and Palmer, Martha. 2004. The english all-words task. In Proceedings of ACL 2004 SENSEVAL-3 Workshop, Barcelona, Spain, pp. 41–43.Google Scholar
Soanes, Catherine and Stevenson, Angus (ed.) 2003. Oxford Dictionary of English. Oxford University Press.Google Scholar
Stevenson, Mark and Wilks, Yorick. 2001. The interaction of knowledge sources in word sense disambiguation. Computational Linguistics 27 (3): 321–49.CrossRefGoogle Scholar
Véronis, Jean. 2001. Sense tagging: does it make sense? In Corpus Linguistics 2001 Conference, Lancaster, UK.Google Scholar
Véronis, Jean. 2004. Hyperlex: lexical cartography for information retrieval. Computer, Speech and Language 18 (3): 223–52.CrossRefGoogle Scholar
Yuret, Deniz. 2004. Some experiments with a naive bayes wsd system. In Proceedings of ACL 2004 SENSEVAL-3 Workshop, Barcelona, Spain, pp. 265–68.Google Scholar