Hostname: page-component-cd9895bd7-8ctnn Total loading time: 0 Render date: 2024-12-26T17:30:59.812Z Has data issue: false hasContentIssue false

Cluster-based mention typing for named entity disambiguation

Published online by Cambridge University Press:  20 August 2020

Arda Çelebi*
Affiliation:
Department of Computer Engineering, Boğaziçi University, Bebek, 34342 İstanbul, Turkey
Arzucan Özgür*
Affiliation:
Department of Computer Engineering, Boğaziçi University, Bebek, 34342 İstanbul, Turkey
*
Corresponding authors. E-mails: [email protected]; [email protected]
Corresponding authors. E-mails: [email protected]; [email protected]

Abstract

An entity mention in text such as “Washington” may correspond to many different named entities such as the city “Washington D.C.” or the newspaper “Washington Post.” The goal of named entity disambiguation (NED) is to identify the mentioned named entity correctly among all possible candidates. If the type (e.g., location or person) of a mentioned entity can be correctly predicted from the context, it may increase the chance of selecting the right candidate by assigning low probability to the unlikely ones. This paper proposes cluster-based mention typing for NED. The aim of mention typing is to predict the type of a given mention based on its context. Generally, manually curated type taxonomies such as Wikipedia categories are used. We introduce cluster-based mention typing, where named entities are clustered based on their contextual similarities and the cluster ids are assigned as types. The hyperlinked mentions and their context in Wikipedia are used in order to obtain these cluster-based types. Then, mention typing models are trained on these mentions, which have been labeled with their cluster-based types through distant supervision. At the NED phase, first the cluster-based types of a given mention are predicted and then, these types are used as features in a ranking model to select the best entity among the candidates. We represent entities at multiple contextual levels and obtain different clusterings (and thus typing models) based on each level. As each clustering breaks the entity space differently, mention typing based on each clustering discriminates the mention differently. When predictions from all typing models are used together, our system achieves better or comparable results based on randomization tests with respect to the state-of-the-art levels on four defacto test sets.

Type
Article
Copyright
© The Author(s), 2020. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Auer, S. Bizer, C. Kobilarov, G. Lehmann, J. Cyganiak, R. and Ives, Z. (2007). DBpedia: A nucleus for a web of open data. In Proceedings of the 6th International Semantic Web Conference (ISWC), pp. 722735.CrossRefGoogle Scholar
Baroni, M. Dinu, G. and Kruszewski, G. (2014). Dont count, predict! a systematic comparison of context-counting vs. context-predicting semantic vectors. In Proceedings of Association for Computational Linguistics (ACL), vol. 1, pp. 238247.Google Scholar
Beheshti, S. Benatallah, B. Venugopal, S. Ryu, S.H. Motahari-Nezhad, H.R. and Wang, W. (2017). A systematic review and comparative analysis of cross-document coreference resolution methods and tools. Computing 99(4), 313349.CrossRefGoogle Scholar
Bollacker, K. Evans, C. Paritosh, P. Sturge, T. and Taylor, J. (2008). Freebase: A collaboratively created graph database for structuring human knowledge. In Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 12471250.CrossRefGoogle Scholar
Brown, P.F. Pietra, V.J.D. deSouza, P.V. Lai, J.C. and Mercer, R.L. (1992). Class-based N-gram models of natural language. Computational Linguistics 18(4), 467479.Google Scholar
Bunescu, R. and Pasca, M. (2006). Using encyclopedic knowledge for named entity disambiguation. In Proceedings of European Chapter of the Association for Computational Linguistics (EACL), pp. 916.Google Scholar
Cardie, C. and Wagstaff, K. (1999). Noun phrase coreference as clustering. In Proceedings of the 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pp. 8289.Google Scholar
Cheng, X. and Roth, D. (2013). Relational inference for wikification. In Proceedings of Conference on Natural Language Learning (CoNLL), pp. 260269.Google Scholar
Clark, A. (2003). Combining distributional and morphological information for part of speech induction. In Proceedings of European Chapter of the Association for Computational Linguistics (EACL), pp. 5966.CrossRefGoogle Scholar
Cucerzan, S. (2007). Large-scale named entity disambiguation based on wikipedia data. In Proceedings of Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 708716.Google Scholar
Devlin, J. Chang, M. Lee, K. and Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), vol. 1, pp. 41714186.Google Scholar
Dutta, S. and Weikum, G. (2015). A joint model for cross-document co-reference resolution and entity linking. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 846856.CrossRefGoogle Scholar
Ester, M. Kriegel, H. Sander, J. and Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In Association for the Advancement of Artificial Intelligence (AAAI) Press, pp. 226231.Google Scholar
Fang, W. Zhang, J. Wang, D. Chen, Z. and Li, M. (2013). Entity disambiguation by knowledge and text jointly embedding. In Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 17871796.Google Scholar
Fang, Z. Cao, Y. Zhang, D. Li, Q. Zhang, Z. and Liu, Y. (2019). Joint entity linking with deep reinforcement learning. In Proceedings of The Web Conference (WWW), pp. 438447.CrossRefGoogle Scholar
Ferragina, P. and Scaiella, U. (2010). Tagme: On-the-fly annotation of short text fragments (by wikipedia entities). In Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 16251628.CrossRefGoogle Scholar
Ganea, O.E. Ganea, M. Lucchi, A. Eickhoff, C. and Hofmann, T. (2016). Probabilistic bag-of-hyperlinks model for entity linking. In Proceedings of the 25th International Conference on World Wide Web, pp. 927938.CrossRefGoogle Scholar
Ganea, O.E. and Hofmann, T. (2017). Deep joint entity disambiguation with local neural attention. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 16671676.CrossRefGoogle Scholar
Goldhahn, D. Eckart, T. and Quasthoff, U. (2012). Building large monolingual dictionaries at the Leipzig corpora collection: From 100 to 200 languages. In Proceedings of Language Resources and Evaluation Conference (LREC), pp. 759765.Google Scholar
Guo, Z. and Barbosa, D. (2016). Robust named entity disambiguation with random walks. In Proceedings of the 23rd ACM International Conference on Information and Knowledge Management (CIKM), pp. 128.Google Scholar
Gupta, N. Singh, S. and Roth, D. (2017). Entity linking via joint encoding of types, descriptions, and context. In Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 26812690.CrossRefGoogle Scholar
Hachey, B. Radford, W. Nothman, J. Honnibal, M. and Curran, J. (2013). Evaluating entity linking with wikipedia. Artificial Intelligence 194, 130150.CrossRefGoogle Scholar
Han, X. and Sun, L. (2012). An entity-topic model for entity linking. In Proceedings of Joint Conference on Empirical Methods in Natural Language Processing (EMNLP) and CoNLL, pp. 105115.Google Scholar
Han, X. Sun, L. and Zhao, J. (2011). Collective entity linking in web text: A graph-based method. In Proceedings of the 34th International ACM SIGIR, pp. 765774.CrossRefGoogle Scholar
Hakimov, S., ter Horst, H., Jebbara, S., Hartung, M. and Cimiano, P. (2016). Combining textual and graph-based features for named entity disambiguation using undirected probabilistic graphical models. In Knowledge Engineering and Knowledge Management (EKAW). Springer, pp. 288302. doi: 10.1007/978-3-319-49004-5_19.CrossRefGoogle Scholar
Hendrickx, I. and Daelemans, W. (2007). Adding semantic information: Unsupervised clusters for coreference resolution. In Workshop notes on Machine Learning for Natural Language Processing.Google Scholar
Hoffart, J. Seufert, S. Nguyen, D.B. Theobald, M. and Weikum, G. (2012). Kore: Keyphrase overlap relatedness for entity disambiguation. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 545554.CrossRefGoogle Scholar
Hoffart, J. Yosef, M.A. Bordino, I. Furstenau, H. Pinkal, M. Spaniol, M. Taneva, B. Thater, S. Weikum, G. Guo, Z. and Barbosa, D. (2011). Robust disambiguation of named entities in text. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 782792.Google Scholar
Huang, L. May, J. Pan, X. and Ji, H. (2016). Building a fine-grained entity typing system overnight for a new x (x= language, domain, genre). arXiv preprint arXiv:1603.03112.Google Scholar
Jin, X. and Han, J. (2011). Expectation maximization clustering. In Sammut C. and Webb G.I. (eds), Encyclopedia of Machine Learning. Boston, MA: Springer.Google Scholar
Kataria, S. Kumar, K.S. Rastogi, R. Sen, P. and Sengamedu, S.H. (2011). Entity disambiguation with hierarchical topic models. In Proceedings of the ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD), pp. 10371045.CrossRefGoogle Scholar
Kneser, R. and Ney, H. (1993). Improved clustering techniques for class-based statistical language modeling. In Proceedings of Eurospeech, vol. 2, pp. 973976.Google Scholar
Kulkarni, S. Singh, A. Ramakrishnan, G. and Chakrabarti, S. (2009). Collective annotation of wikipedia entities in web text. In Proceedings of the ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD), pp. 457466.CrossRefGoogle Scholar
Le, P. and Titov, I. (2018). Improving entity linking by modeling latent relations between mentions. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 15951604.CrossRefGoogle Scholar
Le, P. and Titov, I. (2019). Boosting entity linking performance by leveraging unlabeled documents. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 1935–1945.CrossRefGoogle Scholar
Levy, O. and Goldberg, Y. (2017). Dependency-based word embeddings. In Proceedings of the 52nd Annual Meeting of Association for Computational Linguistics (ACL), pp. 302308.Google Scholar
Ling, X. Singh, S. and Weld, D.S. (2015). Design challenges for entity linking. Transactions of the Association for Computational Linguistics 3, 315328.CrossRefGoogle Scholar
Ling, X. and Weld, D.S. (2012). Fine-grained entity recognition. In Proceedings of Association for the Advancement of Artificial Intelligence (AAAI), vol. 12, pp. 94100.Google Scholar
Liu, C. Li, F. Sun, X. and Han, H. (2019). Attention-based joint entity linking with entity embedding. Information 10(2), 46.CrossRefGoogle Scholar
Mahdisoltani, F. Biega, J. and Suchanek, F.M. (2015). YAGO3: A knowledge base from multilingual wikipedias. In Proceedings of Conference on Innovative Data Systems Research (CIDR).Google Scholar
Manning, C.D. Surdeanu, M. Bauer, J. Finkel, J. Bethard, S.J. and McClosky, D. (2014). The Stanford CoreNLP natural language processing toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics (ACL): System Demonstrations, pp. 5560.CrossRefGoogle Scholar
Mihalcea, R. and Csomai, A. (2007). Wikify! Linking documents to encyclopedic knowledge. In Proceedings of Conference on Information and Knowledge Management (CIKM), pp. 233242.Google Scholar
Mikolov, T. Sutskever, I. Chen, K. Corrado, G. and Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Proceedings of Neural Information Processing Systems (NIPS), pp. 31113119.Google Scholar
Miller, G.A. (1995). WordNet: A lexical database for English. Communications of the ACM 38(11), 3941.CrossRefGoogle Scholar
Milne, D. and Witten, I.H. (2008). Learning to link with wikipedia. In Proceedings of the ACM Conference on Information and Knowledge Management (CIKM), pp. 509518.CrossRefGoogle Scholar
Monahan, S. Lehmann, J. Nyberg, T. Plymale, J. and Jung, A. (2011). Cross-lingual cross-document coreference with entity linking. In Text Analysis Conference (TAC) 2011 Workshop.Google Scholar
Murty, S. Verga, P. Vilnis, L. Radovanovic, I. and McCallum, A. (2018). Hierarchical losses and new resources for fine-grained entity typing and linking. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 97109.CrossRefGoogle Scholar
Neelakantan, A. and Chang, M. (2015). Inferring missing entity type instances for knowledge base completion: New dataset and methods. In Proceedings of the 2015 Conference of the NAACL-HLT: Human Language Technologies, pp. 515525.CrossRefGoogle Scholar
Ngomo, A.N. Roder, M. and Usbeck, R. (2014). Cross-document coreference resolution using latent features. In Proceedings of International Conference on Linked Data for Information Extraction (LD4IE), pp. 3344.Google Scholar
Pasca, M. (2004). Acquisition of categorized named entities for web search. In Proceedings of Conference on Information and Knowledge Management, pp. 137145.CrossRefGoogle Scholar
Onoe, Y. and Durrett, G. (2020). Fine-grained entity typing for domain independent entity linking. In Proceedings of Association for the Advancement of Artificial Intelligence (AAAI).CrossRefGoogle Scholar
Pereira, F. Tishby, N. and Lee, L. (1993). Distributional clustering of English words. In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics (ACL), pp. 183190.CrossRefGoogle Scholar
Pershina, M. He, Y. and Grishman, R. (2015). Personalized page rank for named entity disambiguation. In Proceedings of North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL), pp. 238243.CrossRefGoogle Scholar
Phan, M.C. Sun, A. Tay, Y. Han, J. and Li, C. (2017). Neupl: Attention-based semantic matching and pair-linking for entity disambiguation. In Proceedings of the Conference on Information and Knowledge Management (CIKM), pp. 16671676.CrossRefGoogle Scholar
Phan, M.C. Sun, A. Tay, Y. Han, J. and Li, C. (2018). Pair-linking for collective entity disambiguation: Two could be better than all. In Proceedings of Computing Research Repository (CoRR).Google Scholar
Radhakrishnan, P. Talukdar, P. and Varma, V. (2018). ELDEN: Improved entity linking using densified knowledge graphs. In Proceedings of NAACL-HLT 2018, pp. 1844–1853.CrossRefGoogle Scholar
Raiman, J. and Raiman, O. (2018). DeepType: Multilingual entity linking by neural type system evolution. In Proceedings of Association for the Advancement of Artificial Intelligence (AAAI).Google Scholar
Ratford, W. Hachey, B. Honnibal, M. Nothman, J. and Curran, J.R. (2011). Naive but effective NIL clustering baselines - CMCRC at TAC 2011. In Proceedings of Text Analysis Conference (TAC).Google Scholar
Ratinov, L. Roth, D. Downey, D. and Anderson, M. (2011). Local and global algorithms for disambiguation to wikipedia. In Proceedings of ACL-HLT, pp. 1924.Google Scholar
Ren, X. El-Kishky, A. Wang, C. Tao, F. Voss, C.R. Ji, H. and Han, J. (2015). Clustype: Effective entity recognition and typing by relation phrase-based clustering. In Proceedings of Conference on Knowledge Discovery and Data Mining (KDD), pp. 9951004.CrossRefGoogle Scholar
Rijsbergen, V.C.J. (1979). Information Retrieval, 2nd Edn. London: Butterworths.Google Scholar
Roder, M. Usbeck, R. Hellmann, S. Gerber, D. and Both, A. (2014). N3 - A collection of datasets for named entity recognition and disambiguation in the NLP interchange format. In Proceedings of Language Resources and Evaluation Conference (LREC), pp. 35293533.Google Scholar
Sil, A. Kundu, G. Florian, R. and Hamza, W. (2018). Neural cross-lingual entity linking. In Association for the Advancement of Artificial Intelligence (AAAI), pp. 54645472.Google Scholar
Singh, S. Subramanya, A. Pereira, F. and McCallum, A. (2011). Large-scale cross-document coreference using distributed inference and hierarchical models. In Proceedings of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 793803.Google Scholar
Singh, S. Wick, M. and McCallum, A. (2010). Distantly labeling data for large scale cross-document coreference. CoRR, abs/1005.4298.Google Scholar
Slonim, N. and Tishby, N. (2001). The power of word clusters for text classification. In Proceedings of 23rd European Colloquium on Information Retrieval Research (ECIR), pp. 191200.Google Scholar
Srivastava, N. Hinton, G. Krizhevsky, A. Sutskever, I. and Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15, 19291958.Google Scholar
Steinbach, M. Karypis, G. and Kumar, V. (2000). A comparison of document clustering techniques. In Proceedings of Workshop on Text Mining in Knowledge Discovery and Data Mining (KDD).Google Scholar
Steinley, D. (2006). K-means clustering: A half-century synthesis. The British Journal of Mathematical and Statistical Psychology 59, 134.CrossRefGoogle ScholarPubMed
Sun, Y. Lin, L. Yang, N. Ji, Z. and Wand, X. (2015). Modeling mention, context and entity with neural networks for entity disambiguation. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI), pp. 13331339.Google Scholar
Teffera, H.T. (2010). Automatic Construction of Labeled Clusters of Named Entities for Information Retrieval. MSC Thesis, Universitat Des Saarlandes.Google Scholar
Wick, M. Singh, S. and McCallum, A. (2012). A discriminative hierarchical model for fast coreference at large scale. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pp. 379388.Google Scholar
Xie, R. Liu, Z. Jia, J. Luan, H. and Sun, M. (2016). Representation learning of knowledge graphs with entity descriptions. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, pp. 26592665.Google Scholar
Yaghoobzadeh, Y. and Schutze, H. (2015). Corpus-level fine-grained entity typing using contextual information. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 715725.CrossRefGoogle Scholar
Yaghoobzadeh, Y. and Schutze, H. (2017). Multi-level representations for fine-grained typing of knowledge base entities. In Proceedings of the 15th Conference of the European Chapter of the ACL, vol. 1, pp. 578589.CrossRefGoogle Scholar
Yamada, I. Shindo, H. Takeda, H. and Takefuji, Y. (2016). Joint learning of the embedding of words and entities for named entity disambiguation. In Proceedings of Conference on Natural Language Learning (CoNLL), pp. 250259.CrossRefGoogle Scholar
Yang, Y. Irsoy, O. and Rahman, K.S. (2018). Collective entity disambiguation with structured gradient tree boosting. In Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL), pp. 777786.CrossRefGoogle Scholar
Yeh, A. (2000). More accurate tests for the statistical significance of result differences. In Proceedings of the 18th International Conference on Computational Linguistics (COLING), vol. 2, pp. 947953.CrossRefGoogle Scholar
Yosef, M.A. Bauer, S. Hoffart, J. Spaniol, M. and Weikum, G. (2012). HYENA: Hierarchical type classification for entity names. In Proceedings of 24th International Conference on Computational Linguistics (COLING), pp. 13611370.Google Scholar
Zwicklbauer, S. Seifert, C. and Granitzer, M. (2016a). Robust and collective entity disambiguation through semantic embeddings. In Proceedings of the 39th International Conference on Research and Development in Information Retrieval (SIGIR), pp. 425434.CrossRefGoogle Scholar
Zwicklbauer, S. Seifert, C. and Granitzer, M. (2016b). DoSeR - A knowledge-base-Agnostic framework for entity disambiguation using semantic embeddings. In The Semantic Web. Latest Advances and New Domains, ESWC’16. Springer.CrossRefGoogle Scholar