Instance-based natural language generation

S. VARGES; C. MELLISH

doi:10.1017/S1351324910000069

Instance-based natural language generation

Published online by Cambridge University Press: 12 May 2010

S. VARGES and

C. MELLISH

Show author details

S. VARGES: Affiliation:
Department of Information Engineering and Computer Science, University of Trento, Via Sommarive, 14 38050 Povo (TN), Italy e-mail: [email protected]
C. MELLISH: Affiliation:
Department of Computing Science, University of Aberdeen, King's College, Aberdeen AB24 3UE, UK e-mail: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

We investigate the use of instance-based ranking methods for surface realization in natural language generation. Our approach to instance-based natural language generation (IBNLG) employs two components: a rule system that ‘overgenerates’ a number of realization candidates from a meaning representation and an instance-based ranker that scores the candidates according to their similarity to examples taken from a training corpus. We develop an efficient search technique for identifying the optimal candidate based on a novel extension of the A* algorithm. The rule system is produced automatically from a semantically annotated fragment of the Penn Treebank II containing management succession texts. We detail the annotation scheme and grammar induction algorithm and evaluate the efficiency and output of the generator. We also discuss issues such as input coverage (completeness) and fluency that are relevant to surface generation in general.

Type: Papers
Information: Natural Language Engineering , Volume 16 , Issue 3 , July 2010 , pp. 309 - 346

DOI: https://doi.org/10.1017/S1351324910000069 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2010

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Aha, D. W., Kibler, D., and Albert, M. 1991. Instance-based learning agorithms. Machine Learning 7: 37–66.Google Scholar

Bangalore, S., and Rambow, O. 2000. Corpus-based lexical choice in natural language generation. In Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics (ACL-00), Hong Kong.Google Scholar

Bangalore, S., Rambow, O., and Whittaker, S. 2000. Evaluation metrics for generation. In Proceedings of the 1st International Conference on Natural Language Generation (INLG-00), Mitzpe Ramon, Israel.Google Scholar

Barzilay, R., and Lee, L. Learning to paraphrase: an unsupervised approach using multiple-sequence alignment. In Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT/NAACL-03), Edmonton, Canada.Google Scholar

Belz, A. 2008. Automatic generation of weather forecast texts using comprehensive probabilistic generation-space models. Natural Language Engineering 14 (4): 431–455. Cambridge University Press.Google Scholar

Brown, R. D. 1996. Example-based machine translation in the Pangloss system. In Proceedings of the 16th International Conference on Computational Linguistics (COLING-96), Copenhagen, Denmark.Google Scholar

Cohn, T., Callison-Burch, C., and Lapata, M. 2008. Constructing corpora for the development and evaluation of paraphrase systems. Computational Linguistics 34 (4): 597–614.Google Scholar

Copestake, A., Flickinger, D., Pollard, C. J., and Sag, I. A. 2005. Minimal recursion semantics: an introduction. Research on Language and Computation 3 (4): 281–332.Google Scholar

Corston-Oliver, S., Gamon, M., Ringger, E., and Moore, R. 2002. An overview of amalgam: a machine-learned generation module. In Proceedings of the Second International Natural Language Generation Conference (INLG-02), New York.Google Scholar

Daelemans, W. 1999. Memory-based Language Processing. Introduction to the special issue. Journal of Experimental and Theoretical AI 11 (3): 287–467.Google Scholar

Daelemans, W., Buchholz, S., and Veenstra, J. 1999. Memory-based shallow parsing. In Proceedings of the EACL'99 workshop on Computational Natural Language Learning (CoNLL-99), Bergen, Norway.Google Scholar

Dale, R., and Reiter, E. 1995. Computational interpretations of the Gricean maxims in the generation of referring expressions. Cognitive Science 19: 233–263.Google Scholar

Defense Advanced Research Projects Agency. 1995. In Proceedings of the Sixth Message Understanding Conference (MUC-6). Columbia, MD.Google Scholar

DeVault, D., Traum, D., and Artstein, R. 2008. Practical grammar-based NLG from examples. In Proceedings of the Fifth International Natural Language Generation Conference (INLG-08), Columbus, OH.Google Scholar

Forgy, C. L. 1982. Rete: a fast Algorithm for the many pattern/many object pattern match problem. Artificial Intelligence 19: 17–37.Google Scholar

Huang, X., Acero, A., and Hon, H-W. 2001. Spoken Language Processing: A Guide to Theory, Algorithm, and System Development. Upper Saddle River, NJ: Prentice Hall.Google Scholar

Hanks, P., and Pustejovsky, J. 2005. A pattern dictionary for natural language processing. Revue Francaise de linguistique appliquée 10 (2): 63–82.Google Scholar

Joshi, A. K. 1987. Mathematics of language. In Manaster-Ramis, A. (ed.), An Introduction to Tree Adjoining Grammars, pp. 87–115. Amsterdam: John Benjamins.Google Scholar

Kay, M. 1996. Chart generation. In Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics (ACL-96), Santa Cruz, CA.Google Scholar

Knight, K., and Hatzivassiloglou, V. 1995. Two-level, many-paths generation. In Proceedings of the 33th Annual Meeting of the Association for Computational Linguistics (ACL-95), Cambridge, MA.Google Scholar

Langkilde, I., and Knight, K. 1998. Generation that exploits corpus-based Statistical knowledge. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics (COLING/ACL-98), Montreal, Canada.Google Scholar

Knight, K., and Luk, S. K. 1994. Building a large-scale knowledge base for machine translation. In Proceedings of the National Conference on Artificial Intelligence (AAAI), Seattle, Washington.Google Scholar

Langkilde, I. 2000. Forest-based statistical sentence generation. In Proceedings of the North American Meeting of the Association of Computational Linguistics (NAACL-00), Seattle, Washington DC.Google Scholar

Mairesse, F., and Walker, M. 2008. Trainable generation of big-five personality styles through data-driven parameter estimation. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL-08), Columbus, OH.Google Scholar

Marciniak, T., and Strube, M. 2004. Classification-based generation using TAG. In Proceedings of the 3rd International Natural Language Generation Conference (INLG-04), Brockenhurst, UK.Google Scholar

Marcus, M. P., Santorini, B., and Marcinkiewicz, M. 1993. Building a large annotated corpus for english: the Penn Treebank. Computational Linguistics 19 (2): 313–330.Google Scholar

McDonald, D. D. 1993. Issues in the choice of a source for Natural Language Generation. Computational Linguistics 19: 191–197.Google Scholar

Nicolov, N., Mellish, C., and Richie, G. 1996. Approximate generation from non-hierarchical representations. In Proceedings of the 8th International Workshop on Natural Language Generation, Herstmonceux Castle, UK.Google Scholar

Paiva, D. S., and Evans, R. 2005. Empirically-based control of Natural Language Generation. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL-05), Ann Arbor, MI.Google Scholar

Pan, S., and Shaw, J. 2004. SEGUE: a hybrid case-based surface natural language generator. In Proceedings of the Third International Conference on Natural Language Generation (INLG-04), Brockenhurst, UK.Google Scholar

Papineni, K, Roukos, S., Ward, T., and Zhu, W-J. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of 40th Annual Meeting of the Association for Computational Linguistics (ACL-02), Philadelphia, PA.Google Scholar

Pearl, J. 1984. Heuristics: Intelligent Search Strategies for Computer Problem Solving. Reading MA: Addison-Wesley.Google Scholar

Russell, S., and Norvig, P. 2002. Artificial Intelligence: A Modern Approach, 2nd ed.Upper Saddle River, NJ: Prentice Hall.Google Scholar

Salton, G., and McGill, M. J. 1983. The SMART and SIRE experimental retrieval systems. In Jones, K. S., and Willett, P. (eds.), Readings in Information Retrieval, pp. 118–155. McGraw-Hill, New York.Google Scholar

Sang, E. F. T. K. 2002. Memory-based shallow parsing. Journal of Machine Learning Research 2: 559–594.Google Scholar

Santorini, B. 1990. Part-of-speech tagging guidelines for the Penn Treebank project. Technical Report MS-CIS-90-47, Department of Computer and Information Science, University of Pennsylvania.Google Scholar

Shemtov, H. 1998. Ambiguity Management in Natural Language Generation, PhD thesis, Department of Linguistics, Stanford University.Google Scholar

Shieber, S. M., Schabes, Y., and Pereira, F. C. N. 1995. Principles and implementation of deductive parsing. Journal of Logic Programming 24 (1–2): 3–36.Google Scholar

Somers, H. 1999. Review article: example-based machine translation. Machine Translation 14: 113–158.Google Scholar

Stanfill, C., and Waltz, D. 1986. Toward memory-based reasoning. Communications of the ACM 29 (12): 1213–1228.Google Scholar

Varges, S. 2002. Fluency and completeness in instance-based natural language generation. In Proceedings of the 19th International Conference on Computational Linguistics (COLING-02), Taipei, Taiwan.Google Scholar

Varges, S. 2003. Instance-based Natural Language Generation, PhD thesis, Institute for Communicating and Collaborative Systems, School of Informatics, University of Edinburgh.Google Scholar

Varges, S., and van Deemter, K. 2005. Generating referring expressions containing quantifiers. In Proceedings of the 6th International Workshop on Computational Semantics (IWCS-6), Tilburg, The Netherlands.Google Scholar

Varges, S., and Mellish, C. 2001. Instance-based Natural Language Generation. In Proceedings of the 2nd Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-01), Pittsburgh, PA.Google Scholar

White, M. 2006. Efficient realization of coordinate structures in combinatory categorial grammar. Research on Language and Computation 4 (1): 39–75.Google Scholar

Wong, Y. W., and Mooney, R. 2006. Learning for semantic parsing with Statistical Machine Translation In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT/NAACL-06), New York.Google Scholar

Wong, Y. W., and Mooney, R. 2007. Generation by inverting a semantic parser that uses statistical machine translation. In Proceedings of Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, Rochester, New York.Google Scholar

XTAG Research Group. 2001. A lexicalized tree adjoining grammar for English. Technical Report IRCS-01-03, IRCS, University of Pennsylvania.Google Scholar

Article contents

Instance-based natural language generation

Abstract

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests