Hostname: page-component-cd9895bd7-gbm5v Total loading time: 0 Render date: 2024-12-26T07:50:53.038Z Has data issue: false hasContentIssue false

Best parse parsing with Earley's and Inside algorithms on probabilistic RTN

Published online by Cambridge University Press:  12 September 2008

Young S. Han
Affiliation:
Korea Advanced Institute of Science and TechnologyTaejon, Korea
Key-Sun Choi
Affiliation:
Korea Advanced Institute of Science and TechnologyTaejon, Korea

Abstract

Inside parsing is a best parse parsing method based on the Inside algorithm that is often used in estimating probabilistic parameters of stochastic context free grammars. It gives a best parse in O(N3G3) time where N is the input size and G is the grammar size. Earley algorithm can be made to return best parses with the same complexity in N.

By way of experiments, we show that Inside parsing can be more efficient than Earley parsing with sufficiently large grammar and sufficiently short input sentences. For instance, Inside parsing is better with sentences of 16 or less words for a grammar containing 429 states. In practice, parsing can be made efficient by employing the two methods selectively.

The redundancy of Inside algorithm can be reduced by the topdown filtering using the chart produced by Earley algorithm, which is useful in training the probabilistic parameters of a grammar. Extensive experiments on Penn Tree corpus show that the efficiency of Inside computation can be improved by up to 55%.

Type
Articles
Copyright
Copyright © Cambridge University Press 1995

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Aho, Alfred V., and Ullman, Jeffrey D., (1972) The Theory of Parsing, Translation, and Compiling, vol. I. New Jersey: Prentice Hall.Google Scholar
Allen, J., (1994) Natural Language Understanding. 2nd edition. Benjamin Cummings.Google Scholar
Briscoe, T., and Carroll, J., (1993) Generalized probabilistic LR parsing of natural language (Corpora) with unification-based grammars. Computational Linguistics 19(1): 2557.Google Scholar
Carroll, J., and Briscoe, T., (1992) Probabilistic normalization and unpacking of packed parse forests for unification-based grammars. In proceedings, AAAl Fall Symposium Series: Probabilistic Approaches to Natural Language.Cambridge. Pp. 33–8.Google Scholar
Charniak, E., and Goldman, R., (1993) A Bayesian model of plan recognition. Artificial Intelligence. 64(1): 5379.Google Scholar
Charniak, E., Hendrickson, C., Jacobson, N., and Perkowitz, M., (1993) Equations for part-of-speech tagging. In proceedings, AAAl Conference.Google Scholar
Glenn, C, and Charniak, E., (1992) Learning probabilistic dependency grammars from labelled texts. In Proceedings, AAAl Fall Symposium Series: Probabilistic Approaches to Natural Language.Cambridge. Pp. 2532.Google Scholar
Han, Young S., and Choi, Key-Sun. (1993) Lexical concept acquisition from collocation map. In Proceedings, a workshop of SIGLEX: Acquisition of Lexical Knowledge from Text.Ohio. Pp. 2231.Google Scholar
Han, Young S., and Choi, Key-Sun. (1994) A Reestimation algorithm for probabilistic transition network. In proceedings of COLING.Kyoto. Pp. 859–64.CrossRefGoogle Scholar
Jelinek, R, Lafferty, J. D., and Mercer, R. L., (1990) Basic methods of probabilistic context free grammars. IBM RC 16374. IBM Continuous Speech Recognition Group.Google Scholar
Kochut, K., (1983) Towards the elastic ATN implementation. In Leonard, B., (ed.), The Design of Interpreters, Compilers, and Editors for ATN. New York: Springer-Verlag. Pp. 175214.Google Scholar
Kupiec, J., (1991) A Trellis-based algorithm for estimating the parameters of a hidden stochastic context-free grammar. In Proceedings, Speech and Natural Language Workshop,sponsored by DARPA.Pacific Grove. Pp. 241–6.CrossRefGoogle Scholar
Lafferty, J., Sleator, D., and Temperley, D., (1992) Grammatical trigrams: a probabilistic model of link grammar. In Proceedings, AAAI Fall Symposium Series: Probabilistic Approaches to Natural Language.Cambridge. Pp. 8997.Google Scholar
Lari, K., and Young, S. J., (1990) The estimation of stochastic context-free grammars using the Inside-Outside algorithm. Computer Speech and Language. 4: 3556.Google Scholar
Schabes, Y., (1992) Stochastic lexicalized tree-adjoining grammars. In Proceedings, the 15th International Conference on Computational Linguistics.Google Scholar
Woods, W. A., (1970) Transition network grammars for natural language analysis, Communication of the ACM 13.CrossRefGoogle Scholar
Wright, J. H., (1990) LR parsing of probabilistic grammars with input uncertainty for speech recognition. Computer Speech and Language 4:297323.CrossRefGoogle Scholar
Wright, J., Wrighley, E., and Sharman, R., (1991) Adaptive probabilistic generalized LR parsing. In Proceedings, 2nd International Workshop on Parsing Technologies,Cancun, Mexico. Pp. 154–63.Google Scholar