Hostname: page-component-586b7cd67f-l7hp2 Total loading time: 0 Render date: 2024-12-01T03:24:59.989Z Has data issue: false hasContentIssue false

Learning correction grammars

Published online by Cambridge University Press:  12 March 2014

Lorenzo Carlucci
Affiliation:
Università di Roma“La Sapienza”, Department of Computer Science, Via Salaria 113, 00198, Roma Scuola Normale Superiore di Pisa, Classe di Lettere, Piazza dei Cavalieri7, 56126 Pisa, E-mail: [email protected]
John Case
Affiliation:
University of Delaware, Department of Computer and Information Sciences, Newark, De 19716-2586, U.S.A., E-mail: [email protected]
Sanjay Jain
Affiliation:
National University of Singapore, Department of Computer Science, Singapore 117417. Republic of Singapore, E-mail: [email protected]

Abstract

We investigate a new paradigm in the context of learning in the limit, namely, learning correction grammars for classes of computably enumerable (c.e.) languages. Knowing a language may feature a representation of it in terms of two grammars. The second grammar is used to make corrections to the first grammar. Such a pair of grammars can be seen as a single description of (or grammar for) the language. We call such grammars correction grammars. Correction grammars capture the observable fact that people do correct their linguistic utterances during their usual linguistic activities.

We show that learning correction grammars for classes of c.e. languages in the TxtEx-mode (i.e., converging to a single correct correction grammar in the limit) is sometimes more powerful than learning ordinary grammars even in the TxtBc-model (where the learner is allowed to converge to infinitely many syntactically distinct but correct conjectures in the limit). For each n ≥ 0. there is a similar learning advantage, again in learning correction grammars for classes of c.e. languages, but where we compare learning correction grammars that make n + 1 corrections to those that make n corrections.

The concept of a correction grammar can be extended into the constructive transfinite, using the idea of counting-down from notations for transfinite constructive ordinals. This transfinite extension can also be conceptualized as being about learning Ershov-descriptions for c.e. languages. For u a notation in Kleene's general system (O, <o) of ordinal notations for constructive ordinals, we introduce the concept of an u-correction grammar, where u is used to bound the number of corrections that the grammar is allowed to make. We prove a general hierarchy result: if u and v are notations for constructive ordinals such that u <ov. then there are classes of c.e. languages that can be TxtEx-learned by conjecturing v-correction grammars but not by conjecturing u-correction grammars.

Surprisingly, we show that—above “ω-many” corrections—it is not possible to strengthen the hierarchy: TxtEx-learning u-correction grammars of classes of c.e. languages, where u is a notation in O for any ordinal, can be simulated by TxtBc-learning w-correction grammars, where w is any notation for the smallest infinite ordinal ω.

Type
Research Article
Copyright
Copyright © Association for Symbolic Logic 2009

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

REFERENCES

[1]Ambainis, A., Case, J., Jain, S., and Suraj, M., Parsimony hierarchies for inductive inference, this Journal, vol. 69 (2004), no. 1, pp. 287327.Google Scholar
[2]Angluin, D., Inductive inference of formal languages from positive data. Information and Control, vol. 45 (1980), pp. 117135.CrossRefGoogle Scholar
[3]Ash, J. and Knight, J. F., Recursive structures and Ershov's hierarchy, Mathematical Logic Quarterly, vol. 42 (1996), pp. 461468.CrossRefGoogle Scholar
[4]Baliga, G., Case, J., Jain, S., and Suraj, M., Machine learning of higher order programs, this Journal, vol. 59 (1994), no. 2, pp. 486500.Google Scholar
[5]Barzdiņš, J. M., Inductive inference of automata, functions and programs, Proceedings of the 20th International Congress of Mathematicians (James, R. D., editor), 1974, pp. 455460.Google Scholar
[6]Blum, L. and Blum, M., Toward a mathematical theory of inductive inference. Information and Control, vol. 28 (1975), pp. 125155.CrossRefGoogle Scholar
[7]Blum, M., A machine-independent theory of the complexity of recursive functions, Journal of the ACM, vol. 14 (1967), pp. 322336.CrossRefGoogle Scholar
[8]Buchholz, W., Proof-theoretic analysis of termination proofs, Annals of Pure and Applied Logic, vol. 75 (1995), pp. 5765.CrossRefGoogle Scholar
[9]Burgin, M., Grammars with prohibition and human-computer interaction. Proceedings of the Business and Industry Symposium and the Military, Government, and Aerospace Simulation Symposium (Hill, J., editor), 2005, pp. 143147.Google Scholar
[10]Carlucci, L., Case, J., and Jain, S., Learning correction grammars, Proceedings of the 20th Annual Conference on Learning Theory (Bshouty, N. and Gentile, C., editors), Lecture Notes in Artificial Intelligence, vol. 4539, Spring Verlag, 2007, pp. 203217.Google Scholar
[11]Case, J., The power of vacillation in language learning, SIAM Journal on Computing, vol. 28 (1999), no. 6. pp. 19411969.CrossRefGoogle Scholar
[12]Case, J., Jain, S., and Sharma, A., On learning limiting programs. International Journal of Foundations of Computer Science, vol. 3 (1992), no. 1, pp. 93115.CrossRefGoogle Scholar
[13]Case, J. and Lynes, C., Machine inductive inference and language identification, Proceedings of the 9th International Colloquium on Automata, Languages and Programming, Lecture Notes in Computer Science, vol. 140, Springer Verlag, 1982, pp. 107115.CrossRefGoogle Scholar
[14]Case, J. and Royer, J., Program size complexity of correction grammars, Working paper, 2008.Google Scholar
[15]Case, J. and Smith, C., Comparison of identification criteria for machine inductive inference. Theoretical Computer Science, vol. 25 (1983), pp. 193220.CrossRefGoogle Scholar
[16]Chen, K., Tradeoffs in inductive inference of nearly minimal sized programs, Information and Control, vol. 52 (1982), pp. 6886.CrossRefGoogle Scholar
[17]Epstein, R. L., Haas, R., and Kramer, R., Hierarchies of sets and degrees below 0′, Logic Year 1979-1980 (Lerman, M., Schmerl, J. H., and Soare, R. I., editors), Lecture Notes in Mathematics, vol. 859, Springer Verlag, 1981, pp. 3248.CrossRefGoogle Scholar
[18]Ershov, Y. L., A hierarchy of sets I, Algebra and Logic, vol. 7 (1968), pp. 2343.Google Scholar
[19]Ershov, Y. L., A hierarchy of sets II, Algebra and Logic, vol. 7 (1968), pp. 212232.CrossRefGoogle Scholar
[20]Ershov, Y. L., A hierarchy of sets III, Algebra and Logic, vol. 9 (1970), pp. 2031.CrossRefGoogle Scholar
[21]Freivalds, R., Minimal Gödel numbers and their identification in the limit, Proceedings of the 4th Symposium on Mathematical Foundations of Computer Science (Becvar, Jiri, editor), Lecture Notes in Computer Science, vol. 32, 1975, pp. 219225.Google Scholar
[22]Freivalds, R., Inductive inference of minimal programs, Proceedings of the 3rd Annual Workshop on Computational Learning Theory (Fulk, M. and Case, J., editors), Morgan Kaufmann Publishers, 1990, pp. 320.Google Scholar
[23]Freivalds, R. and Smith, C., On the role of procrastination in machine learning, Information and Computation, vol. 107 (1993), no. 2, pp. 237271.CrossRefGoogle Scholar
[24]Gold, E. M., Language identification in the limit. Information and Control, vol. 10 (1967), pp. 447474.CrossRefGoogle Scholar
[25]Hopcroft, J. and Ullman, J., Introduction to Automata Theory, Languages and Computation, Addison-Wesley, 1979.Google Scholar
[26]Jain, S., Osherson, D.. Royer, J., and Sharma, A., Systems that Learn: An Introduction to Learning Theory, 2nd ed., MIT Press, 1999.CrossRefGoogle Scholar
[27]Jain, S. and Sharma, A.. Program size restrictions in computational learning, Theoretical Computer Science, vol. 127 (1994), pp. 351386.CrossRefGoogle Scholar
[28]Kinber, E. B., On the synthesis in the limit of almost minimal Godel numbers, Theory of Algorithms and Programs (Bārzdiñš, J. M., editor), vol. 1, Latvian State University, 1974, pp. 221223.Google Scholar
[29]Kleene, S. C., On notation for ordinal numbers, this Journal, vol. 3 (1938), pp. 150155.Google Scholar
[30]Kleene, S. C.. On the forms of predicates in the theory of constructive ordinals, American Journal of Mathematics, vol. 66 (1944). pp. 4158.CrossRefGoogle Scholar
[31]Kleene, S. C., On the forms of predicates in the theory of constructive ordinals (second paper), American Journal of Mathematics, vol. 77 (1955), pp. 405428.CrossRefGoogle Scholar
[32]Machtey, M. and Young, P., An Introduction to the General Theory of Algorithms, North Holland, New York, 1978.Google Scholar
[33]Osherson, D., Stob, M., and Weinstein, S., Systems that Learn: An Introduction to Learning Theory for Cognitive and Computer Scientists, MIT Press, 1986.Google Scholar
[34]Osherson, D. and Weinstein, S., Criteria for language learning, Information and Control, vol. 52 (1982), pp. 123138.CrossRefGoogle Scholar
[35]Osherson, D. and Weinstein, S., A note on formal learning theory, Cognition, vol. 11 (1982), pp. 7788.CrossRefGoogle ScholarPubMed
[36]Osherson, D. and Weinstein, S., Learning theory and natural language, Cognition, vol. 17 (1984), pp. 128.CrossRefGoogle ScholarPubMed
[37]Peter, R., Recursive Functions, Academic Press, New York, 1967.Google Scholar
[38]Pinker, S., Formal models of language learning, Cognition, vol. 7 (1979), pp. 217283.CrossRefGoogle ScholarPubMed
[39]Rathjen, M., The realm of ordinal analysis, Sets and Proofs (Cooper, S. and Truss, J., editors), Cambridge University Press, 1999, pp. 219279.CrossRefGoogle Scholar
[40]Rogers, H., Theory of Recursive Functions and Effective Computability, McGraw-Hill, 1967, Reprinted by MIT Press in 1987.Google Scholar
[41]Royer, J. and Case, J., Subrecursive Programming Systems: Complexity & Succinctness, Birkhäuser, 1994.CrossRefGoogle Scholar
[42]Schaefer, M., A guided tour of minimal indices andsliortest descriptions, Archive for Mathematical Logic, vol. 18 (1998), pp. 521548.CrossRefGoogle Scholar
[43]Takeuti, G., Proof Theory, 2nd ed., North Holland, 1987.Google Scholar
[44]Weiermann, A., Proving termination for term rewriting systems, Proceedings of the 5th Workshop on Computer Science Logic (Börger, E., Jäger, G., Büning, H. K., and Richter, M. M., editors), Lecture Notes in Computer Science, vol. 626, 1992, pp. 419428.CrossRefGoogle Scholar
[45]Wexler, K., On extensional learnability, Cognition, vol. 11 (1982), pp. 8995.CrossRefGoogle ScholarPubMed
[46]Wexler, K. and Culicover, P., Formal Principles of Language Acquisition, MIT Press, 1980.Google Scholar