Hostname: page-component-586b7cd67f-g8jcs Total loading time: 0 Render date: 2024-11-25T01:45:18.725Z Has data issue: false hasContentIssue false

Predicate logic as a modeling language: modeling and solving some machine learning and data mining problems with IDP3

Published online by Cambridge University Press:  14 May 2014

MAURICE BRUYNOOGHE
Affiliation:
HENDRIK BLOCKEEL
Affiliation:
BART BOGAERTS
Affiliation:
BROES DE CAT
Affiliation:
STEF DE POOTER
Affiliation:
JOACHIM JANSEN
Affiliation:
ANTHONY LABARRE
Affiliation:
JAN RAMON
Affiliation:
MARC DENECKER
Affiliation:
SICCO VERWER
Affiliation:
Institute for Computing and Information Sciences, Radboud Universiteit Nijmegen, Toernooiveld, Nijmegen, the Netherlands (e-mail: [email protected])
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

This paper provides a gentle introduction to problem-solving with the IDP3 system. The core of IDP3 is a finite model generator that supports first-order logic enriched with types, inductive definitions, aggregates and partial functions. It offers its users a modeling language that is a slight extension of predicate logic and allows them to solve a wide range of search problems. Apart from a small introductory example, applications are selected from problems that arose within machine learning and data mining research. These research areas have recently shown a strong interest in declarative modeling and constraint-solving as opposed to algorithmic approaches. The paper illustrates that the IDP3 system can be a valuable tool for researchers with such an interest. The first problem is in the domain of stemmatology, a domain of philology concerned with the relationship between surviving variant versions of text. The second problem is about a somewhat related problem within biology where phylogenetic trees are used to represent the evolution of species. The third and final problem concerns the classical problem of learning a minimal automaton consistent with a given set of strings. For this last problem, we show that the performance of our solution comes very close to that of the state-of-the art solution. For each of these applications, we analyze the problem, illustrate the development of a logic-based model and explore how alternatives can affect the performance.

Type
Regular Papers
Copyright
Copyright © Cambridge University Press 2014 

References

Andrews, T., Blockeel, H., Bogaerts, B., Bruynooghe, M., Denecker, M., De Pooter, S., Macé, C. and Ramon, J. 2012. Analyzing manuscript traditions using constraint-based data mining. In COmbining COnstraint solving with MIning and LEarning (CoCoMile). Montpellier, France, 27 August 2012, Proceedings First Workshop on Combining Constraint Solving with Mining and Learning (CoCoMile). (ECAI 2012 Workshop), 15–20.Google Scholar
Andrews, T. and Macé, C. 2013. Beyond the tree of texts: Building an empirical model of scribal variation through graph analysis of texts and stemmata. Literary and Linguistic Computing 28, 4, 504521.Google Scholar
Baret, P., Macé, C., Robinson, P., Peersman, C., Mazza, R., Noret, J., Wattel, E., Van Mulken, M., Robinson, P., Lantin, A-C., Canettieri, P., Loreto, V., Windram, H., Spencer, M., Howe, C., Albu, M. and Dress, A. 2006. Testing methods on an artificially created textual tradition. In The Evolution of Texts: Confronting Stemmatological and Genetical Methods, Proceedings of the International Workshop, Louvain-la-Neuve. Istituti editoriali e poligrafici internazionali, Pisa, Italy, 255283.Google Scholar
Biermann, A. W. and Feldman, J. A. 1972, June. On the synthesis of finite-state machines from samples of their behavior. IEEE Transactions on Computers 21, 6, 592597.CrossRefGoogle Scholar
Blockeel, H., Bogaerts, B., Bruynooghe, M., De Cat, B., De Pooter, S., Denecker, M., Labarre, A., Ramon, J. and Verwer, S. 2012. Modeling machine learning and data mining problems with FO(⋅). In Technical Communications of the 28th International Conference on Logic Programming (ICLP 2012), Budapest, Hungary, September 4–8, 2012, Dovier, A. and Costa, V. S., Eds., LIPIcs, vol. 17. Schloss Dagstuhl – Leibniz-Zentrum fuer Informatik, Wadern, Germany, 1425.Google Scholar
Brewka, G., Eiter, T. and Truszczyński, M. 2011. Answer set programming at a glance. Communications of the ACM 54, 12, 92103.CrossRefGoogle Scholar
Calimeri, F., Ianni, G., Ricca, F., Alviano, M., Bria, A., Catalano, G., Cozza, S., Faber, W., Febbraro, O., Leone, N., Manna, M., Martello, A., Panetta, C., Perri, S., Reale, K., Santoro, M. C., Sirianni, M., Terracina, G. and Veltri, P. 2011. The third answer set programming system competition: Preliminary report of the system competition track. In Proceedings of the International Conference on Logic Programming and Nonmonotonic Reasoning (LPNMR). Springer, Berlin, Germany, 388403.Google Scholar
Costa Florêncio, C. and Verwer, S. 2012. Regular inference as vertex coloring. In Algorithmic Learning Theory, Bshouty, N., Stoltz, G., Vayatis, N. and Zeugmann, T., Eds., Lecture Notes in Computer Science, vol. 7568. Springer, Berlin, Germany, 8195.CrossRefGoogle Scholar
Coste, F. and Nicolas, J. 1997. Regular inference as a graph coloring problem. In ICML Workshop on Grammatical Inference, Automata Induction, and Language Acquisition. Workshop on Automata Induction, Grammatical Inference, and Language Acquisition at The Fourteenth International Conference on Machine Learning (ICML-97) July 12, 1997, Nashville, Tennessee, 6 pages.Google Scholar
De Cat, B., Bogaerts, B., Bruynooghe, M. and Denecker, M. 2014. Predicate logic as a modelling language: The IDP system. CoRR abs/1401.6312.Google Scholar
De Cat, B., Bogaerts, B., Denecker, M. and Devriendt, J. 2013. Model expansion in the presence of function symbols using constraint programming. In 25th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2013), Washinton, WA, November 4–6, 2013. 10681075.Google Scholar
De Cat, B. and Bruynooghe, M. 2013. Detection and exploitation of functional dependencies for model generation. Theory and Practice of Logic Programming 13, 4–5, 471485.Google Scholar
De Cat, B., Denecker, M. and Stuckey, P. 2012. Lazy model expansion by incremental grounding. In Technical Communications of the 28th International Conference on Logic Programming (ICLP 2012), Budapest, Hungary, September 4–8, 2012, Dovier, A. and Costa, V. S., Eds., LIPIcs, vol. 17. Schloss Dagstuhl – Leibniz-Zentrum fuer Informatik, Wadern, Germany, 201211.Google Scholar
De Cat, B., Denecker, M., Stuckey, P. J. and Bruynooghe, M. 2014. Lazy model expansion: Interleaving grounding with search. CoRR abs/1402.6889.Google Scholar
de la Higuera, C. 2005. A bibliographical study of grammatical inference. Pattern Recognition 38, 9, 13321348.CrossRefGoogle Scholar
Denecker, M., Lierler, Y., Truszczynski, M. and Vennekens, J. 2012. A Tarskian informal semantics for Answer Set Programming. In Technical Communications of the 28th International Conference on Logic Programming (ICLP 2012), Budapest, Hungary, September 4–8, 2012, Dovier, A. and Costa, V. S., Eds. LIPIcs, vol. 17. Schloss Dagstuhl – Leibniz-Zentrum fuer Informatik, Wadern, Germany, 277289.Google Scholar
Denecker, M. and Ternovska, E. 2008, April. A logic of nonmonotone inductive definitions. ACM Transactions on Computational Logic (TOCL) 9, 2, 14:152.Google Scholar
Denecker, M. and Vennekens, J. To appear. The well-founded semantics is the principle of inductive definition, revisited. In 14th International Conference on Principles of Knowledge Representation and Reasoning, Vienna, Austria, July 20–24, 2014, AAAI press.Google Scholar
Denecker, M., Vennekens, J., Bond, S., Gebser, M. and Truszczyński, M. 2009. The second Answer Set Programming competition. In 10th International Conference on Logic Programming and Non-Monotonic Reasoning (LPNMR), Erdem, E., Lin, F. and Schaub, T., Eds. LNCS, vol. 5753. Springer, Berlin, Germany, 637654.Google Scholar
, Wittocx, J. and Denecker, M. 2011. A prototype of a knowledge-based programming environment. In International Conference on Applications of Declarative Programming and Knowledge Management. Lecture Notes in Computer Science, vol. 7773. Springer, Berlin, Germany, 279286.Google Scholar
Devriendt, J., Bogaerts, B., Mears, C., Cat, B. D. and Denecker, M. 2012. Symmetry propagation: Improved dynamic symmetry breaking in SAT. In Proceedings of the 24th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'12), Athens, Greece, IEEE Press, 4956.Google Scholar
Dovier, A. and Costa, V. S., Eds. 2012. Technical Communications of the 28th International Conference on Logic Programming (ICLP 2012), Budapest, Hungary, September 4–8, 2012. LIPIcs, vol. 17. Schloss Dagstuhl – Leibniz-Zentrum fuer Informatik, Wadern, Germany.Google Scholar
Eén, N. and Sörensson, N. 2003. An extensible SAT-solver. In International Conference, SAT, Giunchiglia, E. and Tacchella, A., Eds. LNCS, vol. 2919. Springer, Berlin, Germany, 502518.Google Scholar
Erdem, E. 2011. Applications of answer set programming in phylogenetic systematics. In Logic Programming, Knowledge Representation, and Nonmonotonic Reasoning – Essays Dedicated to Michael Gelfond on the Occasion of His 65th Birthday, Lecture Notes in Computer Science, vol. 6565, Springer, 415431.CrossRefGoogle Scholar
Felsenstein, J. 2004. Inferring Phylogenies. Sinauer, Sunderland, MA.Google Scholar
Frisch, A. M., Harvey, W., Jefferson, C., Hernández, B. M. and Miguel, I. 2008. Essence: A constraint language for specifying combinatorial problems. Constraints 13, 3, 268306.Google Scholar
Gambette, P. 2010. Who is who in phylogenetic networks: Articles, authors and programs. Published electronically. Accessed 2011. URL: http://www.atgc-montpellier.fr/phylnet.Google Scholar
Gebser, M., Kaufmann, B., Neumann, A. and Schaub, T. 2007. clasp: A conflict-driven answer set solver. In LPNMR, C. Baral, G. Brewka and J. S. Schlipf, Eds. LNCS, vol. 4483. Springer, Berlin, Germany, 260265.Google Scholar
Gelfond, M. and Lifschitz, V. 1988. The stable model semantics for logic programming. In ICLP/SLP, Kowalski, R. A. and Bowen, K. A., Eds. MIT Press, Cambridge, MA, 10701080.Google Scholar
Gold, E. M. 1978. Complexity of automaton identification from given data. Information and Control 37, 3, 302320.Google Scholar
Grinchtein, O., Leucker, M. and Piterman, N. 2006. Inferring network invariants automatically. In Automated Reasoning, Furbach, U. and Shankar, N., Eds. Lecture Notes in Computer Science, vol. 4130. Springer, Berlin, Germany, 483497.CrossRefGoogle Scholar
Guns, T., Nijssen, S. and Raedt, L. D. 2011. Itemset mining: A constraint programming perspective. Artificial Intelligence 175, 12–13, 19511983.Google Scholar
Heule, M. and Verwer, S. 2010. Exact DFA identification using SAT solvers. In Grammatical Inference: Theoretical Results and Applications (ICGI 2010), 66–79.Google Scholar
Heule, M. J. H. and Verwer, S. 2012. Software model synthesis using satisfiability solvers. Empirical Software Engineering, August, 1–32. doi: http://dx.doi.org/10.1007/s10664-012-9222-z.Google Scholar
Huson, D. H., Rupp, R. and Scornavacca, C. 2010. Phylogenetic Networks: Concepts, Algorithms and Applications. Cambridge University Press, Cambridge, UK.CrossRefGoogle Scholar
Ierusalimschy, R., de Figueiredo, L. H. and Celes, W. 1996. Lua – an extensible extension language. Software: Practice and Experience 26, 6, 635652.Google Scholar
Jansen, J., Jorissen, A. and Janssens, G. 2013. Compiling input* FO(⋅) inductive definitions into tabled Prolog rules for IDP3. Theory and Practice of Logic Programming 13, 4–5, 691704.Google Scholar
Kowalski, R. A. 1974. Predicate logic as programming language. In IFIP Congress, Rosenfeld, J. L., Ed. Information Processing 74, Proceedings of IFIP Congress 74, Stockholm, Sweden, August 5–10, 1974. North-Holland, 1974, 569574.Google Scholar
Labarre, A. and Verwer, S. To appear. Merging partially labelled trees: Hardness and an efficient practical solution. IEEE/ACM Transactions on Computational Biology and Bioinformatics.Google Scholar
Leone, N., Pfeifer, G., Faber, W., Eiter, T., Gottlob, G., Perri, S. and Scarcello, F. 2002. The DLV system for knowledge representation and reasoning. ACM Transactions on Computational Logic 7, 499562.Google Scholar
Mariën, M., Wittocx, J., Denecker, M. and Bruynooghe, M. 2008. SAT(ID): Satisfiability of propositional logic extended with inductive definitions. In SAT, Büning, H. Kleine and Zhao, X., Eds. LNCS, vol. 4996. Springer, Berlin, Germany, 211224.Google Scholar
Marriott, K., Nethercote, N., Rafeh, R., Stuckey, P. J., Garcia de la Banda, M. and Wallace, M. 2008. The design of the Zinc modelling language. Constraints 13, 3, 229267.CrossRefGoogle Scholar
Milner, R. 1978. A theory of type polymorphism in programming. Journal of Computer and System Sciences 17, 3, 348375.Google Scholar
Mitchell, D. G. and Ternovska, E. 2005. A framework for representing and solving NP search problems. In Twentieth AAAI National Conference on Artificial Intelligence (AAAI-05), Veloso, M. M. and Kambhampati, S., Eds. MIT Press, Cambridge, MA, 430435.Google Scholar
Nieuwenhuis, R., Oliveras, A. and Tinelli, C. 2006. Solving SAT and SAT modulo theories: From an abstract Davis–Putnam–Logemann–Loveland procedure to DPLL(T). Journal of the ACM 53, 6, 937977.Google Scholar
Pelov, N., Denecker, M. and Bruynooghe, M. 2007. Well-founded and stable semantics of logic programs with aggregates. Theory and Practice of Logic Programming (TPLP) 7, 3, 301353.CrossRefGoogle Scholar
Roos, T. and Heikkilä, T. 2009. Evaluating methods for computer-assisted stemmatology using artificial benchmark data sets. Literary and Linguistic Computing 24, 4, 417433.Google Scholar
Schulte, C. and Stuckey, P. J. 2008. Efficient constraint propagation engines. ACM Transactions on Programming Languages and Systems 31, 1.Google Scholar
Stamina 2010. The StaMinA competition, learning regular languages with large alphabets. Accessed 2012. URL: http://stamina.chefbe.net/.Google Scholar
Swift, T. and Warren, D. S. 2012. XSB: Extending Prolog with tabled logic programming. Theory and Practice of Logic Programming 12, 1–2, 157187.Google Scholar
Syrjänen, T. and Niemelä, I. 2001. The smodels system. In LPNMR, Eiter, T., Faber, W. and Truszczyński, M., Eds. LNCS, vol. 2173. Springer, Berlin, Germany, 434438.Google Scholar
Timpanaro, S. 2005. The Genesis of Lachmann's Method, Most, G. W., Trans. University of Chicago Press, Chicago, IL.Google Scholar
Van Gelder, A., Ross, K. A. and Schlipf, J. S. 1991. The well-founded semantics for general logic programs. Journal of the ACM 38, 3, 620650.Google Scholar
Wittocx, J., Denecker, M. and Bruynooghe, M. 2013. Constraint propagation for first-order logic and inductive definitions. ACM Transactions on Computational Logic 14, 3.CrossRefGoogle Scholar
Wittocx, J., Mariën, M. and Denecker, M. 2008. The IDP system: A model expansion system for an extension of classical logic. In The 2nd International Workshop on Logic and Search (LaSh 2008), Denecker, M., Ed. November 6–7, 2008, Leuven, Belgium, 153165.Google Scholar
Wittocx, J., Mariën, M. and Denecker, M. 2010. Grounding FO and FO(ID) with bounds. Journal of Artificial Intelligence Research 38, 223269.Google Scholar