Hostname: page-component-586b7cd67f-dlnhk Total loading time: 0 Render date: 2024-11-23T12:56:19.083Z Has data issue: false hasContentIssue false

Structure learning of probabilistic logic programs by searching the clause space

Published online by Cambridge University Press:  15 January 2014

ELENA BELLODI
Affiliation:
Dipartimento di Ingegneria – University of Ferrara, Via Saragat 1, 44122, Ferrara, Italy (e-mail: [email protected])
FABRIZIO RIGUZZI
Affiliation:
Dipartimento di Matematica e Informatica – University of Ferrara, Via Saragat 1, 44122, Ferrara, Italy (e-mail: [email protected])

Abstract

Learning probabilistic logic programming languages is receiving an increasing attention, and systems are available for learning the parameters (PRISM, LeProbLog, LFI-ProbLog and EMBLEM) or both structure and parameters (SEM-CP-logic and SLIPCASE) of these languages. In this paper we present the algorithm SLIPCOVER for “Structure LearnIng of Probabilistic logic programs by searChing OVER the clause space.” It performs a beam search in the space of probabilistic clauses and a greedy search in the space of theories using the log likelihood of the data as the guiding heuristics. To estimate the log likelihood, SLIPCOVER performs Expectation Maximization with EMBLEM. The algorithm has been tested on five real world datasets and compared with SLIPCASE, SEM-CP-logic, Aleph and two algorithms for learning Markov Logic Networks (Learning using Structural Motifs (LSM) and ALEPH++ExactL1). SLIPCOVER achieves higher areas under the precision-recall and receiver operating characteristic curves in most cases.

Type
Regular Papers
Copyright
Copyright © Cambridge University Press 2014 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Beerenwinkel, N., Rahnenführer, J., Däumer, M., Hoffmann, D., Kaiser, R., Selbig, J. and Lengauer, T. 2005. Learning multiple evolutionary pathways from cross-sectional data. Journal of Computational Biology 12, 584598.CrossRefGoogle ScholarPubMed
Bellodi, E. and Riguzzi, F. 2011. Learning the structure of probabilistic logic programs. In 21st International Conference on Inductive Logic Programming (ILP-2011), Revised Selected Papers. LNCS, Vol. 7207. Springer, Berlin, Germany, 6175.Google Scholar
Bellodi, E. and Riguzzi, F. 2012. Experimentation of an expectation maximization algorithm for probabilistic logic programs. Intelligenza Artificiale 8, 318.Google Scholar
Bellodi, E. and Riguzzi, F. 2013. Expectation maximization over binary decision diagrams for probabilistic logic programs. Intelligent Data Analysis 17, 343363.CrossRefGoogle Scholar
Berka, P., Rauch, J. and Tsumoto, S. (Eds.) 2002. ECML/PKDD 2002 Discovery Challenge. Proceedings of the ECML/PKDD Discovery Challenge: A Collaborative Effort in Knowledge Discovery from Databases, 108–119.Google Scholar
Biba, M., Ferilli, S. and Esposito, F. 2008. Discriminative structure learning of Markov logic networks. In Proceedings of the 18th International Conference on Inductive Logic Programming (ILP-2008). LNCS, Vol. 5194. Springer, Berlin, Germany, 5976.Google Scholar
Boyd, K., Davis, J., Page, D. and Santos Costa, V. 2012. Unachievable region in precision-recall space and its effect on empirical evaluation. In Proceedings of the 29th International Conference on Machine Learning (ICML-2012), Edinburgh, Scotland, UK. icml.cc/Omnipress, Madison, WI, 639646.Google Scholar
Bragaglia, S. and Riguzzi, F. 2011. Approximate inference for logic programs with annotated disjunctions. In 20th International Conference on Inductive Logic Programming (ILP-2010), Revised Papers. LNCS, Vol. 6489. Springer, Berlin, Germany, 3037.Google Scholar
Craven, M. and Slattery, S. 2001. Relational learning with statistical predicate invention: Better models for hypertext. Machine Learning 43, 97119.CrossRefGoogle Scholar
Dantsin, E. 1991. Probabilistic logic programs and their semantics. In Russian Conference on Logic Programming (RCLP-1991). LNCS, Vol. 592. Springer, Berlin, Germany, 152164.Google Scholar
Darwiche, A. 2004. New advances in compiling CNF into decomposable negation normal form. In Proceedings of the 16th Eureopean Conference on Artificial Intelligence (ECAI-2004). IOS Press, Amsterdam, Netherlands, 328332.Google Scholar
Davis, J. and Goadrich, M. 2006. The relationship between precision-recall and ROC curves. In Proceedings of the 23rd International Conference on Machine Learning (ICML-2006). ACM International Conference Proceeding Series 148. ACM, New York, NY, 233240.CrossRefGoogle Scholar
De Raedt, L., Demoen, B., Fierens, D., Gutmann, B., Janssens, G., Kimmig, A., Landwehr, N., Mantadelis, T., Meert, W., Rocha, R., Santos Costa, V., Thon, I. and Vennekens, J. 2008. Towards digesting the alphabet-soup of statistical relational learning. In 1st Workshop on Probabilistic Programming: Universal Languages, Systems and Applications (NIPS 2008), Vancouver, British Columbia, Canada, 13.Google Scholar
De Raedt, L., Kersting, K., Kimmig, A., Revoredo, K. and Toivonen, H. 2008. Compressing probabilistic Prolog programs. Machine Learning 70, 151168.Google Scholar
De Raedt, L., Kimmig, A. and Toivonen, H. 2007. ProbLog: A probabilistic prolog and its application in link discovery. In Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI-2007). AAAI Press, Menlo Park, CA, 24622467.Google Scholar
De Raedt, L. and Thon, I. 2010. Probabilistic rule learning. In 20th International Conference on Inductive Logic Programming (ILP-2010), Revised Papers, LNCS, Vol. 7207. Springer, New York, NY, 4758.CrossRefGoogle Scholar
Fawcett, T. 2006. An introduction to ROC analysis. Pattern Recognition Letters 27, 861874.CrossRefGoogle Scholar
Friedman, N. 1998. The Bayesian structural EM algorithm. In Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence (UAI '98). Morgan Kaufmann, Burlington, MA, 129138.Google Scholar
Fuhr, N. 2000. Probabilistic datalog: Implementing logical information retrieval for advanced applications. Journal of the American Society for Information Science 51, 95110.Google Scholar
Getoor, L., Friedman, N., Koller, D., Pfeffer, A. and Taskar, B. 2007. Probabilistic relational models. In Introduction to Statistical Relational Learning, Getoor, L. and Taskar, B., Eds. MIT Press, Cambridge, MA, 129174.Google Scholar
Gutmann, B., Kimmig, A., Kersting, K. and De Raedt, L. 2008. Parameter learning in probabilistic databases: A least squares approach. In Machine Learning and Knowledge Discovery in Databases – European Conference (ECML/PKDD-2008), Proceedings, Part I, LNCS, Vol. 5211. Springer, Berlin, Germany, 473488.Google Scholar
Gutmann, B., Kimmig, A., Kersting, K. and De Raedt, L. 2010. Parameter Estimation in ProbLog from Annotated Queries. Tech. Rep. CW 583, KU Leuven, Belgium.Google Scholar
Gutmann, B., Thon, I. and De Raedt, L. 2011. Learning the parameters of probabilistic logic programs from interpretations. In Machine Learning and Knowledge Discovery in Databases – European Conference (ECML/PKDD-2011), Proceedings, Part I, LNCS, Vol. 6911. Springer, Berlin, Germany, 581596.Google Scholar
Huynh, T. N. and Mooney, R. J. 2008. Discriminative structure and parameter learning for Markov logic networks. In Proceedings of the 25th International Conference on Machine Learning (ICML-2008), ACM International Conference Proceeding Series 307. ACM, New York, NY, 416423.Google Scholar
Inoue, K., Sato, T., Ishihata, M., Kameya, Y. and Nabeshima, H. 2009. Evaluating abductive hypotheses using an EM algorithm on BDDs. In Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI-2009). Morgan Kaufmann, Burlington, MA, 810815.Google Scholar
Ishihata, M., Kameya, Y., Sato, T. and Minato, S. 2008a. Propositionalizing the EM algorithm by BDDs. In 18th International Conference on Inductive Logic Programming (ILP-2008), Late Breaking Papers, 44–49.Google Scholar
Ishihata, M., Kameya, Y., Sato, T. and Minato, S. 2008b. Propositionalizing the EM Algorithm by BDDs. Tech. Rep. TR08-0004, Department of Computer Science, Tokyo Institute of Technology, Japan.Google Scholar
Ishihata, M., Sato, T. and ichi Minato, S. 2011. Compiling Bayesian networks for parameter learning based on shared BDDs. In Proceedings of the 24th Australasian Joint Conference on Advances in Artificial Intelligence (AI 2011). LNCS, Vol. 7106. Springer, New York, NY, 203212.Google Scholar
Kersting, K. and De Raedt, L. 2008. Basic principles of learning Bayesian logic programs. In Probabilistic Inductive Logic Programming, De Raedt, L., Frasconi, P., Kersting, K. and Muggleton, S., Eds. LNCS, Vol. 4911. Springer, New York, NY, 189221.Google Scholar
Khosravi, H., Schulte, O., Hu, J. and Gao, T. 2012. Learning compact Markov logic networks with decision trees. Machine Learning 89, 257277.CrossRefGoogle Scholar
Kimmig, A., Demoen, B., De Raedt, L., Santos Costa, V. and Rocha, R. 2011. On the implementation of the probabilistic logic programming language ProbLog. Theory and Practice of Logic Programming 11, 235262.CrossRefGoogle Scholar
Kok, S. and Domingos, P. 2005. Learning the structure of Markov logic networks. In Proceedings of the 22nd International Conference on Machine Learning (ICML-2005), ACM International Conference Proceeding Series 119. ACM, New York, NY, 441448.Google Scholar
Kok, S. and Domingos, P. 2009. Learning Markov logic network structure via hypergraph lifting. In Proceedings of the 26th Annual International Conference on Machine Learning (ICML-2009), ACM International Conference Proceeding Series 382. ACM, New York, NY, 505512.CrossRefGoogle Scholar
Kok, S. and Domingos, P. 2010. Learning Markov logic networks using structural motifs. In Proceedings of the 27th International Conference on Machine Learning (ICML-2010). Omnipress, Madison, WI, 551558.Google Scholar
Lowd, D. and Domingos, P. 2007. Efficient weight learning for Markov logic networks. In Proceedings of the 18th European Conference on Machine Learning (ECML-2007), LNCS, Vol. 4702. Springer, New York, NY, 200211.Google Scholar
Meert, W., Struyf, J. and Blockeel, H. 2008. Learning ground CP-Logic theories by leveraging Bayesian network learning techniques. Fundamenta Informaticae 89, 131160.Google Scholar
Mihalkova, L. and Mooney, R. J. 2007. Bottom-up learning of Markov logic network structure. In Proceedings of the 24th International Conference on Machine Learning (ICML-2007), ACM International Conference Proceeding Series 227. ACM, New York, NY, 625632.CrossRefGoogle Scholar
Minato, S., Satoh, K. and Sato, T. 2007. Compiling Bayesian networks by symbolic probability calculation based on zero-suppressed BDDs. In Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI-2007). AAAI Press, Palo Alto, CA, 25502555.Google Scholar
Muggleton, S. 1995. Inverse entailment and Progol. New Generation Computing 13, 245286.CrossRefGoogle Scholar
Ourston, D. and Mooney, R. J. 1994. Theory refinement combining analytical and empirical methods. Artificial Intelligence 66, 273309.CrossRefGoogle Scholar
Paes, A., Revoredo, K., Zaverucha, G. and Santos Costa, V. 2006. PFORTE: Revising probabilistic FOL theories. In Advances in Artificial Intelligence – Proceedings of the 2nd International Joint Conference, 10th Ibero-American Conference on AI, 18th Brazilian AI Symposium (IBERAMIA-SBIA-2006), LNCS, Vol. 4140. Springer, New York, NY, 441450.Google Scholar
Poole, D. 1993. Logic programming, abduction and probability – a top-down anytime algorithm for estimating prior and posterior probabilities. New Generation Computing 11, 377400.CrossRefGoogle Scholar
Poole, D. 1997. The independent choice logic for modelling multiple agents under uncertainty. Artificial Intelligence 94, 756.Google Scholar
Przymusinski, T. C. 1989. Every logic program has a natural stratification and an iterated least fixed point model. In Proceedings of the 8th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS-1989). ACM Press, New York, NY, 1121.Google Scholar
Quinlan, J. R. and Cameron-Jones, R. M. 1993. FOIL: A midterm report. In Machine Learning: ECML-93, Proceedings of the European Conference on Machine Learning. LNCS, Vol. 667. Springer, Berlin, Germany, 320.Google Scholar
Rauzy, A., Châtelet, E., Dutuit, Y. and Bérenguer, C. 2003. A practical comparison of methods to assess sum-of-products. Reliability Engineering and System Safety 79, 3342.CrossRefGoogle Scholar
Richards, B. L. and Mooney, R. J. 1995. Automated refinement of first-order Horn-clause domain theories. Machine Learning 19, 95131.Google Scholar
Richardson, M. and Domingos, P. 2006. Markov logic networks. Machine Learning 62, 107136.CrossRefGoogle Scholar
Riguzzi, F. 2004. Learning logic programs with annotated disjunctions. In Proceedings of the 14th International Conference on Inductive Logic Programming (ILP-2004). LNAI, Vol. 3194. Springer-Verlag, Berlin, Germany, 270287.Google Scholar
Riguzzi, F. 2006. ALLPAD: Approximate Learning of Logic Programs with Annotated Disjunctions. In 16th International Conference on Inductive Logic Programming (ILP-2006), Revised Selected Papers, LNCS, Vol. 4455. Springer, Berlin, Germany, 4345.Google Scholar
Riguzzi, F. 2007. A top-down interpreter for LPAD and CP-Logic. In AI*IA 2007: Artificial Intelligence and Human-Oriented Computing, Proceedings of the 10th Congress of the Italian Association for Artificial Intelligence, LNCS, Vol. 4733. Springer, Berlin, Germany, 109120.Google Scholar
Riguzzi, F. 2008a. ALLPAD: Approximate Learning of Logic Programs with Annotated Disjunctions. Machine Learning 70, 207223.CrossRefGoogle Scholar
Riguzzi, F. 2008b. Inference with Logic Programs with Annotated Disjunctions under the well–founded semantics. In Proceedings of the 24th International Conference on Logic Programming (ICLP-2008), LNCS, Vol. 5366. Springer, Berlin, Germany, 667771.Google Scholar
Riguzzi, F. 2009. Extended semantics and inference for the independent choice logic. Logic Journal of the IGPL 17, 589629.Google Scholar
Riguzzi, F. 2010. SLGAD resolution for inference on Logic Programs with Annotated Disjunctions. Fundamenta Informaticae 102, 429466.Google Scholar
Riguzzi, F. 2013a. MCINTYRE: A Monte Carlo system for probabilistic logic programming. Fundamenta Informaticae 124, 521541.CrossRefGoogle Scholar
Riguzzi, F. 2013b. Speeding up inference for probabilistic logic programs. The Computer Journal. doi:10.1093/comjnl/bxt096.Google Scholar
Riguzzi, F. and Di Mauro, N. 2012. Applying the information bottleneck to statistical relational learning. Machine Learning 86, 89114.CrossRefGoogle Scholar
Riguzzi, F. and Swift, T. 2010. Tabling and answer subsumption for reasoning on logic programs with annotated disjunctions. In Technical Communications of the 26th International Conference on Logic Programming (ICLP-2010), LIPIcs, Vol. 7. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Wadern, Germany, 162171.Google Scholar
Riguzzi, F. and Swift, T. 2011. The PITA system: Tabling and answer subsumption for reasoning under uncertainty. Theory and Practice of Logic Programming, International Conference on Logic Programming (ICLP) Special Issue 11, 433449.Google Scholar
Riguzzi, F. and Swift, T. 2013. Well-definedness and efficient inference for probabilistic logic programming under the distribution semantics. Theory and Practice of Logic Programming 13, 279302.Google Scholar
Sang, T., Beame, P. and Kautz, H. A. 2005. Performing Bayesian inference by weighted model counting. In Proceedings of the 20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference (AAAI-2005). AAAI Press/The MIT Press, Cambridge, MA, 475482.Google Scholar
Santos Costa, V., Damas, L. and Rocha, R. 2012. The YAP Prolog system. Theory and Practice of Logic Programming 12, 534.Google Scholar
Santos Costa, V., Page, D., Qazi, M. and Cussens, J. 2003. CLP(BN): Constraint logic programming for probabilistic knowledge. In Proceedings of the 19th Conference on Uncertainty in Artificial Intelligence (UAI'03). Morgan Kaufmann, Burlington, MA, 517524.Google Scholar
Sato, T. 1995. A statistical learning method for logic programs with distribution semantics. In Proceedings of the 12th International Conference on Logic Programming (ICLP-1995). MIT Press, Cambridge, MA, 715729.Google Scholar
Sato, T. and Kameya, Y. 2001. Parameter learning of logic programs for symbolic-statistical modeling. Journal of Artificial Intelligence Research 15, 391454.Google Scholar
Schwarz, G. 1978. Estimating the dimension of a model. The Annals of Statistics 6, 461464.Google Scholar
Srinivasan, A. 2012. Aleph [online]. Accessed 3 April 2012. URL: http://www.cs.ox.ac.uk/activities/machlearn/Aleph/aleph.html.Google Scholar
Srinivasan, A., Muggleton, S., King, R. and Sternberg, M. 1994. Mutagenesis: ILP experiments in a non-determinate biological domain. In Proceedings of the 4th International Workshop on Inductive Logic Programming, GMD-Studien 237. Gesellschaft fur Mathematik und Datenverarbeitung MBH, 217232.Google Scholar
Srinivasan, A., Muggleton, S., Sternberg, M. J. E. and King, R. D. 1996. Theories for mutagenicity: A study in first-order and feature-based induction. Artificial Intelligence 85, 277299.CrossRefGoogle Scholar
Thayse, A., Davio, M. and Deschamps, J. P. 1978. Optimization of multivalued decision algorithms. In Proceedings of the 8th International Symposium on Multiple-Valued logic (MLV '78). IEEE Computer Society Press, Washington, DC, 171178.Google Scholar
Thon, I., Landwehr, N. and De Raedt, L. 2008. A simple model for sequences of relational state descriptions. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD-2008), Part II, LNCS Vol. 5212. Springer, New York, NY, 506521.Google Scholar
Van Gelder, A., Ross, K. A. and Schlipf, J. S. 1991. The well–founded semantics for general logic programs. Journal of the ACM 38, 620650.CrossRefGoogle Scholar
Vennekens, J., Denecker, M. and Bruynooghe, M. 2009. CP-logic: A language of causal probabilistic events and its relation to logic programming. Theory and Practice of Logic Programming 9, 245308.CrossRefGoogle Scholar
Vennekens, J. and Verbaeten, S. 2003. Logic Programs with Annotated Disjunctions. Tech. Rep. CW386, KU Leuven, Netherlands.Google Scholar
Vennekens, J., Verbaeten, S. and Bruynooghe, M. 2004. Logic programs with annotated disjunctions. In Proceedings of the 20th International Conference on Logic Programming (ICLP-2004). LNCS Vol. 3131. Springer, Berlin, Germany, 195209.Google Scholar