A Generalized Model of PAC Learning and itsApplicability

Thomas Brodag; Steffen Herbold; Stephan Waack

doi:10.1051/ita/2014005

A Generalized Model of PAC Learning and itsApplicability

Published online by Cambridge University Press: 09 April 2014

Thomas Brodag ,

Steffen Herbold and

Stephan Waack

Show author details

Thomas Brodag: Affiliation:
Institut für Informatik, Georg-August-Universität Göttingen, Goldschmidtstr. 7, 37077 Göttingen, Germany. e-mail: [email protected], [email protected], [email protected] .
Steffen Herbold: Affiliation:
Institut für Informatik, Georg-August-Universität Göttingen, Goldschmidtstr. 7, 37077 Göttingen, Germany. e-mail: [email protected], [email protected], [email protected] .
Stephan Waack: Affiliation:
Institut für Informatik, Georg-August-Universität Göttingen, Goldschmidtstr. 7, 37077 Göttingen, Germany. e-mail: [email protected], [email protected], [email protected] .

Article contents

Abstract
References

Get access

Abstract

We combine a new data model, where the random classification is subjected to rather weakrestrictions which in turn are based on the Mammen−Tsybakov [E. Mammen and A.B. Tsybakov,Ann. Statis. 27 (1999) 1808–1829; A.B. Tsybakov,Ann. Statis. 32 (2004) 135–166.] small margin conditions,and the statistical query (SQ) model due to Kearns [M.J. Kearns, J. ACM45 (1998) 983–1006] to what we refer to as PAC + SQ model. We generalize the classconditional constant noise (CCCN) model introduced by Decatur [S.E. Decatur, inICML ’97: Proc. of the Fourteenth Int. Conf. on Machine Learn. MorganKaufmann Publishers Inc. San Francisco, CA, USA (1997) 83–91] to the noise modelorthogonal to a set of query functions. We show that every polynomial time PAC + SQ learning algorithm can beefficiently simulated provided that the random noise rate is orthogonal to the queryfunctions used by the algorithm given the target concept. Furthermore, we extend theconstant-partition classification noise (CPCN) model due to Decatur [S.E. Decatur, inICML ’97: Proc. of the Fourteenth Int. Conf. on Machine Learn. MorganKaufmann Publishers Inc. San Francisco, CA, USA (1997) 83–91] to what we call theconstant-partition piecewise orthogonal (CPPO) noise model. We show how statisticalqueries can be simulated in the CPPO scenario, given the partition is known to thelearner. We show how to practically use PAC +SQ simulators in the noise model orthogonal to the query space bypresenting two examples from bioinformatics and software engineering. This way, wedemonstrate that our new noise model is realistic.

Keywords

PAC learning with classification noise Mammen−Tsybakov small margin conditions statistical queries noise model orthogonal to a set of query functions bioinformatics software engineering

Type: Research Article
Information: RAIRO - Theoretical Informatics and Applications , Volume 48 , Issue 2 , April 2014 , pp. 209 - 245

DOI: https://doi.org/10.1051/ita/2014005 [Opens in a new window]
Copyright: © EDP Sciences 2014

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

D.W. Aha and D. Kibler, Instance-based learning algorithms. Machine Learn. (1991) 37–66.

Angluin, D. and Laird, P., Learning from noisy examples. Machine Learn. 2 (1988) 343–370. Google Scholar

http://httpd.apache.org/ (2011).

J.A. Aslam, Noise Tolerant Algorithms for Learning and Searching, Ph.D. thesis. MIT (1995).

Aslam, J.A. and Decatur, S.E., Specification and Simulation of Statistical Query Algorithms for Efficiency and Noise Tolerance. J. Comput. Syst. Sci. 56 (1998) 191–208. Google Scholar

Bartlett, P.L., Boucheron, S. and Lugosi, G., Model selection and error estimation. Machine Learn. 48 (2002) 85–113. Google Scholar

Bartlett, P.L., Jordan, M.I. and McAuliffe, J.D., Convexity, classification, and risk bounds. J. Amer. Stat. Assoc. 1001 (2006) 138–156. Google Scholar

P.L. Bartlett and S. Mendelson, Rademacher and Gaussian complexities: Risk bounds and structural results, in 14th COLT and 5th EuroCOLT (2001) 224–240.

P.L. Bartlett and S. Mendelson, Rademacher and Gaussian complexities: Risk bounds and structural results. J. Mach. Learn. Res. (2002) 463–482.

Blumer, A., Ehrenfeucht, A., Haussler, D. and Warmuth, M.K., Learnabilty and the Vapnik−Chervonenkis dimension. J. ACM 36 (1989) 929–969. Google Scholar

O. Bousquet, S. Boucheron and G. Lugosi, Introduction to statistical learning theory, in Adv. Lect. Machine Learn. (2003) 169–207.

O. Bousquet, S. Boucheron and G. Lugosi, Introduction to statistical learning theory, in Adv. Lect. Machine Learn., vol. 3176 of Lect. Notes in Artificial Intelligence. Springer, Heidelberg (2004) 169–207.

Th. Brodag, PAC-Lernen zur Insolvenzerkennung und Hotspot-Identifikation, Ph.D. thesis, Ph.D. Programme in Computer Science of the Georg-August University School of Science GAUSS (2008).

Cesa-Bianchi, N., Shalev-Shwartz, S. and Shamir, O., Online learning of noisy data. IEEE Trans. Inform. Theory 57 (2011) 7907–7931. Google Scholar

S.E. Decatur, Learning in hybrid noise environments using statistical queries, in Fifth International Workshop on Artificial Intelligence and Statistics. Lect. Notes Statis. Springer (1993).

S.E. Decatur, Statistical Queries and Faulty PAC Oracles. COLT (1993) 262–268.

S.E. Decatur, Efficient Learning from Faulty Data, Ph.D. thesis. Harvard University (1995).

S.E. Decatur, PAC learning with constant-partition classification noise and applications to decision tree induction, in ICML ’97: Proc. of the Fourteenth Int. Conf. on Machine Learn. Morgan Kaufmann Publishers Inc. San Francisco, CA, USA (1997) 83–91.

S.E. Decatur and R. Gennaro, On learning from noisy and incomplete examples, in COLT (1995) 353–360.

L. Devroye, L. Györfi and G. Lugosi, A Probabilistic Theory of Pattern Recognition. Springer, New York (1997).

http://www.eclipse.org/jdt/ (2011).

http://www.eclipe.org/platform/ (2011).

N. Fenton and S.L. Pfleeger, Software metrics: a rigorous and practical approach. PWS Publishing Co. Boston, MA, USA (1997).

Haussler, D. and Haussler, D., Can pac learning algorithms tolerate random attribute noise? Algorithmica 14 (1995) 70–84. Google Scholar

Halperin, I., Wolfson, H. and Nussinov, R., Protein-protein interactions coupling of structurally conserved residues and of hot spots across interfaces. implications for docking. Structure 12 (2004) 1027–1036. Google Scholar

Haussler, D., Quantifying inductive bias: AI learning algorithms and Valiant’s learning framework. Artificial Intelligence 36 (1988) 177–221. Google Scholar

Haussler, D., Kearns, M.J., Littlestone, N. and Warmuth, M.K., Equivalence of models for polynomial learnability. Inform. Comput. 95 (1991) 129–161. Google Scholar

Haussler, D., Haussler, D. and Haussler, D., Calculation and optimization of thresholds for sets of software metrics. Empirical Software Engrg. (2011) 1–30. 10.1007/s10664-011-9162-z. Google Scholar

International Organization of Standardization (ISO) and International Electro-technical Commission (ISEC), Geneva, Switzerland. Software engineering – Product quality, Parts 1-4 (2001-2004).

G. John and P. Langley, Estimating continuous distributions in bayesian classifiers, In Proc. of the Eleventh Conf. on Uncertainty in Artificial Intelligence. Morgan Kaufmann (1995) 338–345.

Kearns, M.J., Efficient noise-tolerant learning from statistical queries. J. ACM 45 (1998) 983–1006. Google Scholar

Kearns, M.J. and Li, M., Learning in the presence of malicious errors. SIAM J. Comput. 22 (1993) 807–837. Google Scholar

Kearns, M.J. and Schapire, R.E., Efficient Distribution-Free Learning of Probabilistic Concepts. J. Comput. Syst. Sci. 48 (1994) 464–497. Google Scholar

Koltchinskii, V., Rademacher penalties and structural risk minimization. IEEE Trans. Inform. Theory 47 (2001) 1902–1914. Google Scholar

Mammen, E. and Tsybakov, A.B., Smooth discrimination analysis. Ann. Statis. 27 (1999) 1808–1829. Google Scholar

P. Massart, Some applications of concentration inequalities to statistics. Annales de la Faculté des Sciences de Toulouse, volume spécial dédiaé` Michel Talagrand (2000) 245–303.

Mendelson, S., Rademacher averages and phase transitions in Glivenko-Cantelli classes. IEEE Trans. Inform. Theory 48 (2002) 1977–1991. Google Scholar

Moreira, I.S., Fernandes, P.A. and Ramos, M.J., Hot spots – A review of the protein-protein interface determinant amino-acid residues. Proteins: Structure, Function, and Bioinformatics, 68 (2007) 803–812. Google Scholar

Nettleton, D.F., Orriols-Puig, A. and Fornells, A., A study of the effect of different types of noise on the precision of supervised learning techniques. Artif. Intell. Rev. 33 (2010) 275–306. Google Scholar

Ofran, Y. and Rost, B., ISIS: interaction sites identified from sequence. Bioinform. 23 (2007) 13–16. Google Scholar

Ofran, Y. and Rost, B., Protein-protein interaction hotspots carved into sequences. PLoS Comput. Biol. 3 (2007). Google Scholar

J.C. Platt, Fast training of support vector machines using sequential minimal optimization, in Advances in kernel methods. Edited by B. Schölkopf, Ch.J.C. Burges and A.J. Smola. MIT Press, Cambridge, MA, USA (1999) 185–208.

J. Ross Quinlan, C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1993).

L. Ralaivola, F. Denis and Ch.N. Magnan, CN = CPCN, in ICML ’06: Proc. of the 23rd int. Conf. Machine learn. ACM New York, NY, USA (2006) 721–728.

B. Schölkopf and A.J. Smola, Learning with Kernels. MIT Press (2002).

Thorn, K.S. and Bogan, A.A., Asedb: a database of alanine mutations and their effects on the free energy of binding in protein interactions. Bioinformatics 17 (2001) 284–285. Google Scholar PubMed

Tsybakov, A.B., Optimal aggregation of classifiers in statistical learning. Ann. Statis. 32 (2004) 135–166. Google Scholar

Valiant, L., A theory of learnability. Communic. ACM 27 (1984) 1134–1142. Google Scholar

L. Valiant, Learning disjunctions of conjunctions, in Proc. of 9th Int. Joint Conf. Artificial Int. (1985) 560–566.

Article contents

A Generalized Model of PAC Learning and itsApplicability

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests