Search results for Pattern Recognition and Machine Learning

51 - Regularization
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

24 February 2023

Print publication:

22 December 2022, pp 2221-2259
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

We discussed the least‐squares problem in the previous chapter, which uses a collection of data points ${x (n), y_{n}}$ to determine an optimal parameter $w^{⋆}$ by minimizing an empirical quadratic risk of the form:

Author Index
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

24 February 2023

Print publication:

22 December 2022, pp 3149-3172
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

70 - Explainable Learning
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

24 February 2023

Print publication:

22 December 2022, pp 3042-3064
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Most learning algorithms, including deep neural networks with many layers and parameters, act as black-box procedures where feature vectors at the input layer are transformed into label predictions at the output layer through a succession of nonlinear transformations. Given how prevalent learning-based systems are becoming in modern practice, including their use in fields such as medical diagnosis, autonomous systems, and even legal proceedings, it is necessary to have confidence in their predictions in order to ensure reliable, fair, and nondiscriminatory conclusions. For this reason, one needs to understand how classification results are attained, and what attributes in the input data have influenced the decisions most heavily. Questions of this type are addressed under the topic of explainability in machine learning.

19 - Convergence Analysis I: Stochastic Gradient Algorithms
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 February 2023

Print publication:

22 December 2022, pp 683-729
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

26 - Decentralized Optimization II: Primal-Dual Methods
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 February 2023

Print publication:

22 December 2022, pp 969-1008
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

42 - Inference over Graphs
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 March 2023

Print publication:

22 December 2022, pp 1682-1739
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In the previous chapter we clarified the representation power of Bayesian networks. In this chapter, we examine the solution to inference problems over these networks. In particular, given some observations, we would like to determine (a) the states of some nodes or (b) the most probable hypothesis or explanation corresponding to the observations. We present three inference methods known as (1) inference by enumeration, which is a brute force exact method, (2) inference by variable elimination, which is a more efficient procedure, and (3) belief propagation. The last method is described in the next chapter, where it is shown to be a special case of the sum-product message-passing algorithm. In general, solving inference problems over Bayesian networks is NP-complete (i.e., they cannot be solved in polynomial complexity in the number of nodes); for the benefit of the reader, we describe the various notions of NP-complexity in the concluding remarks of the chapter. We also describe in the chapter approaches to learn the underlying graph structure from observations including the Chow-Liu algorithm and graphical LASSO.

60 - Perceptron
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

24 February 2023

Print publication:

22 December 2022, pp 2499-2529
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In this and the next chapter we discuss two binary classification schemes known as perceptron and support vector machines. In contrast to logistic regression, these methods approximate neither the conditional pdf, $f_{γ | h} (γ | h)$ nor the joint pdf, $f_{γ, h} (γ, h)$ .

27 - Mean-Square-Error Inference
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 March 2023

Print publication:

22 December 2022, pp 1053-1091
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Inference deals with the estimation of hidden parameters or random variables from observations of other related variables. In this chapter, we study the basic, yet fundamental, problem of inferring an unknown random quantity from observations of another random quantity by using the mean-square-error (MSE) criterion. Several other design criteria can be used for inference purposes besides MSE, such as the mean-absolute error (MAE) and the maximum a-posteriori (MAP) criteria. We will encounter these possibilities in future chapters, starting with the next chapter. We initiate our discussions of inference problems though by focusing on the MSE criterion due to its mathematical tractability and because it sheds light on several important questions that arise in the study of inference problems in general.

9 - Convex Optimization
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 February 2023

Print publication:

22 December 2022, pp 302-329
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

6 - Entropy and Divergence
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 February 2023

Print publication:

22 December 2022, pp 196-239
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

15 - Proximal and Mirror-Descent Methods
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 February 2023

Print publication:

22 December 2022, pp 507-546
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Author Index
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 February 2023

Print publication:

22 December 2022, pp 1009-1032
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

10 - Lipschitz Conditions
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 February 2023

Print publication:

22 December 2022, pp 330-340
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

62 - Bagging and Boosting
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

24 February 2023

Print publication:

22 December 2022, pp 2557-2586
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In this chapter we describe two ensemble learning techniques, known as bagging and boosting, which aggregate the decisions of a mixture of learners to enable enhanced classification performance. In particular, they help transform a collection of “weak” learners into a more robust learning machine.

40 - Independent Component Analysis
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 March 2023

Print publication:

22 December 2022, pp 1609-1642
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The expectation-maximization (EM) and Baum–Welch algorithms are particularly useful for the processing of data arising from mixture models. Both techniques enable us to identify the parameters of the underlying components, for both cases when the observations are independent of each other or follow a first-order Markovian process. In this chapter, we consider another important example of a mixture model consisting of a collection of independent sources, a mixture matrix, and the observations. The objective is to undo the mixing and recover the original sources. The resulting technique is known as independent component analysis (ICA).

3 - Random Variables
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 February 2023

Print publication:

22 December 2022, pp 68-131
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

34 - Expectation Propagation
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 March 2023

Print publication:

22 December 2022, pp 1352-1379
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The Laplace method approximates the posterior distribution $f_{z | y} (z | y)$ through a Gaussian probability density function (pdf) that is not always accurate. The Markov chain Monte Carlo (MCMC) method, on the other hand, relies on sampling from auxiliary (proposal) distributions and provides a powerful way to approximate posterior distributions albeit through repeated simulations. In this chapter, we describe a third approach for approximating the posterior distribution, known as expectation propagation (EP). This method restricts the class of distributions from which the posterior is approximated to the Gaussian or exponential family and assumes a factored form for the posterior. The method can become analytically demanding, depending on the nature of the factors used for the posterior, because these factors can make the computation of certain moments unavailable in closed form. The EP method has been observed to lead to good performance in some applications such as the Bayesian logit classification problem, but this behavior is not universal and performance can degrade for other problems, especially when the posterior distribution admits a mixture model.

53 - Self-Organizing Maps
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

24 February 2023

Print publication:

22 December 2022, pp 2290-2312
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The $k$ -nearest neighbor ( $k$ -NN) rule is appealing. However, each new feature $h \in R^{M}$ requires searching over the entire training set of size $N$ to determine the neighborhood around $h$ .

56 - Linear Discriminant Analysis
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

24 February 2023

Print publication:

22 December 2022, pp 2357-2382
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In this chapter, we describe three other data-based generative methods that approximate the solution to the optimal Bayes classifier (52.8) in the absence of knowledge of the conditional probabilities $ℙ (r = r | h = h)$ . The methods estimate the prior probabilities $ℙ (r = r)$ for the classes and, in some cases, assume a Gaussian form for the reverse conditional distribution, $f_{h | r} (h | r)$ . The training data is used to estimate the priors and the first-and second-order moments of $f_{h | r} (h | r)$ .

43 - Undirected Graphs
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 March 2023

Print publication:

22 December 2022, pp 1740-1806
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The discussion in the last two chapters focused on directed graphical models or Bayesian networks, where a directed link from a variable $x_{1}$ toward another variable $x_{2}$ carries with it an implicit connotation of “causal effect” by $x_{1}$ on $x_{2}$ . In many instances, this implication need not be appropriate or can even be limiting. For example, there are cases where conditional independence relations cannot be represented by a directed graph. One such example is provided in Prob. 43.1. In this chapter, we examine another form of graphical representations where the links are not required to be directed anymore, and the probability distributions are replaced by potential functions. These are strictly positive functions defined over sets of connected nodes; they broaden the level of representation by graphical models. The potential functions carry with them a connotation of “similarity” or “affinity” among the variables, but can also be rolled back to represent probability distributions. Over undirected graphs, edges linking nodes will continue to reflect pairwise relationship between the variables but will lead to a fundamental factorization result in terms of the product of clique potential functions. We will show that these functions play a prominent role in the development of message-passing algorithms for the solution of inference problems.

Pattern Recognition and Machine Learning

Refine search

Refine search

Actions for selected content:

2190 results in Pattern Recognition and Machine Learning

51 - Regularization

Summary

Author Index

70 - Explainable Learning

Summary

19 - Convergence Analysis I: Stochastic Gradient Algorithms

26 - Decentralized Optimization II: Primal-Dual Methods

42 - Inference over Graphs

Summary

60 - Perceptron

Summary

27 - Mean-Square-Error Inference

Summary

9 - Convex Optimization

6 - Entropy and Divergence

15 - Proximal and Mirror-Descent Methods

Author Index

10 - Lipschitz Conditions

62 - Bagging and Boosting

Summary

40 - Independent Component Analysis

Summary

3 - Random Variables

34 - Expectation Propagation

Summary

53 - Self-Organizing Maps

Summary

56 - Linear Discriminant Analysis

Summary

43 - Undirected Graphs

Summary

Pattern Recognition and Machine Learning

Refine search

Refine search

Actions for selected content:

Save Search

2190 results in Pattern Recognition and Machine Learning

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary