We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure [email protected]
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The inference of a random variable from observations requires that we evaluate the posterior distribution as happens, for example, in inference formulations based on mean-square-error (MSE), maximum a-posteriori (MAP), or probability of error metrics. In previous chapters, we described several techniques to facilitate the computation or approximation of such posterior distributions using Monte Carlo or variational inference methods. We will encounter other types of approximations in later chapters. For example, in the context of naïve Bayes classifiers in Chapter 55, we will assume that, conditioned on the latent variable , the observations are independent of each other in order to write
We continue our treatment of Markov decision processes (MDPs) and focus in this chapter on methods for determining optimal actions or policies. We derive two popular methods known as value iteration and policy iteration, and establish their convergence properties. We also examine the Bellman optimality principle in the context of value and policy learning. In a later section, we extend the discussion to the more challenging case of partially observable MDPs (POMDPs), where the successive states of the MDP are unobservable to the agent, and the agent is only able to sense measurements emitted randomly by the MDP from the various states. We will define POMDPs and explain that they can be reformulated as belief‐MDPs with continuous (rather than discrete) states. This fact complicates the solution of the value iteration. Nevertheless, we will show that the successive value iterates share a useful property, namely, that they are piecewise linear and convex. This property can be exploited by computational methods to reduce the complexity of solving the value iteration for POMDPs.
In this chapter, we describe a popular discriminative approach for classification problems, known as logistic regression. Assuming binary classification with labels and features , we explained earlier in expression (28.85) that the optimal Bayes classifier for predicting is given by
We mentioned earlier in Section 52.3 that the nearest-neighbor (NN) rule for classification and clustering treats equally all attributes within each feature vector, . If, for example, some attributes are more relevant to the classification task than other attributes, then this aspect is ignored by the NN classifier because all entries of the feature vector will contribute similarly to the calculation of Euclidean distances and the determination of neighborhoods.
Convolutional neural networks (CNNs) are prevalent in computer vision, image, speech, and language processing applications, where they have been successfully applied to perform classification tasks at high accuracy rates. One of their main attractions is the ability to operate directly on raw input signals, such as images, and to extract salient features automatically from the raw data. The designer does not need to worry about which features to select to drive the classification process.
When the training data is linearly separable, there will exist many separating hyperplanes that can discriminate the data into two classes. Some of the techniques we described in the previous chapters, such as logistic regression and perceptron, are able to find such separating hyperplanes.
We develop a sequential version of the importance sampling technique from Chapter 33 in order to respond to streaming data, thus leading to a sequential Monte Carlo solution. The algorithm will lead to the important class of particle filters. This chapter presents the basic data model and the main construction that enables recursive inference. Many of the inference and learning methods in subsequent chapters will possess a recursive structure, which is a fundamental property to enable them to continually learn in response to the arrival of sequential data measurements. Particle filters are particularly well suited for scenarios involving nonlinear models and non-Gaussian signals, and they have found applications in a wide range of areas where these two features (nonlinearity and non-Gaussianity) are prevalent, including in guidance and control, robot localization, visual tracking of objects, and finance.
The optimal Bayes classifier (52.8) requires knowledge of the conditional probability distribution , which is generally unavailable. In this and the next few chapters, we describe data‐based generative methods that approximate the joint probability distribution , or its components and , directly from the data.