Search results for Pattern Recognition and Machine Learning

12 - Fundamental Limits in Model Selection for Modern Data Analysis
- By Jie Ding, Yuhong Yang, Vahid Tarokh
Edited by Miguel R. D. Rodrigues, University College London, Yonina C. Eldar, Weizmann Institute of Science, Israel
Book:

Information-Theoretic Methods in Data Science

Published online:

22 March 2021

Print publication:

08 April 2021, pp 359-382
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

With rapid development in hardware storage, precision instrument manufacturing, and economic globalization etc., data in various forms have become ubiquitous in human life. This enormous amount of data can be a double-edged sword. While it provides the possibility of modeling the world with a higher fidelity and greater flexibility, improper modeling choices can lead to false discoveries, misleading conclusions, and poor predictions. Typical data-mining, machine-learning, and statistical-inference procedures learn from and make predictions on data by fitting parametric or non-parametric models. However, there exists no model that is universally suitable for all datasets and goals. Therefore, a crucial step in data analysis is to consider a set of postulated candidate models and learning methods (the model class) and select the most appropriate one. We provide integrated discussions on the fundamental limits of inference and prediction based on model-selection principles from modern data analysis. In particular, we introduce two recent advances of model-selection approaches, one concerning a new information criterion and the other concerning modeling procedure selection.

1 - Introduction to Information Theory and Data Science.
- By Miguel R. D. Rodrigues, Stark C. Draper, Waheed U. Bajwa, Yonina C. Eldar
Edited by Miguel R. D. Rodrigues, University College London, Yonina C. Eldar, Weizmann Institute of Science, Israel
Book:

Information-Theoretic Methods in Data Science

Published online:

22 March 2021

Print publication:

08 April 2021, pp 1-43
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The purpose of this chapter is to set the stage for the book and for the upcoming chapters. We first overview classical information-theoretic problems and solutions. We then discuss emerging applications of information-theoretic methods in various data-science problems and, where applicable, refer the reader to related chapters in the book. Throughout this chapter, we highlight the perspectives, tools, and methods that play important roles in classic information-theoretic paradigms and in emerging areas of data science. Table 1.1 provides a summary of the different topics covered in this chapter and highlights the different chapters that can be read as a follow-up to these topics.

Notation
Edited by Miguel R. D. Rodrigues, University College London, Yonina C. Eldar, Weizmann Institute of Science, Israel
Book:

Information-Theoretic Methods in Data Science

Published online:

22 March 2021

Print publication:

08 April 2021, pp xvii-xviii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

4 - Information-Theoretic Bounds on Sketching
- By Mert Pilanci
Edited by Miguel R. D. Rodrigues, University College London, Yonina C. Eldar, Weizmann Institute of Science, Israel
Book:

Information-Theoretic Methods in Data Science

Published online:

22 March 2021

Print publication:

08 April 2021, pp 104-133
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Approximate computation methods with provable performance guarantees are becoming important and relevant tools in practice. In this chapter we focus on sketching methods designed to reduce data dimensionality in computationally intensive tasks. Sketching can often provide better space, time, and communication complexity trade-offs by sacrificing minimal accuracy. This chapter discusses the role of information theory in sketching methods for solving large-scale statistical estimation and optimization problems. We investigate fundamental lower bounds on the performance of sketching. By exploring these lower bounds, we obtain interesting trade-offs in computation and accuracy. We employ Fano’s inequality and metric entropy to understand fundamental lower bounds on the accuracy of sketching, which is parallel to the information-theoretic techniques used in statistical minimax theory.

5 - Sample Complexity Bounds for Dictionary Learning from Vector- and Tensor-Valued Data
- By Zahra Shakeri, Anand D. Sarwate, Waheed U. Bajwa
Edited by Miguel R. D. Rodrigues, University College London, Yonina C. Eldar, Weizmann Institute of Science, Israel
Book:

Information-Theoretic Methods in Data Science

Published online:

22 March 2021

Print publication:

08 April 2021, pp 134-162
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Dictionary learning has emerged as a powerful method for data-driven extraction of features from data. The initial focus was from an algorithmic perspective, but recently there has been increasing interest in the theoretical underpinnings. These rely on information-theoretic analytic tools and help us understand the fundamental limitations of dictionary-learning algorithms. We focus on theoretical aspects and summarize results on dictionary learning from vector- and tensor-valued data. Results are stated in terms of lower and upper bounds on sample complexity of dictionary learning, defined as the number of samples needed to identify or reconstruct the true dictionary underlying data from noiseless or noisy samples, respectively. Many analytic tools that help yield these results come from information theory, including restating the dictionary-learning problem as a channel-coding problem and connecting analysis of minimax risk in statistical estimation to Fano’s inequality. In addition to highlighting effects of parameters on the sample complexity of dictionary learning, we show the potential advantages of dictionary learning from tensor data and present unaddressed problems.

Contributors
Edited by Miguel R. D. Rodrigues, University College London, Yonina C. Eldar, Weizmann Institute of Science, Israel
Book:

Information-Theoretic Methods in Data Science

Published online:

22 March 2021

Print publication:

08 April 2021, pp xix-xxii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

8 - Computing Choice: Learning Distributions over Permutations
- By Devavrat Shah
Edited by Miguel R. D. Rodrigues, University College London, Yonina C. Eldar, Weizmann Institute of Science, Israel
Book:

Information-Theoretic Methods in Data Science

Published online:

22 March 2021

Print publication:

08 April 2021, pp 229-262
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

We discuss the question of learning distributions over permutations of a given set of choices, options or items based on partial observations. This is central to capturing the so-called “choice’’ in a variety of contexts. The question of learning distributions over permutations arises beyond capturing “choice’’ too, e.g., tracking a collection of objects using noisy cameras, or aggregating ranking of web-pages using outcomes of multiple search engines. Here we focus on learning distributions over permutations from marginal distributions of two types: first-order marginals and pair-wise comparisons. We emphasize the ability to identify the entire distribution over permutations as well as the “best ranking’’.

Index
Evarist Giné, Richard Nickl, University of Cambridge
Book:

Mathematical Foundations of Infinite-Dimensional Statistical Models

Published online:

05 March 2021

Print publication:

25 March 2021, pp 687-690
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

References
Evarist Giné, Richard Nickl, University of Cambridge
Book:

Mathematical Foundations of Infinite-Dimensional Statistical Models

Published online:

05 March 2021

Print publication:

25 March 2021, pp 667-682
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Contents
Evarist Giné, Richard Nickl, University of Cambridge
Book:

Mathematical Foundations of Infinite-Dimensional Statistical Models

Published online:

05 March 2021

Print publication:

25 March 2021, pp vii-x
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

5 - Linear Nonparametric Estimators
Evarist Giné, Richard Nickl, University of Cambridge
Book:

Mathematical Foundations of Infinite-Dimensional Statistical Models

Published online:

05 March 2021

Print publication:

25 March 2021, pp 389-466
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Preface
Evarist Giné, Richard Nickl, University of Cambridge
Book:

Mathematical Foundations of Infinite-Dimensional Statistical Models

Published online:

05 March 2021

Print publication:

25 March 2021, pp xi-xiv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

7 - Likelihood-Based Procedures
Evarist Giné, Richard Nickl, University of Cambridge
Book:

Mathematical Foundations of Infinite-Dimensional Statistical Models

Published online:

05 March 2021

Print publication:

25 March 2021, pp 541-606
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Frontmatter
Evarist Giné, Richard Nickl, University of Cambridge
Book:

Mathematical Foundations of Infinite-Dimensional Statistical Models

Published online:

05 March 2021

Print publication:

25 March 2021, pp i-iv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

8 - Adaptive Inference
Evarist Giné, Richard Nickl, University of Cambridge
Book:

Mathematical Foundations of Infinite-Dimensional Statistical Models

Published online:

05 March 2021

Print publication:

25 March 2021, pp 607-666
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Dedication
Evarist Giné, Richard Nickl, University of Cambridge
Book:

Mathematical Foundations of Infinite-Dimensional Statistical Models

Published online:

05 March 2021

Print publication:

25 March 2021, pp v-vi
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

2 - Gaussian Processes
Evarist Giné, Richard Nickl, University of Cambridge
Book:

Mathematical Foundations of Infinite-Dimensional Statistical Models

Published online:

05 March 2021

Print publication:

25 March 2021, pp 15-108
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Author Index
Evarist Giné, Richard Nickl, University of Cambridge
Book:

Mathematical Foundations of Infinite-Dimensional Statistical Models

Published online:

05 March 2021

Print publication:

25 March 2021, pp 683-686
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

1 - Nonparametric Statistical Models
Evarist Giné, Richard Nickl, University of Cambridge
Book:

Mathematical Foundations of Infinite-Dimensional Statistical Models

Published online:

05 March 2021

Print publication:

25 March 2021, pp 1-14
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

6 - The Minimax Paradigm
Evarist Giné, Richard Nickl, University of Cambridge
Book:

Mathematical Foundations of Infinite-Dimensional Statistical Models

Published online:

05 March 2021

Print publication:

25 March 2021, pp 467-540
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Pattern Recognition and Machine Learning

Refine search

Refine search

Actions for selected content:

2190 results in Pattern Recognition and Machine Learning

12 - Fundamental Limits in Model Selection for Modern Data Analysis

Summary

1 - Introduction to Information Theory and Data Science.

Summary

Notation

4 - Information-Theoretic Bounds on Sketching

Summary

5 - Sample Complexity Bounds for Dictionary Learning from Vector- and Tensor-Valued Data

Summary

Contributors

8 - Computing Choice: Learning Distributions over Permutations

Summary

Index

References

Contents

5 - Linear Nonparametric Estimators

Preface

7 - Likelihood-Based Procedures

Frontmatter

8 - Adaptive Inference

Dedication

2 - Gaussian Processes

Author Index

1 - Nonparametric Statistical Models

6 - The Minimax Paradigm

Pattern Recognition and Machine Learning

Refine search

Refine search

Actions for selected content:

Save Search

2190 results in Pattern Recognition and Machine Learning

Summary

Summary

Summary

Summary

Summary