Book contents
- Frontmatter
- Contents
- Preface
- Acknowledgements
- 1 Introduction
- 2 The data and some applications
- Part I The Tools
- 3 Basic stringology
- 4 Representing languages
- 5 Representing distributions over strings with automata and grammars
- 6 About combinatorics
- Part II What Does Learning a Language Mean?
- Part III Learning Algorithms and Techniques
- References
- Index
3 - Basic stringology
from Part I - The Tools
Published online by Cambridge University Press: 05 July 2014
- Frontmatter
- Contents
- Preface
- Acknowledgements
- 1 Introduction
- 2 The data and some applications
- Part I The Tools
- 3 Basic stringology
- 4 Representing languages
- 5 Representing distributions over strings with automata and grammars
- 6 About combinatorics
- Part II What Does Learning a Language Mean?
- Part III Learning Algorithms and Techniques
- References
- Index
Summary
The biscuit tree. This remarkable vegetable production has never yet been described or delineated.
Edward Lear, Flora NonsensicaMan muss immer generalisieren.
Carl JacobiFormal language theory has been developed and studied consistently over the past 50 years. Because of their importance in so many fields, strings and tools to manipulate them have been studied with special care, leading to the specific topic of stringology. Usual definitions and results can therefore be found in several text books.
We re-visit these objects here from a pragmatic point of view: grammatical inference is about learning grammars and automata which are then supposed to be used by programs to deal with strings. Their advantage is that we can parse with them, compare them, compute distances… Therefore, we are primarily interested in studying how the strings are organised: knowing that a string is in a language (or perhaps more importantly out of the language) is not enough. We will also want to know how it belongs or why it doesn't belong. Other questions might be about finding close strings or building a kernel taking into account their properties. The goal is therefore going to be to organise the strings, to put some topology over them.
Notations
We start by introducing here some general notations used throughout the book.
In a definition ifdef is a definition ‘if’.
The main mathematical objects used in this book are letters and strings.
- Type
- Chapter
- Information
- Grammatical InferenceLearning Automata and Grammars, pp. 45 - 69Publisher: Cambridge University PressPrint publication year: 2010