Book contents
- Frontmatter
- Contents
- Preface
- Introduction
- Chapter 1 Collections
- Chapter 2 Arrays and ArrayLists
- Chapter 3 Basic Sorting Algorithms
- Chapter 4 Basic Searching Algorithms
- Chapter 5 Stacks and Queues
- Chapter 6 The BitArray Class
- Chapter 7 Strings, the String Class, and the StringBuilder Class
- Chapter 8 Pattern Matching and Text Processing
- Chapter 9 Building Dictionaries: The DictionaryBase Class and the SortedList Class
- Chapter 10 Hashing and the HashTable Class
- Chapter 11 Linked Lists
- Chapter 12 Binary Trees and Binary Search Trees
- Chapter 13 Sets
- Chapter 14 Advanced Sorting Algorithms
- Chapter 15 Advanced Data Structures and Algorithms for Searching
- Chapter 16 Graphs and Graph Algorithms
- Chapter 17 Advanced Algorithms
- References
- Index
Chapter 8 - Pattern Matching and Text Processing
Published online by Cambridge University Press: 11 August 2009
- Frontmatter
- Contents
- Preface
- Introduction
- Chapter 1 Collections
- Chapter 2 Arrays and ArrayLists
- Chapter 3 Basic Sorting Algorithms
- Chapter 4 Basic Searching Algorithms
- Chapter 5 Stacks and Queues
- Chapter 6 The BitArray Class
- Chapter 7 Strings, the String Class, and the StringBuilder Class
- Chapter 8 Pattern Matching and Text Processing
- Chapter 9 Building Dictionaries: The DictionaryBase Class and the SortedList Class
- Chapter 10 Hashing and the HashTable Class
- Chapter 11 Linked Lists
- Chapter 12 Binary Trees and Binary Search Trees
- Chapter 13 Sets
- Chapter 14 Advanced Sorting Algorithms
- Chapter 15 Advanced Data Structures and Algorithms for Searching
- Chapter 16 Graphs and Graph Algorithms
- Chapter 17 Advanced Algorithms
- References
- Index
Summary
Whereas the String and StringBuilder classes provide a set of methods that can be used to process string-based data, the RegEx and its supporting classes provide much more power for string-processing tasks. String processing mostly involves looking for patterns in strings (pattern matching) and it is performed via a special language called a regular expression. In this chapter we look at how to form regular expressions and how to use them to solve common text-processing tasks.
REGULAR EXPRESSIONS
A regular expression is a language that describes patterns of characters in strings, along with descriptors for repeating characters, alternatives, and groupings of characters. Regular expressions can be used to both perform searches in strings and perform substitutions in strings.
A regular expression itself consists of just a string of characters that define a pattern you want to search for in another string. Generally, the characters in a regular expression match themselves, so that the regular expression “the” matches that sequence of characters wherever they are found in a string.
A regular expression can also include special characters called metacharacters. Metacharacters are used to signify repetition, alternation, or grouping. We will examine how these metacharacters are used later in the chapter.
Most experienced computer users have used regular expressions in their work, even if they weren't aware they were doing so at the time. Whenever you type the following command at a command prompt:
C:\>dir myfile.exe
you are using the regular expression “myfile.exe”.
- Type
- Chapter
- Information
- Data Structures and Algorithms Using Visual Basic.NET , pp. 181 - 199Publisher: Cambridge University PressPrint publication year: 2005