Search results for Programming Languages and Applied Logic

12 - Putting It All Together
Andrew W. Appel, Princeton University, New Jersey
Book:

Modern Compiler Implementation in ML

Published online:

05 June 2012

Print publication:

13 December 1997, pp 258-264
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

de-bug: to eliminate errors in or malfunctions of
Webster's Dictionary
Chapters 2–11 have described the fundamental components of a good compiler: a front end, which does lexical analysis, parsing, construction of abstract syntax, type-checking, and translation to intermediate code; and a back end, which does instruction selection, dataflow analysis, and register allocation.
What lessons have we learned? I hope that the reader has learned about the algorithms used in different components of a compiler and the interfaces used to connect the components. But the author has also learned quite a bit from the exercise.
My goal was to describe a good compiler that is, to use Einstein's phrase, “as simple as possible – but no simpler.” I will now discuss the thorny issues that arose in designing Tiger and its compiler.
Nested functions. Tiger has nested functions, requiring some mechanism (such as static links) for implementing access to nonlocal variables. But many programming languages in widespread use −C, C++, Java – do not have nested functions or static links. The Tiger compiler would become simpler without nested functions, for then variables would not escape, and the FindEscape phase would be unnecessary. But there are two reasons for explaining how to compile nonlocal variables. First, there are programming languages where nested functions are extremely useful – these are the functional languages described in Chapter 15.

3 - Parsing
Andrew W. Appel, Princeton University, New Jersey
Book:

Modern Compiler Implementation in ML

Published online:

05 June 2012

Print publication:

13 December 1997, pp 38-86
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Preface
Andrew W. Appel, Princeton University, New Jersey
Book:

Modern Compiler Implementation in ML

Published online:

05 June 2012

Print publication:

13 December 1997, pp ix-x
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Over the past decade, there have been several shifts in the way compilers are built. New kinds of programming languages are being used: object-oriented languages with dynamic methods, functional languages with nested scope and first-class function closures; and many of these languages require garbage collection. New machines have large register sets and a high penalty for memory access, and can often run much faster with compiler assistance in scheduling instructions and managing instructions and data for cache locality.
This book is intended as a textbook for a one- or two-semester course in compilers. Students will see the theory behind different components of a compiler, the programming techniques used to put the theory into practice, and the interfaces used to modularize the compiler. To make the interfaces and programming examples clear and concrete, I have written them in the ML programming language. Other editions of this book are available that use the C and Java languages.
Implementation project. The “student project compiler” that I have outlined is reasonably simple, but is organized to demonstrate some important techniques that are now in common use: abstract syntax trees to avoid tangling syntax and semantics, separation of instruction selection from register allocation, copy propagation to give flexibility to earlier phases of the compiler, and containment of target-machine dependencies. Unlike many “student compilers” found in textbooks, this one has a simple but sophisticated back end, allowing good register allocation to be done after instruction selection.

1 - Introduction
Andrew W. Appel, Princeton University, New Jersey
With Maia Ginsburg
Book:

Modern Compiler Implementation in C

Published online:

05 June 2012

Print publication:

13 December 1997, pp 3-15
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

A compiler was originally a program that “compiled” subroutines [a link-loader]. When in 1954 the combination “algebraic compiler” came into use, or rather into misuse, the meaning of the term had already shifted into the present one.
Bauer and Eickel [1975]
This book describes techniques, data structures, and algorithms for translating programming languages into executable code. A modern compiler is often organized into many phases, each operating on a different abstract “language.” The chapters of this book follow the organization of a compiler, each covering a successive phase.
To illustrate the issues in compiling real programming languages, I show how to compile Tiger, a simple but nontrivial language of the Algol family, with nested scope and heap-allocated records. Programming exercises in each chapter call for the implementation of the corresponding phase; a student who implements all the phases described in Part I of the book will have a working compiler. Tiger is easily modified to be functional or object-oriented (or both), and exercises in Part II show how to do this. Other chapters in Part II cover advanced techniques in program optimization. Appendix A describes the Tiger language.
The interfaces between modules of the compiler are almost as important as the algorithms inside the modules. To describe the interfaces concretely, it is useful to write them down in a real programming language. This book uses the C programming language.

2 - Lexical Analysis
Andrew W. Appel, Princeton University, New Jersey
With Maia Ginsburg
Book:

Modern Compiler Implementation in C

Published online:

05 June 2012

Print publication:

13 December 1997, pp 16-38
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

lex-i-cal: of or relating to words or the vocabulary of a language as distinguished from its grammar and construction
Webster's Dictionary
To translate a program from one language into another, a compiler must first pull it apart and understand its structure and meaning, then put it together in a different way. The front end of the compiler performs analysis; the back end does synthesis.
The analysis is usually broken up into
Lexical analysis: breaking the input into individual words or “tokens”;
Syntax analysis: parsing the phrase structure of the program; and
Semantic analysis: calculating the program's meaning.
The lexical analyzer takes a stream of characters and produces a stream of names, keywords, and punctuation marks; it discards white space and comments between the tokens. It would unduly complicate the parser to have to account for possible white space and comments at every possible point; this is the main reason for separating lexical analysis from parsing.
Lexical analysis is not very complicated, but we will attack it with high-powered formalisms and tools, because similar formalisms will be useful in the study of parsing and similar tools have many applications in areas other than compilation.
LEXICAL TOKENS
A lexical token is a sequence of characters that can be treated as a unit in the grammar of a programming language. A programming language classifies lexical tokens into a finite set of token types.

13 - Garbage Collection
Andrew W. Appel, Princeton University, New Jersey
With Maia Ginsburg
Book:

Modern Compiler Implementation in C

Published online:

05 June 2012

Print publication:

13 December 1997, pp 273-298
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

gar-bage: unwanted or useless material
Webster's Dictionary
Heap-allocated records that are not reachable by any chain of pointers from program variables are garbage. The memory occupied by garbage should be reclaimed for use in allocating new records. This process is called garbage collection, and is performed not by the compiler but by the runtime system (the support programs linked with the compiled code).
Ideally, we would say that any record that is not dynamically live (will not be used in the future of the computation) is garbage. But, as Section 10.1 explains, it is not always possible to know whether a variable is live. So we will use a conservative approximation: we will require the compiler to guarantee that any live record is reachable; we will ask the compiler to minimize the number of reachable records that are not live; and we will preserve all reachable records, even if some of them might not be live.
Figure 13.1 shows a Tiger program ready to undergo garbage collection (at the point marked garbage-collect here). There are only three program variables in scope: p, q, and r.
MARK-AND-SWEEP COLLECTION
Program variables and heap-allocated records form a directed graph. The variables are roots of this graph. A node n is reachable if there is a path of directed edges r → … → n starting at some root r. A graph-search algorithm such as depth-first search (Algorithm 13.2) can mark all the reachable nodes.

6 - Activation Records
Andrew W. Appel, Princeton University, New Jersey
With Maia Ginsburg
Book:

Modern Compiler Implementation in C

Published online:

05 June 2012

Print publication:

13 December 1997, pp 125-149
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

11 - Register Allocation
Andrew W. Appel, Princeton University, New Jersey
With Maia Ginsburg
Book:

Modern Compiler Implementation in C

Published online:

05 June 2012

Print publication:

13 December 1997, pp 235-264
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

reg-is-ter: a device for storing small amounts of data
al-lo-cate: to apportion for a specific purpose
Webster's Dictionary
The Translate, Canon, and Codegen phases of the compiler assume that there are an infinite number of registers to hold temporary values and that move instructions cost nothing. The job of the register allocator is to assign the many temporaries to a small number of machine registers, and, where possible, to assign the source and destination of a move to the same register so that the move can be deleted.
From an examination of the control and dataflow graph, we derive an interference graph. Each node in the inteference graph represents a temporary value; each edge (t1, t2) indicates a pair of temporaries that cannot be assigned to the same register. The most common reason for an interference edge is that t1 and t2 are live at the same time. Interference edges can also express other constraints; for example, if a certain instruction a ← b ⊕ c cannot produce results in register r12 on our machine, we can make a interfere with r12.
Next we color the interference graph. We want to use as few colors as possible, but no pair of nodes connected by an edge may be assigned the same color. Graph coloring problems derive from the old mapmakers' rule that adjacent countries on a map should be colored with different colors.

21 - The Memory Hierarchy
Andrew W. Appel, Princeton University, New Jersey
Book:

Modern Compiler Implementation in ML

Published online:

05 June 2012

Print publication:

13 December 1997, pp 492-511
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

mem-o-ry: a device in which information can be inserted and stored and from which it may be extracted when wanted
hi-er-ar-chy: a graded or ranked series
Webster's Dictionary
An idealized random access memory (RAM) has N words indexed by integers such that any word can be fetched or stored – using its integer address – equally quickly. Hardware designers can make a big slow memory, or a small fast memory, but a big fast memory is prohibitively expensive. Also, one thing that speeds up access to memory is its nearness to the processor, and a big memory must have some parts far from the processor no matter how much money might be thrown at the problem.
Almost as good as a big fast memory is the combination of a small fast cache memory and a big slow main memory; the program keeps its frequently used data in cache and the rarely used data in main memory, and when it enters a phase in which datum x will be frequently used it may move x from the slow memory to the fast memory.
It's inconvenient for the programmer to manage multiple memories, so the hardware does it automatically. Whenever the processor wants the datum at address x, it looks first in the cache, and – we hope – usually finds it there.

4 - Abstract Syntax
Andrew W. Appel, Princeton University, New Jersey
Book:

Modern Compiler Implementation in ML

Published online:

05 June 2012

Print publication:

13 December 1997, pp 87-102
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

2 - Lexical Analysis
Andrew W. Appel, Princeton University, New Jersey
Book:

Modern Compiler Implementation in ML

Published online:

05 June 2012

Print publication:

13 December 1997, pp 14-37
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

lex-i-cal: of or relating to words or the vocabulary of a language as distinguished from its grammar and construction
Webster's Dictionary
To translate a program from one language into another, a compiler must first pull it apart and understand its structure and meaning, then put it together in a different way. The front end of the compiler performs analysis; the back end does synthesis.
The analysis is usually broken up into
Lexical analysis: breaking the input into individual words or “tokens”;
Syntax analysis: parsing the phrase structure of the program; and
Semantic analysis: calculating the program's meaning.
The lexical analyzer takes a stream of characters and produces a stream of names, keywords, and punctuation marks; it discards white space and comments between the tokens. It would unduly complicate the parser to have to account for possible white space and comments at every possible point; this is the main reason for separating lexical analysis from parsing.
Lexical analysis is not very complicated, but we will attack it with high powered formalisms and tools, because similar formalisms will be useful in the study of parsing and similar tools have many applications in areas other than compilation.
LEXICAL TOKENS
A lexical token is a sequence of characters that can be treated as a unit in the grammar of a programming language. A programming language classifies lexical tokens into a finite set of token types.

Frontmatter
Andrew W. Appel, Princeton University, New Jersey
Book:

Modern Compiler Implementation in ML

Published online:

05 June 2012

Print publication:

13 December 1997, pp i-iv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Bibliography
Andrew W. Appel, Princeton University, New Jersey
Book:

Modern Compiler Implementation in ML

Published online:

05 June 2012

Print publication:

13 December 1997, pp 522-530
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Contents
Andrew W. Appel, Princeton University, New Jersey
With Maia Ginsburg
Book:

Modern Compiler Implementation in C

Published online:

05 June 2012

Print publication:

13 December 1997, pp v-viii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

11 - Register Allocation
Andrew W. Appel, Princeton University, New Jersey
Book:

Modern Compiler Implementation in ML

Published online:

05 June 2012

Print publication:

13 December 1997, pp 228-257
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

reg-is-ter: a device for storing small amounts of data
al-lo-cate: to apportion for a specific purpose
Webster's Dictionary
The Translate, Canon, and Codegen phases of the compiler assume that there are an infinite number of registers to hold temporary values and that move instructions cost nothing. The job of the register allocator is to assign the many temporaries to a small number of machine registers, and, where possible, to assign the source and destination of a move to the same register so that the move can be deleted.
From an examination of the control and dataflow graph, we derive an interference graph. Each node in the inteference graph represents a temporary value; each edge (t1, t2) indicates a pair of temporaries that cannot be assigned to the same register. The most common reason for an interference edge is that t1 and t2 are live at the same time. Interference edges can also express other constraints; for example, if a certain instruction a ← b ⊕ c cannot produce results in register r12 on our machine, we can make a interfere with r12.
Next we color the interference graph. We want to use as few colors as possible, but no pair of nodes connected by an edge may be assigned the same color. Graph coloring problems derive from the old mapmakers' rule that adjacent countries on a map should be colored with different colors.

21 - The Memory Hierarchy
Andrew W. Appel, Princeton University, New Jersey
With Maia Ginsburg
Book:

Modern Compiler Implementation in C

Published online:

05 June 2012

Print publication:

13 December 1997, pp 498-517
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

mem-o-ry: a device in which information can be inserted and stored and from which it may be extracted when wanted
hi-er-ar-chy: a graded or ranked series
Webster's Dictionary
An idealized random access memory (RAM) has N words indexed by integers such that any word can be fetched or stored – using its integer address – equally quickly. Hardware designers can make a big slow memory, or a small fast memory, but a big fast memory is prohibitively expensive. Also, one thing that speeds up access to memory is its nearness to the processor, and a big memory must have some parts far from the processor no matter how much money might be thrown at the problem.
Almost as good as a big fast memory is the combination of a small fast cache memory and a big slow main memory; the program keeps its frequently used data in cache and the rarely used data in main memory, and when it enters a phase in which datum x will be frequently used it may move x from the slow memory to the fast memory.
It's inconvenient for the programmer to manage multiple memories, so the hardware does it automatically. Whenever the processor wants the datum at address x, it looks first in the cache, and – we hope – usually finds it there. If there is a cache miss – x is not in the cache – then the processor fetches x from main memory and places a copy of x in the cache so that the next reference to x will be a cache hit.

Part I - Fundamentals of Compilation
Andrew W. Appel, Princeton University, New Jersey
With Maia Ginsburg
Book:

Modern Compiler Implementation in C

Published online:

05 June 2012

Print publication:

13 December 1997, pp 1-2
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

9 - Instruction Selection
Andrew W. Appel, Princeton University, New Jersey
With Maia Ginsburg
Book:

Modern Compiler Implementation in C

Published online:

05 June 2012

Print publication:

13 December 1997, pp 191-217
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Appendix: Tiger Language Reference Manual
Andrew W. Appel, Princeton University, New Jersey
Book:

Modern Compiler Implementation in ML

Published online:

05 June 2012

Print publication:

13 December 1997, pp 512-521
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

10 - Liveness Analysis
Andrew W. Appel, Princeton University, New Jersey
With Maia Ginsburg
Book:

Modern Compiler Implementation in C

Published online:

05 June 2012

Print publication:

13 December 1997, pp 218-234
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

live: of continuing or current interest
Webster's Dictionary
The front end of the compiler translates programs into an intermediate language with an unbounded number of temporaries. This program must run on a machine with a bounded number of registers. Two temporaries a and b can fit into the same register, if a and b are never “in use” at the same time. Thus, many temporaries can fit in few registers; if they don't all fit, the excess temporaries can be kept in memory.
Therefore, the compiler needs to analyze the intermediate-representation program to determine which temporaries are in use at the same time. We say a variable is live if it holds a value that may be needed in the future, so this analysis is called liveness analysis.
To perform analyses on a program, it is often useful to make a control-flow graph. Each statement in the program is a node in the flow graph; if statement x can be followed by statement y, there is an edge from x to y. Graph 10.1 shows the flow graph for a simple loop.
Let us consider the liveness of each variable (Figure 10.2). A variable is live if its current value will be used in the future, so we analyze liveness by working from the future to the past. Variable b is used in statement 4, so b is live on the 3 → 4 edge.

Programming Languages and Applied Logic

Refine search

Refine search

Actions for selected content:

5644 results in Programming Languages and Applied Logic

12 - Putting It All Together

Summary

3 - Parsing

Preface

Summary

1 - Introduction

Summary

2 - Lexical Analysis

Summary

13 - Garbage Collection

Summary

6 - Activation Records

11 - Register Allocation

Summary

21 - The Memory Hierarchy

Summary

4 - Abstract Syntax

2 - Lexical Analysis

Summary

Frontmatter

Bibliography

Contents

11 - Register Allocation

Summary

21 - The Memory Hierarchy

Summary

Part I - Fundamentals of Compilation

9 - Instruction Selection

Appendix: Tiger Language Reference Manual

10 - Liveness Analysis

Summary

Programming Languages and Applied Logic

Refine search

Refine search

Actions for selected content:

Save Search

5644 results in Programming Languages and Applied Logic

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary