Tuesday, July 29, 2008

Translation Rules

Infix to Prefix
1. If E is a variable or constant then Prefix(E) = E
2. if E is E1 op E2 then Prefix(E) = Prefix(E1 op E2) = op Prefix(E1) Prefix(E2)
3. If E is (E1) then Prefix(E) = Prefix(E1)


Example:
Prefix( A * B + C / D )
= Prefix ( ( A * B ) + ( C / D ) )
= + Prefix( A * B) Prefix( C / D )
= + * Prefix( A ) Prefix( B ) / Prefix ( C ) Prefix( D )
= + * A B / C D

Postfix to Infix
1. If E is a variable or constant then Infix(E) = E
2. if E is E1 E2 op then Infix(E) =
Infix(E1 E2 op) = Infix(E1) op Infix(E2)
3. If E is (E1) then
Infix(E) = Infix(E1)

Example:
Infix( A B * C D / + )
= Infix( ( A B * ) ( C D / ) + )
= Infix( A B * ) + Infix( C D / )
= Infix( A ) * Infix( B ) + Infix( C ) / Infix( D )
= A * B + C / D

Postfix to Prefix
1. If E is a variable or constant then Prefix(E) = E
2. if E is E1 E2 op then
Prefix(E) = Prefix(E1 E2 op) = op Prefix(E1) Prefix(E2)
3. If E is (E1) then
Prefix(E) = Prefix(E1)

Example:
Prefix( A B * C D / + )
= Prefix ( ( A B * ) ( C D /) + )
= + Prefix( A B * ) Prefix( C D /)
= + * Prefix( A ) Prefix( B ) / Prefix( C ) Prefix( D )
= + * A B / C D

Tuesday, July 22, 2008

Exercises

2.1 Consider the context-free grammar: S -> S S + | S S * | a

a.) Show how the string aa+a* can be generated by this grammar.
S -> S S *
S -> S S + S *
S -> a S + S *
S -> a a + S *
S -> a a + a *

b.) Construct a parse tree for this string.


c.) What language is generated by this grammar? Justify your answer.
Supposing a is an identifier for a numeric value, then the grammar generates a language which is composed of all possible arithmetic combinations of a using the operations + and * and postfix notation.


2.2 What language is generated by the following grammars? In each case, justify your answer.

a.) S -> 0 S 1 | 0 1
A language that generates a series of equal numbers of 0's and 1's, where the sequence of 0's precedes the sequence of 1's.

b.) S -> + S S | - S S | a
A language consisting of all possible arithmetic combinations of a using only the operations + and * and prefix notation. This is actually the prefix analog to exercise 2.1

c.) S -> S ( S ) S |
A language that generates a string starting with S and ending with S, where there is an even number of S's and where the leftmost half (which starts with a S and ends with an opening parenthesis) of the string is composed of alternating S's and opening parentheses and the rightmost half (which starts with a closing parenthesis and ends with an S) of the string is composed of alternating closing parentheses and S's. In other words, this will simply generate arbitrary sequences of adjacent and nested, matched pairs of parentheses.

d.) S -> a S b S | b S a S |
A language that generates a string which is composed of equal numbers of a's and b's in any or no particular order.

e.) S -> a | S + S | S S | S * | ( S )
A language that generates a string
composed of any number of a's separated by mathematical operations(addition and multiplication) and/or parentheses in particular.

Tuesday, June 24, 2008

The Phases of a Compiler

Basically, a compiler is a program that reads a program written in a source language and translates it into an equivalent program in a target language. The compiler is a large and complex program so for simplicity purposes, it is being divided into two major parts and which are then further broken down into different phases on the base of their complexity. A typical decomposition of a compiler is shown in the figure below.

The two major parts of a compiler are the Analysis phase and the Synthesis phase. The analysis part breaks up the source program into constant piece and creates an intermediate representation of the source program. The Lexical Analyzer, Syntax Analyzer and Semantic Analyzer form the bulk of this part. On the other hand, the synthesis part constructs the desired target program from the intermediate representation. The Intermediate Code Generator, Code Generator, and Code Optimizer are the parts of this synthesis portion. Each phase transforms the source program from one representation into another representation. They all communicate with error handlers and so with the symbol table. Informally, we shall also consider the symbol table manager and the error handler “phases”.

In the first phase of a compiler, we define lexical rules by regular expression. Lexical Analysis is also called Scanning. The Lexical Analyzer reads the source program character by character and returns the tokens of the source program, which are pattern of characters having same meaning in the source program. In this phase, the source program is checked to have valid characters and words. After detailed analysis, input stream of tokens are stored in a buffer.

The second phase of a compiler is the Syntax Analysis. Generally, a Syntax Analyzer creates the syntactic structure (generally a parse tree) of the given program. In this phase, we define syntax rules by Context Free Grammar (CFG). Syntax Analysis is also called parser. During parsing, syntax rules are checked. Inputs of this phase are tokens and output is a parse tree or a syntax tree.

The third phase is called Semantic Analysis. Basically the word semantic means the “meaning”. A semantic analyzer is the one which checks the source program for semantic errors and collects the type information for the code generation. In this phase, we define semantic rules by attribute grammars. Semantic Analysis has two types: Declaration Checking and Type Checking. The input of this phase is a syntax tree and the output is an attribute grammar.

The last and final phase is the Target Code Generation. After analysis of source code is completed, then last step is converting it into a target language. However, there are still other optional phases; namely the Source Code Optimizer, Target Code Optimizer and Intermediate Code Generator. After the syntax and semantic analysis, some compilers may generate explicit intermediate codes representing the source program. So this is where these remaining phases take place. The code optimizer is the one responsible for optimizing the code produced by the intermediate code generator in the terms of time and space so as to produce a faster-running machine code. And lastly, the Code Generator produces the target language in a specific architecture in which the target program is normally a relocatable object file containing the machine codes.

References:

http://www.cs.bilkent.edu.tr/~ilyas/Courses/CS416/lec00-outline.PPT
http://www.personal.kent.edu/~rmuhamma/Compilers/compiler.html

Introduction

"This blog is dedicated to our ICS 40, a course that covers the design and implementation of high-level programming language translators and compilers."

Compiler Principles™ @ half-males.blospot.com is owned and maintained by the 4J Addicts
.

4J Addicts are made of:


Joshua Mario A. Bacal


J
ethro Neil S. Cutamora


J
effrey T. Sy


J
ames Andrew S. Young