COMPILER DESIGN

A compiler is a computer program which helps you transform source code written in a high-level language into low-level machine language. It translates the code written in one programming language to some other language without changing the meaning of the code. The compiling process includes basic translation mechanisms and error detection. Compiler process goes through lexical, syntax, and semantic analysis at the front end, and code generation and

INTRODUCTION OF LANGUAGE PROCESSING SYSTEM

Language Processing System

compiler is a program that converts high-level language to assembly language. A assembler is a program that converts the assembly language to machine-level language.

Preprocessor

preprocessor is a part of the Compiler. It is a tool which produces input for Compiler. It deals with macro processing, augmentation, language extension

Interpreter

An interpreter, like a compiler, translates high-level language into low-level machine language. The difference lies in the way they read the source code or input.

Assembler

An assembler is a type of computer program that interprets software programs written in assembly language into machine language, code and instructions that can be executed by a computer.

An assembler enables software and application developers to access, operate and manage a computer’s hardware architecture and components.

Linker

The linker helps you to link and merge various object files to create an executable file.

The main task of a linker is to search for called modules in a program and to find out the memory location.

Loader

Loader is a part of operating system and is responsible for loading executable files into memory and execute them. It calculates the size of a program (instructions and data) and creates memory space for it. It initializes various registers to initiate execution.

The Phases of a Compiler

Phases of compiler

Lexical Analysis

Lexical analysis is the first phase of a compiler. The lexical analyzer breaks these syntaxes into a series of tokens, by removing any whitespace or comments in the source code. If the lexical analyzer finds a token invalid, it generates an error. The lexical analyzer works closely with the syntax analyzer.

Token passing in compiler

Syntax Analysis

A syntax analyzer or parser takes the input from a lexical analyzer in the form of token streams. The parser analyzes the source code (token stream) against the production rules to detect any errors in the code. The output of this phase is a parse tree.

Syntax Analyzer

Semantic Analysis

Semantic analysis checks the semantic consistency of the code. It uses the syntax tree of the previous phase along with the symbol table to verify that the given source code is semantically consistent.

Semantic Analyzer will check for Type mismatches, incompatible operands, a function called with improper arguments, an undeclared variable, etc.

Intermediate Code Generation

After semantic analysis the compiler generates an intermediate code of the source code for the target machine. It represents a program for some abstract machine. It is in between the high-level language and the machine language.

Intermediate Code

Code Optimization

The next phase of is code optimization or Intermediate code. This phase removes unnecessary code line and arranges the sequence of statements to speed up the execution of the program without wasting resources. The main goal of this phase is to improve on the intermediate code to generate a code that runs faster and occupies less space.

Code Generation

Code generator is used to produce the target code for three-address statements. It uses registers to store the operands of the three address statement. The objective of this phase is to allocate storage and generate relocatable machine code. This phase coverts the optimize or intermediate code into the target language.

Symbol Table

A symbol table contains a record for each identifier with fields for the attributes of the identifier. This component makes it easier for the compiler to search the identifier record and retrieve it quickly.

Types of Parsing

Types of Parser

Top-Down Parsing

In Top-down parsing the parse tree is generated in the top to bottom fashion. The top-down parsing is mainly intended to discover the suitable production rules to generate the correct results.

Two types of Top-down parsing are:

  1. Predictive Parsing:

The predictive parser uses look-ahead point, which points towards next input symbols. Backtracking is not an issue with this parsing technique.

  1. Recursive Descent Parsing:

This parsing technique recursively parses the input to make a prase tree. It consists of several small functions, one for each nonterminal in the grammar.

Bottom-Up Parsing

  • Bottom up parsing is used to construct a parse tree for an input string.
  • In the bottom up parsing, the parsing starts with the input symbol and construct the parse tree up to the start symbol by tracing out the rightmost derivations of string in reverse.

Error – Recovery Methods

Common Errors that occur in Parsing in System Software

  • Lexical: Name of an incorrectly typed identifier
  • Syntactical: unbalanced parenthesis or a missing semicolon
  • Semantical: incompatible value assignment
  • Logical: Infinite loop and not reachable code

A parser should able to detect and report any error found in the program. an error occurred the parser. It should be able to handle it and carry on parsing the remaining input.