Basic Blocks and Directed Acyclic Graphs

Quiz

In compiler design, we focus on several key phases, including lexical analysis, syntax analysis, and intermediate code generation. Another crucial phase is optimization, which helps produce efficient machine code by improving execution speed and reducing memory usage.

One important optimization technique involves Basic Blocks and Directed Acyclic Graphs (DAGs). Basic blocks break code into independent units, while DAGs help detect common subexpressions and eliminate redundant computations, reducing the number of instructions.

In this chapter, we will explore basic blocks, understand how DAGs optimize code, and go through examples for better clarity.

What are Basic Blocks?

A basic block is a sequence of instructions or operations with a single entry point and a single exit point. It means, the execution enters at the beginning and leaves at the end without any jumps or branches in between.

Characteristics of Basic Blocks

To understand basic blocks properly, we must understand its characteristics. Listed below are the important characteristics of basic blocks −

No Jump or Branch Statements Inside − A basic block does not contain labels (LBL), jump (JMP), or test (TST) statements inside it.
Single Entry, Single Exit − Execution will start at the first instruction and continues sequentially until the last instruction.
Self-contained − The basic blocks do not interfere with other code segments. This is making them easier to analyze and optimize.

Example: Identifying Basic Blocks

Consider the following atom sequence (three-address code) generated from a Java expression −

(ADD, b, c, T1)  
(LBL, L1)  
(ADD, b, c, T2)  
(MUL, T1, T2, T3)  
(TST, b, c,, 1, L3)  
(MOV, T3,, a)

We can divide it into three basic blocks −

Basic Block	Instructions
Block 1	(ADD, b, c, T1)
Block 2	(LBL, L1) (ADD, b, c, T2) (MUL, T1, T2, T3)
Block 3	(TST, b, c, 1, L3) (MOV, T3, , a)

Each block is independent and does not contain any control flow changes within it.

Optimizing Basic Blocks Using DAGs

A Directed Acyclic Graph (DAG) is a graph-based structure that helps in eliminating redundant computations and detecting common subexpressions in a basic block. Before further discussion let us understand this properly.

Why Use DAGs?

We use DAGs for the following reasons −

Reduces Redundant Computations − If there is an operation like b + c and it appears multiple times, a DAG helps compute it only once.
Optimizes Instruction Count − The DAG simplifies the atom sequence. Generate a shorter, more efficient version.
Minimizes Temporary Storage − Redundant temporary variables are removed. This is reducing register or memory usage.

Example: DAG for Expression Optimization

Consider the Java statement −

a = (b + c) * (b + c);

The un-optimized atom sequence produced by the parser −

(ADD, b, c, T1)  
(ADD, b, c, T2)  
(MUL, T1, T2, T3)  
(MOV, T3,, a)

This sequence re-computes b + c twice, and this is wasting time and resources.

Using DAG representation, we recognize that b + c is the same in both cases. For that reason we store it once and reuse it −

(ADD, b, c, T1)  
(MUL, T1, T1, a)

Now, instead of two ADD operations, we have only one. Here the MUL operation stores the result directly in a.

How to Construct a DAG for Optimization

We understood how DAGs are useful, but we need to understand how we can form the DAG. To build a DAG from an atom sequence, we follow these steps:

Step 1: Read an Atom (Instruction) − If the operation and operands already exist in the DAG, reuse the node. Otherwise, create a new node.
Step 2: Connect Nodes − Draw directed arcs from operation nodes to operand nodes. Then keep track of which variables are associated with each node.
Step 3: Generate Optimized Code − Now traverse the DAG in reverse order. This will help to generate a minimal atom sequence.

Example: Building a DAG for Optimization

Consider the Java expression −

a * b + a * b + a * b

The parser generates the following atoms −

(MUL, a, b, T1)  
(MUL, a, b, T2)  
(ADD, T1, T2, T3)  
(MUL, a, b, T4)  
(ADD, T3, T4, T5)

Using the DAG Construction Process

Create a single node for MUL, a, b (instead of three).

Reuse it in both ADD operations instead of recomputing.

Optimized Atom Sequence (Generated from DAG)

(MUL, a, b, T1)  
(ADD, T1, T1, T3)  
(ADD, T3, T1, T5)

Here, we have eliminated redundant multiplications, and reduced five instructions to just three.

Basic Blocks and DAG Optimization: Advantages and Disadvantages

The following table highlights the advantages and disadvantages of Basic Blocks and DAG Optimization −

Advantages	Disadvantages
Faster Execution − Fewer instructions mean less CPU time, speeding up program execution.	Limited to Basic Blocks − DAGs work only within a single basic block and cannot optimize across different blocks due to potential jump and branch statements.
Reduced Memory Usage − Fewer temporary variables reduce register and memory storage needs.	Cannot Handle Code with Labels or Jumps − DAG-based optimization is not possible when labels (LBL) or jump (JMP) instructions exist, as control flow alters execution order.
Better Code Optimization − DAG-based optimizations help compilers generate compact, efficient code.	Overhead in DAG Construction − Building and maintaining a DAG adds complexity to the compiler, and large expressions may require more memory for DAG storage.

Conclusion

In this chapter, we explained the concept of basic blocks and DAGs in compiler design and how they help in code optimization. We saw how basic blocks segment code into independent execution units and how DAGs eliminate redundant computations by detecting common subexpressions.

With DAG-based optimization, we can reduce instruction count, memory usage, and execution time in compiled programs. Basic block optimization is powerful, but it is limited to sections of code without jumps or labels.

Print Page