2025-04-14

LLVM: Constraint Elimination Pass

Introduction

Constraint Elimination is a transformation pass that extracts facts from LLVM IR and tries to replace certain uses with known facts. One common replacement is to replace conditional comparisons with constants (true or false) when known. The following is a simple example of how Constraint Elimination works.

int foo(int x, int y) {
  if (x > y) {    // x > y is a known fact in this block
    if (x <= y) { // replace x <= y with false
      return 2;
    }
    return 1;
  }

  return 0;
}

// after constraint elimination
int foo(int x, int y) {
  if (x > y) {
    if (false) { // block can be removed
      return 2;
    }
    return 1;
  }

  return 0;
}

The logic behind Constraint Elimination relies on an analysis pass called the Constraint System. Constraint Elimination stores known facts inside two constraint systems (signed and unsigned) and asks the constraint system if it’s okay to replace uses.

Constraint System

The Constraint System stores known facts in the following form (c0 is constant, c1 to cn are coefficients, v1 to vn are variables):

c0 <= c1 * v1 + c2 * v2 … + cn * vn

For instance, constraint x >= 0 will be decomposed into 0 <= 1 * x and constraint y >= 0 will be decomposed into 0 <= 0 * x + 1 * y. These two constraints are true in the constraint system, which implies a logical AND between them, resulting in x >= 0 && y >= 0.

Constraint Elimination will ask the constraint system whether a constraint (A) is implied by adding a negation of the constraint (!A) into the system. Since constraints are combined with logical AND, the system becomes C1 && C2 && ... && Cn && !A. If this is satisfiable (sat), there’s a solution where A is false, so A isn’t implied by the existing constraints. If unsatisfiable (unsat), no solution exists where A is false, meaning A must be true whenever the existing constraints hold. For example, with constraints A && B, checking A by adding !A gives A && B && !A, which is unsat (a contradiction). Thus, A is implied and can be safely replaced with true in the ConstraintElimination pass.

Now that we understand how constraints are handled, let’s dive into the core of the Constraint Elimination pass.

Constraint Elimination

The Constraint Elimination pass works roughly as follows:

Perform a Depth-First search the dominator tree and add entries to a worklist for every basic block. An entry can be a Fact to be added or a Check that needs to check if it can be replaced.
Sort the worklist by dominance information from the dominator tree.
Process sorted worklist and replace implied conditions.
Remove instructions with no uses inside the function.

Constraint Elimination begins by traversing the entry block and iterates over every instruction. If there’s a comparison instruction icmp, it adds an entry UseCheck into the worklist to check if it can be replaced later. If the terminator is a conditional branch, it adds corresponding facts to the successor blocks.

define i1 @src(i4 %x, i4 %y) {
entry:
  %c.1 = icmp slt i4 %x, 0
  br i1 %c.1, label %t, label %f
t:
  ...x < 0
f:
  ...x >= 0
}

Branching to block t means that %c.1 is true, where it can safely add x < 0 constraint to block t. On the other hand, it will invert the predicate to x >= 0 and add this constraint to block f. It will traverse to the next basic block and do the same until it traverses every basic block in the dominator tree.

Consider another example: block t is reached if x < 0 || y < 0. However, as mentioned earlier that the current constraint system has an implicit logical AND for every fact, constraint elimination cannot add this fact x < 0 || y < 0 into the constraint system. As for block f, it can add the inverse of predicate x >= 0 && y >= 0 into the constraint system.

define i1 @src(i4 %x, i4 %y) {
entry:
  %c.1 = icmp slt i4 %x, 0
  %c.2 = icmp slt i4 %y, 0
  %or = or i1 %c.1, %c.2
  br i1 %or, label %t, label %f
t:
  ret i1 false
f:
  %cmp = icmp sge i4 %x, 0
  ret i1 %cmp
}

Below is the worklist after traversing all blocks in the function and sorting the worklist. NumIn and NumOut represent the sequence for the Depth-first search. The sorted entries follow these rules NumIn < ConditionFact < Check < InstrBefore, where blocks with lower NumIn values dominate those with higher ones. In the same basic block, condition facts are processed before UseCheck.

// entry NumIn=0, NumOut=5
// t     NumIn=1, Numout=2
// f     NumIn=3, Numout=4
UseCheck(%c.1, ICMP_SLT, %x, 0, NumIn=0, NumOut=5)
UseCheck(%c.2, ICMP_SLT, %y, 0, NumIn=0, NumOut=5)
ConditionFact(DT.getNode(f), ICMP_SGE, %x, 0, NumIn=3, NumOut=4)
ConditionFact(DT.getNode(f), ICMP_SGE, %y, 0, NumIn=3, NumOut=4)
UseCheck(%cmp, ICMP_SGE, %x, 0, NumIn=3, NumOut=4)

By processing the above worklist, constraint elimination learns that it can replace %cmp with true in block f (see Alive2 proof — a tool to verify LLVM IR optimizations), and safely remove %cmp = icmp sge i4 %x, 0 instruction.

Missing Cases

As of writing, I have submitted a PR to add optimization opportunities for Constraint Elimination. The pass adds additional facts if the condition being analyzed matches one of the following four patterns.

# bitwise OR
(LHS | RHS >= 0) => LHS >= 0 && RHS >= 0
(LHS | RHS > -1) => LHS >= 0 && RHS >= 0
# bitwise AND
(LHS & RHS < 0)   =>  LHS < 0 && RHS < 0
(LHS & RHS <= -1) =>  LHS < 0 && RHS < 0

This example demonstrates how my PR works. Before my PR, constraint elimination would only add one fact %and.1 < 0 to block then. With the added facts x < 0 && y < 0 in the then block, the pass can replace %t.1 and %t.2 with true.

// before
define void @src(i4 %x, i4 %y) {
entry:
  %and.1 = and i4 %y, %x
  %c.1= icmp slt i4 %and.1, 0
  br i1 %c.1, label %then, label %end

then:
  ; fact: %and.1 < 0
  %t.1 = icmp slt i4 %x, 0
  %t.2 = icmp slt i4 %y, 0
  call void @use(i1 %t.1)
  call void @use(i1 %t.2)
  ret void

end:
  ret void
}

// after
define void @tgt(i4 %x, i4 %y) {
entry:
  %and.1 = and i4 %y, %x
  %c.1= icmp slt i4 %and.1, 0
  br i1 %c.1, label %then, label %end

then:
  call void @use(i1 true)
  call void @use(i1 true)
  ret void

end:
  ret void
}

Measure Replacement Counts

To evaluate the impact of my patch, I measured the number of replacements triggered by the Constraint Elimination pass. My approach is to print log whenever additional facts are added and replacement uses are triggered. This experiment was conducted on two projects, LLVM and Rust.

Building LLVM with UBSan enabled

With UBSan enabled, building LLVM will go through a series of bound checks, firing more Constraint Elimination. The first build has my constraint elimination patch, and I commented out my patch in the second build.

# $CC_HOME points to my locally built LLVM
cmake -GNinja -DCMAKE_CXX_COMPILER=$CC_HOME/bin/clang++ \
  -DCMAKE_C_COMPILER=$CC_HOME/bin/clang \
  -DCMAKE_BUILD_TYPE=Release \
  -DLLVM_ENABLE_PROJECTS="llvm;clang" \
  -DLLVM_CCACHE_BUILD=OFF -DLLVM_BUILD_EXAMPLES=ON \
  ../llvm -DLLVM_TARGETS_TO_BUILD=X86 \
  -DLLVM_USE_SANITIZER=Undefined

# without my patch
ninja -j 12 > without.txt 2>&1
# with my constraint elimination patch
ninja -j 12 > with.txt 2>&1

We observed that Constraint Elimination added several new facts while building LLVM.

# without my patch
$ cat ce2/without.txt | grep "additional" | wc -l
0
# with my constraint elimination patch
$ cat ce2/with.txt | grep "additional" | wc -l
78

We expected that the extra facts could trigger more replacements. However, the total number of replace count is the same.

# without my patch
$ ./perform.sh ce/without.txt
1: replace count
30428
2: replace count
517
3: replace count
0
4: replace count
0
5: replace count
0
6: replace count
7
7: replace count
7
total replace count
30959

# with my constraint elimination patch
$ ./perform.sh ce/with.txt
1: replace count
30428
2: replace count
517
3: replace count
0
4: replace count
0
5: replace count
0
6: replace count
7
7: replace count
7
total replace count
30959

Building Rust nightly

The result of the first experiment didn’t meet what we expected, so we decided to see the result when compiling Rust.

As of writing, Rust nightly supports LLVM 20.x, so I had to rebuild my patch on top of LLVM 20. Luckily, there were no conflicts when I applied my updates to LLVM 20.x branch.

# bootstrap.toml for building rustc
change-id = "ignore"

[target.x86_64-unknown-linux-gnu]
# set external llvm-config to get our locally built LLVM binaries
llvm-config = "/home/lee/dev/llvm-project/build/bin/llvm-config"

[llvm]
download-ci-llvm = false
link-shared = false

[build]
extended = true
# this needs to set to false, or rustc will try to build from in-tree LLVM
optimized-compiler-builtins = false

[rust]
channel = "nightly"

Build stage2 rustc and libstd.

# without my patch
./x.py build --stage 2 2>&1 | tee without.txt
# with my constraint elimination patch
./x.py build --stage 2 2>&1 | tee with.txt

Like we expected, there were additional facts added when compiling rustc.

$ cat rust-ce/without.txt | grep "additional" | wc -l
0
$ cat rust-ce/with.txt | grep "additional" | wc -l
304

Finally, the result is similar to building LLVM. There was no difference in replacement count.

# without my patch
$ ./perform.sh rust-ce/without.txt
1: replace count
594755
2: replace count
57
3: replace count
0
4: replace count
0
5: replace count
0
6: replace count
0
7: replace count
0
total replace count
594812

# with my constraint elimination patch
$ ./perform.sh rust-ce/with.txt
1: replace count
594755
2: replace count
57
3: replace count
0
4: replace count
0
5: replace count
0
6: replace count
0
7: replace count
0
total replace count
594812

Given that Rust performs many bounds checks, we can observe that it benefits significantly from the Constraint Elimination pass overall.

Despite successfully adding additional facts, the total number of replacements remained unchanged in both LLVM and Rust builds. It’s unclear why this happened. One possible explanation is that these new facts did not intersect with any existing icmp checks eligible for replacement. Further investigation is needed to understand this behavior.

Future Work

Constraint Elimination is a fun and powerful transformation pass. There are still many missing optimizations that we can contribute. As we mentioned earlier, the current constraint system supports only logical AND for its constraints. This limitation prevents the system from learning facts expressed as Fact1 || Fact2. Supporting logical OR opens many optimization opportunities, such as (LHS | RHS) < 0 => LHS < 0 || RHS < 0.

Currently, ne constraints are not supported in the constraint system. The reason is that a != b cannot be directly encoded as a single linear constraint — it represents a disjunction: a < b || a > b. However, the constraint system models all facts as conjunctions (logical ANDs) of linear inequalities. Without support for logical OR, it is not possible to store ne facts within the system. In contrast, eq facts like a == b can be represented as two inequalities: a <= b && a >= b, both of which are individually expressible.

Conclusion

I only worked on a certain part of the Constraint Elimination pass. There is still more to discover inside this pass, such as, adding facts for loop induction variables. Lastly, I want to express my gratitude to Dr. John Regehr for his valuable advice on this project. Thanks, John!