SE465 Test 1 Study Guide

Feb 10, 2025

Quick Links


Lecture 2 - Engineering Tests

Date: January 8, 2025

Why Tests?

  • Tests provide assurance that code works.
  • Automated tests allow for faster feedback compared to manual testing.
  • Not writing tests incurs technical debt, making future maintenance harder.
  • Code that lasts years without tests will accumulate bugs unnoticed.
  • Key takeaway: If it matters that the code works, write a test for it.

Exploratory Testing

  • Typically done by dedicated testers, but useful for developers as well.
  • Exploratory testing = Learning + Test Design + Execution.
  • Contrasted with scripted testing:
    • Scripted testing follows predefined steps.
    • Exploratory testing adapts based on findings.
  • Best for:
    • Rapid feedback on new features.
    • Finding the most critical bug quickly.
    • Evaluating risks before writing scripted tests.

Process:

  1. Define a charter (broad testing goal).
  2. Choose an area of the software to test.
  3. Design a test informally.
  4. Execute and log findings.
  5. Repeat as needed.

Regression Testing

Ensures that new changes don’t break existing functionality.

  • Automated: Manual regression testing is inefficient.
  • Optimally sized: Too few = missed bugs, too many = slow execution.
  • Up-to-date: Must match the current version of the software.

Automating Regression Tests

  • Input:
    • File-based inputs simplify regression testing.
    • UI-based tests often use event capture and replay.
  • Output:
    • Comparing screenshots (e.g., Mozilla's testing of Gecko engine) is problematic due to nondeterminism.
    • A more reliable approach is logging-based verification.

Unit Testing

  • Focuses on individual components (functions, classes, or modules).
  • Must be fast to execute.
  • Use mocks and stubs to isolate components.

Example Test Structure (NUnit - C#)

[Test]
public void GetMinimum_UnsortedArray_ReturnsSmallestValue() {
    var arr = new int[] {7, 4, 9, 2, 5};
    var minimum = Statistics.GetMinimum(arr);
    Assert.AreEqual(2, minimum);
}
  • Structure:
    1. Arrange - Set up test data.
    2. Act - Execute the function.
    3. Assert - Verify the output.

Flaky Tests

  • Flaky tests fail inconsistently.
  • Common causes:
    1. Improper waiting for async responses.
    2. Concurrency issues (race conditions, deadlocks).
    3. Test order dependencies (tests rely on prior tests).
  • Fixes: Use explicit waits instead of sleep(), proper locking mechanisms, and independent test cases.

Test Doubles

  • Used to isolate dependencies in unit tests.
  • Types:
    • Dummy objects: Only fill parameter lists, do nothing.
    • Fakes: Implement real behavior but unsuitable for production.
    • Stubs: Return predefined responses.
    • Mocks: Verify method calls and behavior.
    • Spies: Record interactions but wrap real objects.

Mocking Example (jMock - Java)

Mock warehouse = mock(Warehouse.class);
Mock mailer = mock(MailService.class);
order.setMailer((MailService) mailer.proxy());
mailer.expects(once()).method("send");

Key Takeaways

  • Testing is essential for maintainable software.
  • Automate everything: Regression tests should always be automated.
  • Unit tests should be small and isolated.
  • Flaky tests must be fixed to maintain reliability.
  • Mock dependencies for more effective unit testing.

Lecture 3 - Coverage and Fuzzing

Date: January 13, 2025

Control-Flow Graph (CFG, very brief refer to notes for examples)

A CFG is a graphical representation of the flow of control in a program.

  • Nodes: Represent zero or more statements.
  • Edges: Represent possible transitions between statements.

CFG Construction:

  1. Lexing: Converts characters into tokens.
  2. Parsing: Converts tokens into a concrete syntax tree.
  3. Abstract Syntax Tree (AST): Cleans up the concrete syntax tree.
  4. CFG Generation: Converts the AST into a control-flow graph.
  5. Optimizations: Apply optimizations to the CFG.
  6. Code Generation: Convert the CFG into bytecode or machine code.

Basic Blocks

A sequence of instructions with one entry point and one exit point.

  • Maximal Basic Blocks: Group statements that always execute together.

Example: In a loop, the body of the loop forms a basic block.

Statement & Branch Coverage

  • Statement Coverage: Fraction of statements (nodes) executed by the test suite.
  • Branch Coverage: Fraction of branches (edges) taken by the test suite.

Example: Consider the following Python code and test suite:

class Foo:
    def m(self, a, b):
        if a < 0 and b < 0:
            return 4
        elif a < 0 and b > 0:
            return 3
        elif a > 0 and b < 0:
            return 2
        elif a >= 0 and b >= 0:
            return a / b
        raise Exception("I didn't think things through")

And the test suite:

import unittest
from foo import Foo

class CoverageTests(unittest.TestCase):
    def test_one(self):
        f = Foo()
        f.m(1, 2)

    def test_two(self):
        f = Foo()
        f.m(1, -2)

    def test_three(self):
        f = Foo()
        f.m(-1, 2)
  • Coverage Report: The test suite achieves 100% statement coverage but only 75% branch coverage because it doesn't cover the case where both a and b are negative.

Fuzzing

Fuzzing is a technique for finding software bugs by feeding random or semi-random inputs to a program.

Types of Fuzzing:

  1. Mutation-Based Fuzzing: Randomly modifies existing inputs.
  2. Generation-Based Fuzzing: Generates inputs based on a grammar or model.

Fuzzing Workflow

  1. Generate random inputs.
  2. Feed inputs to the program.
  3. Observe crashes, assertion failures, or runtime errors.

Levels of Fuzzing (for C Programs)

  1. Sequence of ASCII characters;
  2. Sequence of words, separators, and white space (gets past the lexer);
  3. Syntactically correct C program (gets past the parser);
  4. Type-correct C program (gets past the type checker);
  5. Statically conforming C program (starts to exercise optimizations);
  6. Dynamically conforming C program;
  7. Model conforming C program.

Fuzzing runs automatically and really works, but without significant work, it won’t find sophisticated domain-specific issues.

Key Takeaways

  • Code Coverage: Use statement and branch coverage to ensure that your tests are comprehensive and target critical paths in your code.

  • Fuzzing: Fuzzing is an effective technique for uncovering bugs, by feeding the program random or semi-random input.

  • Control Flow Graphs: Represent the flow of control in a program, and are used to calculate coverage metrics.


Lecture 4 - Mutation Analysis

Date: January 15, 2025

Mutation analysis helps determine if a test suite is effective by checking whether it detects small changes (mutations) in the code. These changes, called mutants, simulate real-world programming mistakes.

Concept

  • A mutant is a modified version of the original program.
  • A test suite is considered effective if it can kill mutants, meaning the output of the test differs when run on the mutant versus the original.
  • Mutation analysis is essentially fuzzing the test suite to ensure it catches errors.

Example of Mutants

Original Code

int min(int a, int b) {
    int minVal = a;
    if (b < a) {
        minVal = b;
    }
    return minVal;
}

Mutant Variations

minVal = b;       // ∆ 1
if (b > a) {      // ∆ 2
if (b < minVal) { // ∆ 3
BOMB();           // ∆ 4
minVal = a;       // ∆ 5
minVal = failOnZero(b); // ∆ 6
  • Test Case Example: A test suite with (a=0, b=1, expected=0) might detect some mutants but miss others.
  • Equivalent Mutants: Some mutants don’t change behavior, making them undetectable. (∆ 3)

Mutation Analysis Workflow

  1. Generate Mutants - Apply mutation operators (small code changes) to create mutants.
  2. Execute Mutants - Run the test suite on each mutant.
  3. Classify Mutants:
    • Killed: The test suite detects the mutation.
    • Survived: The test suite fails to detect the change.
    • Equivalent: The mutant behaves identically to the original.

Generating Mutants

  • Ground String: A valid program belonging to the programming language grammar.
  • Mutation Operators: Rules that specify syntactic variations of strings generated from a grammar:
    • Arithmetic operator replacement (+-, */)
    • Logical connector replacement (&&||)
    • Relational operator replacement (>>=, <<=)
    • Unary operator deletion (!xx)
    • Statement removal
  • Mutant: The result of applying a mutation operator to a ground string.
  • Mutation Score: The percentage of killed mutants over total mutants.
  • Strong vs. Weak Mutation:
    • Strong: The test must propagate the error to the output.
    • Weak: The test must only cause an internal state change.

Let’s consider mutant ∆1 from above, i.e. we change minVal = a to minVal = b. In this case:

  • reachability: unavoidable;
  • infection: need b != a;
  • propagation: wrong minVal needs to return to the caller; that is, we can’t execute the body of the if statement, so we need b > a.

A test case for strong mutation is therefore a = 5, b = 7 and for weak mutation a = 7, b = 5.

Now consider mutant ∆3, which replaces b < a with b < minVal. This mutant is an equivalent mutant, since a = minVal. (The infection condition boils down to false .)

Key Takeaways

  • Mutation analysis evaluates the effectiveness of test suites.
  • Good test suites kill a high percentage of mutants.
  • Statement and branch coverage alone are weak indicators of test quality.
  • Mutation testing is computationally expensive but provides deep insights into test suite strength.

Lecture 5 - Syntax and Semantics

Date: January 20, 2025

Note: This lecture is a pain in the ass. Consider skipping.

The WHILE Language

A simplified imperative language used for reasoning about code.

Example Program

{ p := 0; x := 1; n := 2 };

while x <= n do {
    x := x + 1;
    p := p + m
};

print_state
  • Statements are separated by ; (not terminated like in C).
  • Variables are implicitly declared.
  • No side effects in expressions—effects occur only in statements.

Syntax Overview

Terminals (Leaf Nodes)

  • n Z: integers
  • true, false B: Boolean values
  • x, y L: variable locations

Non-Terminals (Internal Nodes)

  • a Aexp: arithmetic expressions
  • b Bexp: Boolean expressions
  • s Stmt: statements

Grammar Rules

Arithmetic Expressions

e ::= n | x | -e | e1 aop e2 | (e)

aop ::= + | - | *

Boolean Expressions

b ::= true | false | not b | e1 rop e2 | b1 bop b2 | (b)

rop ::= < | <= | = | >= | >
bop ::= and | or

Statements

s ::= skip | x:=e | if b then s [else s] | while b do s | { slist } | print_state | assert b | assume b | havoc v1, ..., vN

slist ::= s (;s)*

prog ::= slist

Operational Semantics: Big-Step

Defines how program execution transforms a state. A state q is a mapping from variables to values. Judgments are written as:

e,qn \langle e, q \rangle \Downarrow n

Expression e in state q evaluates to value n.

For statements, the notation is:

s,qq\langle s, q \rangle \Downarrow q'

statement s executed in state q results in state q′.

Rules for Arithmetic Expressions (Aexp)

The semantics for arithmetic expressions in WHILE are defined using inference rules. Here are the rules:

1. Constants

n,qn\langle n, q \rangle \Downarrow n
  • A constant nn evaluates to itself.

2. Variables

x,qq(x)\langle x, q \rangle \Downarrow q(x)
  • A variable xx evaluates to its value in the current state qq.

3. Addition

e1,qn1e2,qn2e1+e2,qn1+n2\langle e_1, q \rangle \Downarrow n_1 \quad \langle e_2, q \rangle \Downarrow n_2 \over \langle e_1 + e_2, q \rangle \Downarrow n_1 + n_2
  • The sum of two expressions e1e_1 and e2e_2 evaluates to the sum of their values.

4. Subtraction

e1,qn1e2,qn2e1e2,qn1n2\langle e_1, q \rangle \Downarrow n_1 \quad \langle e_2, q \rangle \Downarrow n_2 \over \langle e_1 - e_2, q \rangle \Downarrow n_1 - n_2
  • The difference of two expressions e1e_1 and e2e_2 evaluates to the difference of their values.

5. Multiplication

e1,qn1e2,qn2e1e2,qn1n2\langle e_1, q \rangle \Downarrow n_1 \quad \langle e_2, q \rangle \Downarrow n_2 \over \langle e_1 * e_2, q \rangle \Downarrow n_1 * n_2
  • The product of two expressions e1e_1 and e2e_2 evaluates to the product of their values.

Rules for Boolean Expressions (Bexp)

Boolean expressions are built on top of arithmetic expressions. Here are the rules:

1. True and False

true,qtrue\langle \text{true}, q \rangle \Downarrow \text{true} false,qfalse\langle \text{false}, q \rangle \Downarrow \text{false}
  • The constants true and false evaluate to themselves.

2. Equality

e1,qn1e2,qn2e1=e2,qn1=n2\langle e_1, q \rangle \Downarrow n_1 \quad \langle e_2, q \rangle \Downarrow n_2 \over \langle e_1 = e_2, q \rangle \Downarrow n_1 = n_2
  • The equality of two expressions e1e_1 and e2e_2 evaluates to whether their values are equal.

3. Less Than or Equal

e1,qn1e2,qn2e1e2,qn1n2\langle e_1, q \rangle \Downarrow n_1 \quad \langle e_2, q \rangle \Downarrow n_2 \over \langle e_1 \le e_2, q \rangle \Downarrow n_1 \leq n_2
  • The less-than-or-equal comparison of two expressions e1e_1 and e2e_2 evaluates to whether n1n_1 is less than or equal to n2n_2.

4. Logical AND

e1,qn1e2,qn2e1 and e2,qn1n2\langle e_1, q \rangle \Downarrow n_1 \quad \langle e_2, q \rangle \Downarrow n_2 \over \langle e_1 \text{ and } e_2, q \rangle \Downarrow n_1 \land n_2
  • The logical AND of two expressions e1e_1 and e2e_2 evaluates to the logical conjunction of their values.

5. Logical OR

e1,qn1e2,qn2e1 or e2,qn1n2\langle e_1, q \rangle \Downarrow n_1 \quad \langle e_2, q \rangle \Downarrow n_2 \over \langle e_1 \text{ or } e_2, q \rangle \Downarrow n_1 \lor n_2
  • The logical OR of two expressions e1e_1 and e2e_2 evaluates to the logical disjunction of their values.

Rules for Statements

In big-step semantics, statements transform the state of the program. The notation is:

s,qq\langle s, q \rangle \Downarrow q'
  • ss: The statement to be executed.
  • qq: The initial state.
  • qq': The final state after executing ss.

Notation

  • Empty state: [][]
  • Example state: [x:=10;y:=15;z:=5][x := 10; y := 15; z := 5]
  • Substitution: q[x:=10]q[x := 10] (yields a state like qq but with xx bound to 1010).

Rules for Evaluating Statements

Statements have side effects because they modify the state rather than directly returning a value. Here are the rules for evaluating statements in WHILE:

1. Skip Statement

skip,qq\langle \text{skip}, q \rangle \Downarrow q
  • The skip statement does nothing. It simply returns the same state qq.

2. Print State

printstate,qq\langle \text{printstate}, q \rangle \Downarrow q
  • The print_state statement does not modify the state but may have the side effect of printing the state.

3. Statement Composition

s1,qqs2,qqs1;s2,qq\langle s_1, q \rangle \Downarrow q'' \quad \langle s_2, q'' \rangle \Downarrow q' \over \langle s_1; s_2, q \rangle \Downarrow q'
  • The composition s1;s2s_1; s_2 executes s1s_1 first, resulting in state qq'', and then executes s2s_2 in state qq'', resulting in the final state qq'.

4. Assignment

e,qnx:=e,qq[x:=n]\langle e, q \rangle \Downarrow n \over \langle x := e, q \rangle \Downarrow q[x := n]
  • The assignment statement x:=ex := e evaluates the expression ee to nn and updates the state qq so that xx is bound to nn.

5. If-Then-Else Statement

b,qtrues1,qqif b then s1 else s2,qq\langle b, q \rangle \Downarrow \text{true} \quad \langle s_1, q \rangle \Downarrow q' \over \langle \text{if } b \text{ then } s_1 \text{ else } s_2, q \rangle \Downarrow q'
  • If the condition bb evaluates to true, the if statement executes s1s_1 and results in state qq'.
b,qfalses2,qqif b then s1 else s2,qq\frac{\langle b, q \rangle \Downarrow \text{false} \quad \langle s_2, q \rangle \Downarrow q'}{\langle \text{if } b \text{ then } s_1 \text{ else } s_2, q \rangle \Downarrow q'}
  • If the condition bb evaluates to false, the if statement executes s2s_2 and results in state qq'.

Example of Evaluating Statements

Consider the compound statement:

p:=0;x:=1;n:=2p := 0; x := 1; n := 2

We can derive its execution as follows:

  1. Evaluate p:=0p := 0 in the empty state [][]:

    0,[]0p:=0,[][p:=0]\langle 0, [] \rangle \Downarrow 0 \over \langle p := 0, [] \rangle \Downarrow [p := 0]
  2. Evaluate x:=1x := 1 in the state [p:=0][p := 0]:

    1,[p:=0]1x:=1,[p:=0][p:=0,x:=1]\langle 1, [p := 0] \rangle \Downarrow 1 \over \langle x := 1, [p := 0] \rangle \Downarrow [p := 0, x := 1]
  3. Evaluate n:=2n := 2 in the state [p:=0,x:=1][p := 0, x := 1]:

    2,[p:=0,x:=1]2n:=2,[p:=0,x:=1][p:=0,x:=1,n:=2]\langle 2, [p := 0, x := 1] \rangle \Downarrow 2 \over \langle n := 2, [p := 0, x := 1] \rangle \Downarrow [p := 0, x := 1, n := 2]
  4. The final state after executing the compound statement is:

    p:=0;x:=1;n:=2,[][p:=0,x:=1,n:=2]\langle p := 0; x := 1; n := 2, [] \rangle \Downarrow [p := 0, x := 1, n := 2]

Semantics of Loops

The semantics of the while statement are defined in two parts:

1. Loop Condition is False

b,qfalsewhile b do s,qq\frac{\langle b, q \rangle \Downarrow \text{false}}{\langle \text{while } b \text{ do } s, q \rangle \Downarrow q}
  • If the loop condition bb evaluates to false, the while statement is equivalent to skip and does not modify the state.

2. Loop Condition is True

b,qtrues;while b do s,qqwhile b do s,qq\frac{\langle b, q \rangle \Downarrow \text{true} \quad \langle s; \text{while } b \text{ do } s, q \rangle \Downarrow q'}{\langle \text{while } b \text{ do } s, q \rangle \Downarrow q'}
  • If the loop condition bb evaluates to true, the body ss is executed once, and then the while loop is executed again in the new state.

Non-Terminating Loops

For infinite loops, we introduce a special state \top (read as "top") to represent divergence:

while true do s,q\langle \text{while true do } s, q \rangle \Downarrow \top
  • An infinite loop enters the state \top, and any statement executed in this state remains in \top:

    s,\langle s, \top \rangle \Downarrow \top
  • But we don’t get any insight into what happens. Thats where small-step semantics come in to deal with programs that don’t terminate, but produce useful side effects while they are executing.

Deterministic Semantics

A semantics is deterministic if every program statement has at most one possible derivation in any state:

if s,qq1 and s,qq2 then q1=q2.\text{if } \langle s, q \rangle \Downarrow q_1 \text{ and } \langle s, q \rangle \Downarrow q_2 \text{ then } q_1 = q_2.

Two statements are semantically equivalent if they produce the same output state given the same input state:

s1 and s2 are semantically equivalent if q.s1,qq1 and s2,qq2 implies q1=q2.s_1 \text{ and } s_2 \text{ are semantically equivalent if } \forall q. \langle s_1, q \rangle \Downarrow q_1 \text{ and } \langle s_2, q \rangle \Downarrow q_2 \text{ implies } q_1 = q_2.

For example, unrolling a while loop yields a semantically equivalent statement:

while b do s is equivalent to if b then (s;while b do s) else skip.\text{while } b \text{ do } s \text{ is equivalent to } \text{if } b \text{ then } (s; \text{while } b \text{ do } s) \text{ else skip}.

Structural Induction

To prove a property PP on all possible derivation trees (e.g., for semantics), we use structural induction:

  • Base case: Prove PP for all axioms.
  • Inductive case: Assume PP holds for all smaller derivation trees and show that PP continues to hold when applying each possible rule once.

For example, we can prove that the semantics are deterministic using structural induction.

Operational Semantics: Small-Step

Describes intermediate execution steps instead of final results.

Judgments are written as:

s,qt,q \langle s, q \rangle \Rightarrow \langle t, q' \rangle
  • ss: The current program (or statement) to be executed.
  • qq: The current state (or store) of the program.
  • tt: The remaining program (or statement) after executing one step of ss.
  • qq': The new state after executing one step of ss.

If there is no more program left to execute after ss, the notation simplifies to:

s,qq\langle s, q \rangle \Rightarrow q'

Essentially peeling off the first statement from s, which we execute, and we have a remaining program t to execute

Rules in Small-Step Semantics

The rules provided in the section describe how different constructs in the WHILE language are executed step-by-step:

1. Skip Statement

skip,qq\langle \text{skip}, q \rangle \Rightarrow q
  • The skipskip statement does nothing. It simply transitions to the same state qq with no further program to execute.

2. Sequential Composition

s1,qqs1;s2,qs2,q\langle s_1, q \rangle \Rightarrow q' \over \langle s_1; s_2, q \rangle \Rightarrow \langle s_2, q' \rangle
  • If executing s1s_1 in state qq results in state qq', then the program s1;s2s_1; s_2 (which means "execute s1s_1 followed by s2s_2") reduces to s2s_2 in state qq'.

Alternatively, if s1s_1 reduces to another statement s3s_3:

s1,qs3,qs1;s2,qs3;s2,q\langle s_1, q \rangle \Rightarrow \langle s_3, q' \rangle \over \langle s_1; s_2, q \rangle \Rightarrow \langle s_3; s_2, q' \rangle
  • Here, s1s_1 is not fully executed yet, so the remaining program is s3;s2s_3; s_2.

3. If-Then-Else Statement

b,qtrue if b then s1 else s2,qs1,q\langle b, q \rangle \Downarrow \text{true} \over \langle \text{ if } b \text{ then } s_1 \text{ else } s_2, q \rangle \Rightarrow \langle s_1, q \rangle
  • If the condition bb) evaluates to true, then the if statement reduces to s1s_1 (the "then" branch).

Similarly, if bb evaluates to false:

b,qfalseif b then s1 else s2,qs2,q\frac{\langle b, q \rangle \Downarrow \text{false}}{\langle \text{if } b \text{ then } s_1 \text{ else } s_2, q \rangle \Rightarrow \langle s_2, q \rangle}
  • If the condition bb evaluates to false, the if statement reduces to s2s_2 (the "else" branch).

4. While Loop

while b do s,qif b then (s;while b do s) else skip,q\langle \text{while } b \text{ do } s, q \rangle \Rightarrow \langle \text{if } b \text{ then } (s; \text{while } b \text{ do } s) \text{ else skip}, q \rangle
  • The while loop is transformed into an if statement. If the condition bb is true, the loop body ss is executed, followed by the while loop again. If bb is false, the loop reduces to skip.

Example of Small-Step Execution

Consider the program:

while x>0 do x:=x1\text{while } x > 0 \text{ do } x := x - 1

In small-step semantics, the execution would proceed as follows:

  1. The while loop is transformed into an if statement:

    while x>0 do x:=x1,qif x>0 then (x:=x1;while x>0 do x:=x1) else skip,q\langle \text{while } x > 0 \text{ do } x := x - 1, q \rangle \Rightarrow \langle \text{if } x > 0 \text{ then } (x := x - 1; \text{while } x > 0 \text{ do } x := x - 1) \text{ else skip}, q \rangle
  2. If x>0x > 0 is true, the program reduces to:

    x:=x1;while x>0 do x:=x1,q\langle x := x - 1; \text{while } x > 0 \text{ do } x := x - 1, q \rangle
  3. The assignment x:=x1x := x - 1 is executed, updating the state qq, and the loop continues.

This step-by-step reduction continues until x>0x > 0 becomes false, at which point the loop reduces to skip.

Assertions, Assumptions, and Specification

  • assert: Ensures a condition holds at a program point.
  • assume: Restricts program execution to cases where a condition is true.
  • havoc: Assigns arbitrary values to variables, modeling unknown inputs.

Example (Symbolic Execution)

havoc(x);
assume(x > 10);
y = x + 1;
assert(y > x);
assert(y < 200);
  • Ensures y > x always holds but does not guarantee y < 200.

Key Takeaways

  • Big-Step Semantics: Describes how a program executes in a single step.
  • Small-Step Semantics: Defines transitions between intermediate execution states.
  • Assertions & Assumptions: Help model program correctness and symbolic execution.

Lecture 6 - Automated Test Case Generation

Date: No Clue

To generate interesting inputs:

  • Black box - truly random
  • Grey Box - guided by code coverage
  • White box - guided by code structure

Properties for generic automatic oracles:

  • Don't crash
  • Dont mis-use memory
  • User written assertions
  • Domain specific oracles

Valgrind

Inpsects CPU instructions and inserts low-level checks. Execution of every instruciton is interpreted in a sandbox and reported if bad behaviour detected.

  • Pros: Detailed analysis.
  • Cons: 10x slower in performance.

Address Sanitizer

A runtime libary supported by Clang and GCC that provides compile-time instrumentation. Often used in production code.

  • Major part of automated test-case generation validation.

Key Idea: Instrument all memory accesses.

  • Compiler insturments store/load instructions with a check if memory is accesible.
    • Must be super efficient!
    • Meta information about memory is stored.

Example:

OriginalInstrumented
*addr = e;
if (IsPoisoned(addr))
    ReportError(addr, sz, true);
*addr = e;

Memory Mapping

Virtual memory has 2 classes:

  • Mem is normal application memory.
  • Shadow is used to store metadata about main memory.

Poisoning a byte addr of Mem means writing a special value to corresponding place in Shadow

Mem and Shadow should be organized such that mapping is super fast.

// instrument access to addr:
shadow_addr = MemToShadow(addr);
if (ShadowIsPoisoned(shadow_addr)) {
ReportError(addr, sz, kIsWrite);
}

Memory Alignment

Process memory is divided into 8 byte words, called QWORDs.

  • Heap and Stack allocs are at a QWORD boundary.
  • This is called alignment (# of bytes differs on architecture).

Depending on the architecture unaligned memory accesses are expensive.

  • Compilers optimize the code so most accesses are aligned

Stae of an allocated QWORD

AddressSanitizer maps each QWORD of Mem into one byte of Shadow.

Each QWORD can be in one of 9 states:

  • All 8 bytes are accessible (not poisoned). Shadow value is 0.
  • All 8 bytes are inaccessible (poisoned). Shadow value is negative (< 0).
  • First k bytes are accessible, the rest 8-k bytes are not, 0 < k < 8. Shadow is k.

Instumentation Example

byte *shadow_addr = addr >> 3 + 0x20000000;
byte shadow_value = *shadow_addr;

if (shadow_value < 0) ReportError(addr, sz, kIsWrite);
else if (shadow_value) {
  if (SlowPathCheck(shadow_value, addr, sz)) {
    ReportError(addr, sz, kIsWrite);
  }
}

bool SlowPathCheck(shadow_value, addr, sz) {
  last_accessed_byte = ((addr & 7) + (sz  1)) & 7;
  // returns if the acessed byte is poisoned
  return (last_accessed_byte >= shadow_value);
}

Marking Allocation Boundaries with Redzones

Change heap allocator to mark boundaries of allocated segments

  • The markers are called redzones
  • All calls to malloc() are replaced with calls to __asan_malloc()
void *__asan_malloc(size_t sz) {
    // Allocate & poison left redzone
    void *rz = malloc(RED_SZ);
    Poison(rz, RED_SZ);
    
    //Allocate usable memory
    void *addr = malloc(sz);
    UnPoison(addr, sz);
    
    // Allocate & poison right redzone
    rz = malloc(RED_SZ);
    Poison(rz, RED_SZ);
    // Return usable memory
    return addr;
}

Key Takeaways

  • Address Sanitizer instruments memory accesses to detect errors.
  • Memory Mapping and Memory Alignment are key concepts in Address Sanitizer.
  • Redzones are used to mark allocation boundaries.

Lecture 7 - Symbolic Execution

Date: January 27, 2025

Symbolic execution is a technique used to automatically generate test suites that:

  • Achieve full branch coverage (or even full path coverage).
  • Identify dead code.
  • Discover potential division by zero errors.

Steps in Symbolic Execution:

  1. Transform the program to add explicit tests for division by zero (oracles).
  2. Traverse the program to compute each program path.
  3. Solve constraints for each path using a constraint solver (e.g., z3).
  4. Run the program on tests generated by the previous step.

Example Program:

def Foo(x, y):
    """ 
    requires: x and y are int
    ensures: returns floor(max(x, y) / min(x, y))
    """
    if x > y:
        return x / y
    else:
        return y / x

Transformed Program:

def Foo(x, y):
    """
    requires: x and y are int
    ensures: returns floor(max(x, y) / min(x, y))
    """
    if x > y:
        assert y != 0
        return x / y
    else:
        assert x != 0
        return y / x

Main Components of Symbolic Execution

Traversing: automatically explores program paths by executing the program on symbolic input values.

  • Forks execution at each branch and records branching conditions.

Solving Constraints: decides path feasibility, and generating test cases to get paths and to find bugs.

Example Paths:

  1. x>y,y==0x > y, y == 0: Assertion fails.
  2. x>y,y!=0x > y, y != 0: Returns x / y.
  3. x<=y,x==0x <= y, x == 0: Assertion fails.
  4. x<=y,x!=0x <= y, x != 0: Returns y / x.

Symbolic Execution Example

See L07 pdf.

Defining Symbolic Execution

  • Symbolic values: Stand in for input variables instead of using specific values.
  • Path conditions: Constraints that must hold for a given execution path to be feasible.
  • Symbolic state: Pair S=(Env,pc)S = (Env, pc), where:
    • Env maps program variables to symbolic expressions.
    • pc is a first-order logic formula.

Checking Symbolic State Satisfiability

A concrete state MM satisfies symbolic state SS if:

M(Env,pc)    vL,M(v)=Env(v) and pc is SATM \models (Env, pc) \iff \forall v \in L, M(v) = Env(v) \text{ and } pc \text{ is SAT}

Example:

Env={xX,yY}pc=(X>5)(Y<3)\text{Env} = \{ x \mapsto X, y \mapsto Y \} \text{pc} = (X > 5) \wedge (Y < 3)

For concrete state [x 10, y 1], we check:

(10>5)(1<3)Satisfies S(10 > 5) \wedge (1 < 3) \Rightarrow \text{Satisfies S}

For [x 1, y 10], we get:

(1>5)(10<3)Does not satisfy S(1 > 5) \wedge (10 < 3) \Rightarrow \text{Does not satisfy S}

Another Symbolic State Example

Env={xX+Y,yYX}pc=2XY>0\text{Env} = \{ x \mapsto X + Y, y \mapsto Y - X \} \text{pc} = 2X - Y > 0

Checking for [x 10, y 2]:

10=X+Y,2=YXX=4,Y=6(2(4)6>0)Satisfies S10 = X + Y, 2 = Y - X \Rightarrow X = 4, Y = 6 (2(4) - 6 > 0) \Rightarrow \text{Satisfies S}

For [x 2, y 10], solving gives X = -8, Y = 6:

(2(8)6>0)Does not satisfy S(2(-8) - 6 > 0) \Rightarrow \text{Does not satisfy S}

Finding Bugs Using Symbolic Execution

Finding a bug requires finding the conditions that trigger it. Bugs include:

  • assertion failures
  • buffer overflows
  • division by zero

Assertions are rewritten as conditionals to create explicit error paths. For example:

assert(x != NULL);

becomes:

if (x == NULL) error();

This ensures that symbolic execution explores error paths and determines whether an error is reachable.

Instrumenting Code for Bug Detection

Symbolic execution can inject runtime checks, similar to sanitizers. Examples include:

y = 100 / x;
// becomes 
assert(x != 0);
y = 100 / x;
a[x] = 10;
// becomes
assert(x >= 0 && x < len(a));
a[x] = 10;

Problems of Classical Symbolic Execution

Symbolic execution is powerful but has limitations:

  • Constraint Complexity: Some generated constraints may be too complex for SMT solvers to handle efficiently.

  • Path Explosion: The number of paths grows exponentially with program size, making full analysis infeasible.

  • Real-World Code Challenges: Programs interact with:

    • Pointers and data structures
    • Files, databases, and networks
    • Threads and concurrency

To address these challenges, dynamic symbolic execution will be introduced in the next lecture.


Faseeh Irfan