Chapter 8: Introduction to Model Checking
8.1 The Dream of Automated Verification: Letting Computers Prove Correctness
Human Limits and Machine Capability
As software systems become more complex, it becomes increasingly difficult to guarantee correctness through human intuition and experience alone. Developers write code, reviewers inspect it, and testers execute it, but subtle bugs and unexpected interactions still slip through.
Model checking complements these cognitive limits with computation. It systematically explores all possible execution paths and automatically determines whether a specified property holds. In other words, it is a verification method whose exhaustiveness is beyond human capacity.
The key innovation of model checking is the automation of proof. In traditional settings, proving system correctness was the work of specialists with strong mathematical intuition. In model checking, once the system model and the properties of interest are described, the computer carries out the proof attempt automatically and either confirms the property or produces a counterexample.
The Power of Exhaustive Exploration
The basic idea behind model checking is brute-force exploration of the system’s state space. At first glance this may appear inefficient, but in practice it is extremely powerful.
Chess and shogi programs defeat human champions for essentially the same reason. Humans rely on intuition and experience, whereas computers examine huge numbers of possible moves and select the best outcome. In software verification, the same difference appears: where a person says “this is probably correct,” a model checker can say either “this is correct within the model” or “here is a counterexample.”
Automating Formal Verification
Before model checking became practical, formal verification relied primarily on manual theorem proving. Experts advanced one proof obligation at a time through mathematical reasoning. The method was theoretically strong, but it had clear practical limits:
- Demand for expertise: advanced mathematical background was required
- Time and cost: large proofs often took a long time
- Human fallibility: hand-written proofs can themselves contain mistakes
- Scalability limits: large systems were difficult to handle
Model checking changed the situation fundamentally. A team no longer needed to master every proof detail. If the system model and target properties could be described precisely, verification became possible.
The Educational Value of Counterexamples
One of the most valuable features of model checking is the counterexample it provides when a property does not hold. Instead of returning only an abstract “incorrect,” it explains under what circumstances the problem occurs.
For system designers, counterexamples are highly actionable:
- Concrete problem framing: an abstract issue becomes a concrete scenario
- Cause identification: the root cause can be traced
- Repair guidance: the failure point becomes visible
- Deeper understanding: system behavior becomes easier to reason about
Improving Quality Through Early Detection
Model checking can discover problems early in development. By checking designs or specifications before implementation, teams can reduce the number of defects found only after code is written.
The cost of fixing a bug generally rises the later it is discovered. Common industry estimates say that if a defect found in requirements costs 1 unit to fix, the same defect may cost 5 units in design, 10 in implementation, 20 in testing, and over 100 in production. The actual multiplier depends on the organization, domain, and process, but the direction is consistent.
Early defect detection through model checking is therefore not only a technical advantage but also a cost-control mechanism.
Understanding the Limits Correctly
Model checking is powerful, but it is not universal. Using it effectively requires understanding its limits:
- The state-explosion problem: the number of states can grow exponentially, making verification difficult
- The need for abstraction: real systems must be abstracted into finite models
- The difficulty of property specification: teams need skill in writing formal properties
- Possible false positives and false negatives: abstraction can introduce behaviors that do not exist or hide ones that do
Practical model checking depends on choosing an appropriate abstraction level and a verification strategy that matches the problem.
Industrial Success Stories
Model checking is no longer only an academic technique. It has a substantial record in industry:
- Intel uses formal verification in processor design and has found bugs beyond the reach of conventional simulation
- Microsoft has applied verification to Windows drivers and reduced serious failures such as blue-screen crashes
- Amazon is well known for using TLA+ and model-checking-oriented thinking when designing distributed systems
- Airbus uses formal techniques to support assurance arguments for safety-critical control systems
These cases show that model checking has matured from a research method into a practical engineering tool.
Connecting to the Alloy Models in Chapter 4
The Alloy modeling you studied in Chapter 4 is an excellent introduction to
model checking. Alloy’s check command is, in essence, a small-scale form of
model checking:
Pseudo notation (the Chapter 4 model definitions are omitted, so this fragment cannot be executed as-is):
// Revisiting the email-system example from Chapter 4
assert NoUnauthorizedAccess {
all u: User, e: Email |
some e.confidential and
u != e.sender and u not in e.recipients and
u != e.confidential and
AdminAccess not in u.roles.permissions
implies
not canReadEmail[u, e]
}
check NoUnauthorizedAccess for 5 User, 5 Email, 3 Role
In this check operation, Alloy explores all possible states within the given
scope and searches for a counterexample. That is the same core idea you study
in this chapter.
Democratizing Formal Verification
One of model checking’s most important contributions is the democratization of formal verification. What used to be accessible only to a small number of specialists has become a comparatively learnable tool for a broader range of engineers.
That shift changes formal methods from a niche academic technique into a practical development tool. Over time, model checking is likely to become a standard part of software quality assurance.
8.2 State Spaces and How to Explore Them
State Space as a Conceptual Framework
A state space is the set of all states a system can possibly take. This mathematical abstraction lets us understand even complex software as a finite or countably infinite collection of points.
Consider a running program. The program counter, variable values, memory contents, input/output status, and other runtime information together form the current state. Execution can then be understood as a trajectory through that state space.
This idea is analogous to phase spaces in physics and other mathematical representations of dynamical systems. By recasting dynamic behavior as a static structure, we gain something we can analyze.
Structure of a State-Transition Graph
The structure of a state space is usually visualized as a state-transition graph. In such a graph, nodes represent states and edges represent transitions.
Pseudo notation:
Initial state -> State 1 -> State 2 -> State 3
| | | |
State A -> State B -> State C -> State D
| | | |
State X -> State Y -> State Z -> Final state
This structure lets us reason about important system properties:
- Reachability: can one state reach another?
- Cyclicity: does the system contain persistent loops?
- Deadlock: is there a state from which no progress is possible?
- Liveness: is some kind of forward movement always possible?
Exploding State Counts
In real software systems, the number of states grows explosively. This is the best-known challenge in model checking: the state-explosion problem.
Consider a simple example with three Boolean variables:
a,b,c ∈ {true, false}- Number of possible states =
2^3 = 8
If the system has ten Boolean variables:
- Number of possible states =
2^10 = 1,024
With 32 Boolean variables:
- Number of possible states =
2^32 ≈ 4.3 × 10^9
Real programs also contain integers, arrays, pointers, buffers, counters, and
various control states. A single 64-bit integer variable already allows
2^64 ≈ 1.8 × 10^19 values.
Reachability Analysis
Not every state is reachable during actual execution. Model checking therefore focuses on reachable states, that is, states reachable from the initial state. This reduces the search space to some extent.
Pseudo notation:
Reachable <- {InitialStates}
Frontier <- {InitialStates}
while Frontier != {} do
NewStates <- {}
for each state s in Frontier do
for each successor t of s do
if t not in Reachable then
NewStates <- NewStates union {t}
Reachable <- Reachable union {t}
Frontier <- NewStates
This is breadth-first exploration of the reachable state space.
Depth-First Versus Breadth-First Search
State-space exploration typically uses one of two basic strategies.
Depth-first search (DFS):
- Advantage: low memory usage
- Disadvantage: may fail to find the shortest counterexample
- Typical use: cycle detection and deadlock detection
Breadth-first search (BFS):
- Advantage: finds the shortest counterexample
- Disadvantage: high memory usage
- Typical use: minimum-step reachability analysis
Practical tools often switch strategies or combine them depending on the verification objective.
Efficient State Representation
Representing and managing huge numbers of states efficiently is essential for practical model checking.
Bit-vector representation:
Pseudo notation:
State = [a:1, b:0, c:1, counter:5]
-> Bit vector = 10100101
Hash tables: Visited states are typically stored in hash tables to support fast lookup.
Compression: Similar states or infrequent patterns can sometimes be compressed to save memory.
Partial Order Reduction
In concurrent systems, different execution orders can lead to the same result. Partial order reduction exploits this redundancy.
Suppose two operations A and B are independent:
- execution order
A -> B - execution order
B -> A
If both produce the same outcome, it is unnecessary to explore both. By keeping only a representative ordering, the tool reduces the state count.
Symbolic Model Checking
Traditional explicit-state model checking stores states one by one. Symbolic model checking represents sets of states with logical formulas, often using BDD (Binary Decision Diagram) structures.
Pseudo notation:
State set {(a=1,b=0), (a=1,b=1), (a=0,b=1)}
-> BDD form: (a and not b) or (a and b) or (not a and b)
-> Reduced: a or (not a and b) = a or b
When the structure is favorable, symbolic methods can represent extremely large state sets compactly.
Iterative Deepening
To balance completeness and memory constraints, tools sometimes use iterative deepening.
Pseudo notation:
for depth = 1 to MaxDepth do
result = DepthLimitedSearch(depth)
if result = "property violation" then
return counterexample
if result = "complete" then
return "property verified"
This expands the search horizon gradually while keeping memory under control.
Parallel and Distributed Exploration
Modern multi-core processors and distributed environments make it possible to parallelize state-space exploration.
Work-stealing:
Pseudo notation:
Each thread keeps its own stack.
When one thread runs out of work,
it steals work from another thread.
Distributed hash tables: States are partitioned across multiple machines and explored cooperatively over the network.
These techniques make it possible to verify systems that were previously too large to handle.
8.3 Describing Properties with Temporal Logic
The Challenge of Expressing Dynamic Properties
The correctness of a software system cannot be judged from a single snapshot. What matters is how the system behaves over time. Properties such as “every request is eventually answered,” “deadlock never occurs,” and “resources are distributed fairly” all concern dynamic behavior.
Temporal logic is the language used to describe such behavior precisely. Ordinary logic talks about the current state. Temporal logic talks about how states evolve over time.
CTL: The World of Computation Trees
CTL (Computation Tree Logic) is one of the most widely used logics in model checking. It treats execution as a computation tree: each node is a state, each edge is a transition, and the branching structure represents alternative futures.
Pseudo notation:
Initial state
/ | \
State 1 State 2 State 3
/ \ / \ / \
... ... ... ... ... ...
CTL formulas combine path quantifiers with temporal operators.
Path quantifiers:
- A (All paths): for all execution paths
- E (Exists path): for at least one execution path
Temporal operators:
- X (neXt): in the next state
- F (Future): at some point in the future
- G (Globally): always
- U (Until): until
Practical CTL Examples and Their Intuition
Safety:
Pseudo notation:
AG(!error)
“Along all execution paths, an error never occurs.”
Liveness:
Pseudo notation:
AG(request -> AF(response))
“Along all execution paths, whenever a request exists, a response eventually occurs.”
Reachability:
Pseudo notation:
EF(goal_achieved)
“There exists an execution path on which the goal is eventually reached.”
Fairness-oriented intuition:
Pseudo notation:
AG(EF(process1_scheduled) & EF(process2_scheduled))
“At every point, there remains a future in which both process 1 and process 2 can eventually be scheduled.”
LTL: The View of Linear Time
LTL (Linear Temporal Logic) takes a different perspective. It treats execution as a single line of time rather than a branching tree.
Basic LTL operators:
- ○ (Next): at the next moment
- ◇ (Eventually): eventually
- □ (Always): always
- U (Until): until
Examples of LTL formulas:
Pseudo notation:
□(request -> ◇response)
"Always, if there is a request, there is eventually a response."
□◇(process_scheduled)
"The process is scheduled infinitely often."
◇□(stable)
"Eventually the system becomes stable and stays stable thereafter."
Differences in Expressive Power Between CTL and LTL
CTL and LTL are suited to different classes of properties.
A property that CTL expresses well:
Pseudo notation:
AG(EF(reset_possible))
“It is always the case that there exists a path to a state in which reset is possible.”
This cannot be expressed directly in LTL.
A property that LTL expresses well:
Pseudo notation:
□(grant1 -> (!grant2 U release1))
“Once process 1 is granted access, process 2 is not granted access until process 1 releases it.”
This kind of ordering-oriented mutual exclusion property is much more natural in LTL than in CTL.
Practical Property Patterns
Several property patterns appear again and again in real verification tasks.
Mutual exclusion:
Pseudo notation:
AG(!(critical1 & critical2))
“The two processes are never in their critical sections at the same time.”
Deadlock freedom:
Pseudo notation:
AG(EX(true))
“From every reachable state, there is at least one next state.”
Starvation freedom:
Pseudo notation:
□(waiting -> ◇served)
“A waiting process is eventually served.”
Ordering constraints:
Pseudo notation:
□(event_a -> ○(!event_b U event_c))
“After event a, event b must not occur until event c occurs.”
Hierarchical Structure of Properties
Complex systems benefit from organizing properties hierarchically.
Level 1: Basic state validity:
Pseudo notation:
AG(valid_state)
“The system is always in a valid state.”
Level 2: Protocol correctness:
Pseudo notation:
AG(message_sent -> AF(ack_received))
“Every sent message is eventually acknowledged.”
Level 3: Performance-oriented properties:
Pseudo notation:
AG(request -> AF<=n(response))
“A response arrives within at most n time units.”
Negated Properties and Counterexamples
Counterexamples are central to model checking.
If AG(safe) does not hold:
- the counterexample is a path from an initial state to an unsafe state
- equivalently, some execution satisfies
EF(!safe)
If AF(goal) does not hold:
- the counterexample is an infinite path that never reaches the goal
- equivalently, some execution satisfies
EG(!goal)
Why Fairness Constraints Matter
Realistic verification often requires fairness assumptions. Without them, tools may spend effort on executions that are theoretically possible but irrelevant to the system’s intended environment.
Weak fairness (concept): exclude executions in which an action remains continuously enabled from some point onward, yet is ignored forever.
Strong fairness (concept): exclude executions in which an action becomes enabled infinitely often, yet is avoided every time.
In practical tools, fairness usually appears as a constraint on the set of
paths to be considered. In this chapter, we use the SMV-style forms
FAIRNESS (justice) and COMPASSION (p, q) as conceptual notation. The
surrounding module and variable declarations required by real tools are omitted
unless stated otherwise.
Example (NuSMV/nuXmv fairness constraints):
Pseudo notation:
FAIRNESS running_i
COMPASSION (enabled_a, executed_a)
Summary of the meaning:
FAIRNESS p: only evaluate paths on whichpis true infinitely oftenCOMPASSION (p, q): only evaluate paths on which infinite occurrences ofpimply infinite occurrences ofq
For the temporal-logic meaning of fairness and the precise TLA+ treatment, see also Chapter 7, “Temporal Expressions of Fairness.”
Minimal NuSMV/nuXmv example with FAIRNESS and COMPASSION:
【Tool-compliant (runs as-is)】
MODULE main
VAR
enabled_a : boolean;
executed_a : boolean;
ASSIGN
init(enabled_a) := FALSE;
init(executed_a) := FALSE;
next(enabled_a) := !enabled_a;
next(executed_a) := enabled_a;
FAIRNESS enabled_a
COMPASSION (enabled_a, executed_a)
For a self-contained file, see examples/ch08/nusmv/fairness.smv.
Notes:
FAIRNESS enabled_alimits attention to paths on whichenabled_ais true infinitely often.COMPASSION (enabled_a, executed_a)limits attention to paths on which infinitely manyenabled_astates imply infinitely manyexecuted_astates.- In this minimal model,
enabled_aalternates betweenTRUEandFALSE, andexecuted_afollows it with a one-step lag. - In general, fairness constraints narrow the set of explored paths. They do not, by themselves, state a liveness property.
Practical Guidelines for Writing Properties
Effective property descriptions usually follow a few principles:
- Be concrete and checkable: avoid ambiguity and refer to explicit states and events
- Refine incrementally: start with simple properties and grow toward more complex ones
- Think about counterexamples: decide whether a violating trace would truly indicate a defect
- Use realistic assumptions: avoid idealized properties that the system cannot actually satisfy
Tool-Specific Extensions
Real model-checking tools often provide practical extensions on top of standard temporal logic.
Example in SPIN / Promela:
【Tool-compliant (runs as-is)】
bool critical_section = false;
bool cs1 = false;
bool cs2 = false;
active proctype Worker() {
do
:: critical_section = true;
critical_section = false
od
}
active proctype P1() {
do
:: atomic { !cs2 -> cs1 = true; cs1 = false }
od
}
active proctype P2() {
do
:: atomic { !cs1 -> cs2 = true; cs2 = false }
od
}
ltl fairness { []<>(critical_section) }
ltl mutual_exclusion { [](!(cs1 && cs2)) }
For a self-contained file, see examples/ch08/spin/ltl-properties.pml.
Example in NuSMV / SMV:
Pseudo notation:
LTLSPEC G(request -> F(response))
CTLSPEC AG(critical1 -> !critical2)
These extensions turn theoretical temporal logic into something directly useful for engineering verification.
8.4 State Explosion: Compromising with Reality
The Mathematical Reality of Combinatorial Explosion
The state-explosion problem is the most fundamental challenge in model checking. Its severity is often underestimated until one looks at concrete numbers.
Suppose there are n processes and each process has k local states:
- total number of states =
k^n
With 10 processes and 3 states each:
- total number of states =
3^10 = 59,049
With 20 processes and 3 states each:
- total number of states =
3^20 ≈ 3.5 × 10^9
If we also add time, message queues, counters, buffers, and local variables, the state count becomes astronomical. Even today’s fastest machines cannot fully enumerate such spaces in many cases.
The Physical Limit of Memory
Model checking usually requires storing visited states. If each state takes 100 bytes:
10^6states =100 MB10^9states =100 GB10^12states =100 TB
Handling 10^12 states would require around one hundred terabytes of memory,
which is not a realistic target for ordinary engineering workflows.
Root Causes of State Explosion
Understanding why state explosion happens helps us choose the right mitigation techniques.
Explosion caused by concurrency:
Pseudo notation:
2 processes, each with 10 states -> 100 states
3 processes, each with 10 states -> 1,000 states
10 processes, each with 10 states -> 10^10 states
Explosion caused by data values:
Pseudo notation:
1 variable of type 32-bit integer -> 2^32 ≈ 4.3 billion possibilities
1 variable of type 64-bit integer -> 2^64 ≈ 1.8 x 10^19 possibilities
Explosion caused by dynamic data structures: Lists, trees, graphs, and other dynamic structures combine size and shape, so the number of states grows extremely quickly.
Abstraction: Strategically Omitting Detail
The most basic response to state explosion is abstraction. The idea is to omit details that are irrelevant to the property being checked and keep only the information that matters.
Data abstraction:
Pseudo notation:
Concrete value: balance = 1247
-> Abstract value: balance in {zero, positive, negative}
Concrete value: array[100]
-> Abstract value: array_size in {empty, small, large}
Control abstraction:
Pseudo notation:
Concrete control flow: a complex algorithm
-> Abstract control flow: success / failure
Bounded Model Checking
Bounded model checking controls explosion by limiting the search depth.
Basic idea:
- explore only states reachable within
ksteps - if a violation is found within
ksteps, the answer is negative - if no violation is found, the result is only “no bug was found within this bound”
Advantages:
- memory consumption is easier to control
- SAT solvers and related engines can be used effectively
- discovered counterexamples are often short and easy to understand
Limitations:
- completeness is not guaranteed
- choosing a good bound
kis difficult
The Theory and Practice of Partial Order Reduction
In concurrent systems, many action interleavings do not affect the final result. Partial order reduction exploits this redundancy.
Pseudo notation:
Process 1: x := x + 1
Process 2: y := y + 1
These operations are independent. The orders (1 -> 2) and (2 -> 1) lead to
the same result, so only one representative ordering needs to be explored.
When there are n independent actions:
- naive exploration can consider
n!orders - partial order reduction may keep only one representative order
Symbolic Model-Checking Techniques
Symbolic model checking uses formulas to represent state sets and performs set operations through logic operations.
Pseudo notation:
State set S = {(x=0,y=0), (x=0,y=1), (x=1,y=1)}
-> BDD form: (!x and !y) or (!x and y) or (x and y)
-> Reduced: !x or (x and y)
When BDDs are effective, exponentially large state sets can sometimes be handled with polynomial-size structures.
Modern verification also relies heavily on SAT and SMT solvers. By encoding transitions as logical constraints and handing them to the solver, tools can explore large bounded spaces efficiently.
Compositional Verification
Large systems can often be decomposed into smaller parts, verified separately, and then reasoned about compositionally.
Pseudo notation:
Component A: guarantees QA under assumption PA
Component B: guarantees QB under assumption PB
Whole system: (PA and PB) -> (QA and QB)
This keeps each verification unit smaller while still supporting claims about the overall system.
Iterative Refinement
If an abstraction is too coarse, it may produce a spurious counterexample. One practical response is iterative refinement.
Pseudo notation:
1. Verify with a coarse abstraction
2. If a counterexample appears, check whether it is real
3. If it is spurious, refine the abstraction
4. Return to step 1
This process gradually converges toward the minimum level of detail needed for the target property.
Parallel and Distributed Verification
Modern verification engines exploit multi-core processors and clusters.
Parallel search strategies:
- work stealing: each thread explores locally and load is redistributed dynamically
- hash-based partitioning: states are assigned to workers according to a hash
Distributed verification:
- one verification run spans multiple machines
- machines exchange state information over the network
- a distributed hash table tracks visited states
Approximate and Probabilistic Methods
When complete verification is too expensive, approximate methods become useful.
Random sampling:
- choose execution paths randomly from the state space
- estimate confidence statistically
Monte Carlo model checking:
- search probabilistically for counterexamples
- full completeness is lost, but practical runtime often improves
Choosing a Practical Strategy
In real projects, combinations of techniques work better than any one method.
A staged approach:
- use bounded model checking to find easy bugs quickly
- use abstraction to verify the most important properties
- add symbolic or parallel techniques when scale requires them
Cost-benefit analysis:
- verification cost versus the value of the defects found
- completeness versus acceptable runtime
- automation versus the need for human intervention
State explosion cannot be eliminated completely. The engineering task is to understand the system’s structure and choose techniques that make verification practical.
8.5 Interpreting and Using Counterexamples
Counterexamples as a Gift
In model checking, a counterexample is more than an error report. It is a gift that reveals a hidden defect in a concrete and inspectable form. Instead of saying only “the specification is violated,” the tool shows how the problem arises and in what sequence of states.
Skilled designers learn from counterexamples at multiple levels. They do not stop at “fix this bug.” They also ask why the problem was possible, which assumption failed, and whether similar defects exist elsewhere.
Anatomy of a Counterexample
A typical counterexample contains the following elements:
- Execution trace: a sequence of states from the initial state to the violating state
- State information: the values of relevant variables in each state
- Transition information: the action that caused each step
- Logical time: the position of each step in the trace
Example: a mutual-exclusion violation
Pseudo notation:
State 0: (Initial)
process1_state = "thinking"
process2_state = "thinking"
flag1 = false, flag2 = false
turn = 1
State 1: (Process 1 wants to enter)
process1_state = "trying"
process2_state = "thinking"
flag1 = true, flag2 = false
turn = 1
State 2: (Process 2 wants to enter)
process1_state = "trying"
process2_state = "trying"
flag1 = true, flag2 = true
turn = 1
State 3: (Both enter the critical section - VIOLATION!)
process1_state = "critical"
process2_state = "critical"
flag1 = true, flag2 = true
turn = 1
Root-Cause Analysis
Finding the root cause of a counterexample requires systematic analysis.
1. Identify the symptom:
- what differs from the intended behavior?
- at what point does the defect become visible?
2. Check the preconditions:
- what conditions were necessary for the failure?
- why did those conditions hold?
3. Analyze timing:
- did the failure require a specific interleaving?
- would other timings cause the same issue?
4. Revisit design decisions:
- was the underlying design assumption valid?
- could another design avoid the problem entirely?
Analyzing Deadlock Counterexamples
Deadlock counterexamples are especially educational.
Pseudo notation:
State N: (Deadlock detected)
Process A: waiting for Resource 2, holding Resource 1
Process B: waiting for Resource 1, holding Resource 2
Available resources: none
Possible transitions: none
Points to analyze:
- Formation of circular waiting: in what order were resources acquired?
- Avoidability: at what point could a different choice have prevented the deadlock?
- Design improvement: would resource ordering or timeouts eliminate it?
Analyzing Temporal-Property Violations
Violations of temporal properties often require deeper analysis.
Example of a liveness violation: “every request is eventually answered” fails.
Pseudo notation:
Counterexample: an infinite loop that never generates a response
State 0 -> State 1 -> State 2 -> State 1 -> State 2 -> ...
Useful analysis steps:
- find the strongly connected component that forms the loop
- check fairness assumptions to see whether an enabled response action is being postponed forever
- inspect external conditions that may be preventing progress
Handling Spurious Counterexamples
Abstraction can produce counterexamples that cannot occur in the real system. These are called spurious counterexamples.
Typical signs:
- they depend on information lost by abstraction
- they include transitions that are impossible at implementation level
- they assume unrealistic timing or impossible combinations of conditions
Responses:
- refine the abstraction to preserve more information
- add constraints to remove unrealistic transitions
- replay the trace concretely in the original system or a more detailed model
Counterexample Minimization
Long counterexamples are hard to understand, so minimization matters.
Common techniques:
- binary search over the trace to remove unnecessary segments
- causal analysis to retain only steps directly related to the failure
- state projection to focus on the relevant variables only
Pseudo notation:
Original counterexample: 50 steps
Minimized counterexample: 8 steps (only essential transitions)
Visualizing Counterexamples
Complex counterexamples become easier to interpret when visualized well.
State-transition view:
Pseudo notation:
[State0] --action1--> [State1] --action2--> [State2]
| | |
var1=0 var1=1 var1=2
var2=false var2=false var2=true
Timeline view:
Pseudo notation:
Process A: |----thinking----|--trying--|--critical--|
Process B: |----thinking----|----trying----|--critical--|
Time: 0 5 10 15
Message-sequence view:
Pseudo notation:
Client Server Database
| | |
|--req--->| |
| |--query-->|
| |<--resp---|
|<--err---| |
Learning and Improving from Counterexamples
The point of counterexample analysis is not only to fix one bug but to improve the system as a whole.
Pattern recognition:
- are similar problems present elsewhere?
- did the same design choice contribute to multiple issues?
- is a broader design correction needed?
Preventive improvement:
- classify the defect pattern
- update design guidelines and review checklists
- add new automated checks to catch similar issues earlier
Tool Support for Counterexample Analysis
Modern model-checking tools provide dedicated support for analyzing counterexamples.
NuSMV trace facilities:
【Context-dependent snippet】
show_traces -v
write_traces -o counterexample.xml
These commands display a counterexample trace and write it to a file.
SPIN trace facilities:
【Tool-compliant (runs as-is)】
spin -t -p examples/ch08/spin/producer-consumer.pml
spin -t examples/ch08/spin/producer-consumer.pml
The first command prints a more detailed trail, and the second replays the trace as a simulation.
TLC trace facilities:
- step-by-step display of states
- inspection of variable changes
- detailed presentation of the error state
A Counterexample-Driven Development Process
Counterexamples can be integrated into the development process as recurring engineering artifacts.
Counterexample database:
- systematic storage of discovered counterexamples
- classification and searchability
- traceability of repairs and their effects
Regression checks:
- confirm that the specific counterexample is gone
- check that no new defect was introduced
- evaluate performance impact of the repair
Education and knowledge sharing:
- share recurring defect patterns
- use representative traces for onboarding and training
- reuse them during design reviews
Counterexamples are among the most valuable outputs of model checking. Used well, they improve both system quality and the engineering maturity of the team.
8.6 Comparing Major Tools and Choosing Among Them
The Ecosystem of Model-Checking Tools
As model checking has matured, a wide range of tools has emerged. Each tool is specialized for certain problem classes or verification styles. Choosing the right one has a substantial effect on the success of a verification effort.
SPIN: A Pioneer in Concurrent-System Verification
SPIN, developed by Gerard Holzmann, remains one of the representative tools for verifying concurrent systems.
Main characteristics:
- Promela language specialized for describing concurrent processes
- partial order reduction for controlling state explosion caused by interleavings
- memory efficiency for explicit-state exploration
- random simulation for lightweight pre-checks before full verification
Example domain:
【Tool-compliant (runs as-is)】
mtype = { msg };
chan buffer = [1] of { mtype };
active proctype Producer() {
do
:: buffer!msg -> printf("Produced\n")
od
}
active proctype Consumer() {
do
:: buffer?msg -> printf("Consumed\n")
od
}
For a self-contained file, see examples/ch08/spin/producer-consumer.pml.
Advantages:
- natural description of concurrency
- highly optimized verification engine
- long industrial track record
Constraints:
- restricted data types compared with general-purpose languages
- nontrivial learning curve
- limited graphical tooling
NuSMV: A Standard Tool for Symbolic Verification
NuSMV is a classic tool for symbolic model checking.
Main characteristics:
- BDD-based representation for large state sets
- CTL and LTL support for a wide range of temporal properties
- hierarchical modules for structured system descriptions
- counterexample generation for detailed diagnosis
Specification example:
【Tool-compliant (runs as-is)】
MODULE main
VAR
state : {idle, req, crit, exit};
input_req : boolean;
other_flag : boolean;
ASSIGN
init(state) := idle;
init(input_req) := FALSE;
init(other_flag) := FALSE;
next(input_req) := {TRUE, FALSE};
next(other_flag) := {TRUE, FALSE};
next(state) := case
state = idle & input_req : req;
state = req & !other_flag : crit;
state = crit : exit;
state = exit : idle;
TRUE : state;
esac;
LTLSPEC G(state = crit -> F(state = exit))
CTLSPEC AG(state = crit -> !other_flag)
For a self-contained file, see examples/ch08/nusmv/request-response.smv.
Advantages:
- declarative specification style
- strong symbolic engine
- broad academic and industrial use
Constraints:
- can be memory-intensive
- BDD construction may fail on some structures
- limited support for highly dynamic data structures
TLC: The Execution Engine for TLA+
TLC is the model checker dedicated to TLA+ specifications.
Main characteristics:
- high-level specification language close to mathematics
- rich data modeling with sets, functions, and sequences
- distributed verification across multiple machines
- detailed configuration control for the search
TLA+ specification sketch:
Pseudo notation:
Init ≜ balance = [a ∈ Accounts ↦ 0]
Transfer(from, to, amount) ≜
∧ balance[from] ≥ amount
∧ balance' = [balance EXCEPT
![from] = @ - amount,
![to] = @ + amount]
Next ≜ ∃ from, to ∈ Accounts, amount ∈ Nat :
Transfer(from, to, amount)
Advantages:
- expressive specification language
- strong support for abstract modeling
- especially good for distributed protocols
Constraints:
- steep learning curve
- some distance from implementation code
- narrower toolchain than mainstream programming languages
CBMC: A Practical Tool for Software Verification
CBMC is a bounded model checker for C and C++ programs.
Main characteristics:
- direct verification of real code
- SAT-based reasoning over bounded executions
- concrete counterexample traces
- industrial use in domains such as automotive and aerospace
Verification example:
【Tool-compliant (runs as-is)】
#include <assert.h>
extern int nondet_int(void);
int main(void) {
int array[10];
int index = nondet_int();
__CPROVER_assume(index >= 0 && index < 10);
array[index] = 42;
int x = nondet_int();
int y = nondet_int();
__CPROVER_assume(x >= 0 && x <= 1000);
__CPROVER_assume(y >= 0 && y <= 1000);
int sum = x + y;
__CPROVER_assert(sum >= x && sum >= y,
"sum grows for bounded non-negative inputs");
return 0;
}
For a self-contained file, see examples/ch08/cbmc/array-bounds.c.
Advantages:
- works at implementation level
- reuses existing code assets
- produces concrete, actionable bug traces
Constraints:
- bounded verification only
- abstraction is still difficult for large programs
- performance can degrade on very large codebases
UPPAAL: Specialized for Real-Time Systems
UPPAAL is a tool specialized for real-time verification.
Main characteristics:
- timed automata as a natural model of timing constraints
- graphical UI for visual state-machine modeling
- precise clock constraints
- optimization support for fastest or lowest-cost paths
Timed-automaton sketch:
Pseudo notation:
process Task {
clock t;
state Idle {
when(trigger && t >= period) -> Running { t := 0; }
}
state Running {
when(t <= deadline) -> Idle;
when(t > deadline) -> Error; // deadline violation
}
}
Advantages:
- precise reasoning about real-time behavior
- intuitive graphical descriptions
- useful for optimization questions as well as safety questions
Constraints:
- narrower applicability than general-purpose tools
- meaningful learning cost
- scale limits for very large systems
Factors That Determine Tool Choice
The main criteria for choosing a model-checking tool are practical.
1. Nature of the target system:
- concurrent systems -> SPIN, TLC
- hardware / embedded control -> NuSMV, UPPAAL
- software code -> CBMC, Java PathFinder
- real-time systems -> UPPAAL, TIMES
2. Required depth of verification:
- full verification where possible -> symbolic methods such as NuSMV
- bug finding -> explicit-state methods such as SPIN and TLC
- bounded verification -> SAT-based tools such as CBMC
3. Technical background of the team:
- strong mathematical background -> TLA+ / TLC
- programming-oriented background -> CBMC, Java PathFinder
- modeling experience -> NuSMV, UPPAAL
4. Project constraints:
- learning time -> shorter with CBMC, longer with TLA+
- existing toolchain -> integration matters
- performance expectations -> depend on target-system scale and complexity
An Integrated Tool-Use Strategy
In practice, combinations of tools are often most effective.
A staged verification strategy:
- design stage: use TLA+ to validate the high-level algorithm
- detailed design: use NuSMV for control logic
- implementation stage: use CBMC for real code
- integration stage: use SPIN for concurrent behavior
Complementary verification:
- by property type: safety with CBMC, liveness with SPIN
- by abstraction level: high-level with TLA+, low-level with CBMC
- by reasoning style: symbolic with NuSMV, explicit-state with SPIN
Emerging Tools and Trends
As verification technology evolves, new tools and frameworks continue to appear:
- CATAPULT for high-level synthesis verification
- SEAHORN as an LLVM-based verification framework
- SMACK for translating LLVM into Boogie-based workflows
- KLEE as a symbolic-execution engine
The border between model checking, symbolic execution, theorem proving, and static analysis is increasingly fluid.
Practical Adoption Guidelines
Evaluation phase:
- try the tool on a small example
- evaluate learning cost and expected benefit
- check compatibility with the existing development process
Introduction phase:
- run a pilot project
- share knowledge within the team
- expand the scope gradually
Institutionalization phase:
- update tools regularly
- accumulate reusable verification patterns
- transfer the lessons to other projects
The right tool choice has decisive influence on the success of model checking. Technical features matter, but so do team capability, project constraints, and long-term strategy.
End-of-Chapter Exercises
If you use AI while working through these exercises
- Treat AI output as a proposal; use verifiers to make the final judgment.
- Keep a record of the prompts you used, the generated specifications and invariants, the verification commands and logs (including seed, depth, and scope), and the revision history when counterexamples were found.
- For detailed templates, see Appendix D and Appendix F.
Basic Exercise 1: Practicing Temporal Logic
Describe properties of the following system in both CTL and LTL.
Elevator control system:
- states:
{idle, moving_up, moving_down, door_open} - events:
{call_button, arrive_floor, open_door, close_door}
Properties to write:
- Safety: the elevator never moves while the door is open
- Liveness: if a call arrives, the elevator eventually arrives
- Fairness: every floor is served fairly
- Efficiency: no unnecessary movement occurs
For each property:
- write a CTL formula
- write an LTL formula
- explain which logic expresses it more naturally
Basic Exercise 2: State-Space Analysis
Analyze the state space of the following concurrent system.
Shared-counter system with two processes:
Pseudo notation:
Process 1:
loop: read counter -> increment -> write counter
Process 2:
loop: read counter -> decrement -> write counter
Items to analyze:
- calculate the number of possible states when the counter ranges from 0 to 10
- draw the state-transition graph
- identify race conditions
- determine whether deadlock states exist
- discuss whether partial order reduction applies
Practical Exercise 1: Verification with SPIN
Describe the readers-writers problem in Promela and verify it with SPIN.
Requirements:
- multiple readers may read simultaneously
- a writer must write exclusively
- readers and writers must not access the resource at the same time
- starvation must be avoided
Implementation elements:
- process definitions in Promela
- description of shared data structures
- implementation of synchronization primitives
- description of safety and liveness properties
- verification with SPIN and analysis of the results
Practical Exercise 2: Verification with NuSMV
Describe a traffic-light control system in NuSMV and verify it.
System specification:
- north-south and east-west signals
- three states per direction: red, yellow, and green
- vehicle-detection sensors
- pedestrian push buttons
Verification items:
- Safety: conflicting directions never become green at the same time
- Liveness: every direction eventually gets green
- Responsiveness: pedestrian requests are handled appropriately
- Efficiency: unnecessary signal changes are avoided
Advanced Exercise: Staged Verification with Multiple Tools
Design a consensus algorithm for a distributed database and verify it progressively with multiple tools.
System requirements:
- three database nodes
- a majority-based agreement mechanism
- tolerance of node failures
- guaranteed data consistency
Staged verification process:
Phase 1: High-level design in TLA+
- describe the overall algorithm in TLA+
- verify the basic safety and liveness properties
- perform an initial check with TLC
Phase 2: Refinement in Promela
- refine the message-exchange protocol
- describe concurrency and communication in more detail
- verify the model with SPIN
Phase 3: Implementation-level verification
- implement the actual C or Java code
- verify it with CBMC or Java PathFinder
- integrate the results with unit tests
Integrated analysis:
- classify the problems found at each stage
- compare the consistency of results across tools
- analyze the relationship between abstraction level and defect discovery
- evaluate the overall verification strategy
Evaluation points:
- understanding the features and suitable use cases of each tool
- understanding the quality impact of staged verification
- understanding the trade-off between learning cost and verification benefit
- evaluating applicability to real projects
Through these exercises, you should acquire both a theoretical understanding of model checking and the practical ability to apply it. The advanced exercise is especially important because it exposes you to the complexity of real projects and to the need for a systematic verification strategy.