Introduction to SystemVerilog Clocking Blocks
Clocking blocks are one of the most important — and most misunderstood — features of SystemVerilog. They solve a fundamental problem in testbench design: how do you make a testbench communicate synchronously with a DUT without creating race conditions?
This post answers the most common questions engineers ask when using clocking blocks in VCS for the first time.
What is a Clocking Block?
A clocking block is a SystemVerilog construct placed inside an interface that defines:
- Which clock edge synchronizes the signals
- Which signals are inputs to the testbench (sampled from the DUT)
- Which signals are outputs from the testbench (driven to the DUT)
- The input/output skews (setup and hold timing offsets)
Without a clocking block, driving and sampling signals in a testbench program block can cause race conditions with the DUT — especially when both execute in the same simulation time step.
A Minimal Working Example: D Flip-Flop
Let's build up from scratch using a simple synchronous D flip-flop with a single d input and q output.
Step 1 — Define the Interface with a Clocking Block
The interface connects the DUT and the testbench, with the clocking block inside it used exclusively by the TB:
interface dff_if (input bit clk);
logic q, d;
// Clocking block: syncs TB signals to posedge clk
clocking cb @(posedge clk);
input q; // TB reads q from DUT (sampled)
output d; // TB drives d to DUT (skewed)
endclocking
// DUT sees raw signals
modport DUT (input clk,
input d,
output q);
// TB uses the clocking block — not raw signals
modport TB (clocking cb);
endinterface: dff_if
Key point: the TB modport only exposes clocking cb, not the raw d and q signals. This forces all testbench signal access to go through the synchronization mechanism.
Step 2 — The DUT Module
module dff (dff_ifc.DUT dff_if, input bit clock);
always @(posedge clock) begin
dff_if.q = dff_if.d;
$display("@%0d: DUT: q=%b", $time, dff_if.q);
end
endmodule
Step 3 — The Testbench Program
program automatic test (dff_ifc.TB dff_if);
initial
#10 repeat (100)
#1 $display("@%0d: TB: q=%b", $time, dff_if.cb.q);
initial begin
##2 dff_if.cb.d <= 0;
$display("@%0d: TB: drive d=0", $time);
##2 dff_if.cb.d <= 1;
$display("@%0d: TB: drive d=1", $time);
##2 dff_if.cb.d <= 0;
$display("@%0d: TB: drive d=0", $time);
$finish;
end
endprogram
Step 4 — The Top-Level Testbench
// example.sv
`timescale 1ns/1ns
module top;
bit clock = 1;
always #5 clock = !clock; // 10ns period clock
dff_ifc dff_if(clock); // Instantiate interface
dff a1 (dff_if, clock); // Connect DUT
test t1 (dff_if); // Connect TB
initial
$monitor("@%0d: Top: d=%b, q=%b, clock=%b",
$time, dff_if.d, dff_if.q, clock);
endmodule
Compile and run with:
vcs -sverilog -R example.sv
The Three Critical Questions
Q1 — How does the TB drive a DUT signal?
Use a non-blocking assignment (NBA) through the clocking block:
dff_if.cb.d <= 0; // NBA through clocking block
This is non-blocking for two reasons:
- The
programblock executes in a different simulation region than the design, so values don't propagate immediately - The
.cb.accessor means the value is deferred to the next clock edge — even if this line runs at the exact moment the clock rises, the DUT will not see it until the next posedge
This is by design: it eliminates the classic testbench race condition where a signal changes at the same time the DUT latches it.
You can also add a cycle delay before the drive:
##2 dff_if.cb.d <= 1; // Drive after 2 clock cycles
The ##N syntax is a clocking block cycle delay — it waits N posedges of the clocking block's clock before executing the assignment.
Q2 — How does the TB sample a DUT output?
gnt_out = dff_if.cb.q; // Reads the PREVIOUS clock cycle's value
When reading through a clocking block, SystemVerilog inserts an implicit input skew (default: 1 step before the clock edge). This means:
- If you read
cb.qin the same time slot as the rising clock edge, you get the value from the previous cycle - This prevents sampling a value that the DUT is still in the process of computing
Think of it as a built-in setup time enforcer for your testbench.
Q3 — What is the minimum round-trip latency?
With a clocking block, the minimum TB → DUT → TB latency is 2 clock cycles:
- Cycle 1: TB drives
d→ DUT latches it on the next posedge →qupdates - Cycle 2: TB samples
q→ reads the value from the previous cycle
This is the fundamental handshake cost of synchronous testbench communication and is a key timing consideration when writing self-checking tests.
Why Use program automatic?
The automatic keyword on the program block gives all variables inside it automatic storage — meaning each task/function call gets its own copy of local variables. This is equivalent to automatic storage in C. Without it, variables default to static storage, which causes issues in re-entrant tasks and fork-join blocks.
Even if you don't need re-entrancy in your current testbench, automatic is strongly recommended as a defensive coding style.
Common Mistakes to Avoid
- Driving without
.cb.— drivingdff_if.ddirectly from a program block bypasses the clocking block's timing protection and can create races. Always usedff_if.cb.d. - Using blocking assignments (
=) to drive clocking block signals — always use NBA (<=) for clocking block outputs. - Forgetting
##Ndelays — without a cycle delay, multiple drives in the same initial block happen back-to-back in simulation time, which may not reflect real protocol behavior. - Sampling combinational outputs — clocking blocks are designed for registered (flip-flop) outputs. Sampling combinational logic through a clocking block can give stale values due to the input skew.
Quick Reference
| Operation | Syntax | Notes |
|---|---|---|
| Drive a signal | dff_if.cb.d <= value; |
NBA, takes effect next clock edge |
| Drive after N cycles | ##N dff_if.cb.d <= value; |
Waits N posedges first |
| Sample a signal | val = dff_if.cb.q; |
Returns value from previous cycle |
| Wait N cycles | ##N; |
Stalls execution for N clock edges |
| Wait 1 cycle | @(dff_if.cb); |
Alternative to ##1 |
Clocking blocks are the foundation of race-free testbench design in SystemVerilog. Once you internalize the input/output skew model and the NBA requirement, writing synchronous stimulus becomes straightforward — and your simulations become dramatically easier to debug.
Next up: Clocking block skew settings (input #1step, output #1) and how they interact with UVM driver timing.
Comments