Introduction to SystemVerilog Clocking Blocks

Clocking blocks are one of the most important — and most misunderstood — features of SystemVerilog. They solve a fundamental problem in testbench design: how do you make a testbench communicate synchronously with a DUT without creating race conditions?

This post answers the most common questions engineers ask when using clocking blocks in VCS for the first time.

What is a Clocking Block?

A clocking block is a SystemVerilog construct placed inside an interface that defines:

  • Which clock edge synchronizes the signals
  • Which signals are inputs to the testbench (sampled from the DUT)
  • Which signals are outputs from the testbench (driven to the DUT)
  • The input/output skews (setup and hold timing offsets)

Without a clocking block, driving and sampling signals in a testbench program block can cause race conditions with the DUT — especially when both execute in the same simulation time step.

A Minimal Working Example: D Flip-Flop

Let's build up from scratch using a simple synchronous D flip-flop with a single d input and q output.

Step 1 — Define the Interface with a Clocking Block

The interface connects the DUT and the testbench, with the clocking block inside it used exclusively by the TB:

interface dff_if (input bit clk);
  logic q, d;

  // Clocking block: syncs TB signals to posedge clk
  clocking cb @(posedge clk);
    input  q;   // TB reads q from DUT (sampled)
    output d;   // TB drives d to DUT (skewed)
  endclocking

  // DUT sees raw signals
  modport DUT (input  clk,
               input  d,
               output q);

  // TB uses the clocking block — not raw signals
  modport TB  (clocking cb);

endinterface: dff_if

Key point: the TB modport only exposes clocking cb, not the raw d and q signals. This forces all testbench signal access to go through the synchronization mechanism.

Step 2 — The DUT Module

module dff (dff_ifc.DUT dff_if, input bit clock);
  always @(posedge clock) begin
    dff_if.q = dff_if.d;
    $display("@%0d: DUT: q=%b", $time, dff_if.q);
  end
endmodule

Step 3 — The Testbench Program

program automatic test (dff_ifc.TB dff_if);

  initial
    #10 repeat (100)
      #1 $display("@%0d: TB:  q=%b", $time, dff_if.cb.q);

  initial begin
    ##2 dff_if.cb.d <= 0;
    $display("@%0d: TB:  drive d=0", $time);
    ##2 dff_if.cb.d <= 1;
    $display("@%0d: TB:  drive d=1", $time);
    ##2 dff_if.cb.d <= 0;
    $display("@%0d: TB:  drive d=0", $time);
    $finish;
  end

endprogram

Step 4 — The Top-Level Testbench

// example.sv
`timescale 1ns/1ns

module top;
  bit clock = 1;
  always #5 clock = !clock;   // 10ns period clock

  dff_ifc dff_if(clock);      // Instantiate interface
  dff     a1 (dff_if, clock); // Connect DUT
  test    t1 (dff_if);        // Connect TB

  initial
    $monitor("@%0d: Top: d=%b, q=%b, clock=%b",
             $time, dff_if.d, dff_if.q, clock);
endmodule

Compile and run with:

vcs -sverilog -R example.sv

The Three Critical Questions

Q1 — How does the TB drive a DUT signal?

Use a non-blocking assignment (NBA) through the clocking block:

dff_if.cb.d <= 0;   // NBA through clocking block

This is non-blocking for two reasons:

  • The program block executes in a different simulation region than the design, so values don't propagate immediately
  • The .cb. accessor means the value is deferred to the next clock edge — even if this line runs at the exact moment the clock rises, the DUT will not see it until the next posedge

This is by design: it eliminates the classic testbench race condition where a signal changes at the same time the DUT latches it.

You can also add a cycle delay before the drive:

##2 dff_if.cb.d <= 1;   // Drive after 2 clock cycles

The ##N syntax is a clocking block cycle delay — it waits N posedges of the clocking block's clock before executing the assignment.

Q2 — How does the TB sample a DUT output?

gnt_out = dff_if.cb.q;   // Reads the PREVIOUS clock cycle's value

When reading through a clocking block, SystemVerilog inserts an implicit input skew (default: 1 step before the clock edge). This means:

  • If you read cb.q in the same time slot as the rising clock edge, you get the value from the previous cycle
  • This prevents sampling a value that the DUT is still in the process of computing

Think of it as a built-in setup time enforcer for your testbench.

Q3 — What is the minimum round-trip latency?

With a clocking block, the minimum TB → DUT → TB latency is 2 clock cycles:

  • Cycle 1: TB drives d → DUT latches it on the next posedge → q updates
  • Cycle 2: TB samples q → reads the value from the previous cycle

This is the fundamental handshake cost of synchronous testbench communication and is a key timing consideration when writing self-checking tests.

Why Use program automatic?

The automatic keyword on the program block gives all variables inside it automatic storage — meaning each task/function call gets its own copy of local variables. This is equivalent to automatic storage in C. Without it, variables default to static storage, which causes issues in re-entrant tasks and fork-join blocks.

Even if you don't need re-entrancy in your current testbench, automatic is strongly recommended as a defensive coding style.

Common Mistakes to Avoid

  • Driving without .cb. — driving dff_if.d directly from a program block bypasses the clocking block's timing protection and can create races. Always use dff_if.cb.d.
  • Using blocking assignments (=) to drive clocking block signals — always use NBA (<=) for clocking block outputs.
  • Forgetting ##N delays — without a cycle delay, multiple drives in the same initial block happen back-to-back in simulation time, which may not reflect real protocol behavior.
  • Sampling combinational outputs — clocking blocks are designed for registered (flip-flop) outputs. Sampling combinational logic through a clocking block can give stale values due to the input skew.

Quick Reference

Operation Syntax Notes
Drive a signal dff_if.cb.d <= value; NBA, takes effect next clock edge
Drive after N cycles ##N dff_if.cb.d <= value; Waits N posedges first
Sample a signal val = dff_if.cb.q; Returns value from previous cycle
Wait N cycles ##N; Stalls execution for N clock edges
Wait 1 cycle @(dff_if.cb); Alternative to ##1

Clocking blocks are the foundation of race-free testbench design in SystemVerilog. Once you internalize the input/output skew model and the NBA requirement, writing synchronous stimulus becomes straightforward — and your simulations become dramatically easier to debug.

Next up: Clocking block skew settings (input #1step, output #1) and how they interact with UVM driver timing.

Comments