There were a couple of bugs/race conditions in the original Verilog code - now fixed, but the Schematic diagrams are not updated.Overview
This is a simple component that counts each pulse (rising edge) of a CLOCK signal, and emits the accumulated value on COUNT_OUT. The DIRECTION bit specifies if the pulses should increment or decrement the counter. The counter keeps track of the maximum value reached on COUNT_MAX, and indicates when a new maximum value is been output by holding NEWMAX high:
Since COUNT_MAX and NEWMAX are asserted on the same CLOCK edge as COUNT_OUT is incremented. In the RTL care must be take with un-latched assignment "=" and latched assignment "<=" such that the newly incremented value is compared.
counter.v
module counter(
input CLOCK,
input DIRECTION,
output [4:0] COUNT_OUT,
output [4:0] COUNT_MAX,
output NEWMAX
);
reg [4:0] count_int = 0;
reg [4:0] count_max_int = 0;
reg new_max = 0;
always @(posedge CLOCK) begin
new_max <= 0;
if (DIRECTION)
count_int <= count_int + 1;
else
count_int <= count_int - 1;
if (DIRECTION && count_int > count_max_int) begin
count_max_int = count_int;
new_max <= 1;
end
end
assign COUNT_OUT = count_int;
assign COUNT_MAX = count_max_int;
assign NEWMAX = new_max;
endmodule
The RTL Schematic contains a comparator, counter primitive, a 4bit latch, 1 bit latch and some combinatorial logic.
It comes as no surprise that the critical path is the post-incremented value leaving count_int1, through the comparator CREATER:1, matching and2 and causing fde to latch.
Timing constraint: Default period analysis for Clock 'CLOCK'
Clock period: 5.671ns (frequency: 176.336MHz)
Total number of paths / destination ports: 107 / 16
-------------------------------------------------------------------------
Delay: 5.671ns (Levels of Logic = 3)
Source: count_int_2 (FF)
Destination: count_max_int_0 (FF)
Source Clock: CLOCK rising
Destination Clock: CLOCK rising
Data Path: count_int_2 to count_max_int_0
Gate Net
Cell:in->out fanout Delay Delay Logical Name (Net Name)
---------------------------------------- ------------
FD:C->Q 8 0.591 0.761 count_int_2 (count_int_2)
LUT4:I3->O 1 0.704 0.595 count_max_int_cmp_gt00001_SW0_SW1 (N8)
LUT3_D:I0->O 1 0.704 0.424 count_max_int_cmp_gt00001 (count_max_int_cmp_gt00002)
LUT4:I3->O 5 0.704 0.633 count_max_int_and00001 (count_max_int_and0000)
FDE:CE 0.555 count_max_int_0
----------------------------------------
Total 5.671ns (3.258ns logic, 2.413ns route)
(57.5% logic, 42.5% route)
The device utilisation:
Cell Usage :
# BELS : 14
# LUT2 : 1
# LUT3 : 1
# LUT3_D : 1
# LUT4 : 8
# LUT4_L : 2
# VCC : 1
# FlipFlops/Latches : 11
# FD : 4
# FDE : 5
# FDR : 2
# Clock Buffers : 1
# BUFGP : 1
# IO Buffers : 12
# IBUF : 1
# OBUF : 11
=========================================================================
Device utilization summary:
---------------------------
Selected Device : 3s250evq100-4
Number of Slices: 9 out of 2448 0%
Number of Slice Flip Flops: 11 out of 4896 0%
Number of 4 input LUTs: 13 out of 4896 0%
Number of IOs: 13
Number of bonded IOBs: 13 out of 66 19%
Number of GCLKs: 1 out of 24 4%
Improvement
The unsigned comparator GREATER:1 is expensive to implement relative to a 4-bit equality check (only two 4-input LUTs, comparing A0:1 B0:1 and A2:3 B2:3, AND'ed together). It should be possible to simplify the design by comparing equality between COUNT_OUT and COUNT_MAX post-incrment, and latching "at_maximum" as additional state for the next cycle. On the next clock, if DIRECTION is increment (=1) AND at_maximum, we can simply gate the new maximum produced to COUNT_OUT, COUNT_MAX and set NEWMAX simultaneously.
There is one case where this does not hold, when incrementing from 1111 to 0000 under overflow conditions. The comparator would not determine this to be a new maximum condition, while the suggested design does. Here's an implementation:
counter2.v
module counter2( input CLOCK, input DIRECTION, input RESET, output [4:0] COUNT_OUT, output [4:0] COUNT_MAX, output NEWMAX );
reg [4:0] count_int = 0; reg [4:0] count_max_int = 0; reg new_max = 0; wire [4:0] count_up = count_int + 1; wire [4:0] count_down = count_int - 1; wire at_max = count_int == count_max_int; always @(posedge CLOCK) begin if (RESET) begin count_int <= 0; count_max_int <= 0; new_max <= 1; end else if (DIRECTION) begin count_int <= count_up; if (at_max) begin count_max_int <= count_up; new_max <= 1; end else begin new_max <= 0; end end else begin count_int <= count_down; new_max <= 0; end end assign COUNT_OUT = count_int; assign COUNT_MAX = count_max_int; assign NEWMAX = new_max; endmodule
The design changes use one less LUT, but sadly the number of slices remains the same. One benefit of this design is the critical timing path is reduced by 2ns, which puts the design throughput up significantly, 176Mhz to 281Mhz:
Timing constraint: Default period analysis for Clock 'CLOCK'
Clock period: 3.555ns (frequency: 281.294MHz)
Total number of paths / destination ports: 39 / 17
-------------------------------------------------------------------------
|
|