Recent site activity

Maximum Counter

There were a couple of bugs/race conditions in the original
Verilog code - now fixed, but the Schematic diagrams are not updated.

Overview

This is a simple component that counts each pulse (rising edge) of a CLOCK signal, and emits the accumulated value on COUNT_OUT. The DIRECTION bit specifies if the pulses should increment or decrement the counter. The counter keeps track of the maximum value reached on COUNT_MAX, and indicates when a new maximum value is been output by holding NEWMAX high:

Since COUNT_MAX and NEWMAX are asserted on the same CLOCK edge as COUNT_OUT is incremented. In the RTL care must be take with un-latched assignment "=" and latched assignment "<="  such that the newly incremented value is compared.

counter.v
module counter(
   input CLOCK,
   input DIRECTION,
   output [4:0] COUNT_OUT,
   output [4:0] COUNT_MAX,
  output NEWMAX
   );

   reg [4:0] count_int = 0;
   reg [4:0] count_max_int = 0;
   reg new_max = 0;
   always @(posedge CLOCK) begin
      new_max <= 0;
      if (DIRECTION) 
         count_int <= count_int + 1;
      else
         count_int <= count_int - 1;
      if (DIRECTION && count_int > count_max_int) begin
         count_max_int = count_int;
         new_max <= 1;
      end
   end
   assign COUNT_OUT = count_int;
   assign COUNT_MAX = count_max_int;
   assign NEWMAX = new_max;
endmodule

The RTL Schematic contains a comparator, counter primitive, a 4bit latch, 1 bit latch and some combinatorial logic.


It comes as no surprise that the critical path is the post-incremented value leaving count_int1, through the comparator CREATER:1, matching and2 and causing fde to latch.


Timing constraint: Default period analysis for Clock 'CLOCK'
  Clock period: 5.671ns (frequency: 176.336MHz)
  Total number of paths / destination ports: 107 / 16
-------------------------------------------------------------------------
Delay:               5.671ns (Levels of Logic = 3)
  Source:            count_int_2 (FF)
  Destination:       count_max_int_0 (FF)
  Source Clock:      CLOCK rising
  Destination Clock: CLOCK rising

  Data Path: count_int_2 to count_max_int_0
                                Gate     Net
    Cell:in->out      fanout   Delay   Delay  Logical Name (Net Name)
    ----------------------------------------  ------------
     FD:C->Q               8   0.591   0.761  count_int_2 (count_int_2)
     LUT4:I3->O            1   0.704   0.595  count_max_int_cmp_gt00001_SW0_SW1 (N8)
     LUT3_D:I0->O          1   0.704   0.424  count_max_int_cmp_gt00001 (count_max_int_cmp_gt00002)
     LUT4:I3->O            5   0.704   0.633  count_max_int_and00001 (count_max_int_and0000)
     FDE:CE                    0.555          count_max_int_0
    ----------------------------------------
    Total                      5.671ns (3.258ns logic, 2.413ns route)
                                       (57.5% logic, 42.5% route)

The device utilisation:

Cell Usage :
# BELS                             : 14
#      LUT2                        : 1
#      LUT3                        : 1
#      LUT3_D                      : 1
#      LUT4                        : 8
#      LUT4_L                      : 2
#      VCC                         : 1
# FlipFlops/Latches                : 11
#      FD                          : 4
#      FDE                         : 5
#      FDR                         : 2
# Clock Buffers                    : 1
#      BUFGP                       : 1
# IO Buffers                       : 12
#      IBUF                        : 1
#      OBUF                        : 11
=========================================================================

Device utilization summary:
---------------------------

Selected Device : 3s250evq100-4 

 Number of Slices:                        9  out of   2448     0%  
 Number of Slice Flip Flops:             11  out of   4896     0%  
 Number of 4 input LUTs:                 13  out of   4896     0%  
 Number of IOs:                          13
 Number of bonded IOBs:                  13  out of     66    19%  
 Number of GCLKs:                         1  out of     24     4%  

Improvement

The unsigned comparator GREATER:1 is expensive to implement relative to a 4-bit equality check (only two 4-input LUTs, comparing A0:1 B0:1 and A2:3 B2:3, AND'ed together). It should be possible to simplify the design by comparing equality between COUNT_OUT and COUNT_MAX post-incrment, and latching "at_maximum" as additional state for the next cycle. On the next clock, if DIRECTION is increment (=1) AND at_maximum, we can simply gate the new maximum produced to COUNT_OUT, COUNT_MAX and set NEWMAX simultaneously. 

There is one case where this does not hold, when incrementing from 1111 to 0000 under overflow conditions. The comparator would not determine this to be a new maximum condition, while the suggested design does. Here's an implementation:

counter2.v
module counter2(
   input CLOCK,
   input DIRECTION,
input RESET,
   output [4:0] COUNT_OUT,
   output [4:0] COUNT_MAX,
   output NEWMAX
   );

   reg [4:0] count_int = 0;
   reg [4:0] count_max_int = 0;
   reg new_max = 0;
      
   wire [4:0] count_up = count_int + 1;
   wire [4:0] count_down = count_int - 1;
   wire at_max = count_int == count_max_int;
   always @(posedge CLOCK) begin
      if (RESET) begin
         count_int <= 0;
         count_max_int <= 0;
         new_max <= 1;
      end else if (DIRECTION) begin
         count_int <= count_up;
         if (at_max) begin
            count_max_int <= count_up;
            new_max <= 1;
         end else begin
            new_max <= 0;
         end
      end else begin
         count_int <= count_down;
         new_max <= 0;
      end
   end
 
   assign COUNT_OUT = count_int;
   assign COUNT_MAX = count_max_int;
   assign NEWMAX = new_max;
endmodule

The design changes use one less LUT, but sadly the number of slices remains the same. One benefit of this design is the critical timing path is reduced by 2ns, which puts the design throughput up significantly, 176Mhz to 281Mhz:

Timing constraint: Default period analysis for Clock 'CLOCK'
  Clock period: 3.555ns (frequency: 281.294MHz)
  Total number of paths / destination ports: 39 / 17
-------------------------------------------------------------------------