Parametric Optimization of Non Overlapped Clock Pulse Shift Register Design

Ajaz Ahmad*  
Nikhil Ranjan  
*Corresponding author  

corresponding author:  
elaisaira.ahmad@gmail.com  
email id:  
nikhilranjan101@gmail.com

Date of publication (dd/mm/yyyy): 19/05/2017

Abstract – This paper discusses the area and power optimization of shift register. The shift register is design using edge triggered flip flops. The use of latch in shift register creates timing problems which can be avoided by using pulse clock base latches as a basic cell in shift register. This resolve the timing problem created in shift register by using several non overlapping clock in a duration of pulses. This also reduces the number of cells which in turn reduces area of circuit. The shift register cell CMOS layout is design using 50nm technology in MICROWIND layout simulator tool.

Keywords – CMOS, Pulse Latch, Power Dissipation, Area Optimization.

I. INTRODUCTION

A shift register is the essential building block in a digital integrated circuit. Shift registers are normally used in many applications, such as microprocessor, microcontroller, memory design, communication receivers, and image processing ICs. The circuit of a shift register is somewhat simple. A string of N number of flip flop base shift register is synchronized with a common clock. The speed of the flip-flop is not as much essential than the area and power consumption because there is no combinational logic between series connected flip-flops in the shift register. The smallest flip-flop is appropriate for the shift register to optimize the area and power dissipation. In recent times, pulsed latches are use instead of flip-flops in many applications, because of less number of transistors required in pulsed latch as compare with a flip-flop. But the pulsed latch cannot be used in a shift register due to the timing problem between pulsed latches. This problem is avoided by using non overlap clock pulses.

II. RELATED WORK

The area and power consumption of shift register are optimize by using clock pulse latch instate of flip-flops in design [1]. The shift register is design by series connected clock pulse latches in a group. The several such groups of sub shift registers along with clock pulse generator. This circuit avoids the timing problems of conventional shift register. In [1] a 256-bit shift register using pulsed latches was fabricated using a 0.18 CMOS process with. The core area is the power consumption is 1.2 mW at a 100 MHz clock frequency. The proposed shift register saves 37% area and 44% power compared to the conventional shift register with flip-flops [1].

In [2] a Low power clock-gated synchronous counter using Logical Effort (LE) optimized Transmission-gate Master- Slave Flip-Flop (TGMS FF) is proposed. Logical Effort is the transistor width optimization methodology for providing a tradeoff between area, delay and power. Logical Effort theory is the manual method for optimizing the transistor width. Out of all logic circuits, Inverter is said to be a best driver. Even a simple NAND gate will be slower than a inverter because of its topology. Hence Logical Effort uses inverter as a reference circuit and compares the driving capabilities of given gate with it [2].

In [3] a adaptive coupled flip flop circuit is use for power dissipation optimization. In [4] Flip flop and latch designs include many timing element that are not on the critical path and this timing slack can be exploited by sing lower, lower energy TEs. Instead of simultaneously optimizing for delay and energy, critical TEs should be optimized to reduce delay and noncritical TEs should be optimized to reduce energy. Design results shows energy reduction of 63% with no loss in performance compared to a high-performance design with homogeneous flip-flop and latch structures. Compared to a design which uses transistor sizing alone to reduce energy, activity-sensitive selection results in a further total TE energy reduction of 46% [4].

III. ARCHITECTURE OF PULSE LATCH SHIFT REGISTER

This register requires 256 latches comprises of 512 transmission gates and 768 NOT logic gates. The total number of transistors for this design is 2890 for shift register chain. The operations of the other sub shift registers are the same as that of the sub shift register1 except that the first latch receives data from the temporary storage latch in the previous sub shift register.

The 256 bit shift register is divided into sub shifter registers to reduce the number of delayed pulsed clock signals. A 4-bit sub shifter register array is synchronized through the parallel connected clock pulses through clock generator. That is clock0 is connected to the first latch of each array. Similarly the clock1, clock2 and clock3 are connected to second, third and fourth latch of each sub shift register array. When 256 bit shift register is divided into 4 bit sub shift registers, the number of clock-pulse circuits is of 4 bit and the number of latches is 256. A sub shift register consisting of 4 latches requires pulsed clock signals. Each sub shift register has a temporary storage latch [1].

IV. DESIGN RULES OF 50 NM TECHNOLOGY

Design rules are define in term of parameter λ or represent in absolute dimensions i.e. in μm. The rules are
use in such a way that the design can be easily adopted over a cross section of industrial process, making the layout portable. The scaling of lambda parameter is easily possible by changing the values. Some of the rules use in design process for 50 nano meter technology is shown below.

<table>
<thead>
<tr>
<th>Layer</th>
<th>Width</th>
<th>Spacing</th>
<th>Sur cap</th>
<th>Lin cap</th>
<th>Ox cap</th>
<th>Freq</th>
<th>Incdc</th>
<th>Thich</th>
<th>Height</th>
<th>Permit</th>
</tr>
</thead>
<tbody>
<tr>
<td>metal1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>metal2</td>
<td>8</td>
<td>10</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>Via5</td>
<td>5</td>
<td>5</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>metal3</td>
<td>0</td>
<td>10</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>Via4</td>
<td>3</td>
<td>4</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>metal4</td>
<td>3</td>
<td>4</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>via2</td>
<td>3</td>
<td>4</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>metal5</td>
<td>3</td>
<td>4</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>via</td>
<td>3</td>
<td>4</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>metal</td>
<td>3</td>
<td>4</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>poly</td>
<td>2</td>
<td>3</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>poly2</td>
<td>2</td>
<td>2</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>contact</td>
<td>2</td>
<td>3</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>drain</td>
<td>4</td>
<td>4</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>trih</td>
<td>4</td>
<td>4</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>met1</td>
<td>10</td>
<td>11</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>oxyd</td>
<td>4</td>
<td>4</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
</tbody>
</table>

Fig. 1. CMOS Layout Design Rules for 50 nm Technology

V. EIGHT BIT PULSE LATCH SHIFT REGISTER

The eight bit pulse latch shift register is design with eight latch cascaded and synchronized with four clock pulses generated from pulse clock generator. The two subgroup is design with four latch in each group. The first clock pulse of clock generator is connected to first latch of both group. Similarly the second, third and fourth clock pulse is connected to the second, third and fourth latch of both sub group. Fig. 2 shows the schematic diagram for eight bit PSR. The output of last latch in first subgroup is connected to the input terminal of first latch of second subgroup. Fig 2 shows the clock pulses connection with subgroup latches.

Fig. 2. Schematic Diagram for 8 bit pulse latch base shift register

The data input is shifted from first latch to its successive latches on the arrival of non-overlap pulse clocks. The latches are initial reset to logic ‘0’ levels on arrival of reset signal. The output of each latch follows the input data on each positive level of clock pulses [1,2,3].

Two Hundred and Fifty Six bit Pulse latch shift Register

The fig shows the clock pulse shift register for 256 bit data bits. The design consist of 64 sub group of four latch each is connected in cascaded form to reduce the number of delayed pulsed clock signals. This design of 256 bit register is exactly similar to that of 8, 16 and 32 bit pulse shift register architecture. A 256 array chain of comprises of 64 sub group of 4 latch each is connected to the four non-overlap delayed clock pulses, these are clk0, clk1, clk2, clk3. In the 4-bit sub shift register1, four latches store 4-bit data (Q1-Q4) and by the same way the clock pulses are connected to the other 63 sub group of latches.

This register requires 256 latches comprises of 512 transmission gates and 768 NOT logic gates. The total number of transistors for this design is 2890 for shift register chain. The operations of the other sub shift registers are the same as that of the sub shift register1 except that the first latch receives data from the temporary storage latch in the previous sub shift register [1,2].
VI. TIMING SIMULATION

The use of multiple non-overlap delayed pulsed clock signals as shown in Fig 4. The delayed pulsed clock signals are generated when a pulsed clock signal goes through pulse clock generator circuit. Each latch uses a pulsed clock signal which is delayed from the pulsed clock generator output signal. Therefore, each latch updates the data after its next latch updates the data. As a result, each latch has a constant input during its clock pulse and no timing problem occurs between latches. However, this solution also requires many delay circuits.

Fig. 4. shows the timing simulation of pulse clock shift register. The clock generator generates clock pulses after every 3.7 nano second which is connected to the clock signal of every individual latch. These are the non overlap pulses. The input data at Din terminal is shifted to each flip flop after every level triggered of latches [1,4].
VII. CONCLUSION

The shift register is design using edge triggered flip flops. The use of latch in shift register creates timing problems which can be avoided by using pulse clock base latches as a basic cell in shift register. This work discusses the schematic design and its CMOS layout implementation with optimized area and power of pulse latch base shift register. A small number of the pulsed clock signals is used by grouping the latches to several sub shifter registers and using additional pulse clock generator logic. The simulation analysis shows that the area of latch using 50nm technology is 2.8125 μm$^2$. The power dissipation for 256 bit length shift register is computed as 41.3μW with the area of 3162 μm$^2$. The use of transmission gate not only reduces the number of transistors but also optimized the power dissipation of design. The timing problem of pulse latch base shift register is resolve by using the non overlap clock pulses. These pulses are generated from clock pulse generator. These latches are synchronized through non overlapped clock pulses instead of synchronized clock signal. This solves the timing problems amongst the latch base shift register design. A 256 bit pulse clock shift register requires 256 latches and a four clock pulse generator design.

REFERENCES


AUTHOR’S PROFILE

Ajaz Ahmad
Bachelor Of Engineering In Electronics and Communication (Honors Degree), R.G.T.U., BHOPAL
Quantum Physics, I.I.T., MADRAS
M.Tech (VLSI), R.K.D.F.I.S.T, BHOPAL