

# Low-power pipelined phase accumulator using CMOS-CML hybrid F/Fs for pre-skewing operation

# Yun-Hwan Jung, Yong Sin Kim, Yohan Hong, Ju Eon Kim, and Kwang-Hyun Bae $\!k^{a)}$

School of Electrical and Electronics Engineering, Chung-Ang University, 84 Heukseok-Ro, Dongjak-Gu, Seoul, 156–756, South Korea a) kbaek@cau.ac.kr

**Abstract:** In this paper, a low-power pipelined phase accumulator (PACC) for high-speed direct digital frequency synthesizers (DDFSs) is presented. In the proposed PACC structure, the accumulator core block and the post-skewing block are based on current mode logic (CML) design topology for high-speed operations, whereas the preskewing block consists of static CMOS D-F/Fs and CMOS-CML hybrid F/Fs for low-power operations. The proposed CMOS-CML hybrid F/F provides fast level conversion (CMOS to CML) by having separate current sources and it also consumes low power dissipation by sequentially activating the current sources. Simulated results show that the proposed 24-bit PACC reduces power consumption by 31% compared with a conventional pipelined architecture when input data is updated every eight clock cycles. The operating speed of the proposed PACC is 45% faster than that of the conventional pipelined PACC under the condition of the same power dissipation.

**Keywords:** accumulator, Direct Digital Frequency Synthesizer (DDFS) **Classification:** Integrated circuits

#### References

- H. C. Yeoh, J.-H. Jung, Y.-H. Jung and K.-H. Baek: IEEE J. Solid-State Circuits 45 [9] (2010) 1845.
- [2] C.-Y. Yang, J.-H. Weng and H.-Y. Chang: IEEE J. Solid-State Circuits 46 [9] (2011) 2064.
- [3] Y. S. Kim and S.-M. Kang: Dig. IEEE MTT-s Int. Symp. Microwave (2006) 502.
- [4] T. Yoo, S.-J. Cho, J. W. Lee and K.-H. Baek: IET Electronics Letters 48 [18] (2012) 1102.
- [5] Y.-H. Jung, T. Yoo, S.-J. Cho and K.-H. Baek: IET Electronics Letters 48 [17] (2012) 1044.
- [6] P. Heydari and R. Mohanavelu: IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 12 [10] (2004) 1081.





#### 1 Introduction

Direct digital frequency synthesizers (DDFSs) play an important role in modern digital communications for having many significant advantages over traditional phase-locked loops such as fine frequency resolution, wide tuning bandwidth, and fast frequency switching performance [1, 2, 3]. However, high-power consumption is one of drawbacks of DDFS for monolithic microwave integrated circuit (MMIC) applications (>5 GHz) [2]. A traditional DDFS comprises of a phase accumulator (PACC), a phase-to-amplitude mapper (P2AM), and a digital-to-analog converter (DAC). Among these basic building blocks, P2AM has been the most power-hungry block due to complex digital circuitry and a large number of D-flop/flops (F/Fs) [4, 5]. However, the recent DDFS developments work towards simplifying the complexity of P2AM blocks and thereby reducing the power consumption, which leads to increased power in PACC relatively [1, 2, 3]. Therefore, the power reduction in PACC blocks can be significant achievement in designing a low-power DDFS [3, 4, 5]. In this paper, a pre-skewing block using both CMOS static and CMOS-CML hybrid F/Fs is proposed to reduce the power dissipation while keeping high-speed operations in the PACC.

# 2 Previous pipelined PACCs

In high-speed DDFSs, the most of PACCs adopt a pipelined structure to resolve the speed bottleneck in carry propagation [1]. Fig. 1 shows a conventional pipelined PACC with N-bit frequency control word (FCW) input, Kbit truncated phase output, and *M*-bit accumulator for a single stage (where N=24, K=12, and M=4). The conventional pipelined PACC has three functional components: a pre-skewing block, an accumulator (ACC) core block, and a post-skewing block. The pre-skewing block synchronizes the FCW inputs with a carry input in each stage. Then, the accumulator core block calculates the phase data, and post-skewing block synchronizes the truncated phase outputs. Many high-speed PACCs are based on CML design topology for their advantages over CMOS static logics in operating speed and supply noise [4, 5]. For slow FCW updates, however, the pre-skewing block with the CML topology causes unnecessary static power consumption because D-F/Fs in the pre-skewing block are turned on even for the period when FCW is not updated. To reduce the power consumption in CML-based D-F/Fs, the tail current control scheme has been proposed to control the static current of CML in a pre-skewing block [4]. However, this architecture inherently requires large area. The PACC in [5] reduces power consumption by minimizing the number of D-F/Fs in the pre-skewing block, but it still dissipates the static power due to CML-based D-F/Fs. To overcome aforementioned problems, this paper proposes a low-power and high-speed PACC which is suitable for high-performance DDFSs.







Fig. 1. Conventional pipelined phase accumulator based on CML with 24-bit Frequency Control Word (FCW) input, 12-bit truncated phase output, and 4-bit accumulator for a single stage (N=24, K=12, and M=4)

# **3** Proposed pipelined PACC

The proposed PACC adopts the sequential FCW loading scheme presented in [5] to reduce the number of pre-skewing F/Fs and thereby the chip size. However, unlike [5] which applies CML-based logics for all pre-skewing F/Fs, not only CMOS F/Fs but also CMOS-CML hybrid F/Fs proposed in this work are used in the pre-skewing block to achieve low-power as well as highspeed operations. Fig. 2(a) shows the structure of the proposed PACC, where N=24, K=12, and M=4. The LOAD signal is activated only if the FCW changes, which transfers the FCW inputs to the first column of the 4bit CMOS D-F/Fs in the pre-skewing block. Then, loading signals (from  $LD_1$ ) to  $LD_6$ ) and holding signals (from  $HD_1$  to  $HD_6$ ) are sequentially activated by a register chain in the loading signal generator to transfer 4-bit FCW inputs to the corresponding 4-bit CMOS-CML hybrid F/Fs. The proposed CMOS-CML hybrid F/F is composed of a static CMOS latch as a master one and a level converting latch as a slave one as shown in Fig. 2(b). The level converting latch has two separate current sources,  $M_{S1}$  and  $M_{S2}$ , for tracking and latching operation, respectively. The current sources in the level converting latch are turned on/off by holding signal HD generated by the load signal generator. In the overall FCW loading operations, the HD signals are activated a half clock earlier than the corresponding loading signals to provide enough time to load the tail current for a tracking operation. Note that the sequentially activated HD signals last only one and half clock cycles for saving power during latching operation. Fig. 2(b) also shows a simple timing diagram of the proposed hybrid F/F. The shaded period "A" is a pre-charging time for the tracking current source in the level converting latch. After the LD signal is activated, the FCW inputs are transferred to the accumulator core block at the next rising edge of  $CK_p$ . Therefore, a FCW loading event happens without the timing loss caused by loading signals unlike the PACC presented in [5]. Additionally, the tail current  $I_{SS1}$ is set to be large enough for required transconductance in order to achieve high-speed tracking operation [6]. On the other hand, the tail current  $I_{SS2}$ 







Fig. 2. (a) Structure of the proposed PACC (N=24, K=12, and M=4) and the load signal generator (b) Schematic of the CMOS-CML hybrid F/F and timing diagram

is set to be smaller than  $I_{SS1}$  to decrease power consumption in latching operation. Fig. 3 shows the structure of 4-bit ACC and the schematic of the "carry" and "sum" outputs for a full adder based on the Differential Cascode Voltage Switch Logic (DCVSL), which is composed of 8 and 10 transistors for current switches (excluding current sources), respectively. These circuits also achieve high-speed operations by reducing the number of transistors and parasitic capacitors in differential transistor pairs.

### **4** Simulated results

The proposed pipelined PACC is simulated by using 55 nm standard CMOS technology. The proposed pipelined PACC is designed with 24-bit FCW,







Fig. 3. (a) Structure of 4-bit ACC and the schematic of (b) carry out and (c) sum for the full adder

12-bit phase output, and 4-bit accumulator for each stage. The frequency resolution of the PACC is about 298 Hz at 5 GHz clock ( $=5 \text{ GHz} / 2^{24}$ ). The PACC consumes 35 mW when the FCW inputs are updated every eight clocks at 5 GHz clock. Fig. 4 shows simulated time-domain output waveforms of the proposed PACC and DDFS when FCW is adjusted for  $010000_{16}$  (19.4 MHz),  $050000_{16}$  (96 MHz), and  $710000_{16}$  (2.2 GHz) in hexadecimal codes at intervals of 100 ns. Table I compares the proposed PACC architecture with previously reported architectures based on the CML topology under the same condition (the FCW updates every eight clock cycles at 5 GHz clock speed). The proposed architecture reduces the power consumption by 31% and 13% compared with the conventional pipelined architecture in [1] and the FCW sequential loading scheme in [5], respectively. Note that the operating speed of the proposed PACC is 45% and 14% faster than those of architectures in [1] and [5] respectively when consuming the same power. It should be also noted that all transistors and resistors used in full adders, pre-skewing block, accumulator cores, post-skewing block, and overhead circuits are taken into account for the 'normalized area' in Table I. Fig. 5 shows the advantages of the proposed circuit compared with other schemes in terms of "power consumption" versus "operating speed".



Fig. 4. Time-domain waveforms of the proposed PACC and DDFS for output frequency switching when FCW is adjusted for 19.4 MHz, 96 MHz and 2.2 GHz





**Table I.** Comparisons for PACC architectures based on CML design topology with 24-bit FCW and 12-bit phase output under the same conditions (5 GHz operating frequency, FCW update rate of every eight clock cycles)

| Accumulator<br>architectures               | # of<br>full<br>adders | # of D-F/F      |                         |                  | Additional             | Normalized           | Normalized         |                      |
|--------------------------------------------|------------------------|-----------------|-------------------------|------------------|------------------------|----------------------|--------------------|----------------------|
|                                            |                        | Pre-<br>skewing | Inside<br>4-bit<br>ACCs | Post-<br>skewing | circuits<br>(Overhead) | power<br>consumption | operating<br>speed | Normalized<br>area** |
| Conventional<br>Pipelined [1]              | 24                     | Total : 126     |                         |                  | None                   | 1.00                 | 1 00               | 1.00                 |
|                                            |                        | 84              | 30                      | 12               | None                   | 1.00                 | 1.00               | 1.00                 |
| Pre-skewing<br>register<br>reduction [3]   | 24                     | Total : 102     |                         |                  | D-F/F : 3              | 0.96                 | 1 16               | 0.97                 |
|                                            |                        | 60              | 30                      | 12               | AND :1                 | 0.00                 | 1.10               | 0.07                 |
| FCW<br>sequential<br>loading<br>scheme [5] | 24                     | Total : 90      |                         |                  | D-F/F : 6              | 0.79                 | 1.27               | 0.80                 |
|                                            |                        | 48              | 30                      | 12               |                        |                      |                    |                      |
| Proposed                                   | 24                     | Total : 90      |                         |                  | D-F/F : 6              | 0.69                 | 1.45               | 0.78                 |
|                                            |                        | 48*             | 30                      | 12               | OR : 6                 | 0.09                 | 1.40               | 0.70                 |

\* 24 static CMOS D-F/F and 24 CMOS-CML hybrid F/F

\*\* Including resistors and transistors





## **5** Conclusion

This paper presents a low-power PACC by using both static CMOS D-F/Fs and proposed CMOS-CML hybrid F/Fs in the pre-skewing block for high-speed DDFSs. Compared with a conventional pipelined PACC, the power consumption of the proposed PACC can be reduced by 31% at 5 GHz clock speed. In other words, the proposed PACC increases the operating speed by 45% compared to the conventional pipelined PACC with the same power dissipation. Furthermore, the proposed CMOS-CML hybrid F/F can quickly convert CMOS-level to CML-level voltages without timing loss. Therefore, the proposed PACC can be employed as a low-power solution for highly pipelined high-speed DDFS applications.

### Acknowledgments

This research was supported by the Chung-Ang University Excellent Student Scholarship Grants in 2010, and by the MPIS (Ministry of Science, ICT & Future Planning), Korea, under the ITRC program (NIPA-2013-H0301-13-1013) supervised by the NIPA (National IT Industry Promotion Agency).

