# High-Speed DACs for Millimeter-Wave Digital Arrays in FinFET CMOS

Boris Murmann, Pietro Caragiulo, Richelle Smith Department of Electrical Engineering Stanford University Stanford, CA 94305 Email: murmann@stanford.edu

*Abstract*—This paper describes ongoing work toward a FinFET-friendly implementation of high-speed D/A converters for mm-wave arrays. The proposed architecture is based on massively-parallel charge redistribution, and separates level generation, pulse timing and output power generation. This approach aims to maximally leverage the advantages of FinFet technology (transistor speed and density) and simultaneously mitigate its main shortcomings (poor interconnect delay and current handling). As a proof of concept, we will design 6-8 bit prototypes operating at up to 120 GS/s in 16 nm FinFET CMOS.

#### Keywords—digital-to-analog converter; mm-wave; transmitter

#### I. INTRODUCTION

DARPA's Millimeter-Wave Digital Arrays (MIDAS) program will push the frontiers of digital mm-wave transceiver technology by leveraging the integration density and speed of advanced CMOS process technology. Among the key challenges of this program are the data converters that bridge the digital signal processing modules with the mm-wave transmit/receive circuitry. The work described in this paper tackles the DAC problem, and specifically looks into circuit architectures that leverage the strengths of FinFET CMOS technology.

An important driving factor in our research is that DAC architecture design has not kept up with CMOS technology scaling. As illustrated in Fig. 1, the relative speed of wires and transistors has diverged by 3-6 orders of magnitude in the past two decades [1]. As a result, intrinsic device speed is no longer a significant limitation in FinFET technology at 16 nm and below. Instead, device interconnect upper-bounds the performance of most high-performance converters. While ADC designers have significantly adjusted their architectures to benefit from the speed and integration density of FinFET technology [2], high-speed DACs have essentially not changed from the days when they were mainly limited by transistors.



Fig. 1. Divergence of gate delay (transistor speed) and wire delay [1].

The current-steering architecture has been the dominant high-speed D/A conversion approach since the 1970s. From today's perspective, the main issue with current steering is that it lumps all critical operations into a single pair of nodes. These are: (1) level generation, (2) pulse timing, (3) output matching and output power generation. The current steering device array carries the full output current (often tens of mA) and must switch its hundreds of unit elements with sub-picosecond timing accuracy for linear wideband operation.

While digital timing accuracy has improved in modern processes, the need to switch large currents has detrimental effects in FinFET technologies due to their ultra-thin metal layers and large RC products [3]. Driving large currents dictates large transistor arrays with substantial area and long wires for clocking, data distribution and current combining. Consequently, the DAC becomes a large, inefficient, and poorly timed wire harness in which the transistors play only a minor role and the strengths of modern CMOS technology (transistor speed and density) aren't optimally exploited. Most significantly, the outstanding intrinsic timing accuracy of the FinFET transistor is destroyed by mismatches and unavoidable irregularities in the wire harness.

The remainder of this paper is structured as follows. Section II describes our technical approach and the proposed DAC architecture. Initial simulation results are presented in Section III and plans for a prototype IC are discussed in Section IV.

## II. TECHNICAL APPROACH

Our prior work has identified the above-described problem and proposed a new DAC architecture that separates level generation, timing and output power generation [4] (see Fig. 2). The DAC levels are generated using time-interleaved passive switched-capacitor (SC) circuits, which can be scaled down to the femtofarad level without affecting the linearity and noise performance at the 6-8 bit target performance. Thus, as already proven in high-speed SAR ADCs, very small amounts of power and area are needed for level generation.

To generate the final DAC output, the sub-DAC signals are combined and re-timed at nodes with low driving capability (small switches, compact layout). Power amplification and impedance reduction follows as the last step, after level generation and timing have been dealt with. Our proof-of concept work [4] was realized in a trailing-edge 90 nm process (in 2009) and targeted higher resolution (12 bits) than required for mm-wave applications. Our work will thus focus on

DISTRIBUTION STATEMENT A. Approved for public release: distribution is unlimited.

customizing this work for MIDAS and studying its fundamental speed and power limits in FinFet CMOS technology.



Fig. 2. Time-interleaved switched-capacitor DAC concept [4].

In summary, the envisioned DAC approach will combine the following design guidelines:

- The DAC area must be aggressively minimized to enable short wiring distances. Similar to common practice for massively interleaved ADCs [2], a "wire-first design methodology" must be followed to construct the architecture.
- 2) The currents in level- and timing-critical transistors must be kept as small as possible to benefit from the fast switching speeds of FinFETs. Power gain should be applied after the levels and timing of the output signals have been formed.
- 3) Just as in state-of-the-art ADCs running at ~100 GS/s, massive time interleaving should be used to leverage the immense integration density available in FinFET technology. As we have learned from ADC design, aggregating throughput at less than 1/10th of the process limit is more efficient than having each sub-circuit operate near its absolute limit.

Recent work of IBM Research has already shown the promise of a more digital-friendly approach [5], and has led to an 8-bit, 56 GS/s, 14 nm wireline communication DAC that measures only about 200  $\mu$ m on a side. However, this DAC does not meet the MIDAS power and linearity targets, and its scalability is limited due to its bulky output summation node. Here, we take a holistic approach at fundamentally revisiting the entire problem stack, from FinFET process constraints to a system-optimized mm-wave DAC implementation.

# **III. INITIAL SIMULATION RESULTS**

Initial simulations were carried out to understand where the speed limitations in a FinFET process come from. First, we looked at the minimum delay achievable when switching a resistive load, and then we studied the speed/resolution limitations of a binary weighted capacitive DAC.

Fig. 3 shows a test bench consisting of a simple common source stage driving a load resistance RLOAD. The MOSFET gate is driven by an ideal full-swing pulsed voltage source. The device is sized to maintain the same output swing for varying R<sub>LOAD</sub> and it is adjusted by changing its number of fingers (nf) while keeping the number of fins (nfin) unchanged. To minimize the effect of extrinsic device capacitances seen for small values of nfin and to reduce the lower metal stack series resistance seen for larger nfin, the number of fins is set to 4. As the load resistance approaches 50  $\Omega$  the device size and consequently the output node capacitance must increase. The BSIM model predicts that the RC product at the output node, and hence the delay, remains constant as the device size is increased (blue line). However, post-layout simulations (red and purple lines) show a significant increase of the delay due to parasitic series resistance (see Fig. 4 for a resistance heat map generated by ParagonX [6]) and parasitic shunt capacitors. Using different device aspect ratios or higher metal layers can reduce the parasitic series resistance or the parasitic shunt capacitance but does not affect the overall delay trend. This simple experiment shows that switching a 50  $\Omega$  load in a FinFET technology has detrimental effects on circuit speed.



Fig. 3. Device speed as a function of resistive load.



Fig. 4. Cumulative resistance heat map (drain net).

The layout of a slice of the time-interleaved switched capacitor core is shown in Fig. 5. The layout consists of 256 unit capacitors (0.5 fF each) and inverter drivers. All capacitors were implemented using MOM structures. The interconnect geometry is chosen to minimize the RC product of all nodes in the circuit. We evaluated the performance of this circuit at the schematic and post-layout level with a single-tone input at different sampling frequencies.

Simulation results for a sampling frequency of 14 GHz are shown in Figs. 6 and 7. The post-layout simulation shows a degradation of about 5 dB compared to the simulation at the schematic level. This is due to the presence of spurs in the spectrum caused by larger time-domain glitches in the postlayout waveform. These glitches are mainly due to timing mismatch between the assertion of different bits and differences in the charge packet source/sink speed. In a full implementation of the DAC, this issue is mitigated by subsequent re-timing, as indicated in Fig. 2.



Fig. 5. Layout of a single slice of the time-interleaved SC core.



Fig. 6. Schematic-level simulation results for one slice of the timeinterleaved switched capacitor core for a single-tone input and a sampling frequency of 14 GHz.



Fig. 7. Post-layout simulation results for one slice of the time-interleaved switched capacitor core for a single-tone input and a sampling frequency of 14 GHz.

Fig. 8 shows the single-tone SNDR performance (within  $0...f_s/2$ ) at different sampling frequencies obtained from simulations at the schematic and post-layout level. Post-layout simulations show that the average power consumption of the time-interleaved switched capacitor core at a sampling frequency of 14 GHz is 0.5 mW (schematic level average power consumption is 0.3 mW).



Fig. 8. SNDR performance across sampling frequency. The post-layout degration is mainly due to glitches that are absorbed by the re-timing in the full circuit of Fig. 2.

# **IV. PROTOTYPE ICS**

We will design and experimentally validate multiple DAC prototypes (see Fig. 9) in 16 nm FinFET CMOS. The initial designs will operate between 7.5 to 30 GS/s for baseband & IF signal generation, while our final target will explore direct mmwave synthesis at 120 GS/s. To achieve this update rate, we will employ a hold-interleaving scheme that enables output updates every 8.3 ps, which is not possible using conventional multiplexing methods. All DACs are always active to sum up to the desired output, and the signal is generated by a digitallydriven imbalance between the positive and negative DAC sides. A inherent advantage of this scheme is that it is insensitive to inter-channel offset, which is among the most critical issues in spur management for interleaved wideband systems [7]. Fig. 10 shows preliminary simulation results that highlight this advantage compared to a standard multiplex-based interleaving scheme

The capability of direct mm-wave signal synthesis will open up new opportunities for future systems with multi-band transmission. The envisioned ultra-wideband DAC will therefore serve a variety of related applications within the DoD and commercial wireline communication systems.



Fig. 9. Overview of planned DAC prototype designs.



Fig. 10. Behavioral simulation showing the SNDR versus input frequency for a single-tone input and sampling frequency of 120 GHz. Each DAC has a random offset with  $\sigma$  equal to 5 LSB.

#### **ACKNOWLEDGMENTS**

This work is supported by the Defense Advanced Research Projects Agency (DARPA) under Grant FA8650-18-1-7895. R.S. is supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. DGE 1656518. P.C. and R.S. are supported by the William R. Hewlett and Sang Samuel Wang Stanford Graduate Fellowship.

### REFERENCES

- [1] M. Badaroglu, "More Moore scaling:opportunities and inflection points," in *ITRS/ERD Workshop*, 2015.
- [2] B. Murmann, "The successive approximation register ADC: A versatile building block for ultra-low-power to ultra-high-speed applications," *IEEE Commun. Mag.*, vol. 54, no. 4, 2016.
- [3] C. Hou, "A smart design paradigm for smart chips," in *ISSCC Dig. Tech. Papers*, 2017, pp. 8–13.
- [4] C. Daigle, A. Dastgheib, and B. Murmann, "A 12-bit 800-MS/s switched-capacitor DAC with open-loop output driver and digital predistortion," in *Proc. ASSCC*, 2010, pp. 1–4.
- [5] C. Menolfi *et al.*, "A 112Gb/s 2.6pJ/b 8-Tap FFE PAM-4 SST TX in 14nm CMOS," in *ISSCC Dig. Tech. Papers*, 2018, pp. 104–105.
- [6] Diakopto Inc., "ParagonX Layout Parasitic Visualization Tool." https://diakopto.com/.
- K. Doris, E. Janssen, C. Nani, A. Zanikopoulos, and G. van der Weide, "A 480 mW 2.6 GS/s 10b Time-Interleaved ADC With 48.5 dB SNDR up to Nyquist in 65 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 46, no. 12, pp. 2821–2833, Dec. 2011.