# A High-Efficiency 142–182-GHz SiGe BiCMOS Power Amplifier With Broadband Slotline-Based Power Combining Technique

Xingcun Li<sup>®</sup>, Graduate Student Member, IEEE, Wenhua Chen<sup>®</sup>, Senior Member, IEEE, Shuyang Li<sup>®</sup>, Graduate Student Member, IEEE, Yunfan Wang<sup>®</sup>, Member, IEEE, Fei Huang, Student Member, IEEE, Xiang Yi<sup>®</sup>, Senior Member, IEEE, Ruonan Han<sup>®</sup>, Senior Member, IEEE, and Zhenghe Feng, Fellow, IEEE

Abstract-In this article, a high-efficiency broadband millimeter-wave (mm-Wave) integrated power amplifier (PA) with a low-loss slotline-based power combing technique is proposed. The proposed slotline-based power combiner consists of grounded coplanar waveguide (GCPW)-to-slotline transitions and folded slots to simultaneously achieve power combining and impedance matching. This technique provides a broadband parallel-series combining method to enhance the output power of PAs at mm-Wave frequencies while maintaining the compact area and high efficiency. As a proof of concept, a compact fourto-one hybrid power combiner is implemented in a 130-nm SiGe BiCMOS back-end-of-line (BEOL) process, which leads to a small die area of 126  $\mu$ m  $\times$  240  $\mu$ m and a low measured insertion loss of 0.5 dB. The 3-dB bandwidth is over 80 GHz covering the whole G-band (140-220 GHz). Based on this structure, a high-efficiency mm-Wave PA has been fabricated in the 130-nm SiGe BiCMOS technology. The three-stage PA achieves a peak power gain of 30.7 dB, 3-dB small-signal gain bandwidth of 40 GHz from 142 to 182 GHz, a measured maximum saturated output power of 18.1 dBm, and a peak power-added efficiency (PAE) of 12.4% at 161 GHz. The extremely compact power combining methodology leads to a small core area of 488  $\mu$ m x 214  $\mu$ m and an output power per unit die area of 662 mW/mm<sup>2</sup>.

*Index Terms*—Broadband power combiner, grounded coplanar waveguide (GCPW), high-efficiency, hybrid power combining, mm-Wave power amplifier (PA), SiGe, slotline, transformer.

Manuscript received March 20, 2021; revised June 19, 2021 and July 31, 2021; accepted August 16, 2021. Date of publication September 1, 2021; date of current version January 28, 2022. This article was approved by Associate Editor Payam Heydari. This work was supported in part by the National Key Research and Development Program of China under Grant 2019YFB2204701, in part by the National Natural Science Foundation of China under Grant 61941103, and in part by the National Key Research and Development Program of China under Grant 2020YFB1805004. (*Corresponding author: Wenhua Chen.*)

Xingcun Li, Wenhua Chen, Shuyang Li, Yunfan Wang, Fei Huang, and Zhenghe Feng are with the Department of Electronic Engineering, Tsinghua University, Beijing 100084, China (e-mail: chenwh@mail.tsinghua.edu.cn).

Xiang Yi is with the School of Microelectronics, South China University of Technology, Guangzhou 510641, China, and also with the Pazhou Laboratory, Guangzhou 510663, China.

Ruonan Han is with the Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139 USA. Color versions of one or more figures in this article are available at

https://doi.org/10.1109/JSSC.2021.3107428.

Digital Object Identifier 10.1109/JSSC.2021.3107428

#### I. INTRODUCTION

WITH the rapid growth in demand for high-speed and high-resolution wireless systems, the millimeter-wave (mm-Wave) technology is expected to deliver great potentials through abundant bandwidth and compact transceiver cells. Over the last decade, numerous highly integrated mm-Wave transceivers, especially for operating frequency beyond 100 GHz, have been reported [1]-[10], which are attractive for various applications, such as high-data-rate communication, imaging/radar, spectroscopy for material analysis, and biomedical detection. Power amplifiers (PAs) in the radio frequency (RF) front-end circuits need to produce larger output power to overcome the high propagation loss and satisfy the link budget. However, silicon-based mm-Wave PAs encounter several challenges, such as the limited  $f_t/f_{max}$  of transistors, low breakdown voltage in CMOS/SiGe transistors, and considerable loss of the passive network. Thus, efficient and compact power-combining is critical for silicon-based mm-Wave PAs, especially in phased array transmitters.

In general, mm-Wave power combining techniques are categorized as: 1) stacking of multi-finger transistors; 2) spatial combining techniques; and 3) on-chip power combining techniques. The transistor stacking technique has been discussed in related literature over the past few decades [11]-[17]. Unfortunately, with the increase in the number of stacked transistors and operating mm-Wave frequencies, this method leads to increased complexity and limited power enhancement capability due to parasitic parameters. The spatial combining technique has strong potential for high combining power and efficiency through co-optimal design with antennas [18]-[21]. In fact, due to large substrate loss and antenna size, spatial combining distribution networks on silicon are usually less efficient and have a high cost [22]. On-chip combining techniques based on transmission lines (T-lines) [23]-[36] and transformers [37]-[46] are commonly used to produce greater RF power on silicon-based processes. Conventional binary-combiner structures suffer from efficiency degradation with increasing numbers of devices and occupy large areas, such as zero-degree combiner [28], coupler combiner [29], and Wilkinson combiner [30]. Three-conductor T-line-based

0018-9200 © 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.

See https://www.ieee.org/publications/rights/index.html for more information.



Fig. 1. Two factors affect the power combining loss.

combiners were recently introduced in multi-layer technology and became a popular technique in various mm-Wave designs [33]–[36]. With their effective absorption of the matching network, transformer-based combiners at mm-Wave frequencies occupy a smaller die area. However, they also result in lower inductor Q, larger interwinding parasitic capacitance, and worse port balance.

In this article, a novel on-chip power-combining technique based on slotline power combiner is proposed, which consists of grounded coplanar waveguide (GCPW)-to-slotline transitions and folded slots to achieve power combining and impedance matching simultaneously. A four-to-one parallel-series power combiner in a 130-nm SiGe BiCMOS back-end-of-line (BEOL) process is implemented with a compact area of 126  $\mu$ m  $\times$  240  $\mu$ m and a low power combining loss of less than 1 dB. The 3-dB bandwidth is over 80 GHz covering the whole G-band (140-220 GHz). Furthermore, compared with transformer-based combiners, this structure has an excellent amplitude/phase balance since the undesired common-mode output over the entire operating bandwidth is suppressed. Based on this combining technique, a compact mm-Wave PA prototype is proposed to achieve high output power and high efficiency. A 142-182-GHz PA for the D- and G-band transceivers with a peak power gain of 30.7 dB has been achieved in a 130-nm SiGe BiCMOS technology. The PA produces 18.1-dBm saturated output power  $(P_{SAT})$  with peak power-added efficiency (PAE) of 12.4% at 161 GHz.

This article is organized as follows. In Section II, transformer-based power combining techniques are reviewed. The principles of the proposed power-combining are described in Section III. The design details of the power combiner and high-efficiency PA are given in Section IV. Section V presents the circuit implementation and experimental results. Finally, a conclusion is provided in Section VI.

# II. REVIEW OF THE TRANSFORMER-BASED POWER COMBINING TECHNIQUE

#### A. Power Combining Loss

When the required output power is higher than the power that a single PA can provide, a low-loss power combiner becomes essential. As shown in Fig. 1, two factors affect the overall power combining loss from the output of the individual PA cores to the load.

The first factor is the reflection at the input port of the power combiner. The reflection coefficient ( $\Gamma$ ) at each input port is



Fig. 2. Traditional transformer-based power combining topologies. (a) Parallel combining topology. (b) Series combining topology. (c) Conventional equivalent circuit of the transformer.

given by [47]

$$\Gamma_i = \frac{Z_{\text{IN},i} - Z_{\text{OPT}}}{Z_{\text{IN},i} + Z_{\text{OPT}}^*} \tag{1}$$

where  $Z_{IN,i}$  is the input load impedance of the combiner and  $Z_{OPT}$  is the optimal load impedance for the PA cores. When  $Z_{IN,i}$  deviates from  $Z_{OPT}$ , the output power of the PA cores cannot be transmitted totally to the combining network due to the reflection, degrading the overall efficiency. Especially for a non-isolated combiner,  $Z_{IN,i}$  is not constant but depends on the excitation signal. In other words, if the ports are imbalanced in a non-isolated combiner when using the same PA core at each port, an additional variation in  $Z_{IN,i}$  will be introduced, increasing the reflection.

The second factor is the passive efficiency ( $\eta_{PASS}$ ) of the impedance-transforming power combiner. The passive dissipated losses mainly include conductor loss, dielectric loss, and radiation loss. If the IC technology does not provide more than one thick metal layer, the conductor loss of the thick-conductor-based combiner (such as transformer-based combiner) will become extremely large due to the thin metallization. Moreover, in most nanoscale silicon processes, the layouts have to follow strict pattern density rules to ensure that the density of layout patterns stays within a certain range for each metal layer, which also significantly introduces an additional loss.

The total loss of the power combining technique can be calculated by

$$\text{Loss}_{\text{TOTAL}} = 10 \times \log_{10} \left( \sum_{i=1}^{N} P_{\text{AVS},i} \right) - P_{\text{OUT}}(\text{dBm}). \quad (2)$$

#### B. Power Combining Transformer

For multi-way PA combining, transformer-based power combining techniques have attained much attention in the field. Fig. 2 shows simplified traditional transformer-based combining topologies: 1) parallel combining [37] and 2) series combining [39], [40], where the optimum load resistance  $R_{OPT}$  for each PA transferred from the load impedance  $R_L$  can be expressed as [37]

$$R_{\rm OPT}^{\rm parallel} = \frac{MR_L}{n^2} \tag{3}$$

$$R_{\rm OPT}^{\rm series} = \frac{R_L}{Mn^2} \tag{4}$$



Fig. 3. (a) Conventional implemented transformer physical layout and parasitics of primary and secondary windings. (b) Example of the common-mode suppression versus  $C_m$  based on the model in Fig. 2(c).

where M is the number of the PA and n is the turns ratio of the transformer. In fact, the integrated transformer is implemented using conductors with non-ideal magnetic coupling. The simplified equivalent circuit is shown in Fig. 2(c), where  $k_m$  is the coefficient of magnetic coupling, and  $L_P$  and  $L_S$  are the self-inductances of the primary and secondary inductances, respectively. If only current combining or voltage combining is used for multi-way PA combining, more complicated routing and large impedance transformation ratio are required, which increases the combining loss [41]. Fig. 3(a) shows a conventional transformer physical layout and parasitic effects in silicon processes, where  $C_m$  represents the interwinding capacitance and causes the undesired common-mode signal and noise transfer to the output. For example, Fig. 3(b) shows that the common-mode suppression (mixed mode S-parameters  $S_{\rm CC}$ 21) deteriorates when increasing parasitic capacitance  $C_m$ . In series combining topology, the increasing parasitic capacitance between the primary and secondary windings makes the current of the signal reduce along the path and causes the impedances reflected from the output load back to each port to differ [41]. In [39], the difference of input impedance seen from the series combiner can be up to 40%. The output power and PAE will decrease due to the impedance mismatch caused by the imbalance. The effect is exacerbated when the physical layout is scaled to operate at mm-wave frequencies. It is also explained in [41] that the series combining presents large mismatches and higher than 2-dB combining loss at mm-Wave frequencies. Furthermore, the imbalances change dramatically among different frequencies and, finally, result in a narrowband matching.

Hybrid combining topologies (i.e., parallel–series combining topologies) [41]–[46] are developed to mitigate these issues. Compared with traditional combining topologies, the optimum load resistance can be well controlled with the turns ratio of transformers and the number of PAs in series and parallel. On the other hand, perfect structural symmetry can be obtained, and this arrangement makes it easy to distribute routings. For example, the four-way hybrid combining topology with a combination of two-way parallel and two-way series is shown in Fig. 4. Utilizing differential output and a more symmetric layout could reduce the imbalance. However, long differential lines connected to the pads will result in high transmission loss, and the common-mode impedance of differential T-lines is not high enough to filter out common signals caused by parasitic capacitance. Thus, there are still



Fig. 4. Four-way hybrid combining topology based on the transformer to mitigate the effect of interwinding capacitance. (a) Combination of two-way parallel and two-way series. (b) Differential output and a more symmetrical structure.

slight imbalances at different frequencies between  $PA_1$  and  $PA_2$ , and  $PA_3$  and  $PA_4$  [42]. The effects of parasitic capacitance are not eliminated completely. This reduces the resonant frequency of the transformer so that the operating frequency and bandwidth are also limited.

In addition, reducing the coil overlap and a compensated terminal added to the secondary winding can also reduce the capacitance between the windings, but this may cause other problems, such as coupling coefficient reduction.

# III. SLOTLINE-BASED POWER COMBINING TECHNIQUE

This section proposes a slotline-based combiner as an alternative to the transformer-based combiner in mm-Wave frequencies. The signal is allowed to propagate along slotlines [48] with the major electric field component across the slot. Both resonant and propagating slotlines have been used as radiating antenna elements, filters, and baluns in silicon processes [20], [21], [49]–[52]. In addition, the combination of slotlines and microstrip circuits makes the design of microwave circuits flexible and brings about new circuits.

# A. Modeling of Microstrip-to-Slotline Transitions

As described in the above review, due to interwinding capacitance, designing mm-Wave PAs with conventional transformer-based power combining techniques is challenging. Fig. 3 shows that, due to the electric field coupling between the metals, the common-mode signal is converted to output. Transitions between slotline and T-line are useful in mitigating this problem. Microstrip-to-slotline transitions are shown in Fig. 5. The slotline is crossed at a right angle by a microstrip conductor, and the magnetic coupling at the transition occurs [53]. It is observed that two modes propagate at the transition: microstrip mode and slotline mode. The electric field distribution at the transition is shown in Fig. 5(a) and (b) when feeding the microstrip line with an even-mode signal and an odd-mode signal, respectively. For the even mode, it realizes very high impedances for slotline mode since the TM wave signal at the slotline is suppressed. In contrast, the TE wave propagates along with the slot under the behavior of the odd mode. This structure for even/odd mode wave processing in ICs is first utilized in [49] to simultaneously achieve fundamental oscillation and harmonic-signal generation in integrated THz radiators. Moreover, when the odd mode is excited, the electric



Fig. 5. Microstrip-to-slotline transition structure and electromagnetic field distribution at the transition when (a) even- and (b) odd-mode signals are used to excite the microstrip line.



Fig. 6. Two-port microstrip-to-slotline transition structure. (a) Physical model. (b) Equivalent circuit model using T-line and ideal transformer.

field  $E_{\rm M}$  of the microstrip and the electric field  $E_{\rm S}$  of the slotline are orthogonal. Hence, the parasitic capacitance is mitigated, and it achieves perfect output balance over the entire operating bandwidth in principle. In [50], the balun based on this transition achieves the phase imbalance of less than 0.25° and the amplitude imbalance of less than 0.05 dB from 100 to 180 GHz.

The physical structure and the equivalent circuit of the twoport transition structure can be described, as shown in Fig. 6. The impedances  $Z_{0S}$  and  $Z_{0M}$  are characteristic impedances of the slotline and microstrip, respectively.  $\theta_S$  and  $\theta_M$  represent the electrical lengths of the extension of the slotline and the microstrip, respectively. Since the magnetic coupling at the cross-junction occurs, the cross-junction is equivalent to a transformer. Meanwhile, the return current of the extended



Fig. 7. Two types simplified equivalent circuits of the transition in Fig. 6. (a) Equivalent impedance model with  $jX_M$  and  $jX_S$ . (b) Transform  $jX_S$  to the other side of the transformer.

microstrip  $(Z_{0M}, \theta_M)$  is cut by the slotline, making the transformer in series in the return path of the microstrip. In addition, the extension of the return current path leads to the voltage difference in the slot at the cross-junction and excites the TE wave propagating along the slotline, so the slotline  $(Z_{0S}, \theta_S)$  is connected in parallel with the transformer. The equivalent transformer turns ratio *n* describes the coupling between the microstrip and slotline. For further analysis, the equivalent circuit in Fig. 6(b) can be redrawn, as shown in Fig. 7(a). Here, the impedance can be expressed as

$$jX_M = Z_{0M} \frac{jX_B + jZ_{0M}\tan\theta_M}{Z_{0M} - X_B\tan\theta_M}$$
(5)

$$jX_S = Z_{0S} \frac{j\omega L_0 + jZ_{0S}\tan\theta_S}{Z_{0S} - \omega L_0\tan\theta_S}$$
(6)

where  $X_B$  represents the terminal reactance at *Node B* of the microstrip. It can also be connected to short circuit, open circuit, or other lumped terminals to obtain wanted impedance.  $L_0$  is the discontinuity inductance of the shorted end of the slotline.

After transforming  $X_S$  to the other side of the equivalent transformer, the equivalent circuit is shown in Fig. 7(b). The impedance  $X_1$  is given by  $X_1 = n^2 \cdot X_S$ . In this circuit, when the slotline matches to the impedance  $Z_{0S}$ , the impedance  $Z_{IN,M}$  seen from the Port M is given by

$$Z_{\mathrm{IN},M} = R_{\mathrm{IN},M} + j X_{\mathrm{IN},M} \tag{7}$$

$$R_{\text{IN},M} = n^2 \frac{Z_{0S} X_S^2}{Z_{0S}^2 + X_S^2}$$
(8)

$$X_{\text{IN},M} = X_M + n^2 \frac{Z_{0S}^2 X_S}{Z_{0S}^2 + X_S^2}.$$
 (9)

The transformation ratio *n* can be evaluated approximately by the ratio of the voltage at a certain distance away from the slot to the voltage directly across the slot [53]. From expressions (5)~(9), the bandwidth limitation of the transition mainly comes from the frequency dependence of  $X_M$  and  $X_1$ , which depends on the stub electrical length and turns ratio *n*. A short-circuit microstrip line with  $\theta_M = 0$  or a quarter wavelength open-circuit microstrip line will minimize  $X_M$ , and an open-end or a quarter wavelength short-end slotline will maximize  $X_1$ .



Fig. 8. Proposed four-to-one parallel-series slotline-based combiner. (a) 3-D view of the physical structure where *Node B* is floating (open-circuit). (b) Top view and electrical-field distribution.

# B. Proposed Hybrid Four-to-One Slotline-Based Power Combining

For multi-way power combining, series and parallel microstrip-to-slotline transition can be combined to achieve hybrid power combining. A four-to-one parallel-series combiner based on GCPW-to-slotline transitions is proposed in Fig. 8(a), and the principle of power combining is shown in Fig. 8(b) (top view). The proposed combiner consists of three transition structures  $A \sim C$ , summing up the output power from four single-end PAs by means of a non-isolated combiner topology. T-lines are used to connect PAs and output based on the GCPW structure in multilayer silicon processes, while slotlines are realized by narrow slots in stacked solid metal to minimize the conductor loss. The series power combining occurs in transitions A and B, and parallel combining occurs in transition C with *Slot* 1 propagating the electromagnetic wave. In Fig. 8, *T-line 1* in part C has a floating node Node B on the one side. In fact, it can also be connected to short circuit or other lumped terminals to obtain wanted impedance. The distribution of the electric field and the current is also illustrated in Fig. 8(b). In addition, unlike the conventional transformer, the undesired  $C_m$  is eliminated by the GCPW-to-slotline transition. The combiner filters out the undesired output common-mode signals to get the better port balance over the entire operating band.

It is noteworthy that the slotlines in transitions A and B (i.e., *Slot* 2) are folded into the transverse direction so that electrical fields inside slotlines are out of phase and non-radiative. Furthermore, the length of the straight *Slot* 1 is much smaller than  $\lambda_0/4$ . Those are critical to eliminating the effects of the radiation from slotlines. Fortunately, there are at least two ways to further mitigate the effects of the radiation from the straight part of *Slot* 1. One way is to use a multi-folded slot layout similar to *Slot* 2, and another way is to shorten the horizontal size of transition A/B, which certainly helps to reduce the length of *Slot* 1.

However, in the actual mm-Wave PA circuit, especially in a broadband PA, the influence of output capacitance cannot be ignored. The equivalent schematic of an actual four-toone PA combining with the proposed combiner is shown in Fig. 9(a), and the impedance transfer functions are achieved

through the structures  $A \sim C$ , as illustrated in Fig. 9(b), where optimal resistance  $R_{\text{OPT}}$  and parallel capacitance  $C_{\text{PA}}$  are used to describe characteristics of the PA cell. First, the load impedance  $R_{\text{LOAD}}$  is converted to  $Z_1$  through transition C, and the impedance connected to Slot 1 is  $2 \cdot Z_1$  due to the parallel combining. Especially if T-line 1 is connected directly to the slot with zero electrical length in its matched impedance,  $Z_1$  will be simplified to  $Z_1 = n_C^2 \cdot Z_{\text{LOAD}}$ , where  $n_C \approx 1$ . Second,  $2 \cdot Z_1$  transforms to the  $2 \cdot Z_{OPT}$  (Z<sub>OPT</sub> for each single PA) through *Slot* 1 and the transition structure A/B, where the folded slotline (Slot 2) in the structure A/B should be shorter than a quarter wavelength in order to provide an inductance  $X_1$  to resonate the output capacitance of PA ( $C_{PA}$ ). T-line 2 also contributes an inductance in  $X_{M2}$  to resonate with the capacitance  $C_{PA}$ , and it is also used to connect to the transistor. Fig. 9(c) shows the impedance transforming the trajectory of the output combiner on the Smith chart, but the trajectory is not unique.

#### C. Transformer-Liked Impedance Matching Function

From the above analysis in Sections III-A and III-B, the equivalent circuit of the proposed structure A/B is actually consistent with the traditional transformer-based equivalent circuit, as shown in Fig. 2(c). To gain insight, as shown in Fig. 10(a), lumped inductance elements are introduced to represent jX. When  $Z_1$  is a real impedance, the impedance transformation of *Slot* 1 in Fig. 9(b) can be expressed as

$$Y_2 = \frac{1}{Z_2} = 2Y_{0S1} \cdot \frac{Y_1 + j2Y_{0S1}\tan\theta_{S1}}{2Y_{0S1} + jY_1\tan\theta_{S1}}$$
(10)

$$\operatorname{Re}(Y_2) = Y_1 \cdot \frac{1 + \tan^2 \theta_{S1}}{1 + \frac{Y_1^2}{4Y_{eq}^2} \tan^2 \theta_{S1}} = Y_1 + \Delta Y \qquad (11)$$

$$\operatorname{Im}(Y_2) = j 2Y_{0S1} \cdot \frac{4Y_{0S1}^2 \tan \theta_{S1} - Y_1^2 \tan \theta_{S1}}{4Y_{0S1}^2 + Y_1^2 \tan^2 \theta_{S1}}$$
(12)

where  $Y_1$ ,  $Y_2$ , and  $Y_{0S1}$  are the admittances corresponding to  $Z_1$ ,  $Z_2$ , and  $Z_{0S1}$ , respectively. In the approximate analysis of Re( $Y_2$ )  $\approx Y_1$ , lumped capacitance elements  $C_{\text{SLOT1}}$  are introduced to represent the function of *Slot* 1 when the characteristic impedance  $Z_{0S1}$  is much smaller than  $2 \cdot Z_1$  and the electrical length  $\theta_{S1}$  is smaller than  $90^\circ$ . The circuit



Fig. 9. (a) Equivalent circuit model of the proposed power combiner. (b) Impedance transfer functions are achieved through structures  $A \sim C$ . (c) Impedance transformation trajectory through the proposed combiner.



Fig. 10. (a) Simplified equivalent circuit model of the structure A/B prototype with lumped elements. (b) Equivalent circuit based on non-ideal transformer model. (c) Frequency response of the magnetically coupled resonator. (d) Non-ideal transformer as guidance to design the EM structure.

model of the structure A/B prototype can be related to the non-ideal transformer model with parameters ( $L_P$ ,  $L_S$ , and  $k_m$ ) in Fig. 2(c). ( $L_P$ ,  $L_S$ , and  $k_m$ ) can be given by

$$k_m^2 = \frac{X_2}{X_{M2} + X_2} \tag{13}$$

$$L_P = \frac{X_{M2}}{2\pi f \left( 1 - k_m^2 \right)}$$
(14)

$$L_{S} = \frac{X_{M2}k_{m}^{2}}{2\pi f \left(1 - k_{m}^{2}\right)n_{A(B)}^{2}}.$$
(15)

Thus, the simplified equivalent circuit model of the structure A/B prototype is shown in Fig. 10(b). The combination of

two resonators (*Resonators* 1 and 2), including the parasitic capacitances  $C_{\text{PA}}$  and  $C_{\text{SLOT1}}$ , has the potential to achieve broadband mm-Wave PAs with multi-resonances just like the transformer broadband matching technique [54], [55], as shown in Fig. 10(c). The two pole frequencies,  $f_H$  and  $f_L$ , of the coupling network can be calculated by  $(L_P, L_S, k_m, C_{\text{PA}}, \text{ and } C_{\text{SLOT1}})$ .

For preliminary circuit design, the non-ideal transformer, based on the primary inductance  $L_P$ , the secondary inductance  $L_S$ , and the coupling factor  $k_m$ , guides the circuit design for impedance matching.  $L_P$  and  $L_S$  are determined by  $X_{M2}$  introduced by *T-line* 2 and  $X_2$  introduced by *Slot* 2, respectively. The coupling factor  $k_m$  can be changed by increasing or decreasing the GCPW-to-slot transitions in different positions. However, multi-cross-junctions in the structure A/B will cause complex magnetic coupling between *T-line 2* and *Slot 2*, as shown in Fig. 10(d). Thus, the lumped parameter values  $(L_P, L_S, \text{ and } k_m)$  should be extracted from the Z and Y parameters of the EM simulation results.

#### D. Design Methodology of the Combiner

Based on the proposed physical structure and circuit model, the design procedure is presented as follows.

Step 1 (Estimate Impedance Transformation Trajectory): This step aims to understand how to achieve impedance matching and how each part of the structure affects the impedance. For a given optimum load impedance  $Z_{OPT}$  and  $R_{LOAD}$ , the impedance transforming the trajectory of the combiner can be estimated utilizing the Smith chart in Fig. 9(c). Node B in transition C can be short circuit, open circuit, or other lumped terminals to obtain wanted impedance 2.Z1.

Step 2 (Pre-Simulate Using Lumped Non-Ideal Transformer Parameter): The key of the proposed combiner is the design of the structure A/B and Slot 1, transforming  $2 \cdot Z_1$  to  $Z_{OPT}$ . Since the structure A/B is equivalent to a non-ideal transformer, the transformer-based design method is used to determine multiple sets of desired lumped parameters ( $L_P$ ,  $L_S$ ,  $k_m$ , and  $C_{SLOT1}$ ) for pre-layout simulation. The equivalent capacitance of the necessary sub-quarter wavelength Slot 1 connected to transition C is included in  $C_{SLOT1}$ .

Step 3 (Tradeoff Between the Loss and Bandwidth): Based on the design degrees of freedom  $(L_P, L_S, k_m, \text{ and } C_{\text{SLOT1}})$ , a set of parameters is selected to meet the design goals under the tradeoff between loss and bandwidth considering the non-ideal factors, such as the finite  $Q_{LP}$  and  $Q_{LS}$ . In mm-Wave frequency, especially over 100 GHz, the quality factor of the passive device is always lower than 30. It is obvious that lossy passive components affect the frequency response as well. We have to compromise between the loss and bandwidth. In sub-THz, the loss should be paid more attention than the bandwidth due to the limited output power of the transistors. According to the requirement for the PA, two different design strategies can be adopted: 1) low-loss matching strategy should implement the center frequency  $f_C$  to approach  $f_L$ to avoid significant loss in upper frequency and 2) broadband matching strategy can keep  $f_C$  at the center between  $f_L$  and  $f_H$  and ensure the smaller ripple, such as a 1-dB ripple.

Step 4 (Implement the Physical Structure): This step is to synthesize the physical structure A/B such that the determined lumped parameters are in Step 3. The parameters  $(L_P, Q_P, L_S, Q_S, \text{ and } k_m)$  can be extracted from Z-parameters of the 3-D-EM simulation results, as shown in Fig. 10(d).  $L_P$ and  $L_S$  can be tuned by changing *T-line* 2 and Slot 2. The coupling factor  $k_m$  can be tuned by increasing or decreasing the GCPW-to-slot transitions in different positions.  $(Z_{0S1}, \theta_{S1})$ of Slot 1 can be determined by (10)–(12). If a large  $C_{\text{SLOT1}}$  is required, the lumped MOM capacitor can be inserted in Port S to maintain a compact area and avoid a long Slot 1 but introduce an additional loss. *Step 5 (Post-Simulate With PA Cores):* The final step aims to complete the optimal design and adapt to additional parasitics that the model cannot capture.

#### IV. PA DESIGN WITH PROPOSED POWER COMBINING

A high-efficiency 160-GHz PA with the proposed slotlinebased power combining is designed in a 130-nm SiGe BiCMOS process with  $f_t/f_{max} = 350/450$  GHz. Fig. 11(a) shows the schematic of the proposed PA prototype. The modified three-conductor input power splitter is used to equally split the input power and provide two pairs of balanced signals to four PA paths. Each PA makes use of an identical stacked HBT topology to achieve high output power. To increase the power gain, two stages of driving amplifiers (DAs) with transformer-based interstage matching are employed. The key implementation details will be introduced in this section.

# A. PA and DA Cell Design and Layout

Transistor stacking is another technique commonly used in Si-based technologies to improve power handling capability. The design methodology for stacking HBTs has been explicitly discussed in [56]. However, when the number of stacked transistors exceeds three or more, the proportion of leakage current at the base increases greatly, which reduces the reliability of transistors, such as significant self-heating effects, instabilities, degradation, and even destroying the device, especially for HBT transistors. In addition, when operating at mm-Wave frequencies, it will further complicate the amplifier stability analysis, such as the effects of increased parasitics inside of the stacked transistors. Thus, the design and layout of the amplifier cell are crucial to the overall PA performance.

In this design, a two-stacked topology with a series inductance between common-emitter (CE) transistor Q1 and common-base (CB) transistor Q2 and a parallel RC at the base of Q2 is introduced as the unit amplifier core, as shown in Fig. 11(b). The inductance  $L_{CE}$  is used to move the mismatch input impedance Z<sub>IN,CB</sub> of the CB stage to the optimal impedance Z<sub>OPT,CE</sub> of the CE stage to improve the output power and efficiency. The parallel  $R_{BB}$ - $C_{BB}$  replaces the ideal ac ground at the base of the CB transistor to enhance stability. The parallel base resistance  $R_{\rm BB}$  is added to mitigate the effect of the base reversal current, which will result in a negative real part of the input impedance [56]. Fig. 12(a) and (b) compares the simulated stability and the maximum available/stable gain (MAG/MSG) of the post-layout stacked topology with a finite base impedance and an ac grounded. It can be seen that the frequency as K equals 1 is pushed to a lower frequency with the proposed topology. It is obvious that the gain of the CB stage decreases by reducing  $C_{BB}$  from infinity (ideal ac ground) to a finite value, and the stability will be improved. However, the parasitic interconnect inductance will reduce the self-resonant frequency of  $C_{BB}$  in reality. If the self-resonant frequency falls within the operating frequency band, it may cause inductive positive feedback and, thus, decrease the stability. To gain intuitive insights into the stability, a small inductance (3-pH) is connected in series with a 200-fF  $C_{BB}$  for a more realistic simulation. The series LC network resonates around 205 GHz.



Fig. 11. (a) Schematic of the proposed PA prototype using a four-to-one slotline-based combiner. (b) Potential stability issue of the unit PA core and proposed layout of the unit PA core with each transistor size of  $10 \times 0.9 \times 0.07 \ \mu m^2$ .



Fig. 12. Simulated (a) stability and (b) MAG/MSG of the stacked topology with different base impedances. (c) High-pass *RC* network is connected in series to the input of the PA core to improve the stability at lower frequencies.

Fig. 12(b) shows that the stability will decrease at higher frequencies, but *K* is still larger than 1. Thus, capacitors with high self-resonate frequency are required to achieve  $C_{\rm BB}$ . To minimize the interconnect parasitics,  $C_{\rm BB}$  is implemented by paralleling two customized metal–oxide–metal (MOM) capacitors at symmetrical positions as close to the base as possible. Fig. 11(b) also shows a compact layout of the unit PA core with each transistor size of  $10 \times 0.9 \times 0.07 \ \mu m^2$ . A similar layout drawing is designed for the DA core with each transistor size of  $4 \times 0.9 \times 0.07 \ \mu m^2$ . Finally, 200-fF  $C_{\rm BB}$  and 200- $\Omega R_{\rm BB}$ , and 100-fF  $C_{\rm BB}$  and 200- $\Omega R_{\rm BB}$  are chosen for the PA core and the DA core, respectively.

Furthermore, due to the inherent low port-to-port isolation of the proposed combiner, the unknown hazardous load modulation occurs out of the operating band. It is necessary to guarantee that the PA cores are unconditional stable in the full band. A high pass *RC* network is connected in series to the input of the amplifier core in Fig. 11 to improve the stability at lower frequencies. Fig. 12(c) shows that the simulated stability at lower frequencies of the PA core is enhanced with a high-pass *RC* (200  $\Omega$ //100 *f*F).

#### B. Power Combiner Design

As mentioned before, on-chip power combining should be exploited to enhance the maximum available output power per chip. To achieve a compact PA, the specific implementation of the combiner needs to co-optimize with the characteristics of the PA. Fig. 13(a) plots the simulated power contours of



Fig. 13. (a) Simulated output power and PAE contours of the PA core at 160 GHz. (b) Optimal load impedance  $R_{\text{OPT}}$  and equivalent output capacitance  $C_{\text{PA}}$  of the PA core at different frequencies.

the PA core at 160 GHz with a 4-V supply. An optimum load impedance  $Z_{OPT} = 10 + j31 \Omega$  is chosen for the PA core, giving an output power of 14.3 dBm with 26% PAE. The equivalent output capacitance  $C_{PA}$  of ~29 *f*F and optimal load  $R_{OPT}$  of ~106  $\Omega$  for the PA core are simulated from 140 to 180 GHz, as shown in Fig. 13(b). Based on the design methodology in Section III-D, a slotline-based power combiner, as shown in Fig. 14(a), sums the outputs from four-unit PA cores. *T-line* 1 with 50- $\Omega$  characteristic impedance is connected directly to the slot with zero electrical length to implement transition *C* and obtain the lower parallel combining loss, where  $n_C$  is approximately equal to 1 and  $2 \cdot Z_1$  is equal to 100  $\Omega$ . The transition A/B is implemented with 4- $\mu$ m-width *T-line* 2 and 8- $\mu$ m-width folded *Slot* 2,



Fig. 14. 3-D layout view of (a) final proposed slotline-based combiner and (b) transformer-based combiner.



Fig. 15. Simulated (a) loss and (b) common-mode suppression of the final proposed slotline-based combiner and the transformer-based combiner.

transforming ~100  $\Omega$  to 2· $Z_{OPT}$  together with *Slot* 1. The first tradeoff strategy mentioned in Section III-D (*Step 3*) is adopted in this design. The center frequency  $f_C$  is set to approach  $f_L$  to obtain a lower loss, and then, the parameters are optimized to achieve a sufficient bandwidth to cover 140–180 GHz. The center tap of *T-line* 2 is used for the power supply. The transform function can be approximately equivalent to a transformer with 2· $L_P$  of 71 pH, the secondary inductance 2· $L_S$  of 51 pH, and the coupling factor  $k_m$  of 0.56. The realization of the final network will also require adjustments of components values due to their finite quality factors and subsequent detailed EM simulations. *Slot* 1 is used to adjust the impedance deviation in the actual layout. In this design, *Slot* 1 is implemented with 8- $\mu$ m width and 45- $\mu$ m length.

To form a fair comparison with the transformer-based combiner, the transformer is implemented with approximately the same  $k_m$ ,  $L_P$ , and  $L_S$  to achieve power combining and impedance transforming. Fig. 14 shows the 3-D layout view of the final proposed slotline-based combiner and the transformer-based combiner. Fig. 15(a) and (b) presents the simulated combining loss with the method mentioned in Section II-A and the common-mode suppression of the final proposed slotline-based combiner compared to the transformer-based combiner, respectively. The proposed combiner is more compact and has better performance than the conventional transformer-based combiner. The simulated radiation energy of the power combiner is less than 3%, and the simulated combining loss of the proposed combiner is  $\sim 1$  dB. This corresponds to a combining efficiency of roughly 80%. The low-loss performance is mainly due to the sub-quarter wavelength design and the solid metal plate for the slotline consisting of all metal layers and vias. Moreover, the proposed combiner achieves excellent common-mode suppression in a broadband bandwidth.



Fig. 16. (a) Simplified circuit diagram to represent the three-conductor T-line splitter. (b) Cross section of the three-conductor T-line implemented in the proposed power splitter. (c) 3-D layout view of the final splitter.

Besides, the proposed slotline-based combiner can easily pass the DRC check without filling any dummy patterns. Thus, the proposed combiner is expected to become an effective alternative to the transformer-based combiner at mm-Wave frequency, maintaining a compact area and high efficiency.

# C. Power Splitter Design

In order to improve the reverse isolation (as the forward gain of the three-stage circuit exceeds 30 dB), the input power splitter is designed by co-optimization with DA core characteristics with three-conductor T-lines [33], [34]. The electric field in the three-conductor is orthogonal to the electric field in the slot, so it has better reverse isolation. The input impedance of a single DA core is about  $12.5 - j8.5 \Omega$ . A modified three-conductor T-line balun with an additional compensating T-line is proposed to generate two pairs of balanced signals. The simplified circuit of the proposed splitter is shown in Fig. 16(a). The compensating T-line to the splitter brings many benefits, such as improving the port imbalance, expanding bandwidth, and providing more freedom in impedance matching design, but also introduces an additional power loss on the lossy compensating T-line. The splitter and the dc-block capacitors are optimized together for impedance matching. The layout implementation of the splitter is shown in Fig. 16(b) and (c), where top metals TM1 and TM2, and bottom metals M1 and M2 are used for the three-conductor T-lines. The characteristic impedance and electrical length of each part are listed in Table I. Fig. 17 shows the simulated performance of the power splitter in a 130-nm SiGe BiCMOS BEOL process. In the range of 140-180 GHz, the amplitude imbalance is less than 0.1 dB, and the phase imbalance is less than 5° with a power loss of  $1.2 \sim 2$  dB. The ADS electromagnetic simulator in conjunction with HFSS is utilized to optimize the passive elements and T-lines.

TABLE I PARAMETERS FOR THE PROPOSED SPLITTER

| Parameters                        | Values | Parameters           | Values |  |  |
|-----------------------------------|--------|----------------------|--------|--|--|
| $Z_{01}, Z_{02}$                  | 45 Ω   | $\theta_1, \theta_2$ | 32°    |  |  |
| Z <sub>03</sub> , Z <sub>04</sub> | 28 Ω   | $\theta_3, \theta_4$ | 32°    |  |  |
| Z 05                              | 66 Ω   | $\theta_{5}$         | 12°    |  |  |
| <i>C</i> <sub>1</sub>             | 55 fF  |                      |        |  |  |



Fig. 17. (a) Simulated loss of the power splitter. (b) Simulated port imbalance of the power splitter.

#### V. EXPERIMENTAL RESULTS

As a proof-of-concept for the proposed high-performance broadband combining technique, as shown in Fig. 18(a), the fabricated chip contains a test structure with two combiners connected back-to-back with 50-Ω port impedance (P1-P4 ports are also designed to 50  $\Omega$  to avoid impedance mismatch) to test the performance of the four-to-one broadband slotline-based power combiner. The area of the structure is 540  $\mu$ m  $\times$  320  $\mu$ m, including pads, and the core area of the power combiner is only 126  $\mu$ m  $\times$  240  $\mu$ m (half the core area of the back-to-back test structure). The simulated port imbalance of the four-to-one combiner is shown in Fig. 19. The amplitude imbalance of each pair of ports (P1/P2 and P3/P4) is less than 0.04 dB, and the differential signal phase mismatch is less than 0.5°. This structure has an excellent amplitude/phase balance to filter out the undesired output common mode over the entire operating bandwidth. The RF performance is measured using waveguide infinity GSG probes. The small-signal performance was characterized through VDI WR 5.1 (140–220 GHz) extenders connected to the Agilent PNA-X N5247A vector network analyzer. The simulated and measured S11 and S21 are compared in Fig. 20. The measured minimum insertion loss of the combiner structure is  $\sim 0.5$  dB (half the measured insertion loss of the back-to-back test structure). The 3-dB bandwidth is over 80 GHz covering the whole G-band (140-220 GHz). Thus, it is effective for achieving broadband power combining.

The proposed high-efficiency PA with the broadband slotline-based power combining technique was designed and fabricated using a 130-nm SiGe BiCMOS process. The chip micrograph of the design is shown in Fig. 18(b). The area of the circuit is 750  $\mu$ m × 560  $\mu$ m, including the RF and dc pads, and the core area is only 488  $\mu$ m × 214  $\mu$ m. The setups for the PA measurement are shown in Fig. 21. The simulated and measured S-parameters are demonstrated in Fig. 22. The peak gain *S*21 of the PA is 30.7 dB at 163 GHz, and the



Fig. 18. Chip micrograph of (a) back-to-back four-to-one power combiner test structure and (b) proposed mm-Wave PA.



Fig. 19. Simulated port imbalance of the four-to-one combiner in Fig. 18(a).



Fig. 20. Measured (solid) and simulated (dashed). (a) *S*11 and (b) *S*21 of the test structure shown in Fig. 18(a).

3-dB bandwidth is from 142 to 182 GHz. The reverse isolation S12 is lower than -30 dB. The output stage consumes a dc current of  $\sim 60$  mA with a 4-V supply, and two driving stages consume a total dc current of  $\sim 50$  mA with a 3.3-V supply.

The large signal performance is characterized through a 110–170-GHz source (R&S SMZ170), a 140–220-GHz source (VDI WR 5.1 extender), and a power meter (Erikson PM5B). Fig. 23 demonstrates the measured output power  $P_{OUT}$  versus input power, power gain versus input power, output stage efficiency  $\eta_{PA}$ , and PAE of the whole circuit versus input power at 161 GHz, in comparison with the simulation results. It exhibits a 12.4% peak PAE at 18.1 dBm (64.6 mW) output power, and the maximum efficiency of the output stage  $\eta_{PAmax}$  is up to 20.1%. The total loss of the output pad and output 50- $\Omega$  T-line off the core area is roughly 0.3 dB. Thus, the output power density, i.e., the output power per unit core die area, is 662 mW/mm<sup>2</sup>, while the output power per unit total die area is 153 mW/mm<sup>2</sup>. The measured maximum output power

TABLE II Performance Comparison With Other State-of-the-Art Silicon-Based PAs Around D-Band and G-Band

| Ref.         | Tech.        | Freq.<br>(GHz) | P <sub>sat</sub><br>(dBm) | PAE <sub>max</sub><br>(%) | Gain<br>(dB) | BW <sub>-3dB</sub><br>(GHz) | Area<br>(mm <sup>2</sup> ) | Power Density<br>(mW/mm <sup>2</sup> ) | FoM <sub>ITRS</sub> | FoM <sub>A</sub> | Combiner Type         |
|--------------|--------------|----------------|---------------------------|---------------------------|--------------|-----------------------------|----------------------------|----------------------------------------|---------------------|------------------|-----------------------|
| [23]         | 65-nm CMOS   | 150            | 12.2                      | 12.1*                     | 16           | 30                          | 0.38                       | 44                                     | 83                  | 16               | T-line-based          |
| [44]         | 40-nm CMOS   | 140            | 14.8                      | 8.9                       | 20.3         | 17                          | 0.34                       | 89                                     | 88                  | -                | Transformer-based     |
| [46]         | 16-nm FinFET | 135            | 15                        | 12.8                      | 20.5         | 22                          | $0.062^{***}$              | 510***                                 | 89                  | -                | Transformer-based     |
| [24]         | 90-nm SiGe   | 116            | 20.8                      | 7.6                       | 15           | 24                          | 4.95                       | 24                                     | 86                  | 17               | T-line-based          |
| [25]         | 130-nm SiGe  | 120            | 15.5                      | 6.4                       | 19           | 35                          | 0.53                       | 67                                     | 84                  | -                | T-line-based          |
| [26]         | 130-nm SiGe  | 170            | 18                        | 4                         | 30.2         | 25                          | 0.85                       | 74                                     | 99                  | 14               | Diff. T-line-based    |
| [27]         | 130-nm SiGe  | 170            | 18.7                      | 4.4                       | 23.6         | 27                          | 1.35                       | 55                                     | 93                  | 17               | T-line-based          |
| [29]         | 130-nm SiGe  | 120            | 16.5                      | 12.8                      | 26.5         | 28                          | 0.74                       | 60                                     | 96                  | 18               | Marchand balun-based  |
| [30]         | 130-nm SiGe  | 130            | $17^{**}$                 | 13**                      | 34**         | 30                          | 1.06                       | 47                                     | 104                 | -                | Wilkinson-based       |
| [31]         | 130-nm SiGe  | 180            | 15                        | 3.5                       | 19           | 80                          | 0.92                       | 34                                     | 85                  | 18               | T-line-based          |
| [34]         | 90-nm SiGe   | 120            | 22                        | 3.6                       | 7.7          | 30                          | 0.62                       | 256                                    | 77                  | 16               | Three-conductor-based |
| [35]         | 130-nm SiGe  | 160            | 18                        | 9.4                       | 24           | 20                          | 0.84                       | 75                                     | 96                  | 18               | Three-conductor-based |
| This<br>Work | 130-nm SiGe  | 161            | 18.1                      | 12.4                      | 30.7         | 40                          | 0.42                       | 153                                    | 104                 | 24               | Slotline-based        |

\*: Estimated from power gain of cascaded experimental results. \*\*: Balun loss de-embedded. \*\*\*: Die area not including DC pads. -: Not indicates f/fmax of the employed process.



Fig. 21. (a) Measurement setup diagram. (b) Measurement environment.

and maximum  $\eta_{PAmax}$  of the amplifier are 14.5–18.1 dBm and 7.9–20.1% over 140–180 GHz, respectively, as shown in Fig. 24.

To examine the performance of the PA comprehensively, the figure of merit (Fo $M_{ITRS}$ ) [46] of the amplifier is calculated

$$FoM_{ITRS} = G(dB) + P_{SAT}(dBm) + 20 \log_{10}(f_0(GHz)) + 10 \log_{10}(PAE_{MAX}(\%))$$
(16)



Fig. 22. Measured (solid) and simulated (dashed) S-parameters of the proposed PA.



Fig. 23. Measured (solid) and simulated (dashed)  $P_{\rm OUT}$ , Gain,  $\eta_{\rm PA}$ , and PAE at 161 GHz.

where G,  $P_{\text{SAT}}$ , and  $f_0$  represent the power gain, the maximum output power, and the operating frequency, respectively.

However, the above factor is limited since it does not include circuit bandwidth and fabrication process difference, which are two of the important characteristics of an amplifier. For a more comprehensive comparison, a new figure of merit ( $FoM_A$ ) for mm-Wave broadband PA performance, considering operating bandwidth and fabrication process, is proposed

$$FoM_{A} = G(dB)/N + P_{SAT}(dBm) + 10 \log_{10}(PAE(\%)) + 20 \log_{10}\left(\frac{f_{0}}{f_{max}}\right) + 10 \log_{10}\left(\frac{\Delta f_{-3 \ dB}}{f_{0}}\right) \quad (17)$$



Fig. 24. Measured (solid) and simulated (dashed)  $P_{\text{SAT}}$  and  $\eta_{\text{PAmax}}$  from 140 to 180 GHz.

where *N* and  $\Delta f_{-3 \text{ dB}}$  represent the stage number and the bandwidth of the PA. This FoM<sub>A</sub> can indicate the overall performance of the PAs. Table II shows the summary of the performance and comparison with the state-of-the-art CMOS/SiGe amplifiers operating at the D-band and the G-band.

# VI. SUMMARY

In this article, a new power combining technique based on microstrip-to-slotline transitions is proposed to overcome the adverse effects of interwinding capacitance in traditional transformer-based power combining techniques at mm-wave frequencies while maintaining a compact area and high efficiency. A slotline-based power combiner was implemented and characterized across 140–220 GHz. This structure has an excellent amplitude/phase balance, and the measured minimum insertion loss of the combiner structure is ~0.5 dB.

A high-efficiency mm-Wave PA prototype based on the proposed combining technique in a 130-nm SiGe BiCMOS process was designed and characterized. The proposed PA demonstrates a peak small-signal gain of 30.7-dB with 40-GHz 3-dB bandwidth. The measured maximum output power and maximum  $\eta_{PAmax}$  of the PA over 140–180 GHz are 14.5–18.1 dBm and 7.9%–20.1%, respectively. The maximum output power of 18.1 dBm and the maximum PAE of 12.4% were measured at 161 GHz. The extremely compact power combining methodology leads to a small core area of 488  $\mu$ m × 214  $\mu$ m and an output power per unit die area of 662 mW/mm<sup>2</sup>. Due to the proposed design techniques, this PA achieves a recorded FoM<sub>A</sub> of 24 among previously reported D- and G-band PAs.

#### REFERENCES

- U. R. Pfeiffer, E. Ojefors, and Y. Zhao, "A SiGe quadrature transmitter and receiver chipset for emerging high-frequency applications at 160GHz," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2010, pp. 416–417.
- [2] M. Fujishima, M. Motoyoshi, K. Katayama, K. Takano, N. Ono, and R. Fujimoto, "98 mW 10 Gbps wireless transceiver chipset with D-band CMOS circuits," *IEEE J. Solid-State Circuits*, vol. 48, no. 10, pp. 2273–2284, Oct. 2013.
- [3] W. Shin, B.-H. Ku, O. Inac, Y.-C. Ou, and G. M. Rebeiz, "A 108– 114 GHz 4 × 4 wafer-scale phased array transmitter with highefficiency on-chip antennas," *IEEE J. Solid-State Circuits*, vol. 48, no. 9, pp. 2041–2055, Sep. 2013.

- [4] A. Townley et al., "A 94GHz 4TX-4RX phased-array for FMCW radar with integrated LO and flip-chip antenna package," in Proc. IEEE Radio Freq. Integr. Circuits Symp. (RFIC), May 2016, pp. 294–297.
- [5] A. Visweswaran et al., "9.4 A 145GHz FMCW-radar transceiver in 28nm CMOS," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2019, pp. 168–170.
- [6] M. Kucharaski, W. A. Ahmad, H. J. Ng, and D. Kissinger, "Monostatic and bistatic G-band BiCMOS radar transceivers with on-chip antennas and tunable TX-to-RX leakage cancellation," *IEEE J. Solid-State Circuits*, vol. 56, no. 3, pp. 899–913, Mar. 2021.
- [7] A. Mostajeran, A. Cathelin, and E. Afshari, "A 170-GHz fully integrated single-chip FMCW imaging radar with 3-D imaging capability," *IEEE J. Solid-State Circuits*, vol. 52, no. 10, pp. 2721–2734, Oct. 2017.
- [8] D. Nemchick *et al.*, "180 GHz pulsed CMOS transmitter for molecular sensing," *IEEE Trans. THz Sci. Technol.*, early access, May 31, 2021, doi: 10.1109/TTHZ.2021.3085138.
- [9] X. Ma, Y. Wang, X. You, J. Lin, and L. Li, "Respiratory pattern recognition of an adult bullfrog using a 100-GHz CW Doppler radar transceiver," in *IEEE MTT-S Int. Microw. Symp. Dig.*, Nanjing, China, May 2019, pp. 1–3.
- [10] H. Afzal, R. Abedi, R. Kananizadeh, P. Heydari, and O. Momeni, "An mm-Wave scalable PLL-coupled array for phased-array applications in 65-nm CMOS," *IEEE Trans. Microw. Theory Techn.*, vol. 69, no. 2, pp. 1439–1452, Feb. 2021.
- [11] T. Sowlati and D. M. W. Leenaerts, "A 2.4-GHz 0.18-μm CMOS selfbiased cascode power amplifier," *IEEE J. Solid-State Circuits*, vol. 38, no. 8, pp. 1318–1324, Aug. 2003.
- [12] S. Pornpromlikit, J. Jeong, C. D. Presti, A. Scuderi, and P. M. Asbeck, "A watt-level stacked-FET linear power amplifier in silicon-on-insulator CMOS," *IEEE Trans. Microw. Theory Techn.*, vol. 58, no. 1, pp. 57–64, Jan. 2010.
- [13] D. Fritsche, R. Wolf, and F. Ellinger, "Analysis and design of a stacked power amplifier with very high bandwidth," *IEEE Trans. Microw. Theory Techn.*, vol. 60, no. 10, pp. 3223–3231, Oct. 2012.
- [14] H.-T. Dabag, B. Hanafi, F. Golcuk, A. Agah, J. F. Buckwalter, and P. M. Asbeck, "Analysis and design of stacked-FET millimeter-wave power amplifiers," *IEEE Trans. Microw. Theory Techn.*, vol. 61, no. 4, pp. 1543–1556, Apr. 2013.
- [15] J.-H. Chen, S. Helmi, R. Azadegan, F. Aryanfar, and S. Mohammadi, "A broadband stacked power amplifier in 45-nm CMOS SOI technology," *IEEE J. Solid-State Circuits*, vol. 48, no. 11, pp. 2775–2784, Nov. 2013.
- [16] Y. Kim and Y. Kwon, "Analysis and design of millimeter-wave power amplifier using stacked-FET structure," *IEEE Trans. Microw. Theory Techn.*, vol. 63, no. 2, pp. 691–702, Feb. 2015.
- [17] S. R. Helmi, J.-H. Chen, and S. Mohammadi, "High-efficiency microwave and mm-wave stacked cell CMOS SOI power amplifiers," *IEEE Trans. Microw. Theory Techn.*, vol. 64, no. 7, pp. 2025–2038, Jul. 2016.
- [18] Y. A. Atesal, B. Cetinoneri, M. Chang, R. Alhalabi, and G. M. Rebeiz, "Millimeter-wave wafer-scale silicon BiCMOS power amplifiers using free-space power combining," *IEEE Trans. Microw. Theory Techn.*, vol. 59, no. 4, pp. 954–965, Apr. 2011.
- [19] J. Jayamon *et al.*, "Spatially power-combined W-band power amplifier using stacked CMOS," in *Proc. IEEE Radio Freq. Integr. Circuits Symp.*, Jun. 2014, pp. 151–154.
- [20] T. Chi, F. Wang, S. Li, M.-Y. Huang, J. S. Park, and H. Wang, "17.3 A 60GHz on-chip linear radiator with single-element 27.9dBm psat and 33.1dBm peak EIRP using multifeed antenna for direct onantenna power combining," in *IEEE Int. Solid-State Circuits Conf.* (*ISSCC*) Dig. Tech. Papers, Feb. 2017, pp. 296–297.
- [21] B. Abiri and A. Hajimiri, "A 69-to-79GHz CMOS multiport PA/radiator with +35.7dBm CW EIRP and integrated PLL," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2018, pp. 404–406.
- [22] R. A. York, "Some considerations for optimal efficiency and low noise in large power combiners," *IEEE Trans. Microw. Theory Techn.*, vol. 49, no. 8, pp. 1477–1482, Aug. 2001.
- [23] Y.-H. Hsiao, Z.-M. Tsai, H.-C. Liao, J.-C. Kao, and H. Wang, "Millimeter-wave CMOS power amplifiers with high output power and wideband performances," *IEEE Trans. Microw. Theory Techn.*, vol. 61, no. 12, pp. 4520–4533, Dec. 2013.
- [24] H.-C. Lin and G. M. Rebeiz, "A 110–134-GHz SiGe amplifier with peak output power of 100–120 mW," *IEEE Trans. Microw. Theory Techn.*, vol. 62, no. 12, pp. 2990–3000, Dec. 2014.

- [25] M. Bao, Z. S. He, and H. Zirath, "A 100–145 GHz area-efficient power amplifier in a 130 nm SiGe technology," in *Proc. 47th Eur. Microw. Conf. (EuMC)*, Oct. 2017, pp. 1017–1020.
- [26] M. Kucharski, H. J. Ng, and D. Kissinger, "An 18 dBm 155-180 GHz SiGe power amplifier using a 4-way T-junction combining network," in *Proc. Eur. Solid State Circuits Conf. (ESSCIRC)*, Cracow, Poland, Sep. 2019, pp. 333–336.
- [27] A. Ali, "168-195 GHz power amplifier with output power larger than 18 dBm in BiCMOS technology," *IEEE Access*, vol. 8, pp. 79299–79309, 2020.
- [28] M. H. Eissa, A. Malignaggi, and D. Kissinger, "A 13.5-dBm 200–255-GHz 4-way power amplifier and frequency source in 130-nm BiCMOS," *IEEE Solid-State Circuits Lett.*, vol. 2, no. 11, pp. 268–271, Nov. 2019.
- [29] M. Kucharski, J. Borngraber, D. Wang, D. Kissinger, and H. J. Ng, "A 109–137 GHz power amplifier in SiGe BiCMOS with 16.5 dBm output power and 12.8% PAE," in *Proc. 47th Eur. Microw. Conf.* (*EuMC*), Oct. 2017, pp. 1021–1024.
- [30] A. Visweswaran, B. Vignon, X. Tang, S. Brebels, B. Debaillie, and P. Wambacq, "A 112-142GHz power amplifier with regenerative reactive feedback achieving 17dBm peak p<sub>sat</sub> at 13% PAE," in *Proc. IEEE 45th Eur. Solid State Circuits Conf. (ESSCIRC)*, Sep. 2019, pp. 337–340.
- [31] P. Starke, C. Carta, and F. Ellinger, "High-linearity 19-dB power amplifier for 140–220 GHz, saturated at 15 dBm, in 130-nm SiGe," *IEEE Microw. Wireless Compon. Lett.*, vol. 30, no. 4, pp. 403–406, Apr. 2020.
- [32] J. Kim et al., "A fully-integrated high-power linear CMOS power amplifier with a parallel-series combining transformer," *IEEE J. Solid-State Circuits*, vol. 47, no. 3, pp. 599–614, Mar. 2012.
- [33] H.-C. Park, S. Daneshgar, Z. Griffith, M. Urteaga, B.-S. Kim, and M. Rodwell, "Millimeter-wave series power combining using subquarter-wavelength baluns," *IEEE J. Solid-State Circuits*, vol. 49, no. 10, pp. 2089–2102, Oct. 2014.
- [34] S. Daneshgar and J. F. Buckwalter, "Compact series power combining using subquarter-wavelength baluns in silicon germanium at 120 GHz," *IEEE Trans. Microw. Theory Techn.*, vol. 66, no. 11, pp. 4844–4859, Nov. 2018.
- [35] X. Li, W. Chen, Y. Wang, and Z. Feng, "A 160 GHz high output power and high efficiency power amplifier in a 130-nm SiGe BiCMOS technology," in *Proc. IEEE Radio Freq. Integr. Circuits Symp. (RFIC)*, Aug. 2020, pp. 199–202.
- [36] Y. Gong and J. D. Cressler, "A 60-GHz SiGe power amplifier with threeconductor transmission-line-based Wilkinson baluns and asymmetric directional couplers," *IEEE Trans. Microw. Theory Techn.*, vol. 69, no. 1, pp. 709–722, Jan. 2021.
- [37] K. H. An *et al.*, "Power-combining transformer techniques for fullyintegrated CMOS power amplifiers," *IEEE J. Solid-State Circuits*, vol. 43, no. 5, pp. 1064–1075, May 2008.
- [38] D. Chowdhury, P. Reynaert, and A. M. Niknejad, "Design considerations for 60 GHz transformer-coupled CMOS power amplifiers," *IEEE J. Solid-State Circuits*, vol. 44, no. 10, pp. 2733–2744, Oct. 2009.
- [39] U. R. Pfeiffer and D. Goren, "A 23-dBm 60-GHz distributed active transformer in a silicon process technology," *IEEE Trans. Microw. Theory Techn.*, vol. 55, no. 5, pp. 857–865, May 2007.
- [40] Y. Zhao and J. R. Long, "A wideband, dual-path, millimeter-wave power amplifier with 20 dBm output power and PAE above 15% in 130 nm SiGe-BiCMOS," *IEEE J. Solid-State Circuits*, vol. 47, no. 9, pp. 1981–1997, Sep. 2012.
- [41] Q. J. Gu, Z. Xu, and M.-C. F. Chang, "Two-way current-combining W-band power amplifier in 65-nm CMOS," *IEEE Trans. Microw. Theory Techn.*, vol. 60, no. 5, pp. 1365–1374, May 2012.
- [42] J. Oh, B. Ku, and S. Hong, "A 77-GHz CMOS power amplifier with a parallel power combiner based on transmission-line transformer," *IEEE Trans. Microw. Theory Techn.*, vol. 61, no. 7, pp. 2662–2669, Jul. 2013.
- [43] D. Zhao and P. Reynaert, "An E-band power amplifier with broadband parallel-series power combiner in 40-nm CMOS," *IEEE Trans. Microw. Theory Techn.*, vol. 63, no. 2, pp. 683–690, Feb. 2015.
- [44] D. Simic and P. Reynaert, "A 14.8 dBm 20.3 dB power amplifier for Dband applications in 40 nm CMOS," in *Proc. IEEE Radio Freq. Integr. Circuits Symp. (RFIC)*, Jun. 2018, pp. 232–235.
- [45] C.-F. Chou, Y.-H. Hsiao, Y.-C. Wu, Y.-H. Lin, C.-W. Wu, and H. Wang, "Design of a V-band 20-dBm wideband power amplifier using transformer-based radial power combining in 90-nm CMOS," *IEEE Trans. Microw. Theory Techn.*, vol. 64, no. 12, pp. 4545–4560, Dec. 2016.

- [46] B. Philippe and P. Reynaert, "24.7 A 15dBm 12.8%-PAE compact D-band power amplifier with two-way power combining in 16nm FinFET CMOS," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2020, pp. 374–376.
- [47] J. Rahola, "Power waves and conjugate matching," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 55, no. 1, pp. 92–96, Jan. 2008.
- [48] E. A. Mariani, C. P. Heinzman, J. P. Agrios, and S. B. Cohn, "Slot line characteristics," *IEEE Trans. Microw. Theory Techn.*, vol. MTT-17, no. 12, pp. 1091–1096, Dec. 1969.
- [49] R. Han *et al.*, "A SiGe terahertz heterodyne imaging transmitter with 3.3 mW radiated power and fully-integrated phase-locked loop," *IEEE J. Solid-State Circuits*, vol. 50, no. 12, pp. 2935–2947, Dec. 2015.
- [50] C. Wang and R. Han, "Dual-terahertz-comb spectrometer on CMOS for rapid, wide-range gas detection with absolute specificity," *IEEE J. Solid-State Circuits*, vol. 52, no. 12, pp. 3361–3372, Dec. 2017.
- [51] Z. Hu, M. Kaynak, and R. Han, "High-power radiation at 1 THz in silicon: A fully scalable array using a multi-functional radiating mesh structure," *IEEE J. Solid-State Circuits*, vol. 53, no. 5, pp. 1313–1327, May 2018.
- [52] H. Bameri and O. Momeni, "An embedded 200 GHz power amplifier with 9.4 dBm saturated power and 19.5 dB gain in 65 nm CMOS," in *Proc. IEEE Radio Freq. Integr. Circuits Symp. (RFIC)*, Aug. 2020, pp. 191–194.
- [53] R. Garg, I. Bahl and M. Bozzi, *Microstrip Lines and Slotlines*, 3rd ed. Norwood, MA, USA: Artech House, 2013.
- [54] H. Jia, C. C. Prawoto, B. Chi, Z. Wang, and C. P. Yue, "A full Kaband power amplifier with 32.9% PAE and 15.3-dBm power in 65-nm CMOS," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 65, no. 9, pp. 2657–2668, Sep. 2018.
- [55] H. Wang, C. Sideris, and A. Hajimiri, "A CMOS broadband power amplifier with a transformer-based high-order output matching network," *IEEE J. Solid-State Circuits*, vol. 45, no. 12, pp. 2709–2722, Dec. 2010.
- [56] K. Datta and H. Hashemi, "Performance limits, design and implementation of mm-wave SiGe HBT class-E and stacked class-E power amplifiers," *IEEE J. Solid-State Circuits*, vol. 49, no. 10, pp. 2150–2171, Oct. 2014.



Xingcun Li (Graduate Student Member, IEEE) received the B.S. degree in electronic information science and technology from the University of Electronic Science and Technology of China (UESTC), Chengdu, China, in 2017. He is currently pursuing the Ph.D. degree with the Department of Electronic Engineering, Tsinghua University, Beijing, China.

In 2020, he was a Visiting Student with the Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology (MIT), Cambridge, MA, USA. His current research interests

include silicon-based mm-Wave and THz integrated circuits for radar and wireless communication.



Wenhua Chen (Senior Member, IEEE) received the B.S. degree in microwave engineering from the University of Electronic Science and Technology of China (UESTC), Chengdu, China, in 2001, and the Ph.D. degree in electronic engineering from Tsinghua University, Beijing, China, in 2006.

From 2010 to 2011, he was a Post-Doctoral Fellow with the iRadio Lab, University of Calgary, Calgary, AB, Canada. He is currently a Professor with the Department of Electronic Engineering, Tsinghua University. He has authored or coauthored

over 200 journal articles and conference papers. His main research interests include an energy-efficient power amplifier (PA) and linearization, and millimeter-wave integrated circuits.

Dr. Chen is also an Editorial Member of *Engineering*. He was a recipient of the 2015 Outstanding Youth Science Foundation of NSFC, the 2014 URSI Young Scientist Award, and the Student Paper Awards of several international conferences. He is also an Associate Editor of IEEE MICROWAVE AND WIRELESS COMPONENTS LETTERS and the IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES.



**Shuyang Li** (Graduate Student Member, IEEE) received the B.S. degree in electronic information science and technology from the Department of Electronic Engineering, Tsinghua University, Beijing, China, in 2020, where he is currently pursuing the Ph.D. degree in electromagnetic field and microwave technique.

His current research interests include mm-wave and THz integrated circuits for radar and wireless communication.

Yunfan Wang (Member, IEEE) received the B.S.

degree in physics and the M.S. degree in electronic engineering from Tsinghua University, Beijing,

His research interests include magnetless nonreciprocal components and on-chip terahertz circuits.

China, in 2018 and 2021, respectively.



**Ruonan Han** (Senior Member, IEEE) received the B.Sc. degree in microelectronics from Fudan University, Shanghai, China, in 2007, the M.Sc. degree in electrical engineering from the University of Florida, Gainesville, FL, USA, in 2009, and the Ph.D. degree in electrical and computer engineering from Cornell University, Ithaca, NY, USA, in 2014.

He is currently an Associate Professor with Tenure at the Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA. His research interests

include microelectronic circuits and systems operating at millimeter-wave and terahertz frequencies.

Dr. Han was a recipient of the Cornell ECE Director's Ph.D. Thesis Research Award, the Cornell ECE Innovation Award, the IEEE Microwave Theory and Techniques Society (MTT-S) Graduate Fellowship Award, the IEEE Solid-State Circuits Society (SSC-S) Predoctoral Achievement Award, the Intel Outstanding Researcher Award in 2019, and the U.S. National Science Foundation (NSF) CAREER Award in 2017. He and his students won three Best Student Paper Awards of the IEEE Radio-Frequency Integrated Circuits Symposium in 2012, 2017, and 2021. He has served as an Associate Editor for IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS (TVLSI) from 2019 to 2021 and a Guest Associate Editor for IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES in 2019. He has been serving as an Associate Editor for IEEE TRANSACTIONS ON QUANTUM ENGINEERING since 2020. He also serves/has served on the Technical Program Committees (TPCs) of the IEEE RFIC Symposium, the European Microwave Conference, and the Steering Committees of the IEEE International Microwave Symposium in 2019 and 2021. He is also an IEEE MTT-S Distinguished Lecturer for the term 2020-2022.



Fei Huang (Student Member, IEEE) was born in Anhui, China, in 1993. He received the B.S. and M.S. degrees in electronic engineering from Tsinghua University, Beijing, China, in 2015 and 2018, respectively, where he is currently pursuing the Ph.D. degree in electronic engineering.

His research interests include millimeter-wave transmitting systems and high-efficiency radio frequency (RF) power amplifier (PA) design.

Mr. Huang was a recipient of the IEEE International Conference on Microwave and Millimeter

Wave Technology (ICMMT) Best Student Paper Award in 2020.



Xiang Yi (Senior Member, IEEE) received the B.E. degree from the Huazhong University of Science and Technology (HUST), Wuhan, China, in 2006, the M.S. degree from the South China University of Technology (SCUT), Guangzhou, China, in 2009, and the Ph.D. degree from Nanyang Technological University (NTU), Singapore, in 2014.

He is currently a Professor with SCUT and Pazhou Laboratory, Guangzhou. Prior to that, he was a Post-Doctoral Fellow with Massachusetts Institute of Technology (MIT), Cambridge, MA, USA. His

research interests include radio frequency (RF), millimetre-wave (mm-wave), and terahertz (THz) frequency synthesizers and transceiver systems.

Dr. Yi was a recipient of the IEEE ISSCC Silkroad Award and the SSCS Travel Grant Award in 2013. He is also a technical reviewer for several IEEE journals and conferences.



**Zhenghe Feng** (Fellow, IEEE) received the B.S. degree in radio and electronics from Tsinghua University, Beijing, China, in 1970.

Since 1970, he has been with Tsinghua University as an Assistant Professor, a Lecturer, an Associate Professor, and a Full Professor. His main research areas include numerical techniques and computational electromagnetics, radio frequency (RF) and microwave circuits and antennas, wireless communications, smart antennas, and spatial-temporal signal processing.