KATHOLIEKE UNIVERSITEIT LEUVEN FACULTEIT TOEGEPASTE WETENSCHAPPEN DEPARTEMENT ELEKTROTECHNIEK (ESAT) AFDELING ESAT-INSYS Kasteelpark Arenberg 10 – B-3001 Leuven (Belgi¨e)
Matching properties of deep sub-micron MOS transistors
Promotoren: Prof. Dr. ir. H.E. Maes Prof. Dr. ir. W. Sansen
Proefschrift voorgedragen tot het behalen van het doctoraat in de toegepaste wetenschappen door ir. Jeroen A. Croon
Juni 2004
In samenwerking met IMEC vzw Interuniversitair Micro-Elektronica Centrum vzw Kapeldreef 75 – B-3001 Leuven (Belgi¨e)
KATHOLIEKE UNIVERSITEIT LEUVEN FACULTEIT TOEGEPASTE WETENSCHAPPEN DEPARTEMENT ELEKTROTECHNIEK (ESAT) AFDELING ESAT-INSYS Kasteelpark Arenberg 10 – B-3001 Leuven (Belgi¨e)
Matching properties of deep sub-micron MOS transistors
Jury: Voorzitter: Prof. Dr. ir. P. Verbaeten Prof. Dr. ir. H.E. Maes, promotor Prof. Dr. ir. W. Sansen, promotor Prof. Dr. ir. H. De Man Prof. Dr. ir. G. Declerck Prof. Dr. ir. R. De Keersmaecker Prof. Dr. ir. M. Steyaert Prof. Dr. J. Schmitz (Universiteit Twente) Dr. ir. S. Decoutere (IMEC)
Proefschrift voorgedragen tot het behalen van het doctoraat in de toegepaste wetenschappen door ir. Jeroen A. Croon
U.D.C. 621.3.049.77 Wet. Depot : D/2004/7515/34 ISBN 90-5682-496-1 Juni 2004
In samenwerking met IMEC vzw Interuniversitair Micro-Elektronica Centrum vzw Kapeldreef 75 – B-3001 Leuven (Belgi¨e)
c °Katholieke Universiteit Leuven - Faculteit Toegepaste Wetenschappen Arenbergkasteel, B-3001 Leuven (Belgi¨e) Alle rechten voorbehouden. Niets uit deze uitgave mag worden vermenigvuldigd en/of openbaar gemaakt worden door middel van druk, fotocopie, microfilm, elektronisch of op welke andere wijze ook zonder voorafgaande schriftelijke toestemming van de uitgever. All rights reserved. No part of the publication may be reproduced in any form by print, photoprint, microfilm or any other means without written permission from the publisher. D/2004/7515/34 ISBN 90-5682-496-1
Abstract This thesis examines the matching properties of deep sub-micron MOS transistors. Microscopic fluctuations cause stochastic parameter fluctuations that affect the accuracy of the MOSFET. In analog circuits this limits the speed, power or accuracy at which a circuit can operate. For digital designs it starts to limit the speed, power and yield. Good insight in the magnitude of the fluctuations and their physical origins is therefore required. This work studies the matching properties of MOSFETs at several levels of abstraction. Firstly, a simple and physics-based model is presented that accurately describes the mismatch in de drain current for the full bias range above the threshold voltage. This facilitates accurate circuit design for deep sub-micron technologies. Secondly, the most commonly used methods to extract the matching properties of a technology are bench-marked with respect to model accuracy, measurement accuracy and speed, and physical contents of the parameters. This creates insight in which method to use in which situation and in how to treat data presented in literature. As third topic the physical origins of microscopic fluctuations and how they affect MOSFET operation √ are investigated. This leads to a refinement of the generally applied σ∆P ∝ 1/ area law in both weak and strong inversion. In addition, the analysis of simple transistor models highlights the physical mechanisms that dominate the fluctuations in the drain current and transconductance. The fourth topic considers the impact of process parameters on the matching properties. In accordance with literature, it is found that the granular structure of the poly-silicon gate material can play an important role. Furthermore, it is identified that the gate does not act as an ideal mask for the halo implantation, which worsens the matching properties of a technology. Also, scaling issues are briefly addressed. Finally, the impact of gate line-edge roughness is investigated, which is considered to be one of the roadblocks to the further down-scaling of the MOS transistor. The impact of line-edge roughness on parameter fluctuations, off-state current and yield has been modeled. The effect has also been experimentally studied by intentionally increasing the roughness and by studying transistors with sinusoidally shaped gate edges. A prediction is made about the technology node at which line-edge roughness will become an issue. Summarizing, regarding the matching properties of deep sub-micron MOS transistors, this thesis tries to present insight in the modeling aspects, characterization aspects, the physical origins, and technological aspects, while also extensively treating one of the main future issues. This work could therefore be useful for device physicists, characterization engineers, technology designers, circuit designers, or anybody else interested in the stochastic properties of the MOSFET.
i
Acknowledgements Doing a Ph.D. is quite involving. It also involves working together with a lot of interesting and intelligent people. I would like to acknowledge the following persons for their contributions. First of all I’d like to thank my promotors prof. Herman Maes and prof. Willy Sansen. In the past few years, their thorough review of my work helped to significantly increase the quality. I’d also like to thank prof. Gilbert Declerck, prof. Hugo De Man and prof. Roger De Keersmaecker for the careful reading of this manuscript and their comments. My supervisor, Dr. ir. Stefaan Decoutere, is greatly acknowledged for his support, guidance, and many interesting discussions. Special thanks must go to Hans Tuinhout from Philips Research in Eindhoven. Not many students have the luck to regularly meet the expert in the field they are working in. A lot of the ideas presented in this thesis would not have been realized without him, especially those related to the chapter on parameter extraction. Besides Hans, I’d also like to acknowledge R´egis Difrenza from ST Microelectronics in Crolles, and Johan Knol and Antoine Moonen from Philips Semiconductors in Nijmegen for their contributions to this part of the work and for fruitful discussions. I’d also like to express my gratitude to Maarten Rosmeulen for the help he gave me at the beginning of this Ph.D.. I’m still using the environment for matching analysis that he created, and a lot of the ideas regarding the description of the mismatch in the drain current originated from him. The work on line-edge roughness would not have been possible without the help of Peter Leunissen. I greatly appreciate our discussions on the fundamental aspects of the topic, setting up the experiments and the time he spent on creating the required test structures. I’d also like to express my gratitude to all other people from the lithography department that have been involved in this work, including the industrial affiliates. During the past few years I’ve really enjoyed working in the msti group. Special thanks go to my former analog colleague Martina Da Rold, and her replacement Abdelkarim Mercha, who took the time to fully read this manuscript and give valuable criticisms. iii
iv Further thanks go to the other group members (Andreas, Arturo, Frank, Gillis, Gregor, Koen, Liesbeth, Li Jen, Mingwei, Nordin, Rafael, Snezana, and Stefaan) without whom life at IMEC would have been much less enjoyable. During my Ph.D. I had a lot of valuable discussions with Ewout Vandamme and Emmanuel Augendre who often acted as test vehicles for my ‘idea of the week ’. Their patience and insight are deeply appreciated. Furthermore, I’d like to thank Rita Rooyackers and Malgorzata Jurczak for taking the time to explain me the details of CMOS processing. I’d also like to thank Frans Widdershoven from Philips Research for our many five minute discussions on the physics behind parameter fluctuations, that generally ended up lasting longer than one hour. Many thanks go to all members of the amsimec, dip, hirel, platform, and former tcad groups for creating a great working environment. Furthermore, the administrative work of Claire Declerck, Hilde Derdin and Veerle Vranckx is highly appreciated. I’d also like to express my gratitude to Eddy Simoen who, back in 1996, introduced me to IMEC and to the field of microelectronics in general. Life is more than work. The following persons contribute to making it enjoyable. I’d like to thank Hans van Meer for the many conversations during coffee breaks and weekly dinners, and for trying to drive at the left side of the road in Ireland. As part of the same team, and as good friends, I’d like to thank Eric ten Haaf, Koen van der Zanden, and Martin Vogels for making Leuven (and Brussels) an interesting place to discover. Another great person to know is Merlijn van Spengen (pc1952), who, besides his Ph.D. work, managed to become an ‘expert’ in issues ranging from spherical lightning to the world energy problem. Th´eodore Marescaux and Martin Palkovic are acknowledged for excellent dinners, even better discussions, and for being nice people to hang out with. I’d further like to thank all the people I play squash with and the ‘Leuven united’ floorball team for keeping me in the right shape. Behalve in Belgi¨e zijn er ook een hoop mensen in Nederland die ik graag zou willen bedanken voor hun vriendschap. Arjan van IJzerloo, Robert Hijman, Martijn Hoogendijk, Sander Weernink, Maurice de Cock en Michiel Dekker maken het altijd de moeite waard om een paar uurtjes voor in de trein te zitten. Tenslotte wil ik mijn ouders, mijn broer Marcel en mijn opa (†) en oma bedanken. Zij hebben mij geleerd om altijd oprecht te blijven, het menselijke aspect in het oog te houden en de dingen van een positieve kant te benaderen. Ook hebben ze mij de motivatie en mogelijkheid gegeven om dit werk te volbrengen, maar belangrijker, het is altijd een plezier om thuis te komen. Jeroen Croon March 2004
List of publications Journal paper: 1. J.A. Croon, M. Rosmeulen, S. Decoutere, W. Sansen and H.E. Maes, “An Easyto-Use Mismatch Model for the MOS Transistor,” IEEE Journal of Solid-State Circuits, vol. 37, no. 8, pp. 1056–1064, 2002 Conference papers: 1. J.A. Croon, M. Rosmeulen, S. Van Huylenbroeck and S. Decoutere, “A General Model for MOS Transistor Matching,” in Proc. of the 29th European Solid-State Device Research Conference, pp. 464-467, 1999 2. J.A. Croon, M. Rosmeulen, S. Decoutere, W. Sansen and H.E. Maes, “A Simple and Accurate Deep Submicron Mismatch Model,” in Proc. of the 30th European Solid-State Device Research Conference, pp. 356-359, 2000 3. J.A. Croon, M. Rosmeulen, S. Decoutere, W. Sansen and H.E. Maes, “A simple characterization method for MOS transistor matching in deep submicron technologies,” in Proc. of the 2001 International Conference on Microelectronic Test Structures, pp. 213-218, 2001 4. J.A. Croon, H.P. Tuinhout, R. Difrenza, J. Knol, A.J. Moonen, S. Decoutere, H.E. Maes and W. Sansen, “A comparison of extraction techniques for threshold voltage mismatch,” in Proc. of the 2002 International Conference on Microelectronic Test Structures, pp. 235-240, 2002 5. J.A. Croon, E. Augendre, S. Decoutere, W. Sansen and H.E. Maes, “Influence of Doping Profile and Halo Implantation on the Threshold Voltage Mismatch of a 0.13 µm CMOS Technology,” in Proc. of the 32nd European Solid-State Device Research Conference, pp. 579-582, 2002
v
vi 6. J.A. Croon, G. Storms, S. Winkelmeier, I. Pollentier, M. Ercken, S. Decoutere, W. Sansen and H.E. Maes, “Line Edge Roughness: Characterization, Modeling and Impact on Device Behavior,” International Electron Device Meeting 2002, pp. 307-310, 2002 7. J.A. Croon, L.H.A. Leunissen, M. Jurczak, M. Benndorf, R. Rooyackers, K. Ronse, S. Decoutere, W. Sansen and H.E. Maes, “Experimental investigation of the impact of line-edge roughness on MOSFET performance and yield,” in Proc. of the 33rd European Solid-State Device Research Conference, pp. 227230, 2003 Unrelated publications: 1. J. Croon, S. Biesemans, S. Kubicek, E. Simoen, K. De Meyer and C. Claeys, “Freeze-out effects on the characteristics of deep submicron Si nMOSFETs in the 77 K to 300 K range,” in Proc. of the 4th Symposium on Low Temperature Electronics and High Temperature Superconductivity, pp. 187-198, 1997 2. J.A. Croon, H.M. Borsboom and A.F. Mehlkopf, “Optimization Of Low Frequency Litz-Wire RF Coils,” in Proc. of the 7th Scientific Meeting & Exhibition of the International Society for Magnetic Resonance in Medicine, pp. 740, 1999
List of symbols Symbol
Unit
A0,∆P A0,∆P1 ,∆P2 , AL,∆P1 ,∆P2 ,.. ..,AW,∆P1 ,∆P2 , AW L,∆P1 ,∆P2 AL,∆P aph
cm2 s−1
asr
cm2 s−1
AW,∆P AW L,∆P BQM 1
V1/3 cm2/3
BQM 3
cm1/3 C−1/3
CD CGC Cox d DNA dQM
F cm−2 F cm−2 F cm−2 cm cm cm
Eef f EF Ei ENA (y)
V cm−1 eV eV V cm−1
Description proportionality constant related to the √ σ∆P ∝ 1/ W L law parameters decribing the width and length dependence of the correlation between ∆P1 and ∆P2 parameter√used to describe the deviation from the σ∆P ∝ 1/ W L law for short devices parameter describing the gate-bias dependence of the mobility due to phonon scattering parameter describing the gate-bias dependence of the mobility due to surface-roughness scattering parameter√used to describe the deviation from the σ∆P ∝ 1/ W L law for narrow devices parameter√used to describe the deviation from the σ∆P ∝ 1/ W L law for short and narrow devices parameter used in the calculation of quantummechanical effects parameter used in the calculation of quantummechanical effects depletion-layer capacitance gate-to-channel capacitance oxide capacitance distance along line location of doping concentration peak distance used to describe the increase in σ∆VGS due to quantummechanical effects effective field Fermi energy intrinsic energy electrical field caused by the charge sheet qNA (y)dy
vii
viii EQM 2
V cm−1
Es E1 fδP G gm gmmax gout I0
V cm−1 V cm−1 cm2 cm−2 / cm−1 A V−1 A V−1 A V−1 A
ID ID,noLW R Iof f Iof f,local
A A A A cm−1
J k Ksr KVT , Kδ
A cm−2 J K−1 V s−1
L Lchannel Lcrit Lef f Lgate Llocal Lmask Lmet lp Lsmooth L∆ lδP
cm cm cm cm cm cm cm cm cm cm cm cm
mC n n n n0 ni NA NA0 Ndev Ndevice
cm−3 cm−3 cm−3 cm−3 cm−3
parameter used in the calculation of quantummechanical effects electrical field at the oxide-silicon interface electrical field at pinch-off point normalized power spectrum of δP geometry function transconductance maximum transconductance output conductance from weak inversion extrapolated drain current at VGS = VT drain current drain current of a device without line-width roughness off-state current (or leakage current) local off-state current at a certain position z along the gate current density Boltzmann’s constant (= 1.38 · 10−23 ) parameter used in the calculation of µsr parameters used in the calculation of the impact of δVT on the drain current channel length metallurgical channel length critical gate length effective channel length gate length local gate length gate length on mask metallurgical channel length position of pinch-off point with respect to drain local gate length after applying a smoothing window correlation length of surface roughness correlation length of the stochastic process describing δP parameter used in the calculation of µC electron concentration ≡ 1 + CD /Cox ≡ ∂ln(µ)/∂ln(Eef f ) electron concentration if no microscopic fluctuations were present intrinsic carrier concentration doping concentration peak doping concentration number of device pairs number of devices
ix Ndope Nf Np p p P pdevice plocal pµ q Qcs QD Qi qp Qs r R
cm−3 cm−2 cm−3 cm−3 cm−1
C C cm−2 C cm−2 C cm−2 C C cm−2 cm cm2
Rcontact RD rI0 o f f
Ω Ω cm−1
RLW R
cm2
RLW R,smooth cm2 Rr0 Iof f Rs RS SW T tGD tox toxef f tQM tQM GD
Ω cm−1 Ω cm−1 cm−1 K cm cm cm cm cm
tQM QD
cm
tµGD
cm
doping concentration fixed-oxide-charge density doping concentration in gate hole concentration parameter in the second-order autoregressive model a parameter probability that a device fails probability that locally the gate length is shorter than Lcrit Fuchs scattering factor elementary charge (= 1.6 · 10−19 ) charge density in charge sheet depletion charge inversion-layer charge point charge substrate charge radial direction autocovariance function related to the surface roughness contact resistance series resistance at the drain ratio of the off-state current in a realistic device incorporating microscopic fluctuations and an ideal device without these fluctuations autocovariance function related to the line-width roughness autocovariance function related to the smoothed out line-width roughness autocovariance function related to local variation of r0 Iof f series resistance series resistance at the source smoothing window temperature thickness of depletion layer in gate oxide thickness effective oxide thickness quantummechanical increase in oxide thickness parameter used to describe the impact of fluctuations in tQM on gate depletion parameter used to describe the impact of fluctuations in tQM on the depletion-layer charge parameter used to describe the impact of fluctuations in the effective field on gate depletion
x tµQD
cm
VBS VCB VCS VDS VDSsat VF B VGC VGS Vq p vsat VT VT 0 VT 0 VT,local
V V V V V V V V V cm s−1 V V V V cm−1
VT lw VT narrow W Wc Wc,Iof f
V V cm cm cm
WD QM WD
cm cm
Wmiddle WNA Wnarrow Wsmooth wδP x y ysc z zµ α
cm cm cm cm
α1 α2 β β0
cm−1 cm−1 A V−2 A V−2
γ γBH
V1/2
cm cm cm cm cm
parameter used to describe the impact of fluctuations in the effective field on the depletion-layer charge bulk-to-source voltage channel-to-bulk voltage channel-to-source voltage drain-to-source voltage saturation voltage flat-band voltage gate-to-channel voltage gate-to-source voltage impact of a point charge on surface potential saturation velocity threshold voltage threshold voltage at VBS = 0 V threshold voltage at VDS = 0 V local threshold voltage at a certain position z along the gate threshold voltage of a long and wide transistor threshold voltage of a narrow transistor channel width correlation width related to the line-width roughness correlation width related to the microscopic fluctuations in Iof f,local depletion-layer width depletion-layer width, calculated including quantummechanical effects width of center transistor width of doping concentration peak width of parasitic edge transistor half of the width of the smoothing window sensitivity of the drain current to δP direction from source to drain direction perpendicular to oxide-silicon interface depth of a charge sheet width direction inversion layer thickness fitting parameter, used to describe the VBS dependence of σ∆VT parameter in the first-order autoregressive model parameter in the second-order autoregressive model current factor current factor, without gate or drain bias dependent effects taken into account body-effect coefficient Brooks-Herring screening parameter
xi δ ∆ δfn δP
cm
∆P ∆P ∆0 P ∆UT QM ∆WD
V cm
∆xs
cm
∆ψs
V
∆ψsQM
V
²ox ²si ζsat
F cm−1 F cm−1 A V−1
ζsr
A V−1
η θ θsat
V−1 V−1
θsr
V−1
θ1 θ2 κ λ
V−1 V−2
µ µB µC µf c µsat µsr µsr µ∆P ξ
cm2 V−1 s−1 cm2 V−1 s−1 cm2 V−1 s−1 cm2 V−1 s−1 cm2 V−1 s−1 cm2 V−1 s−1 cm2 V−1 s−1
cm
= CD /Cox rms value of the surface rourghness ≡ eqδψs /KT microscopic deviation of a parameter from its typical value mismatch in a parameter (= P2 − P1 ) average of the mismatch in a parameter deviation of a parameter from the ideal case without microscopic fluctuations ≡ −(VGS − VT )∆ID /ID quantummechanical increase of the depletion layer width position, with respect to the drain, of maximum sensitivity of the drain current to δVT shift in surface-potential due to the short-channel effect shift in surface-potential due to quantummechanical effects permittivity of silicon dioxide (= 3.45 · 10−13 ) permittivity of silicon (= 1.04 · 10−12 ) parameter describing the drain-bias dependence of the current factor parameter describing the gate-bias dependence of the current factor parameter used in the calculation of the effective field mobility reduction factor parameter describing the drain-bias dependence of the current factor parameter describing the gate-bias dependence of the current factor (= θ) first-order mobility reduction factor second-order mobility reduction factor scaling coefficient parameter related to the range of the short-channel effect mobility bulk mobility mobility limited by Coulomb scattering mobility limited by fixed-oxide-charge scattering drain bias limited part of mobility mobility limited by surface-roughness scattering gate bias limited part of mobility average of the mismatch in a parameter parameter related to the effect of non-abrupt junctions on the short-channel effect
xii ρ ρ Ω cm−3 ρLW R ρrepeat ρ(∆P1 , ∆P2 ) σ σLER cm σln σLW R cm σLW R,smooth cm σ∆P σσ φB φF φM S φt ψ ψs ψs0 ωr ωx ωz [f1 ∗ f2 ](x)
V V V V V V V cm−1 cm−1 cm−1
correlation factor resistivity autocorrelation function of the line-width roughness measurement repeatability correlation between the mismatch in P1 and the mismatch in P2 standard deviation standard deviation of the line-edge position ≡ |dln(Iof f )/dL|σLW R standard deviation of the local line-width standard deviation of the smoothed out local line-width standard deviation of the mismatch in a parameter standard deviation of extracted standard deviation surface potential in strong inversion Fermi potential work function of gate thermal voltage (= kT /q) potential surface potential long-channel surface-potential spacial frequency in the 1/r direction spacial frequency in the 1/x direction spacial R ∞ frequency in the 1/z direction ≡ −∞ f1 (x0 ) · f2 (x − x0 )dx0 (convolution integral)
List of acronyms Symbol
Description
2D 3D CMOS DIBL FD HDD ITRS LDD LER LWR MOS MOSFET SCE SEM SIMS SMU SNM SOI SRAM STI
Two Dimensional Three Dimensional Complementary Metal-Oxide Semiconductor Drain Induced Barrier Lowering Fully Depleted Highly Doped Drain International Technology Roadmap for Semiconductors Lowly Doped Drain Line-Edge Roughness Line-Width Roughness Metal-Oxide Semiconductor Metal-Oxide Semiconductor Field-Effect Transistor Short-Channel Effect Scanning Electron Microscope Secondary Ion Mass Spectrometry Source-Monitor Unit Static Noise Margin Silicon-On-Insulator Static Random Access Memory Shallow Trench Isolation
xiii
Contents Abstract
i
Acknowledgements
iii
List of publications
v
List of symbols
vii
List of acronyms
xiii
1 Introduction
1
1.1
Matching analysis
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
1.2
Importance for circuit design . . . . . . . . . . . . . . . . . . . . . . .
4
1.3
State of the art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
1.4
Research objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
1.5
Outline of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
2 Measurement and modeling of mismatch 2.1
11
Measurement setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
2.1.1
Measurement system . . . . . . . . . . . . . . . . . . . . . . . .
12
2.1.2
Test structures . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
2.1.3
Measurement algorithm . . . . . . . . . . . . . . . . . . . . . .
17
xv
xvi
CONTENTS 2.2
Experimental setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
2.3
Modeling of mismatch in the drain current . . . . . . . . . . . . . . . .
19
2.3.1
Modeling approach . . . . . . . . . . . . . . . . . . . . . . . . .
19
2.3.2
Impact of threshold voltage mismatch . . . . . . . . . . . . . .
20
2.3.3
Impact of current factor mismatch . . . . . . . . . . . . . . . .
24
2.3.4
The complete model . . . . . . . . . . . . . . . . . . . . . . . .
26
2.3.5
Parameter extraction . . . . . . . . . . . . . . . . . . . . . . . .
27
2.3.6
Model accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
Width and length dependence . . . . . . . . . . . . . . . . . . . . . . .
29
2.4.1
2 Width and length dependence of σ∆P . . . . . . . . . . . . . .
32
2.4.2
Width and length dependence of correlation factors
. . . . . .
33
2.4.3
Matching properties of a 0.18 µm CMOS process . . . . . . . .
33
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
37
2.4
2.5
3 Parameter extraction
39
3.1
Extraction methods . . . . . . . . . . . . . . . . . . . . . . . . . . . .
40
3.2
Experimental setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43
3.3
Comparison of extraction methods . . . . . . . . . . . . . . . . . . . .
44
3.3.1
Model accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . .
44
3.3.2
Measurement accuracy and speed . . . . . . . . . . . . . . . . .
45
3.3.3
Physical meaningfulness of parameters . . . . . . . . . . . . . .
51
3.3.4
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
58
3.4
Future issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
58
3.5
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
60
4 Physical origins of MOSFET mismatch 4.1
Basic operation of the MOS transistor . . . . . . . . . . . . . . . . . .
63 64
CONTENTS
4.2
4.3
4.4
xvii
4.1.1
Regions of operation and current expressions . . . . . . . . . .
64
4.1.2
Short- and narrow-channel effects . . . . . . . . . . . . . . . . .
69
4.1.3
Gate depletion . . . . . . . . . . . . . . . . . . . . . . . . . . .
74
4.1.4
Quantummechanical effects . . . . . . . . . . . . . . . . . . . .
74
4.1.5
Low field mobility . . . . . . . . . . . . . . . . . . . . . . . . .
75
Mismatch in the drain current . . . . . . . . . . . . . . . . . . . . . . .
76
4.2.1
Solution of the current equation in weak inversion . . . . . . .
77
4.2.2
Solution of the current equation in strong inversion . . . . . . .
82
4.2.3
Short- and narrow-channel effects . . . . . . . . . . . . . . . . .
86
4.2.4
Comparison of mismatch in weak and strong inversion . . . . .
88
4.2.5
Asymmetry of MOSFET mismatch . . . . . . . . . . . . . . . .
91
Physical origins of fluctuations . . . . . . . . . . . . . . . . . . . . . .
94
4.3.1
Doping fluctuations . . . . . . . . . . . . . . . . . . . . . . . .
95
4.3.2
Impact of fluctuations in channel doping on threshold voltage .
95
4.3.3
Gate depletion . . . . . . . . . . . . . . . . . . . . . . . . . . .
99
4.3.4
Quantummechanical effects . . . . . . . . . . . . . . . . . . . . 100
4.3.5
Mobility fluctuations . . . . . . . . . . . . . . . . . . . . . . . . 103
4.3.6
Combination of all effects and comparison with experiments . . 106
4.3.7
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5 Technological aspects
113
5.1
Technology descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . 114
5.2
Impact of the gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5.3
5.2.1
Amorphous or poly-crystalline silicon as gate material? . . . . 118
5.2.2
Impact of the gate doping . . . . . . . . . . . . . . . . . . . . . 119
Impact of the halo implantation . . . . . . . . . . . . . . . . . . . . . . 122
xviii
CONTENTS 5.3.1
Long- and wide-channel transistors . . . . . . . . . . . . . . . . 123
5.3.2
Short- and narrow-channel effects . . . . . . . . . . . . . . . . . 126
5.4
Comparison of different CMOS technologies . . . . . . . . . . . . . . . 128
5.5
Alternative device concepts . . . . . . . . . . . . . . . . . . . . . . . . 130
5.6
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
6 Impact of line-edge roughness
135
6.1
Characterization of line-edge roughness
6.2
Modeling the impact of line-width roughness . . . . . . . . . . . . . . 139
6.3
. . . . . . . . . . . . . . . . . 136
6.2.1
Impact of line-width roughness on the threshold voltage . . . . 140
6.2.2
Impact of line-width roughness on the off-state current . . . . . 141
6.2.3
Impact of line-width roughness on yield . . . . . . . . . . . . . 143
Experimental investigation of the impact of LWR . . . . . . . . . . . . 144 6.3.1
Experimental setup . . . . . . . . . . . . . . . . . . . . . . . . . 144
6.3.2
Sinusoidally-shaped gate edges . . . . . . . . . . . . . . . . . . 145
6.3.3
Extra rough gates . . . . . . . . . . . . . . . . . . . . . . . . . 150
6.3.4
Yield . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
6.4
Prediction of the impact of LWR and guidelines . . . . . . . . . . . . . 152
6.5
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
7 Conclusions, future work and outlook
157
7.1
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
7.2
Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
7.3
Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
S Summary in Dutch
163
S.1 Inleiding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 S.2 Meten en modelleren van mismatch in de drainstroom . . . . . . . . . 165
CONTENTS
xix
S.2.1
Het meten van mismatch in de drainstroom . . . . . . . . . . . 166
S.2.2
Het modelleren van mismatch in de drainstroom . . . . . . . . 167
S.3 Parameterextractie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 S.3.1
Modelnauwkeurigheid . . . . . . . . . . . . . . . . . . . . . . . 173
S.3.2
Meetnauwkeurigheid en -tijd . . . . . . . . . . . . . . . . . . . 173
S.3.3
Fysische betekenis . . . . . . . . . . . . . . . . . . . . . . . . . 175
S.3.4
Samenvatting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
S.4 Fysische oorsprong van mismatch tussen MOSFETs . . . . . . . . . . 179 S.4.1
Mismatch in de drainstroom . . . . . . . . . . . . . . . . . . . . 179
S.4.2
Fysische oorsprong van de fluctuaties
. . . . . . . . . . . . . . 183
S.5 Technologische aspecten . . . . . . . . . . . . . . . . . . . . . . . . . . 187 S.5.1
Amorf of poly-silicium gatemateriaal? . . . . . . . . . . . . . . 188
S.5.2
Invloed van het implanteren van een halo . . . . . . . . . . . . 189
S.5.3
Schaling van de matchingeigenschappen . . . . . . . . . . . . . 189
S.6 Invloed van ruwheid van lijnranden . . . . . . . . . . . . . . . . . . . . 191 S.6.1
Karakterisatie van de randruwheid . . . . . . . . . . . . . . . . 191
S.6.2
Modellering van de invloed van randruwheid . . . . . . . . . . 192
S.6.3
Experimentele studie van de invloed van randruwheid . . . . . 194
S.6.4
Voorspelling van de invloed van randruwheid . . . . . . . . . . 197
S.7 Conclusies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 Bibliography
201
Chapter 1
Introduction No two transistors are the same. When closely examined, differences can be observed at several levels, that, in one way or the other, are related to distance. For instance, when two ’identical’ circuits are not fabricated in the same facility, they are produced by different people using different machines. This results in slightly nonidentical circuits and different circuit yields for the two different fabrics. In order to minimize differences, strategies like the ‘copy EXACTLY! technology transfer method’ of INTEL can be employed [1]. However, even within one production facility, differences between ’identical’ circuits are observed. Different lots are not always processed using the same machines, while a machine itself shows a slight drift in time, which causes differences between wafers. On a single wafer, differences between dies are observed, which are called inter-die variations. These could for example be due to the fact that during processing the temperature is slightly different at the edge of a wafer than it is in the center. The above effects are summarized in figure 1.1. The variation between circuits increases as their distance at process time increases. At the bottom of the upturned pyramid the intra-die fluctuations are present. Intra-die fluctuations are the differences between supposedly identical structures within one die. These differences can have a systematic nature when they are caused by asymmetries in layout. For instance, it was shown in [2] that the proximity of metal wiring lines can affect transistor operation. This e.g. reduces the mirror factor of a current mirror when one of the two transistors is more closely located to the metal line, which needs to be taken into account when the circuit is designed. Besides systematic mismatch, also a stochastic component is present that is caused by the fact that at the microscopic level1 transistors are not the same. One of the most well known examples of stochastic fluctuations in MOSFETs is the random nature of the amount of dopant atoms and their positions [3]. Stochastic fluctuations are inde1 Or
at the nanoscale level for modern-day devices.
1
2
CHAPTER 1. INTRODUCTION
inter fab distance
inter lot inter wafer inter die intra die variability Figure 1.1: Variability at several levels
pendent of the distance between the devices under study, and by this they determine the maximal obtainable accuracy within a certain technology. In this work, we study the stochastic fluctuations of the MOSFET, which is the most important component of modern-day integrated circuits.
1.1
Matching analysis
The overall variability of a component is the sum of the variabilities at all levels. When studying the stochastic component, we want to filter out all other possible causes of variation. This is achieved by matching analysis, which characterizes the difference between two devices. Consider figure 1.2, which shows two types of variation: 1) Microscopic fluctuations typically have a length scale that is shorter than the device dimensions, and can be considered as spatial noise. 2) The other types of variations have length scales that are longer. Now look at the differences between the three devices that are depicted in figure 1.2. The difference between the first and third device is for the largest part due to a disturbance close to device 3, of which the impact lessens as distance increases. In other words, because the surroundings of device 1 and device 3 are nonidentical, their behavior is also nonidentical. This is often caused by asymmetry in the layout, which means that the difference is systematic and the same for all processed chips. The difference between device 1 and 2 is only marginally affected by the disturbance close to device 3. Therefore, it is mainly caused by the stochastic variation. This means that the difference, or mismatch, between device 1 and device 2 is not the same as and uncorrelated to the difference observed on another chip. All this results in distributions for the mismatch as displayed in figure 1.3. Examples are shown for the mismatch in threshold voltage (∆VT ) and the mismatch in the
parameter value
1.1. MATCHING ANALYSIS
3
disturbance device 1
device 2 device 3 location
Figure 1.2: A certain device parameter as a function of the location on the chip. Devices are located at three positions. 20
14
W/L = 12.0/0.10 mm/mm
W/L = 12.0/0.10 mm/mm 12
15
8
counts
counts
10
6 4
10
5
2 0 -12
-8
-4 0 4 DVT (mV) (a)
8
12
0
-0.6 -0.4 -0.2 0 0.2 0.4 0.6 Dln(Ioff) (-) (b)
Figure 1.3: Distribution of the mismatch in the threshold voltage (a) and of the mismatch in the off-state current (b). logarithm of the off-state current (∆ln(Iof f )). When a quantity is determined by a summation of numerous independent variables, its distribution tends to be normal, as is observed for the mismatch in threshold voltage. The average value is determined by the systematic component of the mismatch (denoted by µ∆VT or ∆VT ), which is close to zero for a symmetric layout. The width of the distribution is caused by the stochastic component and it is represented by the standard deviation (σ∆VT ) or by the variance, which is the square of the standard deviation. Another distribution that will be encountered is the lognormal distribution, which arises when the exponent of a normally distributed parameter is taken. Lognormal distributions appear when numerous independent variables are multiplied. For the
4
CHAPTER 1. INTRODUCTION
examples displayed in figure 1.3, it is observed that the mismatch in the off-state current can be approximated by such a distribution. In general, it will be found that the off-state current has a distribution in between normal and lognormal. The difference between two devices is in most cases not represented by just one parameter. However, when more parameters are needed, these do not have to be independent from one another and correlations can exist. For instance, the off-state current is a function of the threshold voltage and a correlation between the fluctuation in these parameters can be expected. Summarizing, when studying the matching performance of a technology, one examines the means of, standard deviations of, and correlations between the mismatch of relevant device parameters. The mismatch between two transistors increases when the distance between them is increased.
1.2
Importance for circuit design
In order to understand the impact of stochastic fluctuations, three circuit examples from literature are presented. These deal with the speed-accuracy-power trade-off in analog circuits, analog-to-digital converters, and with the SRAM circuit. In [4] the impact of threshold-voltage mismatch on the speed-accuracy-power tradeoff of analog CMOS circuits is investigated. The current mirror is examined as basic current-processing block. As basic voltage-processing block a one-transistor implementation of a voltage amplifier is taken. The size dependence of the mismatch√is proportional to the inverse of the square-root of the area [5], i.e. σ∆VT = A∆VT / area, where the proportionality constant A∆VT characterizes the matching performance of a technology. Using this law, it is seen that the accuracy of a MOSFET can be increased by increasing its width or length. However, an increase in the width of a MOSFET results in a larger current and thus power dissipation. Increasing the length reduces the current, but it also reduces the speed. A similar reasoning can be applied for the impact of noise. However, for the basic building blocks it is found that the impact of the matching performance of a technology on the speed-accuracy-power trade-off is one to two orders of magnitude larger than that of noise. As second illustration, we take a look at the work presented in [6], in which the impact of stochastic variations on the yield of an analog-to-digital converter is investigated. The results of this work are copied into figure 1.4. It is indeed observed that a good matching performance is required to be able to make high accuracy analog-to-digital converters with acceptable yield. As third example consider the SRAM circuit, which is embedded in many digital designs. Figure 1.5 shows a six transistor implementation of an SRAM cell and its transfer characteristic during read access. In [7, 8] the impact of stochastic variations in the threshold voltage on the SRAM is analyzed. This variation translates into a variation on the static noise margin (SNM), as defined in figure 1.5b. When the variation is too large, the SNM of some cells disappear, as is shown in figure 1.5 with the dashed line. In this case it is not possible to change the state of the cell and therefore
1.2. IMPORTANCE FOR CIRCUIT DESIGN
5
yield (%)
100 80 7 bit
8 bit 60
9 bit
40 10 bit
20 0
0
1
2
3
4
5
6
7
s(DVT) (mV)
Figure 1.4: Yield of several analog-to-digital converters with different accuracies as a function of the standard deviation of the mismatch in threshold voltage. Results are taken from [6].
word-line
Vdd
SNM Vdd
Vdd bitline
VR
Vdd
VR
VL ground
bitline
ground
VL
(a) (b)
Figure 1.5: Schematic of an SRAM cell (a) and its transfer characteristic during read access (b). b) The full lines give the transfer characteristic in case of average transistor behavior. The dashed lines represent an extreme case, for which the static noise margin is reduced to zero due to the stochastic variation in the threshold voltage. The static noise margin (SNM) is equal to the length of the side of the minimum square in the ’eyes’ of the transfer characteristic. These figures are based on [7]. it fails. It was found in [8] that in order to obtain a 90 % yield on a 1 Mbit SRAM it is required that A∆VT < 6 mVµm for a 180 nm technology and A∆VT < 2.5 mVµm for a 100 nm technology. This last number is not easy to achieve and it explains the increasing interest in research regarding stochastic parameter fluctuations. Summarizing, it can be stated that stochastic fluctuations limit the maximal obtainable accuracy, speed, size, yield and/or minimal obtainable power dissipation in
6
CHAPTER 1. INTRODUCTION
CMOS circuits.
1.3
State of the art
Looking at the references made throughout this thesis, it is observed that only about one third is from before 1998, which is when this work was started. This indicates the increasing interest in research regarding the matching properties of MOSFETs. Here a brief introduction is presented to the most significant papers in this field. More complete references to literature will be made at the relevant places in this thesis. Three kinds of topics are distinguished. The first investigates the physical origins of MOSFET mismatch, the second models the mismatch in the drain current in terms of the mismatch in other transistor parameters, and the third investigates technology related issues. Note that one publication can treat more than one of these topics. One of the first investigated effects of microscopic fluctuations on MOSFET operation was published in 1973 by Van Overstraeten, Declerck and Broux [9]. It shows that these fluctuations need to be taken into account for accurate modeling of the weak inversion current. The first paper that examines the impact of microscopic fluctuations on the stochastic properties of macroscopic MOSFET behavior was published in 1975 by Keyes [3]. It examines the impact of the discrete character of doping on the fluctuations in the threshold voltage. This is thought to determine the lower obtainable boundary of threshold voltage fluctuations and it is still one of the most studied effects. A popular analytical derivation based on a charge-sheet approach was presented in 1997 by Takeuchi [10] and in 1998 by Stolk [11]. The topic has also been extensively studied by device simulations (see for instance the papers of Asenov [12]). Experimental work regarding doping fluctuations was presented in the mid-nineties by Mizuno [13] and in 2000 by Tuinhout [14]. We note that, until now, calculations regarding the impact of doping fluctuations are only able to explain half of the experimentally observed fluctuations in the threshold voltage. This indicates the presence of other fluctuation mechanisms. Another extensively studied field is how the mismatch in MOSFET parameters translates into a mismatch in the drain current. In general, this is achieved by first order sensitivity analysis on a relatively simple model for the drain current. In most cases mismatch in the threshold voltage and mismatch in the current factor are taken into account (see for example the work of Vittoz [15] (1985), Lakshmikumar [16] (1986), Pelgrom [5] (1989), Bastos [17] (1995) and Serrano-Gotarredona [18] (2000)). Drennan [19] (1999) follows a slightly different approach by starting from a more complex compact model and by assuming prior knowledge of width and length dependencies to estimate model parameters. Maybe the most referred to paper in matching literature is the one written in 1989 by Pelgrom [5]. This work examines the width and length dependence of the standard deviation of the mismatch at the fundamental level. This standard deviation is found to be inversely proportional to the square root of the device area. This is one of the best known laws in the field of matching.
1.4. RESEARCH OBJECTIVES
7
The impact of technology-related parameters is less well understood. However, some effects were studied, like for instance the influence of metal coverage [20] (1996) and the impact of the granular structure of the gate material [21] (1997) by Tuinhout. The impact of the vertical doping profile was studied by Takeuchi [10] (1997), while e.g. Difrenza looked at the impact of halos [22] (2000). In 2001 Stolk [8] briefly outlined the required steps to optimize a technology with respect to its matching performance. However, note that technologies keep changing and that this work can never be considered complete. Summarizing, we conclude that research of the stochastic properties of technologies is gaining in interest. Knowledge has been built up regarding the impact of doping fluctuations on the threshold voltage and of how the mismatch in the drain current depends on transistor parameters. Technology-related issues have been investigated, but are not completely understood. Furthermore, with the down-scaling to deep submicron and sub 100 nm gate lengths, new technological and physical issues arise. In general it can be stated that full quantitative understanding of the matching properties of MOSFETs is still missing. Finally, references should be made to the theses of Bastos [23] (1998), Difrenza [24] (2002) and Tuinhout, that also deal with the topic of matching. Bastos mainly concentrated on the description of the mismatch in the drain current and on the impact of mismatch on a digital-to-analog converter. Difrenza focussed on the physical modeling and also discussed the impact of the gate material and the halo implantation. Based on numerous practical examples, Tuinhout extensively studied the measurement of mismatch and layout issues.
1.4
Research objectives
The main goal of this work is to understand, model and characterize the matching properties of deep submicron MOSFETs. This is further specified as: • Develop a physics-based model that accurately describes the mismatch in the drain current over as large a bias range as possible. • Benchmark different methods for mismatch characterization. • Understand and provide models for the physical causes of MOSFET mismatch. • Investigate the impact of process steps and technological parameters on the matching performance of deep-submicron technologies. • Investigate the impact of line-edge roughness as one of the future causes for stochastic parameter fluctuations. These objectives encompass all three matching research topics defined in the previous subsection. The work presented in this thesis is done on 180 nm and 130 nm CMOS
8
CHAPTER 1. INTRODUCTION
technologies developed in IMEC. Experimentally investigated gate lengths range down to sub 100 nm.
1.5
Outline of the thesis
This thesis consists of five technical chapters after which it is concluded and suggestions for future work are presented. The chapters are related to the above mentioned research objectives and are presented in the same order. This also approximates the chronological order in which the work took place. Exceptions are chapter 4, for which the work was done last, and chapter 5, which shows results that were obtained during the full duration of this work. We started our work by trying to describe the mismatch in the drain current as a function of other model parameters (chapter 2). We reasoned that, by taking a physical model as base, this would automatically lead to physical insight in the matching properties of the MOSFET. This turned out to be only partly true. By the time the work for chapter 2 got finalized, Philips Research and IMEC had started working together. Comparison of our extraction methodology with the one of Philips uncovered large and unexpected differences. This resulted in a small collaboration between Philips Research (Eindhoven, the Netherland), Philips Semiconductors (Nijmegen, the Netherlands), ST Microelectronics (Crolles, France) and IMEC. The same material was measured at each of these locations and the most common extraction methods were bench-marked. The results of this work are presented in chapter 3. By now it became apparent that a deeper knowledge regarding the physical origins of MOSFET mismatch was required, and a lot of the ideas that ended up in chapter 4 were developed in this period. At the same time, in the lithography group of IMEC the question arose how to deal with line-edge roughness. Another small collaboration was started, and priority was given to this work. However, some of the ideas regarding the physical origins of mismatch could already be applied for the specific case of line-edge roughness. We have therefore chosen to present the work regarding line-edge roughness in chapter 6 at the end of this thesis as an illustration of the more general theories presented earlier. By now our ideas regarding the physical origins of MOSFET mismatch had received time to mature. They are presented in chapter 4. In order to understand technological issues, relevant process splits were analyzed during the full duration of this work. Also, a dedicated experiment was set up, which mainly focussed on the impact of the halo implantation. The results of this work are presented in chapter 5. A more detailed overview of the contents of the chapters will now be given. Chapter 2: Measurement and modeling of mismatch in the drain current. The main topic of this chapter is the modeling of the mismatch in the drain current as a function of mismatch in the threshold voltage and current factor. An accurate model is required in order to fully understand the impact of variability on the MOSFET and to evaluate the impact of mismatch on circuits. We distinguish ourself from other work by our
1.5. OUTLINE OF THE THESIS
9
modeling approach: The impact of the mismatch in threshold voltage and current factor are treated separately. Assumptions that are required to model the impact of mismatch in the current factor are not required to model the impact of mismatch in the threshold voltage. This approach results in a continuous model that is valid in moderate and strong inversion. Most of the theories presented in this thesis are compared to experimental data. Therefore, chapter 2 starts by describing our measurement setup, test structures and measurement approach. Chapter 3: Parameter extraction. Numerous methods exist that extract the variation in the threshold voltage and current factor. Quite often publications do not mention which method is used, but we will show that significant differences can occur. The most commonly applied methods are bench-marked with respect to model accuracy, physical meaningfulness, and measurement accuracy and speed. The following methods are examined: the maximum slope method, the three points method, the four points method, applying a current criterion and current-mismatch fitting methods. Chapter 4: Physical origins of MOSFET mismatch. This chapter looks at the origins of fluctuations at the microscopic level and at how they affect MOSFET behavior. In order to achieve this, it is necessary to delve deeper into MOSFET theory than before and, as an introduction to this chapter, the basic equations of MOSFET operation are derived. The chapter continues by again deriving these equations, but now in the presence of √ microscopic fluctuations. In agreement with other published work, we find that the 1/ area law does not hold in weak inversion. Furthermore, we find in this regime of operation that edge effects, like halos or shallow trench isolation, can cause serious increases in the mismatch for long and wide transistors, which are not observed in strong inversion. In parallel and in √ agreement with a recent publication [25] we also find a slight departure of the 1/ area law in strong inversion for high enough values of the drain bias. Short- and narrow-channel effects are described using theories published in literature. The chapter ends by using the theory of MOSFET operation to calculate the impact of doping fluctuations in the channel region and gate, the impact of fluctuations in the oxide charge and the impact of fluctuations in surface roughness. As in literature, the charge sheet approach is followed. The calculations include quantum mechanical effects, gate depletion and fluctuations in the mobility. We predict that Coulomb scattering gives a significant contribution to stochastic parameter fluctuations. We combine all models and fit the total model to the experimentally obtained curve of the mismatch in the drain current as a function of the gate bias. The physical content of the model is tested by predicting the mismatch in the transconductance, the mismatch at different bulk bias conditions, and the correlation of the mismatches at several bias conditions. Chapter 5: Technological aspects. In this chapter examples are presented that demonstrate how certain process parameters can affect the matching properties of a technology. As in literature we find that the grain structure of the gate material can have a large impact. Furthermore, the impact of the halo implantation is examined. We
10
CHAPTER 1. INTRODUCTION
find that halos can seriously degrade the matching performance of a technology when they are unintentionally implanted through the gate. Also in this chapter, the scaling behavior of the matching performance is addressed. Chapter 6: Impact of line-edge roughness on parameter fluctuations, off-state current and yield. For near-future gate-lengths, line-edge roughness is expected to cause significant parameter fluctuations, increase the off-state current and decrease yield. Therefore, it has recently become a topic of interest. The chapter starts with the description of line-edge roughness itself. Based on this information, we calculate the impact of line-edge roughness. We test our models by intentionally increasing the roughness. We then use these models to predict the moment at which line-edge roughness will become an issue. These predictions are used to present guidelines for as well device engineering as gate-patterning process development. This thesis ends in chapter 7 with the major conclusions and suggestions for future work.
Chapter 2
Measurement and modeling of mismatch in the drain current In order to be able to calculate or simulate the effects of MOS transistor mismatch, it is important to have a model that accurately describes the mismatch in the drain current. In the development of such a model, several aspects have to be taken into account. We would like the model to be valid over a large bias range. This would allow us to use the model for a large set of applications. We would also like the model to be physics based. A physics based model has as advantage over empirical models, that its model parameters can more easily be linked to the technology of which it describes the matching properties. Thirdly, the model needs to be continuous between different regions of operation of the MOS transistors. Continuity makes the model easier to implement in a circuit simulator. Furthermore, a method needs to be developed to extract model parameters. Existing physics based mismatch models1 can be separated in two groups, those that take a complex description of the drain current as base [19, 28–30], or those based on a simple description [5, 15–18, 31–41]. The mismatch models based on complex drain-current models can provide very accurate results. They include a lot of well understood physical effects and therefore contain many parameters. However, the mechanisms that cause mismatch are usually only partly understood and one cannot implicitly assume that a mismatch model automatically inherits the physics contained in the drain current model on which it is based. E.g., although a parameter is independent of a certain bias voltage, this does not have to hold for the mismatch in that parameter. Because of the large number of parameters involved, complex models are very time consuming to use. In this chapter, we therefore choose to develop a mismatch model that yields sufficiently accurate results, but that is kept as simple as possible (section 2.3). Our newly 1 As
opposed to statistical modeling [26, 27].
11
12
CHAPTER 2. MEASUREMENT AND MODELING OF MISMATCH
UNIX worstation
parameter analyzer
switching matrix
probecard wafer chuck waferprober Figure 2.1: Measurement system developed modeling approach has the advantage over previously published work that it stretches the range of validity towards lower values of the gate bias into the upper part of the moderate inversion region. Also the bias dependence of the mismatch parameters will be investigated. A new scheme for parameter extraction will be introduced. The width and length dependence of the extracted parameters is described by the model published in [5], which is presented in section 2.4. The model, derived in this chapter, will be tested on a 0.18 µm CMOS technology, from which measurement data is required. We will therefore start by describing how to measure MOS transistor mismatch (section 2.1) and by providing the experimental background (section 2.2). Section 2.5 concludes the chapter.
2.1
Measurement setup
In order to determine the mismatch between two transistors, we want to measure their drain currents as simultaneously as possible. This requires an appropriate measurement system, test structures and measurement algorithm. These issues will be discussed in the following three subsections, respectively.
2.1.1
Measurement system
The system used for the measurements is schematically presented in figure 2.1. This system is part of the semi-automatic HP4063 Semiconductor Parameter Analysis System, which is described fully in [42]. It consists of a wafer-prober, a chuck, a switching matrix, a parameter analyzer and an UNIX workstation. Also needed is a probe-card.
2.1. MEASUREMENT SETUP
13
These component will now briefly be described. Wafer-prober. Accurate extraction of a standard deviation requires the measurement of a large number of transistor pairs. To measure manually would therefore be a very time consuming and tedious process. The wafer-prober automatically moves the chuck around so that all required device modules on the wafer are contacted. Chuck. The measurement-wafer is located on a thermochuck. The operation of the thermochuck induces noise in the transistors under test, which can seriously degrade measurement accuracy. The two transistors of a pair are measured directly after each other, while the temperature difference between two consequential measurements is not significant. Therefore, the thermochuck is switched off. Switching matrix and probe-card. One module, which consists of 2 × 12 bonding pads, contains several transistor pairs. All twenty-four bonding pads are contacted at once by the twenty-four pins of the probe-card. The type of probe-card used depends on the material of the bonding pads and is chosen in such a way as to minimize the contact resistance. For aluminum bonding pads a probe-card with tungsten needle-tips is used, while for copper bonding pads the needle tips are made of a beryllium-copper alloy. Although the probe-card has twenty-four pins, the parameter analyzer only has four SMUs. The switching matrix takes care of connecting the correct pin to the correct SMU. Parameter analyzer. Through its four SMUs, the HP4142B parameter analyzer supplies the bias voltages and measures the currents of the transistors under test. A force and sense technique is applied for the biasing. The sensing is done in between the probe-card and the switching matrix. For most of the measurements only the lowest voltage range (−2 < V < 2) of the system is needed, which has a more than sufficient resolution of 100 µV. The specified worst case accuracy is < 2.1 mV. The resolution at which currents are measured is 0.02 % at the bottom of a specific measurement range and 0.002 % at the top of the range. Changes in measurement range take place at current levels of approximately 10n A, where n is an integer. The specified worst case accuracy is ∼ 0.5 % of the measured current. When doing matching measurements the non-specified short-term repeatability of the measurement system is of far more importance than the worst case accuracy. This will be extensively analyzed in section 3.3.2, where we will find that this measurement repeatability is much better than the specified worst case accuracy. UNIX workstation. The UNIX workstation is used to communicate with the measurement equipment and to collect the measurement data.
2.1.2
Test structures
This section introduces the test structures that are needed to characterize MOS transistor mismatch. A nice overview of test structures for matching studies was published
14
CHAPTER 2. MEASUREMENT AND MODELING OF MISMATCH
Figure 2.2: Layout of standard transistor pair
in [43]. Figures 2.2 and 2.3a display the standard matched transistor pair. The two transistors have common gate, common source and common bulk. Their drains are connected separately. With the standard test structure we only want to analyze random local fluctuations. Therefore the test structure is designed to be as symmetrical as possible, the transistors are located close to each other and their currents flow in the same direction. Mismatch can also be due to systematic differences in layout or by longer range gradients. To analyze this kind of mismatch, different test structures are required. We will discuss the most common ones, which are also presented in figure 2.3. Note that, according to need, numerous kinds of variations to these test structures are possible. Rotated transistors. Differences in e.g. crystal orientation and stress can cause systematic mismatch between transistors with different orientation [44]. A test-structure in which transistors are rotated with respect to each other allows for investigation of these effects. Metal coverage. For easy routing of metal lines it would be favorable if they could be lain-out over transistors. This can cause systematic deviations due to e.g. insufficient passivation of dangling bonds at the silicon silicon-dioxide interface [20, 45, 46]. To investigate the influence of the proximity of a metal line, a transistor pair is designed in which one of the transistors is covered with metal. Impact of gradients. Layer thicknesses and doping profiles can vary slightly over a chip or wafer. These gradients cause a systematic mismatch, which becomes more prominent when two transistors are located further apart. As test-structure an array of transistors is used, which are spaced at a certain distance. Quadrature layout. A way to circumvent the effects of gradients or other systematic
2.1. MEASUREMENT SETUP
15 drain 1
drain 1 drain 2
gate drain 4
gate
drain 1 drain 2 gate
source drain 2
source metal
source drain 3 (a)
(b)
(c) drain 1
drain 1 drain 2 drain 3
drain n gate
drain 2 gate source source (e)
(d)
drain 1 drain 2drain 3 drain 4
drain 1 drain 2 gate
source (f)
source (g)
Figure 2.3: Schematic representation of the standard matched transistor pair (a) and test structures for evaluating the mismatch between transistors with different orientations (b), the impact of metal coverage (c), gradient mismatch (d), a quadrature layout (e), the influence of dummy transistors (f) and the impact of the proximity of a large structure like a resistor or capacitor (g)
mismatch causing effects is to use a quadrature layout. Each transistor in the pair is split up in two transistors. The four resulting transistors are cross-coupled (see figure 2.3e). This structure is quite complex to lay out, which might actually result in extra mismatch contributions. Dummy transistors. When several transistors are supposed to match, the transistors at the side of an an array have different surroundings than transistors in the center. This can cause systematic mismatch. Adding dummy transistors at the edge of the array reduces this mismatch. As test structure, several closely spaced transistors are placed next to each other. All terminals are common, except the drain connections. Impact of a capacitor or resistor. The proximity of a large structure, like a capacitor [47] or resistor, can affect transistor behavior and cause mismatch. As test-
16
CHAPTER 2. MEASUREMENT AND MODELING OF MISMATCH
L (mm)
10
constant W/L constant area
1
0.1
1
10 W (mm)
Figure 2.4: Transistor pair dimensions for the evaluation of random mismatch for a 0.18 µm CMOS process
structure, matched pairs are placed at several distances from the large structure under investigation. We will now return to the standard matched transistor pair, used for extracting random mismatch. In section 2.4 it will be found that random mismatch is inversely proportional to the square root of the area. Deviations are expected for short or narrow transistors. To examine this width and length dependence a proper set of dimensions needs to be defined. As an example, figure 2.4 shows the chosen dimensions for a test-chip of a 0.18 µm technology. In this figure, transistors on the diagonal going from lower-left to upper-right have constant W/L-ratio, but different areas. These pairs are used to evaluate the area dependence. Transistors on the other diagonal have constant area, but different W/L-ratios. They are used to evaluate the impact of short- and narrow-channel effects. Having chosen the device dimensions, another issue that needs to be dealt with is the total required amount of transistor pairs with the same dimension (Ndev ). Usually, the measured mismatch in the drain current can be assumed normally distributed. Then, from basic statistical theory, it follows that the standard deviation (σσ ) of the extracted standard deviation (σ) is equal to: σσ = √
σ . 2Ndev
(2.1)
One 8” wafer contains approximately 40 to 200 test chips. One experiment of ∼20 wafers, usually has 2 wafers per experimental split. With one transistor pair per dimension per test-chip, this gives rise to σσ s ranging from 3.5 % to 8 %, which is sufficiently accurate for most experiments.
2.2. EXPERIMENTAL SETUP
2.1.3
17
Measurement algorithm
This subsection describes the routine, which is used to measure the drain currents (ID ) of the two transistors of the pair under test. MOS transistor mismatch is usually evaluated as a function of the gate bias (VGS ) at a certain drain bias (VDS ) and bulk bias (VBS ). The gate bias ranges from 0 V to the supply voltage (VDD ). Steps of 50 mV are sufficiently small. Because conditions of the surroundings (e.g. temperature) can vary over time, we want to measure these curves as fast ’after’ each other as possible. This is done in the following way. First the voltages are supplied to the common source, common bulk, common gate and separate drains. Next the drain current of the first transistor is measured, then the drain current of the second transistor. The gate bias is increased (or decreased in the case of PMOSFETs) by 50 mV and again the two drain currents are measured directly after each other. This process is repeated until the full ID − VGS curves are measured. Note that our main interest lies in the difference between parameters. The absolute measurement conditions are therefore not of great importance, as long as these conditions are stable. The drains of the two transistors are routed to two separate SMUs. This might give rise to a measurement-system-related offset. To circumvent this problem the measurement is repeated, but transistor one is now considered as the second and vice versa. Combining the two measurements cancels out the offset. As mentioned before, another source of error could be the variation in temperature during the measurement. However, in the next chapter a good measurement repeatability will be demonstrated, and it is thus concluded that temperature fluctuations do not play a significant role. As mentioned in the previous subsection, accurate extraction of standard deviations requires a lot of measurements. This makes measurement time a serious constraint. The algorithm presented above needs approximately 150 ms to measure one bias condition. Repeating the measurement makes this 300 ms. As practical example we will take the measurement of a 0.18 µm technology with a supply voltage of 1.8 V. When 15 pair dimensions are examined at two values of the drain bias, one wafer contains 40 chips and both NMOS and PMOS transistors are measured, the total measurement time for one wafer would approximately be 7 12 hours.
2.2
Experimental setup
In the next section measurements are performed to test the mismatch model under development. The experimental background for these experiments will now be provided. Choices need to be made concerning: used technology, type of transistors, geometries of examined device pairs, the number of measured pairs and what to measure. Technology. The technology chosen for this experiment is the 0.18 µm CMOS technology published in [48], which has a physical oxide thickness of 2.8 nm and a supply voltage of VDD = 1.8 V. At the time of this research, to our knowledge, simple mismatch models had not been demonstrated on technologies with gate lengths below
18
CHAPTER 2. MEASUREMENT AND MODELING OF MISMATCH
Table 2.1: Measured pair dimensions. The main focus will lie on the highlighted geometries. ’square’ W (µm) L (µm) 0.25 0.18 0.25 0.25 0.35 0.35 1.0 1.0 1.4 1.9 1.9 2.7 4.0 4.0 10.0 7.2
constant area W (µm) L (µm) 10.0 0.18 7.1 0.25 4.2 0.45 0.55 3.3 0.35 5.3 0.25 7.2
0.7 µm. Type of transistors. Both NMOS and PMOS transistors are examined. Since no significant differences were observed, most of the shown results are for NMOS transistors. Device pair geometries. In the standard lay-out, device pairs with 25 different geometries are available on the test chip used for the experiment. The dimensions are shown in figure 2.4. To limit measurement time only the subset of 14 pair dimensions listed in table 2.1 is measured. This subset contains approximately square transistors with different areas (left column) and transistors with constant area, but different width-over-length ratios (right column). Quite often, only results for the four emphasized geometries are shown, in order to keep the number of presented figures under control. These geometries are representative for the whole set of measured pair dimensions. Number of measured pairs. The sample size for this experiment is 84 device pairs per pair geometry. From (2.1) it follows that this results in a relative accuracy of the extracted standard deviations of σσ∆P /σ∆P = 7.7%. What to measure. To test the model, eight ID − VGS curves per transistor in the pair are measured by the routine described in subsection 2.1.3. The bias conditions of the measurements are presented in the table below. VDS (V) 0.05 0.3 0.9 1.8
VBS (V) 0.0 0.0 0.0 0.0
VDS (V) 0.05 1.8 0.05 1.8
VBS (V) -0.9 -0.9 -1.8 -1.8
2.3. MODELING OF MISMATCH IN THE DRAIN CURRENT
2.3
19
Modeling of mismatch in the drain current
In this section a model is developed that describes the relative mismatch in the drain current (∆ID /ID ) as function of the bias voltages (VGS , VDS and VBS ). As was mentioned in the introduction of this chapter we want the model to be physics based, valid over a large bias range, continuous between different regions of operation, and as simple as possible, while sufficiently accurate. The accuracy target is: ¯ ¯ ¯ σ∆ID /ID |model ¯ ¯ ¯ < 20 %. − 1 (2.2) ¯σ ¯ ∆ID /ID |experimental This section is organized as follows. In the first subsection the applied modeling approach will be introduced. Subsections 2.3.2 and 2.3.3 calculate the impact on the drain current of a mismatch in threshold voltage and current factor, respectively. The method for parameter extraction is developed in subsection 2.3.5. Finally, in subsection 2.3.6 the model accuracy is examined.
2.3.1
Modeling approach
When modeling mismatch it can safely be assumed that the mismatch in a certain parameter ∆P is much smaller than the value of the parameter P itself. In this case the impact of the mismatch in parameters Pi on the drain current ID can be calculated by a first order Taylor approximation: ∆ID ∼ 1 ∂ID 1 ∂ID ∆P1 + ∆P2 + . . . . = ID ID ∂P1 ID ∂P2
(2.3)
The mismatch ∆ID /ID in a transistor pair is just one realization of a distribution of possible ∆ID /ID ’s. This distribution can usually be assumed normal, in which case it is fully described by a mean (µ∆ID /ID ) and a standard deviation (σ∆ID /ID ). From (2.3) it directly follows that: µ∆ID /ID = µ 2 σ∆I D /ID
=
1 ∂ID ID ∂P1
1 ∂ID 1 ∂ID µ∆P1 + µ∆P2 + . . . and ID ∂P1 ID ∂P2
¶2
µ 2 σ∆P 1
+
+
1 ∂ID ID ∂P2
(2.4)
¶2 2 σ∆P + 2
(2.5)
2 ∂ID ∂ID 2 ∂P ∂P ρ(∆P1 , ∆P2 )σ∆P1 σ∆P2 + . . . , ID 1 2
where µ∆P is the mean of ∆P , σ∆P its standard deviation, and ρ(∆P1 , ∆P2 ) the correlation between the mismatches in parameters P1 and P2 . In accordance with previous work [5, 15–18, 28, 29, 32–40], the mismatch in the drain current is assumed to result from a mismatch in threshold voltage (∆VT ) and a
20
CHAPTER 2. MEASUREMENT AND MODELING OF MISMATCH
mismatch in the current factor (∆β/β). Using the equations above, their impacts will be calculated in the next two subsections, respectively. Although our model will also be based on assumptions concerning the drain current model, as apposed to other models, we will look for each parameter separately which assumptions are required. In other words, in developing our mismatch model, we did not limit ourselves to just one description of the drain current. In this way we hope to keep the model as simple as possible. The drain current is only modeled to such an extent as is necessary to describe the mismatch, related to either the threshold voltage or the current factor.
2.3.2
Impact of threshold voltage mismatch
In calculating the impact of threshold voltage mismatch on the drain current, it is assumed that the drain current is a function of the gate-overdrive voltage (VGS − VT ), but not of VGS or VT separately: ID = f(VGS − VT , VDS , VBS ). Using (2.3), it follows that: ¯ ∆ID ¯¯ 1 dID gm 1 ∂ID ∆VT ∼ ∆VT = − ∆VT , = =− ID ¯∆VT ID ∂VT ID dVGS ID
(2.6)
(2.7)
where gm is the transconductance. As opposed to other models we do not proceed with modeling gm /ID . Further working out of the term at this stage would require more assumptions and would make this part of the mismatch model unnecessarily complex. For practical applications a circuit designer can calculate gm /ID from any suitable drain current model. When extracting model parameters or evaluating model accuracy, gm /ID can be calculated directly from the measurement data. This is the approach followed in the remainder of this section. Since (2.7) is only based on assumption (2.6) we expect this equation to be valid in the whole inversion region. The validity of this statement will now be examined. In strong inversion (VGS À VT ) it approximately holds that: gm /ID ∝ 1/(VGS − VT ).
(2.8)
In weak inversion (VGS ¿ VT ) the drain current can be written as2 : ID =
´ ³ W I0 e(VGS −VT )/nφt 1 − e−VDS /φt , L
(2.9)
where I0 is the normalized current extrapolated to VGS = VT , L and W are the transistor length and width, φt is the thermal voltage kT /q and nφt is the subthreshold 2 All equations written down in this thesis are valid for NMOS transistors. The equations for PMOS transistors are easily found by introducing the appropriate minus signs.
2.3. MODELING OF MISMATCH IN THE DRAIN CURRENT
21
r(DVGS, DVGSmin) (-)
s(DVGS) (mV)
100
10 NMOS VDS=1.8 V VBS=0.0 V 0
0.5
1
1.5
NMOS VDS=1.8 V VBS=0.0 V
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3
2
VGS (V)
(a)
0
0.5
1
1.5
VGS (V)
(b)
Figure 2.5: a) Experimental σ∆VGS − VGS curves. b) Correlation of ∆VGS with the ∆VGS value at the minimum of the σ∆VGS − VGS curve. (•) W=0.25 µm, L=0.18 µm, (¥) W=10.0 µm, L=0.18 µm, (¨) W=1.0 µm, L=1.0 µm, (N) W=0.25 µm, L=7.2 µm slope. From (2.8) it follows that threshold voltage mismatch becomes the dominant mismatch causing effect at low gate biases in the strong inversion region. Since in weak inversion, the drain current depends exponentially on the threshold voltage, its mismatch is also expected to be primarily determined by threshold voltage fluctuations. In this case, it follows from (2.7) that σ∆VGS = σ∆ID /gm = σ∆VT . The mismatch in the gate bias is evaluated at constant drain current. For several dimensions, figure 2.5a shows σ∆VGS as a function of the average gate bias at this current. At high gate biases σ∆VGS is increasing with VGS , which suggests the dominance of current-factor mismatch. At lower gate biases the curves are expected to level off at σ∆VT . However, in contradiction to the observations reported in [49, 50], this behavior is not encountered. Figure 2.5 plots the correlation of ∆VGS at the minimum of the σ∆VGS − VGS curve and ∆VGS (VGS ) as a function of the gate bias. At high gate bias the correlation decreases, because current-factor mismatch takes over from threshold-voltage mismatch. However, it is seen that the correlation also drops when going into the weak inversion region. The behavior observed in figure 2.5 might be due to a couple of reasons. For instance, the mismatch in threshold voltage itself can originate from different physical effects in weak and strong inversion. The difference between weak and strong inversion will be extensively studied in chapter 4, section 4.2. For the model developed in this section we conclude that (2.7) is valid for gate biases higher than the minimum of the σ∆VGS − VGS curve, which approximately lies at VGS = VT (∼ 0.35 V). Note that this range of validity is significantly larger than that of other simple models in literature. Now the dependence of threshold voltage mismatch on the drain and bulk bias will be investigated. We will start with the drain bias dependence. For short transistors it has been reported that threshold voltage mismatch can increase due to drain induced barrier lowering3 [51]. Figure 2.6a shows σ∆VT as a function of the drain 3 For
a physical explanation we again refer to chapter 4
2
CHAPTER 2. MEASUREMENT AND MODELING OF MISMATCH
s(DVT) (mV)
30
NMOS
r(DVT, DVT@VDS=50mV) (-)
22
VBS=0.0 V
10 8 6 4 2
0
0.5
1
1.5
2
NMOS VBS=0.0 V
1 0.95 0.9 0.85 0.8
0
VDS (V)
(a)
0.5
1
1.5
VDS (V)
(b)
Figure 2.6: σ∆VT (a) and the correlation of ∆VT with ∆VT @VDS = 50 mV (b) as a function of the drain bias. Error bars represent 99 % confidence intervals. (•) W=0.25 µm, L=0.18 µm, (¥) W=10.0 µm, L=0.18 µm, (¨) W=1.0 µm, L=1.0 µm, (N) W=0.25 µm, L=7.2 µm bias for several pair dimensions. Figure 2.6b shows the correlation of ∆VT (VDS ) with ∆VT @VDS = 50 mV. In these figures, threshold voltage mismatch is extracted by applying a current criterion4 . It is observed that σ∆VT does not vary significantly with the drain bias. The correlation drops slightly with increased drain bias, which is more prominent for short transistors, as expected. However, since these effects are not very strong, they will be neglected. We will continue with the modeling of the bulk bias dependence of threshold voltage mismatch. The threshold voltage can be written as: p p VT = VT 0 + γ( φB − VBS − φB ), (2.10) p VT 0 = VF B + φ B + γ φ B . (2.11) VF B is the flat-band voltage, VT 0 is the threshold voltage at zero bulk bias, γ is the body-effect coefficient and φB is the surface potential in strong inversion. In literature, threshold voltage mismatch is usually described by a mismatch in VT 0 and a mismatch in γ. We will not follow this approach for the following reason. The body-effect coefficient is given by: √ 2q²si NA γ= , (2.12) Cox where Cox is the oxide capacitance per unit area, NA is the doping concentration, q is the elementary charge and ²si is the permittivity of silicon. Fluctuations in the threshold voltage can be attributed to fluctuations in doping concentration and fluctuations in oxide capacitance. Now consider figure 2.7, which shows a schematic drawing of the MOS transistor. The width of the depletion region is drawn for three 4 An
overview of extraction methods will be presented in chapter 3
2
2.3. MODELING OF MISMATCH IN THE DRAIN CURRENT
WD1(VBS1) WD2 (VBS2) WD3 (VBS3)
23
DNA1 DNA2 DNA3
Figure 2.7: Schematic representation of the MOS transistor. The width of the depletion region WD is drawn for three different values of the bulk bias. different values of the bulk bias and VBS1 > VBS2 > VBS3 . When the bulk bias becomes more negative, the depletion width increases. The extra amount of dopants included in the depletion region (NA2 ) fluctuates independently from the original amount of dopants (NA1 ). In other words, the correlation between ∆NA1 and ∆NA2 is zero. The same holds for the extra included dopants when the bulk bias is decreased even further. It follows that, although for uniform doping profiles γ is independent from the bulk bias, ∆γ cannot be considered constant. To avoid this problem, we choose to model the bulk bias dependence of σ∆VT instead of the bulk bias dependence of ∆VT . In [11]5 the impact of doping fluctuations on threshold voltage mismatch is calculated to be: p t2ox 8q 3 ²si NA (φB − VBS ) 2 σ∆VT ,doping = , (2.13) 3W L²2ox In case of fluctuations in the oxide capacitance (σCox ), it follows from (2.3) and (2.10) to (2.12) that: p σ∆Cox σ∆VT ,Cox = γ φB − VBS · . (2.14) Cox Based on these two equations, the following empirical model is proposed: µ ¶α VBS σ∆VT (VBS ) = σ∆VT |VBS =0 · 1 − , (2.15) φB where α is a fitting parameter. It follows that in case of dominating doping fluctuations, α = 1/4. For dominant fluctuations in oxide capacitance, α = 1/2. For short transistors the threshold voltage becomes less sensitive to the bulk bias and α is expected to decrease. To first order, the width and length dependence of α is modeled by: AW,α AW L,α AL,α + + , (2.16) α(W, L) = A20,α + L W WL in which A20,α , AL,α , AW,α and AW L,α are proportionality constants. This equation takes into account possible deviations for short-, narrow-, and short-and-narrowchannel transistors. 5 Also,
see chapter 4, subsection 4.3.2.
CHAPTER 2. MEASUREMENT AND MODELING OF MISMATCH r(DVT, DVT@VBS=0.0V) (-)
24
s(DVT) (mV)
30
10 8 6 4 2
NMOS VDS=1.8 V
1 0.95 0.9 0.85 0.8
-2
-1.5
-1
-0.5
0
NMOS VDS=1.8 V -2
VBS (V)
-1.5
-1
-0.5
0
VBS (V)
(b)
(a)
Figure 2.8: σ∆VT (a) and the correlation of ∆VT with ∆VT @VBS = 0 V (b) as a function of the bulk bias. The dashed lines show fits of (2.15) to the experimental data. Error bars represent 99 % confidence intervals. (•) W=0.25 µm, L=0.18 µm, α = 0.071, (¥) W=10.0 µm, L=0.18 µm, α = 0.101, (¨) W=1.0 µm, L=1.0 µm, α = 0.331, (N) W=0.25 µm, L=7.2 µm, α = 0.342
Figure 2.8a shows fits of (2.15) to experimentally obtained values of σ∆VT as a function of the bulk bias for several pair dimensions. Figure 2.8b shows the correlation of ∆VT (VBS ) with ∆VT @VBS = 0 V. It is seen that the bulk bias dependence of σ∆VT is well described. For long transistor pairs α ≈ 0.3. This suggests that threshold voltage mismatch is mainly caused by doping fluctuations. The correlation is seen to drop for decreasing bulk bias, which is expected considering the analysis related to figure 2.7. However, in chapter 4 it will be found that the complete picture is more complicated and that we cannot jump to the conclusion of dominating doping fluctuations this easily. It is mainly for this reason that, at this stage, the empirical model (2.15) is used. Physics- and technology-related details will be discussed later. For the short transistors it is observed that the bulk bias dependence of threshold voltage mismatch disappears, as expected.
2.3.3
Impact of current factor mismatch
To calculate the influence of a mismatch in the current factor on the drain current, a more detailed description of the drain current is needed. As was reasoned in the previous subsection, current factor mismatch is only expected to have an influence in the strong inversion regime. The following widely used strong inversion model for the drain current is chosen: ID = β(VGS − VT − VDS /2)VDS .
(2.17)
2.3. MODELING OF MISMATCH IN THE DRAIN CURRENT
25
The current factor itself is given by: β=
W Cox µ(VGS , VDS ) . L
(2.18)
Therefore, mismatch in the current factor can be attributed to mismatch in the transistor dimensions, mismatch in the oxide capacitance or mismatch in the mobility µ, which is the only bias dependent quantity in this equation. It will be seen that mismatch in series resistance is taken into account by an apparent mismatch in the mobility. Before applying (2.3) to (2.17), the bias dependence of µ will be examined. In strong inversion the mobility is determined by the bulk mobility (µB ), phonon scattering (µph = aph /(VGS − VT − VDS /2)), surface roughness scattering (µsr = asr /(VGS − VT − VDS /2)) and velocity saturation (µsat = Lvsat /VDS ), where aph and asr are proportionality constants and vsat is the saturation velocity. The given bias dependencies should be considered as first order approximations, which are convenient since they will result in a simple mismatch model. A more accurate analysis of mobility determining effects can be found in e.g. [52–58]. The total mobility is calculated by applying Matthiessen’s rule: 1 1 1 1 1 = + + + . µ µB µph µsr µsat
(2.19)
Combining this with (2.18) yields: 1 1 VGS − VT − VDS /2 VDS = + + , β β0 ζsr ζsat
(2.20)
where β0 = W Cox µB /L, 1/ζsr = (L/W Cox )((1/aph ) + (1/asr )) and ζsat = W Cox vsat . Mobility depends only weakly on the bulk bias. This dependence will therefore be neglected. Mathematically, (2.20) is equivalent to the approach followed by [17, 18, 35, 37–40], in which the current factor is described by: β=
β0 , 1 + θsr (VGS − VT − VDS /2) + θsat VDS
(2.21)
where the mobility reduction parameters θsr = β0 /ζsr and θsat = β0 /ζsat . The parameters θsr and θsat do not depend on the oxide capacitance, but they do depend on µB . In our formulation β0 , ζsr and ζsat all depend on Cox , but the mobility determining effects are represented by separate parameters. When series resistance at the source (RS ) and drain (RD ) plays a significant role, in (2.17) and (2.20) VGS needs to be replaced by VGS − ID RS and VDS by VDS − ID (RS + RD ). Since the MOS transistor is symmetrical, RS = RD . When the impact of series resistance on the current factor, described by (2.20), is neglected, it easily follows that series resistance effects can be included by replacing the parameters ζsr and ζsat by: 1/ζsr = (L/W Cox )((1/aph ) + (1/asr )) + RS + RD , (2.22) 1/ζsat = (1/W Cox vsat ) + (RS − RD )/2.
(2.23)
26
CHAPTER 2. MEASUREMENT AND MODELING OF MISMATCH
It is seen that the impacts of the source and drain resistance on ζsat cancel out. However, the mismatches in source and drain resistance (∆RS and ∆RD ) are uncorrelated and do not have to be equal within one transistor. Therefore, the fluctuations in ζsat are affected by mismatch in series resistance. All of the above presented equations are valid in the linear regime of the saturation region, which means that the drain bias is smaller than the saturation voltage (VDSsat ). For larger drain bias it has to be replaced by the saturation voltage, which is calculated by putting dID /dVDS = 0. Applying this to (2.17) with (2.20) yields: p a2 + 2ab(VGS − VT ) − a VDSsat = , (2.24) b where a = (1/β0 ) + (VGS − VT )/ζsr and b = (1/ζsat ) − (1/2ζsr ). When b → 0, this equation simplifies to: VDSsat = VGS − VT . (2.25) Even when b → 0 it is preferred to use (2.24), because the use of (2.25) creates a discontinuity between the linear and the saturation regime. We will now proceed by calculating the impact of a mismatch in the current factor on the drain current. The parameters determining current factor mismatch are the mismatch in β0 , ζsr and ζsat . Applying (2.3) to (2.17) with (2.20) yields: ¯ ∆ID ¯¯ 1 1 1 − βVDS ∆ . (2.26) = −β∆ − β(VGS − VT − VDS /2)∆ ID ¯∆(1/β) β0 ζsr ζsat This equation is valid in the linear regime. In saturation, again, VDS needs to be replaced by VDSsat . Fluctuations in VDSsat do not influence the drain current, since ∂ID /∂VDSsat = 0.
2.3.4
The complete model
The total mismatch in the drain current is calculated by adding the contribution due to threshold voltage-mismatch (2.7) and the contribution due to current-factor mismatch (2.26): ¯ ¯ ∆ID ∆ID ¯¯ ∆ID ¯¯ = + (2.27) ID ID ¯∆VT ID ¯∆(1/β) It was found earlier that the first term on the right hand side is approximately valid from VGS = VT to VGS = VDD , while the second term is only valid in strong inversion. However, as was reasoned before, at low gate bias ∆ID /ID |∆VT is much larger than ∆ID /ID |∆(1/β) . It is therefore safe to use (2.26) for all gate biases greater than the threshold voltage without much loss of overall model accuracy. Extrapolating to even lower gate biases results in negative mobility terms, which might lead to singularities. To avoid this, when a mobility term in (2.20) turns negative, it is equated to zero, which is equivalent to removing it from the model.
2.3. MODELING OF MISMATCH IN THE DRAIN CURRENT
27
To calculate µ∆ID /ID and σ∆ID /ID , (2.4) and (2.5) need to be applied to (2.27). For σ∆ID /ID this results in: µ 2 = σ∆I D /ID
gm ID
¶2 2 2 2 σ∆V + β 2 σ∆(1/β + β 2 (VGS − VT − VDS /2)2 σ∆(1/ζ + (2.28) T 0) sr ) 2 2 + β 2 VDS σ∆(1/ζ + correlation terms. sat )
The width and length dependence of the variances of the mismatch parameters and the correlation factors will be modeled in the next section. We will now proceed with the development of the parameter extraction routine.
2.3.5
Parameter extraction
In the previous subsections two kinds of parameters were encountered, parameters related to the modeling of the drain current (VT , β0 , ζsr and ζsat ) and the parameters describing the mismatch (∆VT , ∆(1/β0 ), ∆(1/ζsr ) and ∆(1/ζsat )). Firstly, the extraction of drain-current-model parameters will be outlined. Secondly we will look into the extraction of the mismatch-model parameters. Drain-current-model parameters. Of the parameters VT , β0 , ζsr and ζsat , the first three are estimated from the ID − VGS curve in the linear regime at low drain bias (VDS = 50 mV) at which the ζsat term in (2.20) can be ignored. First ζsr is considered to be infinite and VT and β0 are determined by the maximum slope method6 : β0 =
1 gm · @ gm = gmmax 1 − ID /(ζsr VDS ) VDS
(2.29)
ID @ gm = gmmax , β0 VDS
(2.30)
VT = VGS − VDS /2 −
where gmmax is the maximum transconductance. The first factor on the right hand side of (2.29) is a correction for finite values of ζsr , which lower the transconductance. To estimate ζsr , (2.17) with (2.20) is rewritten into: ID (1 + (β0 /ζsr )(VGS − VT − VDS /2)) = β0 (VGS − VT − VDS /2)VDS ,
(2.31)
from which 1/ζsr is estimated by a linear least squares fit. Since this model is only valid in the strong inversion region, the fit ranges from VGS @gm = gmmax to VGS = VDD . This new value of ζsr is now introduced into (2.29) after which new values of β0 and VT are calculated. From these values a new value of ζsr can be calculated and so on. This process is iterated until no significant changes are observed. The parameter ζsat is extracted from the ID − VGS curve in saturation (VDS = VDD ). The approach is the same as the one followed to extract ζsr . Multiplying the left and right hand sides of (2.17) by 1/β results in a function that depends on 1/ζsat , like 6 The
maximum slope method will be illustrated in chapter 3, figure 3.1.
28
CHAPTER 2. MEASUREMENT AND MODELING OF MISMATCH
Table 2.2: Extracted values of VT , β0 , ζsr and ζsat for several device dimensions. The oxide thickness is equal to 2.8 nm. W (µm) 0.25 10.0 1.0 0.25
L (µm) 0.18 0.18 1.0 7.2
VT (V) 0.282 0.329 0.373 0.293
β0 (µA V−2 ) 516 17.8 · 103 279 10.6
ζsr (µA V−1 ) 1.70 · 103 45.3 · 103 3.17 · 103 156
ζsat (µA V−1 ) 996 36.0 · 103 2.75 · 103 136
(2.31) depends on 1/ζsr . From this, 1/ζsat is extracted by a least squares fit. Since the estimation takes place in strong inversion, in (2.17) and (2.20) VDS needs to be replaced by (VDSsat ), which is itself a function of ζsat . To avoid this problem the following estimate for the saturation voltage is used: s ¯ (VGS − VT − VDS /2)VDS ¯¯ ∼ VDSsat = 2 · ID |VDS =VDD . (2.32) ¯ ID VDS =50 mV The extracted parameters are summarized in table 2.2. Mismatch-model parameters. We will proceed with the extraction of ∆VT , ∆(1/β0 ), ∆(1/ζsr ) and ∆(1/ζsat ). It follows from (2.7), (2.26) and (2.27) that ∆ID /ID depends linearly on these parameters. The most straightforward way to extract the parameters is to use a linear fit of these equations to the experimental ∆ID /ID − VGS curves. Both the curve at VDS = 50 mV and at VDS = VDD are included in the fit. The gate bias ranges from the minimum out of figure 2.5a to VGS = VDD . A disadvantage of this method is illustrated in figure 2.9, which shows the mismatch in one transistor pair as a function of the gate bias. It is observed that the mismatch at low gate bias is much higher than at high gate bias. Therefore, a small inaccuracy in the model at low gate bias could have a large impact on the obtained relative accuracy at higher gate biases. To avoid this problem, we choose not to minimize the sum of the squared differences, but to minimize the sum of the squared relative differences. This normalization is achieved by dividing the left- and right-hand sides of (2.27) by σ∆ID /ID before performing the fit. The result of this fit is also shown the figure 2.9.
2.3.6
Model accuracy
This subsection examines the accuracy of the developed model by comparing experimental σ∆ID /ID − VGS curves to the modeled ones. Figure 2.10 shows the comparison at zero bulk bias with the drain bias as a parameter. For all shown dimensions it is seen that the curves at VDS = 50 mV and VDS = 1.8 V are well described by the model. The curves at VDS = 0.3 V and VDS = 0.9 V were not included in the fit and are seen to be well predicted. Figure 2.11 shows the model accuracy for all measured
2.4. WIDTH AND LENGTH DEPENDENCE
29 NMOS W = 10.0 mm L = 0.18 mm
DID/ID (%)
10
1
VDS = 50 mV VDS = 1.8 V
0.1
0
0.5
1
1.5
2
VGS (V)
Figure 2.9: The ∆ID /ID − VGS curves of a certain transistor pair at VDS = 50 mV and VDS = VDD . Also shown is a standard least squares fit to these curves (full lines) and a weighted least squares fit (dashed lines). The drain-current-model parameters are listed in table 2.2.
dimensions at zero bulk bias and |VDS | = 1.8 V. In the strong inversion region the model is seen to describe the measurements within the required 20 % accuracy. In weak inversion the accuracy decreases as expected (see subsection 2.3.2 and figure 2.5). Figure 2.12 compares experimental and modeled σ∆ID /ID −VGS curves at VDS = 1.8 V with the bulk bias as a parameter. Again the curves are seen to be well described. Note that all of the parameters, except α, have been extracted at zero bulk bias. The values for α are taken from figure 2.8a. Figure 2.13 shows the model accuracy for all measured pair dimensions at |VDS | = | − VBS | = 1.8 V. Again, within the strong inversion region the model is seen to describe the measurements within the required 20 % accuracy range. In weak inversion the accuracy decreases. Note that in this figure the weak inversion region is larger than that in figure 2.11, since the threshold voltage increases with decreasing bulk bias.
2.4
Width and length dependence
In the previous section a model was developed to describe the mismatch in the drain currents of a transistor pair. In this section the width and length dependence will be 2 modeled of the variance of the mismatch in a certain parameter (σ∆P ) (subsection 2.4.1) and of the correlation factors between the mismatches in parameters (subsection 2.4.2). The last subsection tests and demonstrates the derived model.
30
CHAPTER 2. MEASUREMENT AND MODELING OF MISMATCH
VDS = 0.05 V VDS = 0.3 V VDS = 0.9 V VDS = 1.8 V
10
s(DID/ID) (%)
s(DID/ID) (%)
100
10
NMOS W = 10.0 mm L = 0.18 mm
NMOS W = 0.25 mm L = 0.18 mm 0
VDS = 0.05 V VDS = 0.3 V VDS = 0.9 V VDS = 1.8 V
1
0.5
1
1.5
0
2
0.5
NMOS W = 1.0 mm L = 1.0 mm 1
0
1
2
NMOS W = 0.25 mm L = 7.2 mm
1
0.5
1.5
VDS = 0.05 V VDS = 0.3 V VDS = 0.9 V VDS = 1.8 V
10 s(DID/ID) (%)
s(DID/ID) (%)
VDS = 0.05 V VDS = 0.3 V VDS = 0.9 V VDS = 1.8 V
10
1 VGS (V)
VGS (V)
1.5
2
0
0.5
VGS (V)
1
1.5
2
VGS (V)
20
s(model) / s(exp.) - 100 (%)
s(model) / s(exp.) - 100 (%)
Figure 2.10: Experimental (symbols) and modeled (lines) σ∆ID /ID − VGS curves for several values of the drain bias and device dimensions. VBS = 0 V
0 -20 0.25/0.18 10.0/0.18 0.25/0.25 7.1/0.25 0.35/0.35 4.2/0.45 1.0/1.0
-40 -60 -80 -100
0
0.5
1 VGS (V)
(a) NMOS
1.4/1.9 1.9/2.7 0.55/3.3 4.0/4.0 0.35/5.3 0.25/7.2 10.0/7.2
1.5
2
20 0 -20 -40 -60 -80 -100
-2
0.25/0.18
1.4/1.9
10.0/0.18
1.9/2.7
0.25/0.25
0.55/3.3
7.1/0.25
4.0/4.0
0.35/0.35
0.35/5.3
4.2/0.45
0.25/7.2
1.0/1.0
10.0/7.2
-1.5
-1
-0.5
0
VGS (V)
(b) PMOS
Figure 2.11: Relative difference between the modeled and experimental σ∆ID /ID −VGS curves for all measured device dimensions. The W/L ratios are included in the plots. |VDS | = 1.8 V, VBS = 0 V
2.4. WIDTH AND LENGTH DEPENDENCE
31
VBS = 0.0
s(DID/ID) (%)
s(DID/ID) (%)
100
VBS = -0.9 VBS = -1.8
10
VBS = 0.0 VBS = -0.9 VBS = -1.8
10
NMOS W = 10.0 mm L = 0.18 mm
NMOS W = 0.25 mm L = 0.18 mm 1 0
0.5
1
1.5
0
2
0.5
1
VBS = -0.9 VBS = -1.8 10
VBS = -0.9
10
VBS = -1.8
NMOS W = 1.0 mm L = 1.0 mm 0
0.5
1
2
VBS = 0.0 s(DID/ID) (%)
s(DID/ID) (%)
VBS = 0.0
1
1.5
VGS (V)
VGS (V)
1.5
NMOS W = 0.25 mm L = 7.2 mm
1
2
0
0.5
VGS (V)
1
1.5
2
VGS (V)
20
s(model) / s(exp.) - 100 (%)
s(model) / s(exp.) - 100 (%)
Figure 2.12: Experimental (symbols) and modeled (lines) σ∆ID /ID − VGS curves for several values of the bulk bias and device dimensions. VDS = 1.8 V
0 -20 -40 -60 -80 -100
0
0.5
0.25/0.18
1.4/1.9
10.0/0.18
1.9/2.7
0.25/0.25
0.55/3.3
7.1/0.25
4.0/4.0
0.35/0.35
0.35/5.3
4.2/0.45
0.25/7.2
1.0/1.0
10.0/7.2
1 VGS (V)
(a) NMOS
1.5
2
20 0 -20 0.25/0.18 10.0/0.18 0.25/0.25 7.1/0.25 0.35/0.35 4.2/0.45 1.0/1.0
-40 -60 -80 -100
-2
-1.5
1.4/1.9 1.9/2.7 0.55/3.3 4.0/4.0 0.35/5.3 0.25/7.2 10.0/7.2
-1
-0.5
0
VGS (V)
(b) PMOS
Figure 2.13: Relative difference between the modeled and experimental σ∆ID /ID −VGS curves for all measured device dimensions. The W/L ratios are included in the plots. |VDS | = | − VBS | = 1.8 V
32
2.4.1
CHAPTER 2. MEASUREMENT AND MODELING OF MISMATCH 2 Width and length dependence of σ∆P
2 Several publications exist that deal with modeling σ∆P (W, L). The most referred to is the one published by Pelgrom et al. [5]. This subsection is started by presenting a summary of this work. It is assumed that the parameter P can locally be defined as P (x, z) = µP + δP (x, z) and that the overall transistor parameter P is given by averaging P (x, z) over the area of the transistor. Now assume that the first device is located between the coordinates {x1 , z1 } and {x1 + L, z1 + W } and that the second device is located between {x2 , z2 } and {x2 + L, z2 + W }. The mismatch between the two devices is then given by: ÃZ Z ! Z Z {x1 +L,z1 +W } {x2 +L,z2 +W } 1 ∆P = δP (x0 , z 0 )dx0 dz 0 − δP (x0 , z 0 )dx0 dz 0 . WL {x2 ,z2 } {x1 ,z1 } (2.33) This equation can be interpreted as the convolution of a mismatch causing disturbance function P (x, z) and a geometry function G(x, z), which is given by: −1 W L {x, z} ∈ {{x1 , z1 }, {x1 + L, z1 + W }} 1 {x, z} ∈ {{x2 , z2 }, {x2 + L, z2 + W }} i = 1, 2. G(x, z) = (2.34) WL 0 {x, z} ∈ / {{xi , zi }, {xi + L, zi + W }}
Convolution in the space domain is equivalent to multiplication in the spacial frequency domain: ∆P(ωx , ωz ) = G(ωx , ωz ) · δP(ωx , ωz ), (2.35) where G(ωx , ωz ) and δP(ωx , ωz ) are the two-dimensional Fourier transforms of G(x, z) and δP (x, z), respectively. From (2.34) the first is calculated to be: sin(Lωx /2)sin(W ωz /2) ³ i(x2 +L/2)ωx +i(z2 +W/2)ωz G(ωx , ωz ) = e − ··· (2.36) (Lωx /2)(W ωz /2) ´ · · · ei(x1 +L/2)ωx +i(z1 +W/2)ωz 2 From basic spectral theory it follows that σ∆P is equal to: Z ∞Z ∞ 2 σ∆P = |∆P(ωx , ωz )|2 dωx dωz = −∞
(2.37)
−∞
Z
∞
Z
∞
= −∞
|G(ωx , ωz )|2 · |δP(ωx , ωz )|2 dωx dωz ,
−∞
where |δP(ωx , ωz )|2 is the power spectrum of δP (x, z). When the lowest significant frequency of the mismatch generating process is much larger than 1/W and 1/L, a mismatch causing event in one device does not have an impact on the other device and (2.37) can be approximated by: Z ∞Z ∞ A20,∆P 8π 2 |δP(0, 0)|2 2 ≡ . (2.38) σ∆P = |G(ωx , ωz )|2 · |δP(0, 0)|2 dωx dωz = WL WL −∞ −∞
2.4. WIDTH AND LENGTH DEPENDENCE
33
2 Summarizing, it is found that σ∆P is proportional to 1/W L and the proportionality 2 constant A0,∆P . Until now we have implicitly assumed that the parameter under consideration is itself not a function of the width and length. If this is not the case, the model needs to be adapted. We will assume that the width and length dependence of the parameter P (x, z) can be written as:
P (W, L, x, z) = f(W, L) · P 0 (x, z),
(2.39)
where f(W, L) models the width and length dependence of P, but is constant in space, and P 0 (x, z) is independent of the width or length, but does vary with x and z. It easily follows that in this case: A20,∆P 0 , (2.40) WL The equations presented until now are valid for long and wide transistors. Corrections are required for short and narrow transistors [16, 17, 19, 22, 29–33, 35, 37, 38, 40, 51, 59– 65]. The physical aspects of these short- and narrow-channel effects will be discussed in chapter 4. Here, we will limit ourselves to the following empirical description: 2 σ∆P = f(W, L)2
A20,∆P AL,∆P AW,∆P AW L,∆P + + + , (2.41) WL W L2 W 2L W 2 L2 where the terms containing AL,∆P , AW,∆P and AW L,∆P describe to first order deviations for short, narrow and short and narrow device pairs, respectively. When P is a function of the width or length, the right hand side again needs to be multiplied by f(W, L)2 . 2 σ∆P =
2.4.2
Width and length dependence of correlation factors
When the mismatches in two parameters are correlated, this means that they are partially determined by the same mismatch causing mechanism. For long and wide transistors the mechanisms determining the operation of the MOS transistor do not vary with width and length. Therefore, the correlation factor is expected to be constant. Again, for short or narrow transistors deviations can be expected, which are modeled as follows: AL,∆P1 ,∆P2 AW,∆P1 ,∆P2 AW L,∆P1 ,∆P2 ρ(∆P1 , ∆P2 ) = A20,∆P1 ,∆P2 + + + . (2.42) L W WL Note that the correlation factor, and therefore A20,∆P1 ,∆P2 , can be negative.
2.4.3
Matching properties of a 0.18 µm CMOS process
In this subsection the theory developed in the previous two subsections will be applied to the 0.18 µm CMOS technology, that was briefly introduced in section 2.2. The variances of the mismatch-model parameters (∆VT , ∆(1/β0 ), ∆(1/ζsr ) and ∆(1/ζsat )),
34
CHAPTER 2. MEASUREMENT AND MODELING OF MISMATCH
the correlation between these parameters and the parameter α, related to the bulk bias dependence of σ∆VT , were already extracted in section 2.3. The parameters A20 , AL , AW and AW L related to the variances are extracted by a linear weighted least 2 ’s for different widths and squares fit of (2.41) to the experimentally obtained σ∆P 2 ), which can be lengths. The weight attributed to each point is equal to 1/σ 2 (σ∆P calculated using (2.1). Since the current factor is proportional to W/L, the parame2 2 2 ters σ∆(1/β , σ∆(1/ζ and σ∆(1/ζ are multiplied by (W/L)2 prior to the fit. The 0) sr ) sat ) parameters related to the width and length dependence of the correlation factors and α are extracted by a normal linear least squares fit of (2.42) to the experimental data. Besides extracting A20 , AL , AW and AW L , standard regression analysis is applied to determine the standard deviations of these parameters. When this standard deviation is larger than the absolute value of the parameter itself, it is concluded that the obtained value is not significant. In such a case the parameter is removed from the model by equating it to zero after which the fit and regression analysis are repeated. The results of this exercise are presented in table 2.3. Figure 2.14 compares the experimental and modeled width and length dependence of σ∆VT , σ∆(1/β0 ) , σ∆(1/ζsr ) and σ∆(1/ζsat ) and figure 2.15 shows the comparison for the most significant correlation factors. It is observed that (2.41) and (2.42) provide a good description of the experimental data. It follows from table 2.3 that the main part of the mismatch in the drain current is caused by σ∆VT and σ∆(1/β0 ) . We will briefly discuss the obtained results. A more thorough investigation of the physical and technological origins of MOSFET mismatch will be presented in chapter 4 and 5, respectively. Firstly note that the value of A0,∆VT for the NMOS transistors is roughly 50 % higher than those published in literature for 0.18 µm CMOS technologies [6]. This is mainly due to the choice of an amorphous silicon gate material instead of using a fine-grain poly-silicon gate [21, 66]. This will be experimentally verified in section 5.2. The reasonably low value of A0,∆VT for PMOS devices indicates that the boron gate doping at the gate-oxide interface is uniform and high enough for gate-depletion effects to be under control. The nitrided gate oxide effectively prevents boron penetration, which would seriously degrade the matching performance. The relatively large increase in mismatch for short-channel PMOS transistors is due to the absence of halos, which causes the effective channel length to be significantly smaller than the metallurgical channel length. Secondly, it is observed that the mismatch decreases for narrow NMOS transistors. This could be caused by a commonly observed lower boron doping concentration close to the shallow-trench isolation. Generally, this narrow-channel effect is less pronounced for PMOS transistors. Therefore, the decreases in mismatch are less significant or not significant at all. Thirdly, significant negative correlations are observed between ∆VT and ∆(1/ζsat ) and between ∆(1/β0 ) and ∆(1/ζsr ), which do not have a clear physical origin. We therefore conclude that they are mainly related to inaccuracies caused by the simplicity of the model. The correlation between ∆(1/β0 ) and ∆(1/ζsr ) is caused by the simplified expressions for the mobility. More complicated expressions will be presented in chapter 4. The correlation between ∆VT and ∆(1/ζsat ) is caused by neglecting the dependence of the local threshold voltage on its lateral position. Again, this will be
2.4. WIDTH AND LENGTH DEPENDENCE
35
Table 2.3: Extracted parameters for describing the width and length dependence of the variances of the mismatch-model parameters and the correlation coefficients between these parameters. 1σ confidence intervals are included. Dimensions are such that the width and length are given in µm, σ∆VT in mV, σ∆(1/β0 ) in ΩV and σ∆(1/ζsr ) and σ∆(1/ζsat ) in Ω. Transistor-model parameters are listed in table 2.2. A20 2 σ∆V T 2 σ∆(1/β0 ) 2 σ∆(1/ζ sr ) 2 σ∆(1/ζsat ) ρ∆VT ,∆(1/β0 ) ρ∆VT ,∆(1/ζsr ) ρ∆VT ,∆(1/ζsat ) ρ∆(1/β0 ),∆(1/ζsr ) ρ∆(1/β0 ),∆(1/ζsat ) ρ∆(1/ζsr ),∆(1/ζsat ) α
AL NMOS
AW
AW L
36.6 ± 2.8
4.87 ± 0.97
−4.0 ± 1.0
0
4.92e3 ± 0.75e3
0.93e3 ± 0.42e3
−0.47e3 ± 0.29e3
−0.29e3 ± 0.12e3
0.92e3 ± 0.11e3
61 ± 49
−141 ± 39
−19.1 ± 13.9
880 ± 97
0
−136 ± 32
0
0.045 ± 0.036
−0.016 ± 0.013
0
0
−0.186 ± 0.033
0.050 ± 0.015
0
−0.0182 ± 0.0045
−0.587 ± 0.046
0.042 ± 0.017
0.093 ± 0.023
−0.0099 ± 0.0067
−0.923 ± 0.023
0.0488 ± 0.0071
0.0169 ± 0.0092
0
0
0
0
0
0.049 ± 0.036
−0.065 ± 0.017
0
0.0098 ± 0.0050
0.338 ± 0.041
−0.030 ± 0.015
0.054 ± 0.021
−0.0162 ± 0.0061
PMOS 2 σ∆V T 2 σ∆(1/β0 ) 2 σ∆(1/ζ sr ) 2 σ∆(1/ζsat ) ρ∆VT ,∆(1/β0 ) ρ∆VT ,∆(1/ζsr ) ρ∆VT ,∆(1/ζsat ) ρ∆(1/β0 ),∆(1/ζsr ) ρ∆(1/β0 ),∆(1/ζsat ) ρ∆(1/ζsr ),∆(1/ζsat ) α
11.8 ± 1.2
6.1 ± 1.3
0
−0.89 ± 0.39
43.5e3 ± 4.0e3
−3.64e3 ± 0.87e3
−2.9e3 ± 1.1e3
0
8.7e3 ± 1.5e3
−0.63e3 ± 0.35e3
−1.09e3 ± 0.38
0
3.30e3 ± 0.55e3
0.85e3 ± 0.46e3
0
−0.21e3 ± 0.12e3
−0.161 ± 0.055
−0.077 ± 0.025
0
0.0109 ± 0.0076
0.191 ± 0.039
−0.172 ± 0.018
0
0.0138 ± 0.0054 −0.0121 ± 0.0056
−0.173 ± 0.040
0.155 ± 0.019
0
−0.66 ± 0.11
0.281 ± 0.040
0
0
−0.123 ± 0.098
−0.198 ± 0.035
0
0
0
−0.143 ± 0.036
0
0
0.294 ± 0.042
−0.046 ± 0.015
0
0
36
CHAPTER 2. MEASUREMENT AND MODELING OF MISMATCH
7
s(DVT) (mV)
1.0/1.0
10.0/0.18 7.1/0.25
5
4.2/0.45
4
0.55/3.3 0.35/5.3 0.25/7.2
1.4/1.9
3 2
4.0/4.0
1.9/2.7
1 0
0.2
0.4
0.6
0.8
1.4/1.9 0.25/7.2
40 0.35/5.3
4.0/4.0
1.9/2.7
20
1
0
0.2
W L (mm)
35
0.4
0.6
1/
W L (mm)
0.8
1
35 1.0/1.0
NMOS
30
7.1/0.25 10.0/0.18 4.2/0.45
25 20
(W/L) s(D1/zsat) (W)
(W/L) s(D1/zsr) (W)
0.55/3.3 4.2/0.45
60
0 1/
0.55/3.3
1.4/1.9
0.25/7.2
15 0.35/5.3
10
4.0/4.0
1.9/2.7
5
NMOS
30
0
0.2
1.0/1.0 10.0/0.18 7.1/0.25
25 20
0.55/3.3 0.25/7.2 4.2/0.45
1.4/1.9
15
1.9/2.7 0.35/5.3
10
4.0/4.0
5
10.0/7.2
0
1.0/1.0
7.1/0.25 10.0/0.18
10.0/7.2
10.0/7.2
0
NMOS
80 (W/L) s(D1/b0) (W V)
NMOS
6
10.0/7.2
0.4
0.6
0.8
1/
W L (mm)
0
1
0
0.2
0.4
0.6
1/
W L (mm)
0.8
1
Figure 2.14: Experimental (open symbols) and modeled √ (full symbols) values of σ∆VT , σ∆(1/β0 ) , σ∆(1/ζsr ) and σ∆(1/ζsat ) as a function of 1/ W L. In the plots the W/L ratios of the transistors are given in µm/µm. The full line represents the modeled result if only A0 is taken into account. Error bars represent 99 % confidence intervals.
0
0
-0.2
0.35 1.4
-0.4
0.25 10.0 7.1
4.2
0.35
0.55 1.0
-0.6
1.9
4.0 10.0
-0.8 NMOS -1 0.1
NMOS
0.25
r(D1/b0 , D1/zsr) (-)
r(DVT , D1/zsat) (-)
0.18
1 L (mm)
10
-0.2 -0.4 10.0
-0.6
0.18
-0.8 -1 0.1
0.25 7.1 0.35
4.2
0.55
1.0
0.350.25
1.4 1.9 4.0
1
10.0
10
L (mm)
Figure 2.15: Experimental (open symbols) and modeled (full symbols) values of ρ(∆VT , ∆(1/ζsat )) and ρ(∆(1/β0 ), ∆(1/ζsr )) as a function of the length. In the plots the width of the transistors is given in µm. The full line represents the modeled result if only A0 is taken into account.
2.5. CONCLUSIONS
37
further discussed in chapter 4.
2.5
Conclusions
This chapter dealt with two subjects, the measurement of mismatch in the drain current and the modeling of this mismatch. Firstly, our mismatch measurement setup was described and an overview was presented of commonly used test-structures for qualifying the matching properties of a certain technology. Secondly the draincurrent-mismatch model was developed. The model was tested on a 0.18 µm CMOS technology. In the derivation of the model we strived for a physics based one, valid over a large bias range, continuous between different regions of operation and as simple as possible. The relative difference between model and measurement data should be smaller than 20 %. The mismatch in the drain current was assumed to be split up in a contribution due to threshold-voltage mismatch and a contribution due to current-factor mismatch, which were dealt with separately. In calculating the impact of a mismatch in threshold voltage on the drain current, the only assumption made is that the drain current is a function of the gate-overdrive voltage, but not of the gate bias nor the threshold voltage separately. The resulting model was found to be valid in strong inversion and in the upper part of the moderate inversion region. In weak inversion deviations were observed. Threshold voltage mismatch was found to depend only weakly on the drain bias. Therefore, this bias dependency was not taken into account. The bulk bias dependence of the mismatch in the threshold voltage is not modeled directly, since this is considered to be nonphysical. Instead we modeled the bulk bias dependence of the standard deviation of the mismatch in the threshold voltage. This was done in a semi-empirical way. The impact on the drain current of a mismatch in the current factor was split up in three contributions, related to mobility limiting effects, i.e. bulk mobility, phonon scattering, surface roughness scattering and velocity saturation. Mismatch in series resistance is also accounted for by the parameters, related to the current factor. The expression for the saturation voltage was properly derived, which resulted in continuity from the linear to the saturation regime. To extract the mismatch in the model parameters, a weighted least squares fit was introduced as opposed to a normal least squares fit. In this way small modeling errors at low gate biases were prevented from seriously degrading the accuracy at higher values of the gate bias. In strong inversion the model reached the accuracy requirement for all examined drain- and bulk-bias conditions. To obtain good accuracy in the weak inversion region a modeling effort is still required. The width and length dependence of the extracted variances was described by the model of Pelgrom et al.. The correlation factors were reasoned to be independent of width and length. The models were extended to take short and narrow channel effects into account, which leads to an accurate description of the experimental data.
Chapter 3
Parameter extraction In the previous chapter a model was derived for describing mismatch in the drain current. The presented technique to extract the model parameters was developed in such a way as to give the most accurate description of the mismatch in the drain current. However, does this approach also yield the most meaningful values of the model parameters themselves? The question we have to ask ourselves is whether this is important. The answer depends on the person who is asking. For a circuit designer, the answer is no, since his main goal is model accuracy. However, when doing process development, one is more interested in the physical meaningfulness of extracted parameters, since it would help to better understand what is happening inside the devices. For process monitoring, the required measurement time plays an important role, which is related to measurement accuracy. These requirements – model accuracy, physical meaningfulness of parameters and measurement accuracy and time – can, to a certain degree, be in conflict with each other. In literature, several techniques have been presented, that extract mismatch model parameters [5, 17–19, 28–30, 32–40, 67–69]. They all claim to extract mismatch in threshold voltage and mismatch in current factor, which are considered to be, more or less, well defined physical parameters of the MOS transistor. However, we will find that different methods can yield significantly different results, which leads to completely different conclusions for the technology under investigation. This chapter investigates and compares the most commonly used extraction methods. This is done in relation to the above mentioned requirements. The first section of this chapter introduces the extraction methods under investigation. Section 3.2 explains the experimental setup and the applied criteria. The actual comparison between the methods is made in section 3.3. Finally, section 3.4 discusses issues, which might start to affect mismatch parameter extraction for future technologies. Section 3.5 concludes this chapter.
39
40
CHAPTER 3. PARAMETER EXTRACTION
3.1
Extraction methods
Methods to extract the mismatch in threshold voltage (∆VT ), current factor (∆β/β) or any other transistor parameter (∆P ), can be divided into two groups. The methods in the first group extract transistor parameters for each transistor separately and subtract the results (∆P = P2 − P1 ). Commonly used methods are: • Maximum slope method (or steepest slope method) • Three points method [70] • Four points method [71] • Applying a current criterion (only for threshold voltage) The methods in the second group extract the mismatch ∆P directly by a fit to ∆ID /ID − VGS -curves (e.g. see chapter 2). These methods are called currentmismatch-fitting methods. The examined methods will now briefly be described. Maximum slope method. The drain current is measured as a function of the gate bias in the linear regime (low drain bias). The tangent is taken at the place were the steepest slope (gmmax ) occurs. The current factor is equal to β = gmmax /VDS . The gate bias where the tangent and ID = 0 V intercept is equal to VT + VDS /2. The maximum slope method is illustrated in figure 3.1. Note that this method is purely intended for extracting parameters. It does not present a model for the drain current as a function of the bias conditions. Three points method. The drain current is measured at three gate bias points in strong inversion in the linear regime. The first point is roughly located at maximum 600
400
NMOS VDS=50 mV W=10.0 mm L=0.18 mm
300
800 600 400
200 200
100 0
gm (mA V-1)
ID (mA)
500
0
0.2 0.4 0.6 0.8 1 VT+VDS/2 VGS (V)
0 1.2 1.4 1.6
Figure 3.1: Illustration of the maximum slope method
3.1. EXTRACTION METHODS
41 600
600
400 {VGS2, ID2}
200
200
100
{VGS1, ID1} 0
0.2 0.4 0.6 0.8
1
0 1.2 1.4 1.6
VGS (V)
(a) Fixed bias points
500 ID (mA)
600
300
0
800
400
NMOS VDS=50 mV W=10.0 mm L=0.18 mm
800
0.8 V
600 0.4 V
300
400
200
0.2 V
200
100 0
0
gm (mA V-1)
400
{VGS3, ID3}
gm (mA V-1)
ID (mA)
500
NMOS VDS=50 mV W=10.0 mm L=0.18 mm
0.2 0.4 0.6 0.8 1 VTinit VGS (V)
0 1.2 1.4 1.6
(b) Constant overdrive
Figure 3.2: Illustration of the three points method. The solid symbols are measured, open symbols are added for clearness. (a) The three gate-bias conditions have fixed values. (b) The three gate-bias conditions have fixed overdrives with respect to an earlier determined threshold voltage. transconductance, the second bias point 100-300 mV higher and the third bias point at ‘high’ gate bias (see figure 3.2). The threshold voltage, current factor and mobility reduction factor (θ) are extracted by solving the following set of equations, which can be done in an analytical way: IDi =
β (VGSi − VT − VDS /2)VDS , 1 + θ(VGSi − VT − VDS /2)
i = 1, 2, 3
(3.1)
A distinction is made with respect to the way the transistors are biased. 1) The three gate bias points have fixed values (e.g. {VGS1 , VGS2 , VGS3 } = {0.7, 0.9, 1.3} V, see figure 3.2a). 2) The gate is biased with a fixed overdrive with respect to an initial threshold voltage (e.g. {VGS1 , VGS2 , VGS3 } = VT init + {0.2, 0.4, 0.8} V, see figure 3.2b). The initial threshold voltage can be determined with any suitable extraction method, including the three points method itself. A couple of iteration cycles can be used in which the latest obtained threshold voltage is taken as the initial threshold voltage for the next cycle. The method that uses fixed overdrive voltages has the advantage that it will yield good results, also when the threshold voltage is, a-priori, not very well known. A disadvantage is the increase in measurement time. Four points method. This method is similar to the three points method. However, a more accurate drain current model is used that also takes second order mobility reduction into account: IDi =
β(VGSi − VT − VDS /2)VDS . 1 + θ1 (VGSi − VT − VDS /2) + θ2 (VGSi − VT − VDS /2)2
i = 1, · · · , 4
(3.2) The extra parameter (θ2 ) requires one extra measurement point. This set of equations cannot be solved in an analytical way. Therefore, the use of a numerical optimization routine is necessary.
42
CHAPTER 3. PARAMETER EXTRACTION 3
10
2
IDcrit
10
1
ID (mA)
10
0
10 10
-1
10
-2
10
-3
10
-4
10
-5
NMOS VDS=50 mV W=10.0 mm L=0.18 mm 0
0.2 0.4 0.6 0.8 VT VGS (V)
1
1.2 1.4 1.6
Figure 3.3: Illustration of applying a current criterion to obtain the threshold voltage. An interpolation algorithm is used to extract the correct gate bias. Applying a current criterion. The threshold voltage is defined as the gate bias at a certain current level (IDcrit ). Experimentally obtained current levels at which good values for the threshold voltage are found are (W/L) ∗ 400 nA for NMOS and (W/L) ∗ 100 nA for PMOS transistors. Several options are available for finding the correct gate bias. 1) The measurement equipment itself can search for the gate bias, belonging to a particular drain current. Though accurate, this approach can be slow. 2) Another option is to measure the full ID − VGS -curve and use an interpolation algorithm. This is illustrated in figure 3.3. 3) The fastest approach is to connect the gate to the drain and to force IDcrit into the drain (IG ¿ ID ), which gives VGS = VDS = VT . However, this way does not allow for evaluating the drain-bias dependence of threshold-voltage mismatch. In this work the interpolation approach is used, because it was most easy to implement. As for the maximum slope method, this method is purely intended for extracting parameters. It does not present a model. Current-mismatch-fitting methods. An explanation of current-mismatch-fitting methods was presented in section 2.3. The mismatch in the drain current (∆ID /ID ) is a linear function of the mismatch in the threshold voltage (∆VT ) and current factor (∆β/β). This function is obtained by using a first-order Taylor expansion. The mismatch parameters are extracted by means of a linear least squares fit to experimental ∆ID /ID − VGS curves. Two of these methods will be investigated. The first was published in [35], and will be referred to as fitting method A. The model that is fitted is very similar1 to: ∆β gm ∆ID = − ∆VT (3.3) ID β ID 1 As opposed to [35], we have taken the liberty of using (2.7) instead of modeling g /I . See m D subsection 2.3.2.
3.2. EXPERIMENTAL SETUP
43
The fitting range starts at the gate bias where maximum transconductance occurs (in the linear region) and ends at VDS = VDD = 1.5 V. No weight is attributed to the measurement points. The second method under investigation is the one described in chapter 2, which will be referred to as fitting method B.
3.2
Experimental setup
In this section the experimental background is provided for comparing the examined extraction methods. Decisions are made concerning: used technology, type of measured devices, geometries of examined device pairs, the amount of measured pairs, test-structure layout, what to measure and data filtering. Technology. The technology chosen for this experiment is the 0.13 µm technology published in [72], that has a physical oxide thickness of 2.0 nm and a supply voltage of |VDD | = 1.5 V. At the time of this work, this experimental technology was stable enough to obtain consistent and relevant results, while for matching studies the technology was advanced. Issues that are not yet important for this technology, but that might appear for future technologies, are described in section 3.4. Type of devices. In this chapter, only results for NMOS transistors are presented. Similar results were obtained for PMOS transistors. Device pair geometries. In order to limit measurement time only a selective, but representative, set of transistor pair dimensions was measured, namely: W (µm) 0.25 10.0 1.0 0.25 10.0
L (µm) 0.18 0.18 1.0 7.2 7.2
narrow and short wide and short wide and long narrow and long large area
Amount of measured pairs. The sample size for this experiment is 84 device pairs per geometry. This results in a relative accuracy of the extracted standard deviations of √ σσ∆P /σ∆P = 1/ 2Ndev = 7.7%. Test-structure layout. The most frequently used common source/gate/bulk, separate drains layout was chosen as test structure (see figure 2.2 and figure 2.3a). What to measure. For each transistor two ID − VGS -curves are measured, one in the linear regime at VDS = 50 mV and one in saturation at VDS = VDD = 1.5 V. The gate bias ranges from 0.0 to 1.5 volts in steps of 50 mV. All parameters are
44
CHAPTER 3. PARAMETER EXTRACTION
extracted from the same measured curves. This avoids artifacts in the comparisons due to device or bonding pad degradation. It can occur that an extraction method requires measurements at bias conditions that are not measured, e.g. in the case of the fixed-overdrive three-points method or when applying a current criterion. In these cases an interpolation algorithm is applied on the base curves in order to get the needed ‘measurement’ data. Data filtering. Although the investigated technology was stable, at the time of this work it was also still in an experimental phase. Therefore, there might still be some yield issues that can cause extreme parameter shifts for a small amount of the measured devices, that are not related to microscopic fluctuations. To filter out these outliers, a 3σ criterion is applied to the extracted parameters, which is repeated until no more outliers are observed. It might happen that for one extraction method a certain device pair falls just outside the 3σ interval, while for another method it would be just on the inside of it. This could lead to erroneous conclusions when comparing methods. To avoid this problem, a device pair is removed from all data sets when it is considered as an outlier in one of them.
3.3
Comparison of extraction methods
This section compares the extraction methods, that were presented in section 3.1. The methods will be compared with respect to model accuracy (subsection 3.3.1), measurement accuracy (subsection 3.3.2) and physical meaningfulness of parameters (subsection 3.3.3). The obtained results will be summarized in subsection 3.3.4.
3.3.1
Model accuracy
To examine the model accuracy, predicted σ∆ID /ID -curves are compared with the experimental curves, that were used to extract the parameters from (see e.g. section 2.3.6). The results of this exercise are presented in figure 3.4. In discussing the results we will distinguish between the direct extraction methods and the current-mismatchfitting methods. Direct extraction methods. With respect to model accuracy the examined direct extraction methods are limited to the three and four points methods. The maximum slope method and current criterion method are not based on a complete description of the drain current and are therefore disregarded. For the direct methods parameters are extracted in the linear region at low drain bias (VDS =50 mV). It is seen that at this bias condition all methods yield satisfactory accuracy (|σmodel /σexperimental −100%| < 20%). At higher gate biases the four points method gives a higher accuracy. This means that second order mobility reduction (the θ2 term in (3.2)) is present. However, its impact is not big enough to necessitate four points extraction.
3.3. COMPARISON OF EXTRACTION METHODS
45
Mostly, transistors in analog circuits are not biased in the linear regime but in the saturation regime. The models out of (3.1) and (3.2) can be extrapolated to the saturation region by replacing VDS by the saturation voltage, which is calculated by putting dID /dVDS = 0. It is seen, however, that this prediction of the mismatch in saturation does not yield accurate results. This is most apparent for short transistor pairs. It can therefore be concluded that effects like velocity saturation and drain induced barrier lowering have to be taken into account. Finally notice that the models out of (3.1) and (3.2) are only valid in strong inversion. Therefore, the weak inversion region was disregarded in the analysis. Current-mismatch-fitting methods. In figure 3.4 it is seen that both examined currentmismatch-fitting methods yield good accuracy in as well the linear as the saturation region. Method A gives the highest accuracy. This method extracts a different set of parameters (∆VT and ∆β/β) for the linear region and for the saturation region. It is therefore not continuous between both regions. The four parameters of method B (∆VT , ∆(1/β0 ), ∆(1/ζsr )) and ∆(1/ζsat ) are the same in both regions of operation. This results in continuity over the whole bias range, but at the cost of some accuracy. In weak inversion both models become inaccurate. This is due to the fact that the physical mechanisms, which determine threshold voltage mismatch, differ in strong and weak inversion. This will be further looked into in subsection 3.3.3 and in chapter 4.
3.3.2
Measurement accuracy and speed
To determine transistor mismatch, the almost equal drain currents (or other transistor related quantities) of two transistors are subtracted. However, the noise related to the two observations does not cancel out, but is additive. Therefore, determining transistor mismatch requires a much higher measurement accuracy than other transistor measurements. This measurement accuracy is related to required measurement time, since measurement noise can be averaged out by using longer integration times. This subsection mainly deals with measurement accuracy. Measurement speed will be addressed briefly at the end of the subsection. To examine the measurement accuracy, all measurements were repeated. The second measurement was done at a later time. This means that after the first measurement the wafer was removed from the system. Before the second measurement it had to be reinserted and realigned. Ideally the two measurements should yield exactly the same results. By comparing the extracted parameters of the first and second measurement, like in figure 3.5, the inaccuracy can be determined. As a figure of merit the correlation coefficient between the two measurements (ρ(∆P1 , ∆P2 )) is used, which is defined as: ρ(∆P1 , ∆P2 ) =
(∆P1 − ∆P1 ) · (∆P2 − ∆P2 ) . σ∆P1 · σ∆P2
(3.4)
CHAPTER 3. PARAMETER EXTRACTION 10 NMOS 5 VDS=50 mV
W=0.25 mm L=0.18 mm
0 -5 -10 -15
0
0.2 0.4 0.6 0.8
1
s(model) / s(exp.) – 100 (%)
s(model) / s(exp.) – 100 (%)
46
80 NMOS 60 VDS=1.5 V 40 20 0 -20
1.2 1.4 1.6
0
0.2 0.4 0.6 0.8
W=10.0 mm L=0.18 mm
0 -5 -10 -15
0
0.2 0.4 0.6 0.8
1
NMOS 60 VDS=1.5 V
20 0 -20 0
0.2 0.4 0.6 0.8
-5 -10
s(model) / s(exp.) – 100 (%)
s(model) / s(exp.) – 100 (%)
0
1
NMOS 20 VDS=1.5 V
0 -10 -20 0
0.2 0.4 0.6 0.8
0 -5 -10
VGS (V)
1
1.2 1.4 1.6
s(model) / s(exp.) – 100 (%)
s(model) / s(exp.) – 100 (%)
W=0.25 mm L=7.2 mm
0.2 0.4 0.6 0.8
1
1.2 1.4 1.6
VGS (V)
VGS (V)
0
W=1.0 mm L=1.0 mm
10
1.2 1.4 1.6
10
-15
1.2 1.4 1.6
30
-15
NMOS 5 VDS=50 mV
1
VGS (V)
W=1.0 mm L=1.0 mm
0.2 0.4 0.6 0.8
W=10.0 mm L=0.18 mm
40
1.2 1.4 1.6
10
0
1.2 1.4 1.6
80
VGS (V)
NMOS 5 VDS=50 mV
1
VGS (V) s(model) / s(exp.) – 100 (%)
s(model) / s(exp.) – 100 (%)
VGS (V)
10 NMOS 5 VDS=50 mV
W=0.25 mm L=0.18 mm
15 10
NMOS VDS=1.5 V
W=0.25 mm L=7.2 mm
5 0 -5 -10 -15 -20 -25
0
0.2 0.4 0.6 0.8 VGS (V)
1
1.2 1.4 1.6
10
s(model) / s(exp.) – 100 (%)
s(model) / s(exp.) – 100 (%)
3.3. COMPARISON OF EXTRACTION METHODS NMOS VDS=50 mV 5 W=10.0 mm L=7.2 mm 0 -5 -10 -15
0
0.2 0.4 0.6 0.8
1
10 NMOS 5 VDS=1.5 V
W=10.0 mm L=7.2 mm
0 -5 -10 -15 -20 -25
1.2 1.4 1.6
47
0
0.2 0.4 0.6 0.8
1
1.2 1.4 1.6
VGS (V)
VGS (V)
Figure 3.4: Model accuracy for several device-pair dimensions in the linear region (VDS = 50 mV and in the saturation region (VDS = 1.5 V. (•) three points method with fixed bias conditions, (¥) three points method with fixed gate overdrive, (¨) four points method with fixed bias conditions, (N) four points method with fixed gate overdrive, (H) current-mismatch-fitting method A, (×) current-mismatch-fitting method B. 6
NMOS W=10.0 mm L=0.18 mm
Db0/b0 (%) – measurement 2
DVT (mV) – measurement 2
10
5
0
r = 0.997
-5
4
NMOS W=10.0 mm L=0.18 mm
2 0 -2 r = 0.987
-4 -6
-10 -10
-5
0
5
DVT (mV) – measurement 1
(a) Threshold-voltage mismatch
10
-6
-4
-2
0
2
4
6
Db0/b0 (%) – measurement 1
(b) Current-factor mismatch
Figure 3.5: Two examples of measurement repeatability. The method used for extracting the parameters is the three points method with fixed gate overdrive.
Now assume that the measurement result (∆P ) can be split up in the part that we want to measure (∆Pmismatch ), which does not change significantly in time, and an unwanted part related to measurement inaccuracies (∆Pnoise ), i.e. ∆P1 = ∆Pmismatch + ∆Pnoise1 . When we further assume that ρ(∆Pmismatch , ∆Pnoise ) = 0, ρ(∆Pnoise1 , ∆Pnoise2 ) = 0 and that σ∆Pnoise is time invariant , it easily follows that 2 (3.4) calculates the part of σ∆P that is caused by the actual mismatch. The other part is attributed to noise and fluctuations in the resistance between the probe tip
48
CHAPTER 3. PARAMETER EXTRACTION
Table 3.1: Correlation coefficients between two measurements of ∆VT , evaluating measurement repeatability. The device width/length ratios at the top of the columns is given in µm/µm. The abbreviation f.b. stands for fixed bias conditions and f.o. stands for fixed gate bias overdrive. model VDS (V) maximum slope 0.05 3 points – f.b. 0.05 3 points – f.o. 0.05 4 points – f.b. 0.05 4 points – f.o. 0.05 current criterion 0.05 current criterion 1.5 fitting method A 0.05 fitting method A 1.5 fitting method B both
0.25/0.18 0.99916 0.99903 0.99928 0.99678 0.99823 0.99989 0.99981 0.99989 0.99985 0.99977
10.0/0.18 0.99733 0.99591 0.99688 0.98497 0.99140 0.99987 0.99977 0.94600 0.99984 0.99888
1.0/1.0 0.25/7.2 0.99765 0.99825 0.99648 0.99521 0.99715 0.99719 0.98824 0.98220 0.99061 0.98264 0.99979 0.99958 0.99972 0.99963 0.99984 0.99976 0.99969 0.99943 0.99964 0.99925
10.0/7.2 0.99465 0.55807 0.72763 0.61809 0.58814 0.99985 0.99942 0.99453 0.99960 0.99884
Table 3.2: The same as table 3.1, but now for ∆β/β model VDS (V) maximum slope 0.05 3 points – f.b. 0.05 3 points – f.o. 0.05 4 points – f.b. 0.05 4 points – f.o. 0.05 fitting method A 0.05 fitting method A 1.5 fitting method B both
0.25/0.18 0.99973 0.99961 0.99951 0.99721 0.99748 0.99999 0.99998 0.99979
10.0/0.18 1.0/1.0 0.25/7.2 0.98936 0.99870 0.99785 0.99100 0.99751 0.99440 0.98707 0.99702 0.99587 0.94537 0.97527 0.96600 0.94604 0.97758 0.96668 0.97170 0.99995 0.99996 0.99973 0.99995 0.99982 0.99743 0.99965 0.99804
10.0/7.2 0.99348 0.49438 0.54931 0.36262 0.37339 0.99578 0.99962 0.99714
and the bonding pad, which will be called contact resistance fluctuations in the remainder of this chapter. Note that one extraction method can be more susceptible to measurement noise than the other. Tables 3.1 to 3.3 list the correlation coefficients for all examined methods and device dimensions. It is observed that the repeatability is almost 100 % in most cases. The 0.25 µm wide, 0.18 µm long device pairs show the best measurement repeatability, since their intrinsic mismatch is highest. The three and four points methods show poor repeatability on the 10.0 µm wide, 7.2 µm long device pairs. The intrinsic mismatch of these device pairs is low, since they have a large area. This means that they are most susceptible to measurement noise. It will now be shown that this noise is added by the measurement equipment. Figure 3.6 displays, on the right axis, the measurement repeatability (ρrepeat ) of threshold voltage mismatch as a function of the
3.3. COMPARISON OF EXTRACTION METHODS
49
Table 3.3: The same as table 3.1, but now, depending on the method, for ∆θ, ∆θ1 , ∆θ2 , ∆(1/ζsr ) or ∆(1/ζsat ) model parameter 3 points – f.b. ∆θ 3 points – f.o. ∆θ 4 points – f.b. ∆θ1 4 points – f.b. ∆θ2 4 points – f.o. ∆θ1 4 points – f.o. ∆θ2 fitting method B ∆(1/ζsr ) fitting method B ∆(1/ζsat )
0.25/0.18 0.99849 0.99774 0.98863 0.98464 0.98908 0.98502 0.99874 0.99947
10.0/0.18 0.94844 0.93861 0.87339 0.82076 0.84834 0.76111 0.96694 0.96949
1.0/1.0 0.99551 0.99480 0.96970 0.94668 0.96785 0.94308 0.99964 0.99379
0.25/7.2 0.99437 0.99503 0.96490 0.93615 0.96549 0.93834 0.99665 0.98782
10.0/7.2 0.22099 0.31691 0.21776 0.18784 0.16560 0.10136 0.99184 0.99677
20 1 ID (mA)
0.8
change of current range
0.6
10
5
NMOS VDS=50 mV W=10.0 mm L=7.2 mm
0.4 0.2
0 0.2
0.4
rrepeat (-)
15
ID @ VGS2 moves into different current range
0.6
0.8
1
0 1.2
VGS2 (V)
Figure 3.6: Measurement repeatability of ∆VT (right axis) as a function of the placement of the second bias points for the three points method with fixed bias conditions. Also shown is the drain current (left axis) and where the measurement system switches from current range (dashed lines). VGS1 = VGS2 − 0.2 V, VGS3 = VGS2 + 0.4 V placement of the three bias points2 . Also shown, on the left axis, is the drain current. It is seen that the repeatability drops significantly when, at the second bias point (VGS2 ), the measurement system switches to a higher current-measurement range. At the low end of a measurement range, the system noise is highest, which, in this case, demonstrates the impact of the system. To obtain more accurate measurements, longer integration times would be necessary. Note that the other methods do not suffer from the added noise. For the fitting methods, the noise is averaged out over the large number of measured bias points. The maximum slope method does not suffer in 2 Figure 3.6 presents results for the three points method with fixed bias points. Similar figures could have been made for the other three and four points methods.
50
CHAPTER 3. PARAMETER EXTRACTION
1-r • s(DRtotal) (mW)
500 NMOS VDS=50 mV W=10.0 mm L=0.18 mm
400 300 200 100 0 0.4
0.6
0.8
1
1.2
1.4
1.6
VGS (V)
Figure 3.7: Non-repeatable part of the total measured resistance fluctuations of the 10.0 µm wide, 0.18 µm long device pairs at VDS = 50 mV
this particular case, because the peak transconductance is still located in the high end of the lower current range. If this would not have been the case, this method would also have shown poor measurement repeatability. Using a current criterion gives a very good measurement repeatability, since the intrinsic mismatch of a device pair is much higher around threshold than it is in strong inversion (see e.g. section 2.3 or figure 2.10). Returning to table 3.3, it can be seen that the 10.0 µm wide, 0.18 µm long device pairs have a slightly worse repeatability for ∆θ (or other related parameters) than the other pair dimensions. This is believed to be due to contact resistance fluctuations. These device pairs are most susceptible to these fluctuations since they are wide and short and therefore have low resistance by themselves. Figure 3.7 plots the non-repeatable part of the total measured resistance (Rtotal = VDS /ID ) fluctuations in the linear region, which, as expected, are constant as a function of the gate bias. From this, the contact resistance fluctuation is calculated to be σRcontact ≈ 150 mΩ per bonding pad. This number can easily become larger when bonding pads are degraded by earlier measurements, the probes themselves are worn-out or when the pressure of the probes on the bonding pads is too low. In the next subsection it will be seen that contact resistance can affect current factor mismatch when series resistance is not included in the analysis. Contact resistance fluctuations are only an issue for measurements in the linear region. In the saturation region the dependence of the drain current on the drain bias is weak and the influence of series resistance at the drain diminishes. Series resistance at the source still impacts the drain current, but it is common to both transistors in the pair and therefore does not affect the mismatch. Another source of inaccuracy, which was not observed in this particular experiment, could be the measurement resolution of the system. Currents are presented at√a 5.5 digits resolution. Current factor mismatch is approximately equal to (1 %µm)/ W L.
3.3. COMPARISON OF EXTRACTION METHODS
51
This means that measurement resolution could start to play a role for transistors with an area larger than 100 µm2 . A priori, it is difficult to put exact numbers to when matching measurements start to fail. The specified worst-case system accuracy for current measurements (see section 2.1.1) is about 0.5 %. This suggests that transistors with an area of 1 µm2 or larger cannot accurately be measured. This statement is clearly contradicted by tables 3.1 to 3.3. Matching measurements are relative measurements by nature. Their accuracy is not determined by the long-term worst-case system accuracy, but by the short-term system repeatability, which is much better. Usually, this parameter is not specified by the equipment vendor. To further complicate issues, measurement repeatability can depend strongly on the device under test, as was observed in figure 3.6. In this example, for somewhat narrower devices the first two bias points would have fallen inside the more accurate lower current range, with almost 100 % repeatability for ∆VT measurements. It can be concluded that accurate matching measurements are possible far beyond the specified system accuracy. However, one has to always remain careful, especially for transistors with large area (& 10 µm2 , due to system noise) or for short, wide transistor pairs (σ∆R . 2 Ω, due to problems with contacting). We will end this subsection with a discussion on measurement speed. The time a measurement takes is roughly proportional to the number of bias conditions needed for the extraction.3 Listing the different methods from the fastest to the slowest gives: 1) three points method with fixed bias conditions (∼0.9 s4 ), 2) four points method with fixed bias conditions (∼1.2 s), 3) three points method with fixed gate bias overdrive (∼1.8 s), 4) four points method with fixed gate bias overdrive (∼2.4 s), 5) applying a current criterion (∼2 s), 6) maximum slope method (∼3 s), 7) current-mismatchfitting methods A (∼7.5 s), and 8) current-mismatch-fitting method B (∼15 s). From the perspective of measurement time, using a three or four points method is preferred. However, as was concluded earlier, for large area transistors one has to be careful about measurement noise.
3.3.3
Physical meaningfulness of parameters
To investigate whether parameters, that are extracted by two different methods, have the same physical meaning, a device pair by device pair comparison is made. Two examples are shown in figure 3.8, which plot ∆VT extracted with one method against ∆VT extracted with a different method. Two kinds of differences can be observed. In the first case (figure 3.8a) the correlation is almost 1, but a difference in slope occurs, i.e. ∆VT,method1 = f · ∆VT,method2 is noticed. This means that one (or both) of the methods systematically under-/overestimates the mismatch. However, because of the high correlation, it can be concluded that the mismatch causing effect is the same for both situations. This does not have to be the case, as is illustrated in figure 3.8b. Note that both methods have almost 100 % measurement repeatability 3 in section 2.1.1 it was found that the measurement of one bias point requires approximately 300 ms 4 This is the time it approximately takes to measure one transistor pair
52
CHAPTER 3. PARAMETER EXTRACTION
5
NMOS VDS=50 mV W=10.0 mm L=0.18 mm
15 DVT (mV) – current criterion
DVT (mV) – 3 points method, fixed bias
10
0
r = 0.998
-5
10
NMOS VDS=50 mV W=0.25 mm L=7.2 mm
5 0 -5 -10 r = 0.154
-15 -10 -10
-5
0
5
10
DVT (mV) – 3 points method, fixed overdrive
(a)
-15
-10
-5
0
5
10
15
DVT (mV) – fitting method A
(b)
Figure 3.8: Two examples, where threshold-voltage-mismatch-extraction methods are compared. In (a) the three points method with fixed gate overdrive is compared to the same method using fixed bias conditions. In (b) current-mismatch-fitting method A is compared to applying a current criterion. and that mismatch parameters are extracted from the same measurement curves. The poor correlation is therefore truly caused by a difference in physical content of the parameters. Using (3.4), it can easily be shown that the correlation between two parameters (ρ(∆P1 , ∆P2 )) cannot be purely explained by the measurement inaccuracy of both parameters (ρrepeat (∆P1 ) and ρrepeat (∆P1 )) when: q ρ(∆P1 , ∆P2 ) < ρrepeat (∆P1 ) · ρrepeat (∆P2 ). (3.5) In summary, by comparing σ∆P values, it can be determined whether the absolute value of the extracted threshold voltage mismatch is correct. By correlating different methods with each other, the physical content of the parameters can be examined. Another way to examine whether an extracted parameter has the expected physical content is to introduce a known mismatch in the devices under test and to look at how well this is reproduced by the extraction method. For current factor mismatch this could be achieved by using a dedicated test structure, which had a designed systematic mismatch in the gate lengths of the two transistors in the pair [73]. Tables 3.4 to 3.6 show the results of the above mentioned tests. The standard deviations of threshold-voltage and current-factor mismatch for the examined extraction methods are listed in the top parts of tables 3.4 and 3.5 respectively. The bottom parts of these tables list the correlation of the examined methods with the three points method with fixed gate bias overdrive, which has been chosen as the method of reference. Table 3.6 lists the medians of extracted current-factor mismatch for small introduced mismatch in the gate length (top part of the table) and a larger introduced mismatch (bottom part of the table) for several average gate lengths. As an estimate,
3.3. COMPARISON OF EXTRACTION METHODS
53
Table 3.4: Standard deviations of ∆VT in mV and the correlation of ∆VT with the ∆VT ’s extracted with the three points method with fixed gate overdrive. The device width/length ratios at the top of the columns are given in µm/µm. The abbreviation f.b. stands for fixed bias conditions and f.o. stands for fixed gate overdrive. model
VDS (V)
0.25/0.18 10.0/0.18 1.0/1.0 0.25/7.2 σ∆VT (mV) maximum slope 0.05 17.896 3.5518 4.9085 2.9830 3 points – f.b. 0.05 13.167 2.5999 3.7867 2.1091 3 points – f.o. 0.05 18.130 3.6504 4.9146 2.7862 4 points – f.b. 0.05 13.707 2.8391 4.4826 3.1177 4 points – f.o. 0.05 18.358 3.6700 5.1031 3.2245 current criterion 0.05 21.116 3.6652 7.0361 5.9258 current criterion 1.5 21.190 4.4757 8.8511 5.3200 fitting method A 0.05 20.994 3.6004 6.1162 2.5110 fitting method A 1.5 17.572 4.0010 4.7539 3.7193 fitting method B both 18.752 4.3143 5.6053 4.5970 correlation with three points method with fixed gate bias overdrive maximum slope 0.05 0.97986 0.98987 0.98211 0.97021 3 points – f.b. 0.05 0.99550 0.99771 0.99643 0.99737 4 points – f.b. 0.05 0.91693 0.96990 0.94879 0.95381 4 points – f.o. 0.05 0.95891 0.98302 0.96107 0.95738 current criterion 0.05 0.6547 0.91925 0.63421 0.51844 current criterion 1.5 0.57393 0.82957 0.49337 0.38342 fitting method A 0.05 0.83422 0.97520 0.71726 0.57656 fitting method A 1.5 0.80575 0.91610 0.70299 0.55112 fitting method B both 0.88373 0.92030 0.70736 0.63438
10.0/7.2 0.59237 0.66291 0.78975 0.87543 0.85381 1.0062 1.1544 0.50281 0.74600 0.99666 (-) 0.81125 0.97027 0.85954 0.54038 0.70807 0.44883 0.56195 0.61002 0.61672
the median was preferred over the mean, since the first is less sensitive to outliers. We will now discuss the results presented in these tables method by method. Maximum slope method. Looking at tables 3.4 and 3.5 it is observed that the maximum slope method gives results that are close to the results of the three- and fourpoints methods with fixed gate bias overdrive. Significant differences are only observed for the 10.0 µm wide, 7.2 µm long device pairs, that are caused by poor measurement repeatability (see tables 3.1 and 3.2). Looking at the bottom part of table 3.6 it is seen that the maximum slope method slightly underestimates the current factor mismatch. This underestimation becomes more prominent for decreasing transistor lengths. This suggests that it is related to the influence of series resistance or other short channel phenomena. The decrease is in contradiction with what one would expect. The introduced mismatch is equal to ∆β/β = −∆L/Lmask . However, the physical mismatch is not proportional to the inverse of the mask length, but to the inverse of the smaller effective channel
54
CHAPTER 3. PARAMETER EXTRACTION Table 3.5: The same as table 3.4, but now for ∆β/β
model
VDS (V)
0.25/0.18 10.0/0.18 1.0/1.0 0.25/7.2 σ∆β/β (mV) maximum slope 0.05 15.069 1.2235 2.6847 0.99026 3 points – f.b. 0.05 14.444 1.4531 2.4897 0.94718 3 points – f.o. 0.05 14.645 1.3744 2.5875 1.1001 4 points – f.b. 0.05 13.879 1.5731 2.2272 1.3253 4 points – f.o. 0.05 13.914 1.5022 2.3644 1.3635 fitting method A 0.05 16.923 1.3085 3.1844 1.0431 fitting method A 1.5 10.215 0.82103 2.8676 0.90693 fitting method B both 14.491 1.1971 3.0085 1.7888 correlation with three points method with fixed gate bias overdrive maximum slope 0.05 0.96473 0.91485 0.94897 0.90832 3 points – f.b. 0.05 0.97726 0.90439 0.97266 0.96199 4 points – f.b. 0.05 0.89365 0.80661 0.90044 0.88517 4 points – f.o. 0.05 0.93988 0.90167 0.90937 0.88678 fitting method A 0.05 0.88765 0.76998 0.81090 0.50186 fitting method A 1.5 0.80545 0.57800 0.65542 0.45919 fitting method B both 0.91647 0.90506 0.84393 0.69302
10.0/7.2 0.23389 0.38996 0.41496 0.38996 0.51299 0.23982 0.21956 0.39968 (-) 0.70044 0.95784 0.77524 0.23759 0.58842 0.47079 0.60929
length. Understanding the observed behavior is difficult. The position and height of the maximum transconductance peak originate from a mixture of physical effects ranging from mobility reduction, series resistances effects and gate depletion [21] to the usually poorly described transitions from saturation to the linear region and from weak to strong inversion. Now consider the top part of table 3.6, that shows the results for the case of small introduced mismatch. For two of the examined pair dimensions it is observed that the maximum slope method gives serious overestimation of the systematic mismatch, which is caused by a bad contact of a probe-needle at one of the drains. The overestimation does not occur for the extraction methods that take series resistance into account, because it is filtered out by the mobility reduction parameters. In conclusion, the maximum slope method extracts consistent values for the standard deviation of threshold-voltage mismatch and current-factor mismatch. However, it is sensitive to series resistance effects. Another drawback of the method is that the physical effects, forming the maximum transconductance peak, are not easily modeled. Three and four points methods. We will start by comparing the three and four points method with fixed gate-bias overdrive. It is observed that these methods yield approximately the same results (see tables 3.4 and 3.5). Although the differences are somewhat larger than those expected from the repeatability study in subsection 3.3.2, they are small enough to disregard. When comparing the methods with fixed gatebias conditions (f.b.) with the methods with fixed gate-bias overdrive (f.o.) it is
3.3. COMPARISON OF EXTRACTION METHODS
55
Table 3.6: Medians of ∆β/β in % extracted with the examined methods for device pairs with small intentional mismatch in mask length (-5.71 %) and for device pairs with larger intentional mismatch (-28.57 %). The device width/(average length) ratios at the top of the columns are given in µm/µm. The abbreviation f.b. stands for fixed bias conditions and f.o. stands for fixed gate bias overdrive. method
VDS (V)
maximum slope 3 points – f.b. 3 points – f.o. 4 points – f.b. 4 points – f.o. fitting method A fitting method A fitting method B
0.05 0.05 0.05 0.05 0.05 0.05 1.5 both
maximum slope 3 points – f.b. 3 points – f.o. 4 points – f.b. 4 points – f.o. fitting method A fitting method A fitting method B
0.05 0.05 0.05 0.05 0.05 0.05 1.5 both
1.5/0.175 4.5/0.525 ∆β/β = 5.71 % 14.12 5.20 4.74 5.31 7.72 5.70 4.43 5.32 7.56 5.57 19.55 4.86 3.37 4.24 5.36 6.37 ∆β/β = 28.57 % 24.17 25.08 29.23 25.42 31.89 27.33 30.42 25.76 33.03 27.60 16.93 23.49 14.61 20.30 34.18 32.11
13.5/1.575
40.5/4.725
18.40 4.10 5.62 4.24 5.52 31.02 5.40 4.56
5.63 5.42 5.61 5.59 5.39 5.61 5.68 5.67
26.07 24.55 26.33 24.76 26.17 25.81 25.35 28.24
27.78 26.81 27.87 28.07 26.68 27.33 27.16 28.63
observed that the extracted standard deviations differ, despite a high correlation. For example in the case of threshold voltage mismatch, the extracted standard deviations are significantly lower in the case of fixed bias conditions. This stems from the fact that the models represented by (3.1) and (3.2) are not 100 % accurate. Because of this inaccuracy, the extracted parameters are a function of the placement of the bias points, as is illustrated in figure 3.9a for the threshold voltage. This behavior gives rise to a feedback mechanism that causes underestimation when fixed bias conditions are applied. Consider a transistor with a threshold voltage ∆VT higher than the average. For not too large ∆VT , the mismatch is underestimated by a factor 1/(1 − dVT /dVGS1 ). More generally it can be stated that: ∆Pf.o. ∼ = ∆Pf.b. +
dP ∆VT,f.o. , dVGS1
(3.6)
which for the threshold voltage is illustrated in figure 3.9b. In this figure it is also observed that σ∆VT depends less strongly on the placement of the measurement points when it is extracted with fixed gate overdrive. Notice that when measurement time is a serious constraint, one could use fixed bias conditions in combination with (3.6), al-
CHAPTER 3. PARAMETER EXTRACTION NMOS VDS=50 mV W=10.0 mm 0.5 L=0.18 mm dVT/dVGS1 * DVT 0.45 DVT
VT (V)
0.55
0.4 0.4
0.5
0.6 0.7 0.8 VGS1 (V) (a)
0.9
s(DVT) (mV)
56
4.5 4 3.5 3 2.5 2 1.5 1 0.4
NMOS VDS=50 mV W=10.0 mm L=0.18 mm 0.5
0.6 0.7 0.8 VGS1 (V)
0.9
(b)
Figure 3.9: Extracted threshold voltage (a) and threshold-voltage mismatch (b) with the three points method as a function of the placement of the bias points. (•) fixed gate bias, (¥) fixed gate bias overdrive. The from (3.6) predicted threshold-voltage mismatch is represented by the dashed line. VGS2 = VGS1 + 0.2 V, VGS3 = VGS1 + 0.6 V.
though a few extra measurements would be needed to accurately determine dP/dVGS1 . Now consider the results presented in table 3.6. As expected, it is observed that the intentional systematic mismatch is best reproduced using the methods with fixed gate bias overdrive. Looking at the shortest device pairs with small intentional mismatch, one can see that the methods with fixed gate overdrive extract the expected larger mismatch, while the methods with fixed bias conditions yield smaller results. Note that no impact of the bad contact is observed. In conclusion, when using the three or four points method, the transistors should be biased using fixed gate overdrive. No clear difference between the three and four points method was observed. However, note that the four points method is more difficult to implement, since it requires a numerical optimization algorithm. Applying a current criterion. In table 3.4 it is observed that using a current criterion leads to larger values for the extracted standard deviation of threshold voltage mismatch than when another method is applied. Also the correlation with the three points method is significantly smaller than 1. To understand this behavior, figure 3.10a plots the extracted standard deviation as a function of the applied current level. Figure 3.10b plots the correlation with the three points method. At large values of the current level the mismatch is dominated by current-factor mismatch (see section 2.3). With decreasing current level the standard deviation is expected to decrease and to level off at σ∆VT as is depicted by the solid line. The correlation is expected to increase and to level off at 1. However, experimentally it is observed that when the current level is decreased into the weak inversion region, the standard deviation starts to increase again and the correlation drops. This shows that the physical mechanisms that cause threshold-voltage mismatch are not the same in the weak and
3.3. COMPARISON OF EXTRACTION METHODS r(DVT1, DVT2) (mV)
s(DVT) (mV)
100 NMOS VDS=50 mV
10
-10
10
-9
10
-8
10
-7
-6
-5
10 10 10 ID @ VGS = VT IDcrit (A)
(a) standard deviation
57
1 NMOS 0.8 VDS=50 mV 0.6 0.4 0.2 0 -10
10
-9
10
-8
10
-7
-6
-5
10 10 10 ID @ VGS = VT IDcrit (A)
(b) correlation
Figure 3.10: a) Standard deviation of the threshold voltage mismatch extracted with a current criterion as a function of the applied current level. b) Correlation of the threshold voltage mismatch extracted with the three points method with fixed gate overdrive and the mismatch extracted by a current criterion as a function of the applied current level. The solid lines represent the case of equal threshold voltage mismatch in the weak and strong inversion region. (•) W = 10.0 µm, L = 0.18 µm. (¥) W = 1.0 µm, L = 1.0 µm. strong inversion regions.5 Applying a current criterion extracts the threshold voltage in the intermediate moderate-inversion region, which leads to results that are difficult to interpret. A reasonable estimate for σ∆VT in strong inversion can be obtained by equating it to the minimum of the σ∆VT − IDcrit curve. However, this requires higher current levels, that are, a priori, not known. This leads to an increase in measurement time. Current-mismatch-fitting method A. Comparing fitting method A with the other methods, it is observed that the correlation is poor (see tables 3.4 and 3.5). This is due to the fact that physical effects such as mobility degradation, series resistance and velocity saturation (in the case of VDS = 1.5 V) are not included in the model, while the fit includes bias conditions for which these effects play a role. This conclusion can also be drawn when looking at the systematic current-factor mismatch caused by the intentional mismatch in device length (see table 3.6). The introduced mismatch is seen to be underestimated, which is most noticeable for the short transistor pairs. Note that the extraction at VDS = 50 mV suffered from the bad contact of one of the needles. This was already observed for the maximum slope method and can be solved by filtering out series resistance by including a mobility reduction parameter. In saturation (VDS = 1.5 V) the impact of bad contacting disappears as was explained in subsection 3.3.2. Current-mismatch-fitting method B. Finally, looking at tables 3.4 and 3.5, it is ob5 Chapter
4, section 4.2 will look into this difference in more detail.
58
CHAPTER 3. PARAMETER EXTRACTION
Table 3.7: Qualitative comparison of the extraction methods with respect to model accuracy, measurement accuracy, sensitivity to contacting errors, measurement speed and physical meaningfulness of parameters. The abbreviation f.b. stands for fixed bias conditions and f.o. stands for fixed gate overdrive.
model maximum slope 3 points – f.b. 3 points – f.o. 4 points – f.b. 4 points – f.o. current criterion fitting method A fitting method B
model accuracy N.A. N.A. + +
measurement accuracy + + +
sensitivity to contact + + + + + +
measurement speed 0 + 0 + 0 0 -
physical content 0 0 + 0 + 0
served that fitting method B extracts larger standard deviations of threshold-voltage mismatch and current-factor mismatch than the three or four points methods. The correlation between the methods ranges from 60 % to 92 %. Fitting method B fits a simple model over a large bias range. Modeling errors are averaged out over all parameters. Looking at table 3.6 it is observed that the method gives a reasonable estimate of the intentionally introduced mismatch in gate length.
3.3.4
Summary
To end this section, table 3.7 qualitatively compares the examined extraction methods with respect to model accuracy, measurement accuracy, sensitivity to contacting errors, measurement speed and physical meaningfulness of parameters.
3.4
Future issues
This section will briefly discuss three issues that might start to affect matching measurements for technologies beyond the 0.13 µm node. The first is related to the gate leakage current, the second to problems with contact resistance and the third to the bad description of the mismatch in between weak and strong inversion. Gate leakage current. As the gate oxide thickness scales down, the leakage current due to tunnelling increases. This current is proportional to the transistor length, while the drain current without the leakage contribution is inversely proportional to
3.4. FUTURE ISSUES
59
-3
ID (mA)
-2 -1
PMOS W=20.0 mm L=10.0 mm
0 1 2
gate leakage tunnel current
3 4
0
-0.2 -0.4 -0.6 -0.8
-1
-1.2
VGS (V) Figure 3.11: Drain current as a function of the gate bias for transistors with thin gate oxide. (•) equivalent tox = 1.35 nm, VDS = 40 mV. (◦) equivalent tox = 1.35 nm, VDS = 100 mV. (¥) equivalent tox = 1.55 nm, VDS = 40 mV.
this length. Also, the leakage current increases exponentially with gate bias, while at low VDS , the drain current increases linearly. Figure 3.11 shows the total measured drain current for 20.0 µm wide, 10.0 µm long transistors for two values of the effective oxide thickness and two values of the drain bias. For the thin gate-oxide transistor it is observed that the leakage component to the drain current becomes dominant at high gate bias. None of the methods that were described in section 3.1 takes this leakage current into account. The methods are only valid for the limited bias range in which the tunnelling current is not significant. To allow parameter extraction outside this range, new models need to be developed, that take the matching properties of the gate tunnelling current into account. The introduction of high-k dielectrics would overcome this problem. Contact resistance. Effects of contact resistance, though small, were already observed in subsections 3.3.2 and 3.3.3. However, as gate lengths scale down, the resistance of the transistor pairs under test decreases and the measurement problems related to contact resistance could become disastrous. A possible solution would be to decrease the width of the measured transistors, but this would mean that transistors used in RF circuits, wide by nature, cannot be examined directly. A way to circumvent the problem would be to use a force and sense technique, where the sensing is done with a different needle and therefore beyond the contact. This would require two connections and bonding pads attached to each drain, one for forcing the drain voltage and the other one for sensing it. This, however, increases the size of the test structure significantly. Note that it is also possible to put two probe-needles on a somewhat larger designed bonding pad. A more elegant solution would be to change the layout of the test structure to a common source, common drain, separate gates configuration,
60
CHAPTER 3. PARAMETER EXTRACTION
instead of using the standard common source, common gate, separate drains layout. The contact resistance is now common to both transistors and has therefore no significant impact on the mismatch. The contact resistance at the gate is negligible to the large resistance of the gate itself. This adapted layout would also allow investigation of the mismatch in gate current, which was mentioned earlier. Moderate inversion. As supply voltages scale down, the range in which designers can bias transistors, becomes smaller. This pushes the operating conditions of transistors more and more towards moderate inversion. As was pointed out in subsection 3.3.3 the moderate inversion region is difficult to describe and from matching point of view it is unknown territory. Although the formulation presented in section 2 pushed the model validity to lower values of the gate bias, figure 3.10 clearly shows that more modeling efforts are required to better understand and describe this region.
3.5
Conclusions
In this chapter the most common techniques to extract the mismatch of a pair of MOSFETs have been compared, namely: the maximum slope method, three and four points methods, applying a current criterion and current-mismatch-fitting methods. The comparison was made with respect to model accuracy, measurement accuracy and speed, and physical meaningfulness of extracted parameters. Regarding model accuracy, it was found that in the saturation region current-mismatchfitting methods yield the highest accuracy. The examined direct extraction methods only use the linear regime for parameter extraction. In this region the models provide accurate results. However, when extrapolated to saturation the direct methods do not yield accurate results which shows that effects like velocity saturation and drain induced barrier lowering cannot be ignored. Of the examined current-mismatch-fitting methods, the method published in [35] gives the best accuracy. The method developed in chapter 2 has somewhat lower accuracy, but has the advantage that it is continuous from the linear regime to saturation. The measurement accuracy related to the extraction methods was examined by means of a repeatability study. Two sources of error have been recognized, errors caused by measurement system noise and errors caused by fluctuations in the resistance between the probe tips and the bonding pads. The three and four points methods, and assumably the maximum slope method, were found to be most sensitive to measurement system noise. Noise was found to only affect transistor pairs with large area since their intrinsic mismatch is small. No clear limit could be determined since the short term repeatability of measurement systems is not specified, while this quantity was also found to depend strongly on the measured current level. When applying a current criterion, no significant impact of measurement noise is observed, because the mismatch in the drain current is large at the low current level at which the extraction takes place. In the case of current-mismatch-fitting methods the noise is averaged out over the large number of measured bias conditions. Although, with respect to
3.5. CONCLUSIONS
61
measurement noise, one has to be careful when using a three or four points method, they are preferred methods in industrial environments, because of the limited measurement time required. Contact resistance fluctuations were observed, but they were not large enough to be a limiting factor when extracting standard deviations. An impact of contact resistance was seen on the average current factor mismatch for the maximum slope method and for fitting method A in the linear regime. These methods do not take series resistance into account. In case of the other methods series resistance effects are filtered out by the mobility reduction parameters. Contact resistance does not impact measurement accuracy in the saturation region. The physical meaningfulness of the extracted mismatch in threshold voltage and current factor has been investigated by comparing the extracted standard deviations and by correlating the results obtained by the different extraction methods. Currentfactor mismatch was further investigated by using a dedicated test structure, which has intentional mismatch in the gate length. It was found that the physically most meaningful parameters were obtained by using the three points method or four points method. However, it was seen that the transistors have to be biased using a fixed gate overdrive. Using fixed bias conditions leads to a wrong estimate, caused by the inaccuracy of the models on which the methods are based. Using the maximum slope method also provided good results, but it was found to be sensitive to contact resistance. When investigating the application of a current criterion, it was found that threshold voltage mismatch in weak inversion is caused by different physical mechanisms than threshold voltage mismatch in strong inversion. Conventional current levels are located in between these two regions, which makes the result difficult to interpret. One either has to use a lower current level to investigate the weak inversion region, or a higher current level for the strong inversion region. The latter can only be applied if current factor mismatch is still negligible at this higher current level. Current-mismatch-fitting methods were seen to provide physically less meaningful parameters. Model inaccuracies related to the simple models on which these methods are based are averaged out over all the extracted parameters. Finally some issues are recognized that might start to affect parameter extraction for up-coming technologies. Due to the down-scaling of gate thickness, gate leakage starts to have a serious impact on the drain current of long transistors. This means that matching measurements will either be limited to devices with not too long gate lengths or that this leakage current needs to be taken into account. Because of the down-scaling of the gate length, problems with contacting are expected to increase, which will limit matching measurements to not too wide transistors. To circumvent this problem it has been proposed to change the test-structure layout for matching measurements to a common source, common drain, separate gates configuration. Finally, as supply voltages scale down, analogue operation of the MOSFET is pushed towards the moderate inversion region. A modeling effort is required to better understand this region of interest.
Chapter 4
Physical origins of MOSFET mismatch In the previous two chapters it was examined how MOSFET mismatch can be described and how to extract model parameters. This work was done using physically based models for the drain current. However, the physics behind the variability itself was not investigated. In this chapter we will go one level of abstraction deeper and investigate the physical content of MOSFET mismatch. By physical content we mean the origin of the microscopic differences between two transistors and how these microscopic differences affect macroscopic transistor operation. Knowledge about the physical origins of MOSFET mismatch allows the refinement of models, provides information about the dominant mismatch causing mechanisms, and can ultimately lead to technologies with a better matching performance. In literature, mainly the physical origins of threshold voltage mismatch are examined. They are found to be related to doping fluctuations, fluctuations in gate depletion and fluctuations in boron penetration [3,21]. In order to calculate their impacts, generally, charge sheet modeling is applied [5, 10, 11, 16, 19, 30, 31, 37, 38, 51, 65, 74, 75] or 2D or 3D device simulations are performed [11, 12, 76–93]. For short-channel devices twodimensional field-effects influence device behavior, which to first order are modeled in [17, 22, 35, 37, 38, 51, 61, 74, 87, 91, 94]. Finally, by means of simulations the impact of quantum mechanics has been examined [78, 82, 83, 89]. Although much work is ongoing, a complete understanding of the physical origins of mismatch is still lacking. For instance, doping fluctuations only manage to explain about half of the experimentally observed mismatch in the threshold voltage. In this chapter theories will be presented that model the impact of doping fluctuations in the channel, doping fluctuations in the gate, fluctuations in oxide charge, and fluctuations in surface roughness. As an introduction, we will start in section 4.1 by deriving the basic equations of MOSFET operation. The second section again derives these 63
64
CHAPTER 4. PHYSICAL ORIGINS OF MOSFET MISMATCH poly-silicon gate: ND ~ 5·1019 – 1021 oxide extension: ND ~ 5·1019 – 1021
x y
extension: ND ~ 5·1019 – 1021
channel region: NA ~ 1017 – 5·1018
Figure 4.1: Schematic drawing of the MOSFET equations, but now in the presence of microscopic fluctuations. It will be found that the generally applied charge-sheet approach is only valid in strong inversions at low values of the drain bias. New models will be provided for the weak inversion regime, and for the strong inversion regime at higher values of the drain bias. Furthermore, short- and narrow-channel effects and the device symmetry are examined. The actual calculation of the impact of the above mentioned physical effects takes place in section 4.3. Quantum-mechanical and mobility effects will be taken into account. We combine all our models to compare them to experimental data. They are extensively tested by investigating the bias dependencies of the mismatch. With our theories we will manage to understand most of the experimental results. Section 4.4 concludes this chapter.
4.1
Basic operation of the MOS transistor
This section gives a brief overview of the basic operation of the MOS transistor. For more extensive descriptions we refer to [95, 96]. This section is organized as follows: In the first subsection the regions of operation of the MOSFET are introduced, and current expressions are derived for long-channel transistors. The second subsection discusses short- and narrow-channel effects and the impact of halos. Subsections 4.1.3 and 4.1.4 describe the impact of gate depletion and quantummechanical effects, respectively. Subsection 4.1.5 looks deeper into mobility determining effects.
4.1.1
Regions of operation and current expressions
Figure 4.1 shows the basic structure of the n-type MOSFET. Depending on the technology node, for modern-day devices, doping levels range from 1017 − 5 · 1018 cm−3 1 in the channel region and 5 · 1019 − 1021 cm−3 2 in the extension regions and in the 1 The doping type is boron (or possibly Indium) for nMOS transistors and arsenic for pMOS transistors. 2 The doping type is arsenic for nMOS transistors and boron (or possibly Indium) for pMOS transistors.
4.1. BASIC OPERATION OF THE MOS TRANSISTOR
gate -
+ + + + ++
-
+
+
+
-
+ + + + +
Ec Ei EF Ev
EF
VGB = fMS
oxide
oxide
EF
channel region
gate
channel region VGB < fMS
65
-
+
-
+
+
+
+
-
Ec
+ + + + +
Ei EF Ev
(b) flat-band
gate -
ys < fF EF
channel region
channel region -
-
VGB > fMS +
+
+
+
+ + + + +
Ec
ys > fF
Ei EF Ev EF
(c) depletion
---------- -
oxide
gate
oxide
(a) accumulation
VGB > fMS
-
-
+
+
+
-
+ + + + +
Ec Ei EF Ev
(d) inversion
Figure 4.2: Schematic band diagrams for the four regions of MOSFET operation poly-silicon gate. When the voltage applied between the gate and bulk (VGB ) is equal to their difference in work function (φM S ), the energy bands show no bending. This situation is called the flat-band condition and is displayed in figure 4.2b. When the gate-to-bulk bias is smaller, holes accumulate to the surface (figure 4.2a). For larger VGB holes are pushed away from the oxide-silicon interface and a depletion layer appears (figure 4.2c). When VGB is increased further, the intrinsic energy (Ei ) at the surface will become smaller than the Fermi level (EF ), which causes the concentration of surface electrons (n) to become larger than the concentration of holes (figure 4.2d). This situation is called inversion and the layer of electrons at the interface is called the inversion layer. In most cases, the MOSFET operates in this regime. The difference between the intrinsic energy in the bulk and the Fermi energy is given by: µ ¶ NA Ei − EF = kT ln , (4.1) ni where k is Boltzmann’s constant, NA the channel-doping concentration, ni the intrinsic carrier concentration and T the temperature. In other words, the transistor operates in inversion when the VGB causes ψs > φF . The Fermi potential φF = (Ei − EF )/q and ψs is the surface potential. When VGB is increased, at first, the depletion region under the gate will widen, the surface potential (ψs ) will increase, but the electron concentration at the interface remains low. This is called weak inversion. However, when the surface potential reaches ψs = 2φF , it becomes energetically more favorable to add electrons to the surface than to increase the width
66
CHAPTER 4. PHYSICAL ORIGINS OF MOSFET MISMATCH
of the depletion layer. Therefore, the electron concentration in the inversion layer becomes significant, while the depletion layer width is constant and the surface potential remains fixed. This is called strong inversion. We will proceed by deriving the current expressions for the inversion regimes. In general the electron current-density (Jn )3 can be expressed as: Jn = −qnµn
dψs dn + kT µn , dx dx
(4.2)
where µn is the electron mobility. The first term on the right-hand side represents the drift-current component and the second term represents the diffusion current. In order to derive current expressions it will first be assumed that the source and drain are at the same potential (VC ). We need to calculate: 1) How the electron concentration in the inversion layer depends on the surface potential and VC , 2) how the surface potential depends on VGC and VCB , and finally 3) how, for unequal source and drain potential, the surface potential and electron concentration vary laterally. From Boltzmann statistics it follows that: n(x) = ni e(ψs (x)−(φF +VCB ))q/kT = NA e(ψs (x)−(2φF +VCB ))q/kT .
(4.3)
With the bulk taken as reference, VCB equals the increase in Fermi potential due to the contact with the source and drain. In order to find the potential and charge distribution, Poisson’s equation needs to be solved: d2 ψ q =− (p(y) − n(y) − NA (y)) dx2 ²si
(4.4)
For the total charge under the gate (Qs ) in inversion it can be found that these equations yield: q ¡ ¢ Qs ∼ (4.5) = QD + Qi ∼ = − 2²si qNA ψs + (kT /q)e(ψs −(2φF +VCB ))q/kT , where QD is the depletion layer charge and Qi is the inversion layer charge. It follows that if QD is known, Qi is also known. To calculate QD , the depletion approximation is used: In the depletion region (0 < y < WD ) the hole concentration p = 0, while outside this region p = NA (y > WD ). From (4.4) it follows that a charge sheet at position y with charge qNA (y)dy gives rise to a field (ENA (y) ) of: y0 > y 0 0 qN (y)dy/² 0 < y0 < y ENA (y) (y ) = . (4.6) A si qNA (y)dy/²ox −tox < y 0 < 0 This gives for the surface potential: q ψs = ²si 3 In
Z
WD
yNA (y)dy, 0
n-type transistors the hole current-density is negligible (and vice versa).
(4.7)
4.1. BASIC OPERATION OF THE MOS TRANSISTOR
67
where WD is the thickness of the depletion layer. It is assumed that Qi is fully located at y = 0. For a uniform doping concentration (NA (y) = NA ) it follows that: s 2²si ψs WD = , (4.8) qNA QD = −
p
2²si qNA ψs ,
(4.9)
In weak inversion the exponential term in (4.5) is much smaller than ψs . Using a first order Taylor expansion and combining (4.5) and (4.9) gives for the inversion-layer charge: s Qi ∼ =−
²si qNA kT (ψs −(2φF +VCB ))q/kT e . 2ψs q
(4.10)
In order to relate ψs to VGC and VCB , the following potential balance is used: VGC + VCB = φM S + ψs + Vox .
(4.11)
The potential over the oxide (Vox ) is equal to: Vox =
qtox ²ox
Z
WD
NA (y)dy,
(4.12)
0
which for a uniform doping concentration results in: √ tox 2²si qNA ψs VGB = φM S + ψs + . ²ox
(4.13)
In strong inversion the surface potential can be assumed fixed at ψs = 2φF + VCB ≡ φB + VCB , as was mentioned before. The lower VGC boundary of this regime is called the threshold voltage (VT ). From (4.11) it is seen to be equal to: p tox 2²si qNA (φB + VCB ) VT = φM S + φB + . (4.14) ²ox When VGC > VT , an increase in VGC mainly results in an increase of the inversion layer charge. In other words, the device behaves as a capacitor over which a voltage VGC − VT is applied. For the inversion-layer charge this yields: Qi = −
²ox (VGC − VT ). tox
(4.15)
The approximation of ψs = 2φF +VCB , independent of VGC , is checked by introducing this equation in (4.5). Since now the exponential term is dominant it follows that: Ã ! r ²ox (VGC − VT ) q(2φF + VCB ) ∼ 5kT 2kT √ ln , (4.16) φB = 2φF + + = 2φF + q kT q tox 2kT ²si NA
68
CHAPTER 4. PHYSICAL ORIGINS OF MOSFET MISMATCH
which is a bit more accurate than putting φB = 2φF . We will proceed with calculating the current as a function of VGS , VDS and VBS . The gradual channel approximation is used: The lateral change in potential is small enough, so that locally the structure is described by the equations that were derived earlier in this subsection. However, VGC and VCB now are functions of the lateral position (x). At x = 0, VGC = VGS and VCB = −VBS . At x = L, VGC = VGS − VDS and VCB = VDS − VBS . In weak inversion the current is caused by diffusion. It follows from (4.2) and (4.10) that: ´ W W I0 q(ψs −2φF +VBS )/kT ³ ID = kT µn ( Qi |x=0 − Qi |x=L ) = e 1 − e−qVDS /kT , L L (4.17) s µ ¶2 3 ²si q NA kT I0 = µn . 2ψs q The surface potential is calculated with (4.13). Note that this potential is independent of the lateral position, in accordance with neglecting the drift component. Often, (4.13) is approximated by: VGS − VT ψs + VBS ∼ , = 1+δ
(4.18)
where 1/(1 + δ) to first order models the sensitivity of VCS to VGS and: s tox δ= ²ox
²si qNA ²si tox = = n − 1. 2(φB − VBS ) ²ox WD
(4.19)
In strong inversion, the drift component in (4.2) dominates. Combining this with (4.15) and realizing that ψs = φB + VCB gives: W ²ox dVCS ∼ W ²ox dVCS (VGS −VT −VCS ) (VGS − VT |VCB =−VBS −(1+δ)VCS ) , = tox dx tox dx (4.20) where δ · VCS models, to first order, the VCS dependence of the threshold voltage. From now on, unless explicitly mentioned, the symbol VT is used for the threshold voltage at the source side of the transistor. Solving (4.20) gives: ID =
VCS =
VGS − VT −
p
(VGS − VT )2 − 2(1 + δ)(ID /β)(x/L) + C , 1+δ
(4.21)
where the current factor β = W µn ²ox /Ltox and the constant C = 0, as follows from the boundary condition VCS (x = 0) = 0. The drain current is solved from the other boundary condition VCS (x = L) = VDS , which gives: ID = β(VGS − VT − (1 + δ)VDS /2)VDS .
(4.22)
4.1. BASIC OPERATION OF THE MOS TRANSISTOR
69
10
1
10.0/0.09 10.0/0.20 10.0/0.45 1.0/1.0 4.0/4.0 10.0/7.2
0.1
0.01
0
0.2
0.4
0.6
0.8
1
1.2
gout / b (mV)
gout / b (mV)
100 NMOS VDS = 1.2 V
VDS = -1.2 V PMOS 10
1
10.0/0.09 10.0/0.20 10.0/0.45 1.0/1.0 4.0/4.0 10.0/7.2
0.1
0.01
-1.2
VGS (V)
-1
-0.8 -0.6 -0.4 -0.2
0
VGS (V)
(a) NMOS
(b) PMOS
Figure 4.3: Output conductance over the current factor as a function of the gate bias for a 130 nm technology with a nominal gate length of 90 nm, |VDD | = 1.2 V and tox = 1.5 nm This equation is valid in the linear regime. At high drain bias the inversion layer is pinched of at the drain side. In (4.22) VDS has to be replaced by the saturation voltage (VDSsat ), which is calculated by equating dID /dVDS = gout , which results in: VDSsat =
VGS − VT − gout /β , 1+δ
(4.23)
When VDS > VDSsat the transistor is called to be operating in saturation. Figure 4.3 plots the output conductance over the current factor (gout /β) as a function of the gate bias for a 130 nm technology. It is seen that gout /β affects the saturation voltage only for short transistors. Note that (4.22) and (4.23) differ somewhat from (2.17), (2.24) and (3.1). In the previous chapters it was assumed that δ = 0 and gout = 0, while in calculating (4.23) the drain-bias dependence of the mobility was not included.
4.1.2
Short- and narrow-channel effects
In the previous subsection it was assumed that the channel length of the transistor is long. Close to the source and drain junctions, one has to use the two dimensional Poisson equation to find the electric field. For long-channel devices these 2D effects are negligible. However, for short devices they start to play a role. This subsection discusses this short-channel effect (SCE) and channel-length modulation, which is another SCE. Besides SCEs, also the narrow-channel effect will be discussed. 2D field effects. Figure 4.4a shows the surface potential in weak inversion as a function of the lateral position for several transistor lengths at VDS = 0 V. Figure 4.4b shows the surface potential for a short device, as a function of the lateral position, for several
70
CHAPTER 4. PHYSICAL ORIGINS OF MOSFET MISMATCH 1
L = 50 nm L = 100 nm
L = 200 nm
L = 50 nm
2
VDS = 0.0 V VDS = 0.3 V VDS = 0.6 V VDS = 0.9 V VDS = 1.2 V
L = 150 nm
0.8
ys (V)
ys (V)
0.9
0.7
1.5
0.6
1 0.5 0.4
0.5 0
50
100
150
200
0
10
20
30
40
50
x (nm)
x (nm)
(b)
(a)
Figure 4.4: Surface potential (ψs ) as a function of the lateral position (x) with the length (a) or drain bias (b) as a parameter. NA = 1 · 1018 cm−3 , tox = 2.2 nm. values of the drain bias. The surface potential can be approximated by [97]: 0 0 ψs (x) ≈ ψsi + (ψsiend − ψsi )
sinh((x − xibegin )/λi ) + sinh(Li /λi )
0 ) + (ψsibegin − ψsi
(4.24)
sinh((xiend − x)/λi ) , sinh(Li /λi )
where xbegin = 0, xend = L, ψbegin = ψs (x = 0) = ψbi − VBS , ψend = ψs (x = L) = ψbi − VBS + VDS , the built-in potential ψbi = Eg /2q + φB and ψs0 is the long-channel surface potential, given by (4.11). The meaning of the subscript i will become clear when halos are introduced later in this subsection. The parameter λ models the rate of change of the surface potential with the lateral position at the source and drain end of the transistor. It is given by: r ²si λ=ξ tox WD , (4.25) ²ox where ξ is equal to 1 for abrupt junctions, but is generally used as a fitting parameter. Two effects are observed. Firstly, for short transistors it is seen that the potential barrier between source and drain is smaller than 2φF , which also causes a smaller threshold voltage. The decrease in barrier height is calculated at the location of minimal potential: µ ¶ ψend − ψs0 L λ ∼ , (4.26) xmin = − ln 2 2 ψbegin − ψs0 q ∆ψs = ψs (xmin ) − ψs0 ∼ (4.27) = 2 (ψend − ψs0 )(ψbegin − ψs0 )e−L/2λ which has been derived, using the approximation sinh(z) ∼ = ez /2. The difference is seen to increase with increasing drain bias, which is called drain induced barrier lowering (DIBL). It is also observed that dψs (xmin )/dψs0 < 1. This explains the
4.1. BASIC OPERATION OF THE MOS TRANSISTOR
71
increase in subthreshold swing and decrease in dVT /dVBS for short devices, since both the gate- and bulk-bias dependencies of ψs are mainly determined by ψs0 (VGS , VBS ). Secondly, a significant part of the channel is needed to build up the potential. This results in a shorter effective channel length (Lef f ) than the metallurgical channel length (Lmet ). As measure of the effective channel length we take4 : weak inversion:
Lef f =
RL
Qi (ψs (x))dx 0 Qi (ψs (xmin )) ,
strong inversion: Lef f = Lmet − λsource − λdrain .
(4.28) (4.29)
The inversion-layer charge in weak inversion is calculated with (4.10). Further note that it follows from (4.25) that for unequal source and drain bias λsource 6= λdrain , since the depletion layer widths at both sides differ. We will now make the following approximation: The short-channel transistor is assumed to behave as a long-channel transistor with L = Lef f and ψs = ψs0 + ∆ψs . The effective channel length replaces L in the formula for the current factor. The increase in surface potential lowers the threshold voltage by: ∆VT (L) ≡ VT (L) − VT lw ∼ = −(1 + δ)∆ψs ,
(4.30)
where VT lw is the long-channel threshold voltage. To counter the short-channel effect, in modern-day devices extra doping is implanted around the source and drain regions (see figure 4.5a). These regions with a higher doping level are called halos. Figure 4.5b shows the surface potential as a function of the lateral position for transistors with different halo doses (Nhalo ). The transistor is now divided in three regions: 1) halo at source side, 2) center, 3) halo at drain side. In each of these three regions (4.24) is valid. The subscript i denotes the region. The boundary conditions are ψ1begin = ψbi −VBS , limx↑x1end dψs /dx = limx↓x2begin dψs /dx, limx↑x2end dψs /dx = limx↓x3begin dψs /dx and ψ3end = ψbi − VBS + VDS . The overall shift in surface potential is approximated by: ∆ψs =
1 Lef f
Z
L−λdrain λsource
0 ψs (x)dx − ψs2,long
(4.31)
0 The subscript long is added to ψs2 to distinguish it from the short-device case, in which the doping level in region 2 is determined by overlapping halos. The threshold voltage is again calculated with (4.30). Figure 4.6a compares experimental values of the threshold voltage with calculated values. Model parameters are presented in table 4.1. Results for the NMOS transistors are seen to be well described. The PMOS devices, that suffer more strongly from the 4 At this point it is useful to briefly summarize the different lengths that are used in this thesis. The gate length (Lgate ) refers to the length of the poly-silicon gate. The gate length of the minimum sized digital transistor is called the nominal gate length (Lnominal ). The metallurgical channel length (Lmet or Lchannel ) is equal to the distance between the source and drain. The effective channel length (Lef f ) is equal to the electrical channel length. In this thesis, the length L usually refers to the metallurgical channel length. For long-channel transistors the differences between these lengths are negligible.
72
CHAPTER 4. PHYSICAL ORIGINS OF MOSFET MISMATCH 1 18
Nchannel = 1 10
0.9 ys (V)
poly-silicon gate: ND ~ 5·1019 – 1021 oxide extension: ND ~ 5·1019 – 1021
Lhalo
halo: NA + Nhalo
x
Lhalo
y channel region: NA ~ 1017 – 5·1018
(a)
extension: ND ~ 5·1019 – 1021
18
0.8
Nhalo = 0 10
0.7
Nhalo = 1 10
0.6
Nhalo = 3 10
0.5
Nhalo = 7 10
18 18 18
0.4 0.3 0.2 0
20
40
60
80
100
x (nm)
(b)
Figure 4.5: a) Schematic drawing of a MOSFET with halos. b) Surface potential (ψs ) as a function of the lateral position (x) with the halo dose as a parameter. L = 100 nm, Lhalo = 25 nm and tox = 2.2 nm. short channel effect, are less well described. This can be explained by the usage of (4.8) to calculate the depletion layer width. For too strong 2D field-effects the depletion layer width increases and this approximation is not valid. Channel-length modulation. In strong inversion, when the device is operated in saturation (VDS > VDSsat ), the channel is pinched off when the potential reaches V (x) = VDSsat . This point is not exactly located at x = L, but a distance lp closer to the source. This distance can be approximated by [95]: s s r 2²si ²si E12 ²si E12 , (4.32) lp = + (VDS − VDSsat ) − qNA 2qNA 2qNA where E1 = dψs /dx|x=L−lp ∼ 104 − 2 · 105 V/cm. To properly describe the current, in (4.22) and (4.23), L needs to be replaced by L−lp . This effect is called channel-length modulation. Narrow-channel effect. For narrow transistors deviations from wide-channel behavior can be expected. For transistors with shallow-trench isolation (STI), the gate curves a bit around the edge. This causes a larger effective gate-area at the side of the channel, which results in a lowering of the threshold voltage. For NMOS transistors, close to the isolation the boron channel-doping can be reduced due to segregation of dopants into the STI. This also results in a lower threshold voltage. Finally, stress induced by the isolation can also affect transistor operation through a change in the band gap and a change in the mobility. These effects can be modeled by dividing the transistor in three parallel segments: The center segment with width Wmiddle = W − 2Wnarrow has the ‘normal’ threshold voltage (VT lw ), while the two transistors at the edge with width Wnarrow have a threshold voltage (VT narrow ) adjusted for the narrow-channel effects. The overall
4.1. BASIC OPERATION OF THE MOS TRANSISTOR
73
0.4 0.4
0.3 0.2 tox = 1.5 nm ; NMOS tox = 1.5 nm ; PMOS tox = 2.0 nm ; NMOS tox = 2.0 nm ; PMOS model
0 -0.2
VT (V)
VT (V)
0.2
tox = 1.5 nm ; NMOS tox = 1.5 nm ; PMOS tox = 2.0 nm ; NMOS tox = 2.0 nm ; PMOS model
0.1 0 -0.1 -0.2 -0.3
-0.4 0.1
1
10
-0.4 0.1
1
L (mm)
10 W (mm)
(a)
(b)
Figure 4.6: Threshold voltage as a function of the gate length (a) and gate width (b) for two 130 nm technologies. The first has a nominal gate length of 90 nm with tox = 1.5 nm and |VDD | = 1.2 V. The second has a nominal gate length of 130 nm with tox = 2.0 nm and |VDD | = 1.5 V. Model parameters are listed in table 4.1. VDS = 50 mV.
Table 4.1: Model parameters describing the short- and narrow-channel effects related 2 2 to VT , σ∆V and σ∆β/β T tox (nm) Lnominal (nm) type VDD (V) NA (cm−3 ) VT long (V) A0,∆VT (mVµm) A0,∆β/β (%µm) Lhalo (nm) Nhalo (cm−3 ) ξ (-) ∆L∆VT (nm) ∆L∆β/β (nm) Wnarrow (nm) VT narrow (V) A∆VT ,narrow (mVµm) A∆beta/β,narrow (%µm)
1.5 90 NMOS 1.2 6 · 1017 0.291 3.7 1.17 135 7.5 · 1017 1.0 40 60 50 0.118 1.65 1.17
1.5 90 PMOS -1.2 6 · 1017 -0.296 2.3 0.86 130 6.0 · 1017 2.3 150 100 50 -0.296 3.55 1.6
2.0 130 NMOS 1.5 5 · 1017 0.343 3.8 1.00 130 9.5 · 1017 2.5 65 80 70 0.156 2.45 1.17
2.0 130 PMOS -1.5 5 · 1017 -0.325 2.4 0.86 130 6.0 · 1017 4.5 155 155 50 -0.325 3.55 1.6
74
CHAPTER 4. PHYSICAL ORIGINS OF MOSFET MISMATCH
threshold voltage (VT ) is given by: VT (W ) =
Wmiddle 2Wnarrow VT lw + VT narrow . W W
(4.33)
Figure 4.6b shows that this model gives a good description of the threshold voltage as a function of the width. Model parameters are listed in table 4.1.
4.1.3
Gate depletion
The total amount of charge in the MOSFET is equal to 0. Therefore, the negative charge under the oxide is equalled by the same amount of positive charge in the gate. This results in a small depletion layer on top of the oxide, that decreases the total gate-to-channel capacitance (CGC ). The equivalent increase in oxide thickness (tGD ) is defined as tGD ≡ ²ox /CGC −tox . Using the depletion approximation5 , we can write: tGD = −
²ox Qs ∼ ²2ox (VGS − φM S − φB ) = = ²si qNp ²si (tox + tGD )qNp sµ =
tox 2
¶2 +
(4.34)
²2ox (VGS − φM S − φB ) tox − , ²si qNp 2
where Np is the doping concentration in the poly gate at the interface with the oxide. This concentration can be significantly lower than the average doping concentration in the gate and is estimated to have an approximate value of Np ∼ 5 · 1019 cm−3 . At an oxide thickness of tox = 1.5 nm and gate bias of VGS = 1.2 V6 this results in an effective increase in oxide thickness of tGD = 0.75 nm. In the calculation of the drain current tox needs to be replaced by (tox + tGD ). Note that the drain-current is decreased, because of the decrease in oxide capacitance directly (see (4.15)) and because of the related increase in threshold voltage (see (4.14)).
4.1.4
Quantummechanical effects
Figure 4.7 schematically plots the potential in the channel region as a function of the distance from the oxide-silicon interface. It is seen that at the surface the potential can be approximated by a triangular well in which the electron wave-functions are confined. Also the electron distribution of the ground state is plotted in figure 4.7. Two effects are observed. Firstly, the ground-state has an energy, that is slightly 5 The depletion-layer width in the gate ((² /² )t si ox GD ) is of the same order of magnitude as the Debye-length (compensated for Fermi-Dirac statistics). Strictly speaking, this means that the use of the depletion approximation is not valid. 6φ M S + φB ∼ 0.
4.1. BASIC OPERATION OF THE MOS TRANSISTOR
conductionband edge
gate
electron concentration
75
E0 channel region
DysQM (esi/eox)·tQM
Figure 4.7: Schematic drawing of the potential and electron concentration taking quantummechanical effects into account higher than ψs and extra band bending (∆ψSQM ) is required to reach the threshold condition [96]: µ ¶ kT Es ∆ψsQM ≈ BQM 1 · Es2/3 − ln , (4.35) q EQM 2 where the surface field Es = QD /²si , BQM 1 = 1.73 · 10−5 V1/3 cm2/3 and EQM 2 = 2.02 · 105 Vcm−1 . This approximation only takes the lowest energy subband into account, which is accurate when Es & 5 · 105 Vcm−1 or NA & 1 · 1018 cm−3 . Secondly, it is observed that the peak electron concentration is not located at the interface but a certain distance (²si /²ox )tQM away from it: ²si tQM BQM 3 = , 1/3 ²ox (QD + 11 32 Qi )
(4.36)
where BQM 3 = 1.25 · 10−9 cm1/3 C−1/3 . This results in an increase of the effective oxide thickness of tQM ∼ 0.4 nm, that lowers the current factor.
4.1.5
Low field mobility
As was mentioned in section 2.3.3, the mobility is determined by several scattering mechanisms. It can be split up in bulk mobility (µB ), surface and fixed oxide-charge scattering (µf c ), Coulomb scattering (µC ) and surface roughness scattering µsr . The overall mobility (µ) is calculated by Matthiessen’s rule: 1 1 1 1 1 = + + + . µ µB µf c µC µsr
(4.37)
The mobilities will be expressed in terms of the effective field (Eef f ): Eef f = |QB + ηQi |/²si .
(4.38)
76
CHAPTER 4. PHYSICAL ORIGINS OF MOSFET MISMATCH
The parameter η is related to the inversion layer thickness. Theoretically it is equal to 11/32, when only the lowest subband is taken into account. Experimentally it is found that η = 1/2 for electrons and η = 1/3 for holes. The value of 1/2 for electrons is due to the occupancy of higher subbands. The value of 1/3 for holes is very close to the theoretical 11/32. A physically correct approach to model the different components is presented in e.g. [98–100]. The full geometry of the problem needs to be taken into account, e.g. by using Green’s functions. Generally, mobility is calculated by averaging out over all possible device configurations. Fluctuations could be introduced by realizing that a device with finite dimensions can, in itself, not possess all possible configurations. The resulting expressions are quite complicated. Therefore, we choose to use the simpler semi-empirical expressions published in [101] and listed below: µf c =
zµ , 3.2 · 10−9 pµ
(4.39)
pµ = 0.09 + 9.06 · 10−13 (zµ /|Qi |)1/4 Nf , 1/3
zµ = 0.388/Eef f + 1.73 · 10−5 /Eef f , µC =
1.1 · 1021 2 )− ln(1 + γBH
2 γBH 2 1+γBH
mC , NA
2 = 3.2zµ /|Qi |, γBH
µsr =
2 Ksr /Eef f,
(4.40) (4.41) (4.42) (4.43) (4.44)
where Ksr and mC are a proportionality constants and Nf is the fixed oxide-charge density. Figure 4.8a shows a fit of the model to experimental transconductance curves as a function of the gate bias for several values of the bulk bias. The experimental curves seem to be reasonably well described. The results of the fit are Ksr = 1.0 · 1015 Vs−1 , mC = 0.70, Nf = 1.16 · 1011 cm−2 and µB = 715 cm2 V−1 s−1 . The large amount of fixed oxide-charge is typical for heavily nitrided gate oxides. The doping concentration close to the oxide-silicon interface was obtained by SIMS and is equal to NA = 3.1 · 1017 cm−3 . Figure 4.8b shows the overall mobility and the magnitude of its components. The main components are µB and µf c . At low gate bias, Coulomb scattering starts to play a role, but at high gate bias the inversion layer screens the dopants. In this region surface-roughness scattering becomes important.
4.2
Mismatch in the drain current
We will now proceed by examining the impact of a mismatch in the threshold voltage and of a mismatch in the current factor on the drain current. In subsection 2.4.1
4.2. MISMATCH IN THE DRAIN CURRENT
77
VDS = 50 mV
400
m (cm2V-1s-1)
gm / VDS (A V-2)
450
350 300 250
VBS = 0 V VBS = -1 V VBS = -2 V model
200 150 100
1000
total Coulomb surface roughness bulk surface + fixed charge 100
0.2
0.4
0.6
0.8
VGS - VT (V)
(a)
1
1.2
0.2
0.4
0.6
0.8
1
1.2
VGS - VT (V)
(b)
Figure 4.8: a) Experimental and modeled transconductance as a function of the gate overdrive at three values of the bulk bias. b) Total mobility and its components as a function of the gate overdrive at VBS = 0 V. The experimental data was obtained from NMOS transistors with tox = 2 nm. and literature it is assumed that the overall mismatch in a parameter (∆P ) can be calculated by averaging out the microscopic mismatch (δP (x, z)) over the area of the transistor. The impact on the drain current follows from (2.3). This is called the charge-sheet approximation. This approximation is examined by calculating the mismatch in the drain current directly from the microscopic mismatch, using (4.2). We find that the charge-sheet approximation is only valid in strong inversion at low values of the drain bias. Deviations for long-channel devices are calculated in subsection 4.2.1 for the weak inversion regime and in subsection 4.2.2 for the strong inversion regime. Subsection 4.2.3 discusses short- and narrow-channel effects for which models from literature are applied. Subsection 4.2.4 investigates the differences between the weak and strong inversion regimes, that were earlier observed in chapter 2 and chapter 3. By closely examining the averaging effects, we are able to explain most of the differences. Finally, in subsection 4.2.5, the symmetry of the MOSFET is examined, which is closely related to its matching properties.
4.2.1
Solution of the current equation in weak inversion
In weak inversion, the mismatch in drain current will mainly be caused by fluctuations in the surface potential ψs = ψs0 + δψs (x). These are expected to be dominating, since ψs is the only fluctuating parameter in the exponential of (4.17). When we write n(x) = δfn (x) · n0 (x) and δfn (x) = eqδψs (x)/kT , it follows from (4.2) and (4.3) that: ¶ Z µ δψs (x) dδfn (x) 1 L −qδfn (x)n0 (x)µn + kT µn n0 (x) dx+ (4.45) Jn = L 0 dx dx ¶ ¶ Z µ Z µ 1 L dn0 (x) 1 L dn0 (x) + kT µn δfn (x) dx = kT µn δfn (x) dx. L 0 dx L 0 dx
78
CHAPTER 4. PHYSICAL ORIGINS OF MOSFET MISMATCH
poly-silicon gate
ND = 5·1020
oxide
tox = 2.3 nm ND = 5·1019 ND = 5·1020
x
extension
ND = 5·1019
y NA = 1.4·1018
b0 VT00 d0
ND = 5·1020
NA = 1.3·1018
+db +dVT0 +dd
region 1
-5.5 nm xh +5.5 nm
x-Dx/2
(a)
extension x y region 3
2
x+Dx/2
(b)
Figure 4.9: a) Schematical drawing of simulated MOSFETs with a slightly higher doping concentration between xh − 5.5 nm and xh + 5.5 nm. b) Schematical drawing of a MOSFET with a slightly higher threshold voltage, current factor or δ between x − ∆x/2 and x + ∆x/2. This equation has the same shape as: Jn =
1 dV . ρ(x) dx
(4.46)
In other words, the resistivity that the current locally experiences is proportional to 1/δfn (x) and the driving force is proportional to the concentration gradient. In the two dimensional case dn0 (x)/dx does not vary with x, i.e. a microscopic difference at the source side of the transistor has the same impact as a microscopic difference in the middle or at the drain side7 . In order to test this theory 2D simulations were performed in MEDICI [103] of a MOSFET which has a slightly higher doping concentration between xh − 5.5 nm and xh + 5.5 nm (see figure 4.9a)8 . To avoid errors due to small differences in grid, it was made sure that compared simulations had exactly the same grid. Figure 4.10a shows that the relative decrease in drain current due to the extra doping is independent of xh and of the drain bias. It is also observed that the theory out of subsection 4.1.1 gives a good prediction of the shift. 2 Now assume that δψs is normally distributed with mean 0 and variance σδψ and that s its spacial distribution is described by a normalized power spectrum fδψs (ωr ). From this, it follows that δfn has a lognormal distribution with mean (µδfn ) and variance (σδfn ) equal to: 2 2 µδfn = e(q/kT ) σδψs /2 (4.47) ³ ´ 2 2 2 2 2 σδf = e(q/kT ) σδψs e(q/kT ) σδψs − 1 . (4.48) n Note that when σδψs ¿ kT /q, we can linearize the problem, i.e. δfn ∼ = 1+(q/kT )δψs , 2 2 ∼ . How(q/kT )2 σδψ and (4.47) simplifies to µδfn ∼ = = 1 and (4.48) simplifies to σδf s n ever, in general this approximation will not be valid. The deviations from the ideal linear case will now be investigated. 7 This 8 At
conclusion was also reached in [102] low drain bias, a similar kind of analysis was performed in [74]
4.2. MISMATCH IN THE DRAIN CURRENT
79
-0.7
-0.2
-0.5
VDS = 0.05 V VDS = 0.10 V VDS = 0.25 V VDS = 0.50 V VDS = 1.20 V model
-0.4 -0.3 -0.2 -0.1 0
DID/ID (%)
DID/ID (%)
-0.6 -0.15
-0.1
-0.05
VDS = 0.05 V VDS = 0.10 V VDS = 0.25 V VDS = 0.50 V VDS = 1.20 V model Dx = 120 nm VGS = 1.20 V
VGS = 0.20 V 0
0 source
0.2
0.4
0.6
0.8
xh (mm)
0 source
1 drain
0.2
0.4
0.6
xh (mm)
0.8
1 drain
(b)
(a)
Figure 4.10: Simulated and calculated increase in the drain current as a function of the lateral position of a slight increase in the doping concentration as depicted in figure 4.9. a) Weak inversion. b) Strong inversion.
Drain
Drain
Source
NL = 3
Drain
= R’ = 1/dfn
Source
Source
(a)
NW = 5
= R’/2 = 1/2dfn
(b)
Figure 4.11: Representations of the MOSFET in weak inversion by resistor networks. a) Extreme situation of a very narrow or very short transistor. b) Both the length and width of the transistor are larger than the correlation length lδψs . We will introduce the following nomenclature: The difference in a parameter P of a device that suffers from microscopic fluctuations and of an imaginary device without these fluctuations is denoted as ∆0 P . The mismatch between two macroscopically identical devices with only microscopic differences is denoted as ∆P . Furthermore, the correlation p length of the mismatch causing stochastic process fδP (ωr ) is defined as: lδP ≡ 2π fδP (0). Consider the two extreme device shapes as depicted in figure 4.11a. In case of a very short device (L ¿ lδψs ) the macroscopic conductance is given by the sum of the local conductances, which for a wide enough transistor yields: µ∆0 fn = µδfn − 1
(4.49)
lδψs 2 σ . W δfn
(4.50)
2 σ∆ = 0f n
80
CHAPTER 4. PHYSICAL ORIGINS OF MOSFET MISMATCH
When σδψs is a significant fraction of kT /q, the average current and the variation increase, which is mainly due to the exponentially higher conductance of local regions with high ψs . Now consider the other extreme. For a very narrow device (W ¿ lδψs ) the macroscopic resistance is given by the sum of the local resistances, which for a long enough transistor yields: µ∆0 fn = 1/µδfn − 1 (4.51) 2 σ∆ = 0f n
2 lδψs 2 lδψs σδf n σ1/δfn = . L L µ4δfn
(4.52)
When σδψs is a significant fraction of kT /q, the average current and the variation decrease, which is mainly due to the exponentially higher resistance of local regions with low ψs . The relative current mismatch between two transistors ∆ID /ID = ∆fn /(1 + µ∆0 fn ). From this it follows that ∆ID /ID has mean µ∆ID /ID = 0 and variance: 2 σ∆I = D /ID
2 2σ∆ 0f n (1 + µ∆0 fn )2
(4.53)
According to the central limit theorem, ∆ID /ID has a normal distribution for transistors with large enough area. Note that, until now, two dimensional current flows were neglected. A more realistic representation of the transistor is depicted in figure 4.11b. It consists of NW = W/lδψs segments in the width direction and NL = L/lδψs segments in the length direction. Each segment contains four resistors. In [3] a solution to this problem was found by considering each segment to be either switched on or off. A more accurate quasiresistance approach was presented in [102]. We will follow a similar approach, but it is not attempted to analytically model the effect. To obtain the current variation, the resistor network out of figure 4.11b is simulated. The resistivity of each segment is calculated from a randomly assigned δψs , that is taken from a normally distributed set. In order to reach high enough accuracy, 1000 microscopically different resistors were included. Results of this exercise are presented in figure 4.12. Figure 4.12a shows the relative increase in current (µ∆0 fn +1) as a function of σδψs /(kT /q). Different line shapes represent different areas. The arrow indicates the direction of increasing NW . Thus, for the NW NL = 256 case (full lines), the bottom line represents (NW = 2, NL = 128) and the top line represents (NW = 128, NL = 2). Figure 4.12b shows the same data as a function of NW /NL at σδψs /(kT /q) = 2. Only for short transistors an increase is observed. For longer and wide transistors the current decreases with respect to the no-microscopic-fluctuations case. Figures 4.12c and 4.12d show the increase in σ∆ID /ID with respect to the results obtained with the linear approximation. At σδψs /(kT /q) = 2 an increase of 30 % to 50 % is observed, which becomes larger for extreme NW /NL ratios. Note that, although one dimensional analysis yielded the same result for very short or narrow devices, it seriously overestimates the increase for the intermediate cases. This demonstrates that current tends to flow around regions
4.2. MISMATCH IN THE DRAIN CURRENT N N = 256
N N =4
N N = 64
1D extremes
W W
L
W
L
L
W
2
L
m(D’fn) + 1 (-)
m(D’fn) + 1 (-)
N N = 256
N N = 16
N N = 64
N N =4
W
N N = 16 2
81
1.5
1
W
L
W
L
W
L L
1.5
1
0.5
0.5
0
s(dys)/(kT/q) = 2
0 0.01
0
0.5
1
1.5
2
2.5
3
3.5
0.1
1
4
10
100
NW/NL (-)
s(dys)/(kT/q)
(b)
(a) 1.9 increase in s(DID/ID) (-)
increase in s(DID/ID) (-)
3 2.5 2 1.5 1 0
0.5
1
1.5 2 2.5 s(dys)/(kT/q)
3
3.5
1.8 1.7 1.6 1.5 1.4 1.3 1.2 0.01
4
s(dys)/(kT/q) = 2
0.1
10 NW/NL (-)
100
(d)
(c) 1
0.8
0.8
0.75
r(D’fn , D’ys)
r(D’fn , D’ys)
1
0.6
0.7 0.65
0.4 0.6
0.2 0
0.55
0
0.5
1
1.5
2
2.5
s(dys)/(kT/q)
(e)
3
3.5
4
0.5 0.01
s(dys)/(kT/q) = 2
0.1
1
10 NW/NL (-)
100
(f)
Figure 4.12: Simulation results of the resistor network out of figure 4.11b. Figures a+b show the increase in current, c+d the increase in standard deviation and e+f the correlation of current mismatch with ∆0 ψs . In figures a+c+e the results are plotted as a function of σδψs /(kT /q). The arrow indicates increasing NW /NL ratio. Figures b+d+f plot the results as a function of NW /NL for σδψs /(kT /q)=2.
82
CHAPTER 4. PHYSICAL ORIGINS OF MOSFET MISMATCH
of high resistivity, while it concentrates in regions of low resistivity. Finally, figures 4.12e and 4.12f plot the correlation between ∆0 fn and ∆0 ψs . This parameter is a measure of the correlation between current mismatch in weak and strong inversion. At σδψs /(kT /q) = 2 its value is seen to be around 0.5 to 0.8, which is significantly smaller than 1. These simulation results can partly explain the differences between weak and strong inversion that were observed in the previous two chapters. Furthermore, it will be found in subsection 4.2.4 that short- and narrow-channel effects also give rise to significant differences.
4.2.2
Solution of the current equation in strong inversion
In strong inversion the mismatch in the drain current is due to mismatch in the current factor, mismatch in the threshold voltage and mismatch in δ as defined by (4.19). Their contributions will now be calculated. Consider a transistor that in between x − ∆x/2 and x + ∆x/2 has β = β0 + δβ, VT 0 = VT 00 + δVT 0 or δ = δ0 + δδ, while no other microscopic fluctuations are present (see figure 4.9b). This divides the transistor in three regions. In each region the potential VCS is given by (4.21). However, the constant C is only equal to 0 in region 1. In the other regions it follows from the continuity of VCS . Again, the drain current is found by using VCS (x = L) = VDS . To first order this results in9 : ¯ δID ¯¯ ∆x δβ (4.54) = ¯ ID δβ L β ¯ ∆x Z δID ¯¯ dVCS 0 −β x+ 2 δVT 0 dx = (4.55) = ¯ ID δVT 0 ID x− ∆x dx0 2 ∆x Z Z x+ ∆x 2 −1 x+ 2 δVT 0 dx0 0 (x0 )dx0 = ≡ δVT 0 · wδV T0 L x− ∆x VGS − VT − (1 + δ)VCS (x0 ) x− ∆x 2 2 ¯ ∆x Z δID ¯¯ dVCS 0 −β(1 + δ) x+ 2 δδ · VCS (x0 ) dx = = ∆x ID ¯δδ ID dx0 x− 2 =
−1 L
Z
x+ ∆x 2
x− ∆x 2
VGS
VCS (x0 )δδdx0 ≡ − VT − (1 + δ)VCS (x0 )
(4.56) Z
x+ ∆x 2 x− ∆x 2
0 δδ · wδδ (x0 )dx0 ,
0 0 where wδV (x) and wδδ (x) represent the sensitivity of ∆ID /ID to δVT 0 (x) and δδ(x), T0 respectively. Outside the 0 < x < L interval these functions are put to 0. 9 For the mismatch in the threshold voltage, a similar analysis, but using somewhat different mathematics, was published in [41]. It was assumed that ∆x → 0. This approximation will turn out to be invalid. In parallel to √ our work, similar results were published in [25]. In this paper a logarithmic deviation to the 1/ area law was derived, which will turn out to be in accordance with our results.
4.2. MISMATCH IN THE DRAIN CURRENT
83
It is seen that the impact of δβ is independent of its lateral position. Therefore, we can apply the analysis out of subsection 2.4.1 and it follows that: 2 2 σ∆I | = σ∆β/β = D /ID ∆β
2 2 2lδβ σδβ/β
WL
≡
A0,∆β/β , WL
(4.57)
p where the correlation length lδβ ≡ 2π fδβ (0). Now consider δVT 0 and δδ and note that, from (4.14) and (4.19), they are expected to be fully correlated. Using the same kind of approximation as in (4.20) we can write: µ ¶ VCS δVT = 1 + δv δVT 0 , (4.58) φB where δv models the sensitivity of δVT to VCS /φB . The weighting functions 0 wδVT (x, ∆x) and wδV (x) are now defined as: T µ ¶ Z x+ ∆x Z x+ ∆x 0 2 2 δv wδδ (x0 ) 0 0 0 0 wδVT (x)dx = wδVT (x, ∆x) = wδVT 0 (x ) + dx0 (4.59) ∆x φ B x− ∆x x− 2 2 Furthermore, it will turn out useful to define the following functions: kVT = 1 + kδ =
δv VGS − VT 1+δ φB
(4.60)
δv . (1 + δ)φB
(4.61)
At high enough values of the drain bias, it follows from (4.55), (4.56) and (4.59) that a lateral dependence is expected of the sensitivity of the drain current to a local fluctuation in threshold voltage. It follows from (4.15) and (4.55) that this lateral dependence is approximately proportional to the lateral dependence of one over the inversion-layer charge or the local resistivity of the inversion layer. To test (4.59), the simulations described by figure 4.9a are repeated for strong inversion10 . The mobility was taken as a constant in the simulations to avoid a change in current factor due to the change in doping. The results are plotted in figure 4.10b and are seen to be well described for ∆x = 120 nm. It was assumed that δv = 0.5. The value of δVT 0 was calculated from the theory presented in section 4.1.1, but had to be multiplied by 1.1 to give a good fit. The magnitude of ∆x seems too large to be caused by the introduced disturbance. A possible explanation is that the mismatch itself becomes smaller close to the source and drain regions. This could be caused by e.g. charge sharing. Taking this into consideration, the following weighting function will be assumed: 0 x<0∨x>L x · w0 (∆x ) 0 < x < ∆xs s δVT 0 ∆xs wδV (x, ∆xs ) = , (4.62) 0 T ,f in ∆xs < x < L − ∆xs wδVT (x) L−x 0 ∆xs · wδVT (L − ∆xs ) L − ∆xs < x < L 10 A similar simulation was presented in [104], but only at low drain bias. In accordance with our result, no lateral dependence was observed.
84
CHAPTER 4. PHYSICAL ORIGINS OF MOSFET MISMATCH
where ∆xs is the range over which the extension regions affect δVT . A similar fit as depicted in figure 4.10b is obtained when ∆xs = 60 nm. Generally, it can be assumed that (VGS − VT 0 )2 À (L/∆xs )(gout /β)2 , as follows from figure 4.3. The height of the peak is then fully determined by ∆xs , while the output conductance plays no significant role. At low values of the drain bias the impact of a local disturbance of the threshold voltage on the drain current is given by: ¯ −δVT 0 ∆x ∼ gm δVT 0 ∆x δID ¯¯ ∼ , (4.63) = =− ID ¯δVT VGS − VT L ID L which is the same result as was obtained in subsection 2.3.2. This shows that, at low drain bias, all equations are linear and averaging effects can be interchanged. The variance of the drain current is calculated as follows11 : £ ¤ 2lδVT 2 2 0 0 σ∆I | = σδV ρδVT ∗ wδV ∗ wδV (0) ≈ T T ,f in T ,f in D /ID ∆VT W
(4.64)
2 ¤ £ 0 2lδV 0 2 T ∗ w (0) ≈ σδV w δVT ,f in δVT ,f in T W where the autocorrelation function ρδVT (x) describes the spacial properties of δVT . It is equal to the Fourier transform of the normalized power spectrum fδVT (ωr ). Note that (4.64) is identical to (2.38), but with an adapted geometry function that takes the weight of δVT as a function of the lateral position into account. The last approximate equality in (4.64) is valid when the weighting function does not vary too rapidly over a distance of the correlation length lδVT , or, in other words, when 0 0 2 d2 wδV /dx2 ¿ wδV (x)/lδV . This holds at low drain bias, or at high VDS T ,f in T ,f in T when ∆xs À lδVT for all values of x. At low drain bias (4.64) simplifies into:
2 σ∆I | D /ID ∆VT
∼ =
(VGS
2 2 2lδV σδV 1 T T ∼ = − VT )2 WL
µ
gm ID
¶2 2 σ∆V , T
(4.65)
which, again, is the same result as was obtained in subsection 2.3.2. However, at higher drain bias, assuming L À ∆xs À lδVT : ´ ³ L 2 2 2 ln ∆x k 2lδV σδV V 4k k T s VT δ 2 2 T T σ∆I | ≈ − + k , (4.66) ∆V δ T D /ID (VGS − VT 0 )2 VGS − VT 0 WL Applying the analysis out of chapter 2 instead of the analysis presented here, thus using (2.3) on (4.22) with VDSsat = (VGS − VT 0 )/(1 + δ), yields: µ 2 σ∆I | D /ID ∆VT 11 The
=
4kV2 T (VGS − VT 0 )2
¶
2 2 σδV 2lδV T T , WL
symbol ∗ denotes the convolution integral: [f1 ∗ f2 ](x) ≡
R∞
−∞ f1 (x
0)
(4.67) · f2 (x − x0 )dx0 .
s(DVTsat)2 / sD(VTlin)2 (-)
4.2. MISMATCH IN THE DRAIN CURRENT
85
2.5 tox = 1.5 nm ; NMOS tox = 1.5 nm ; PMOS tox = 2.0 nm ; NMOS tox = 2.0 nm ; PMOS long-channel model
2
Dxs=5 nm Dxs=8 nm
1.5
Dxs=16 nm Dxs=48 nm
1
0.5
0.1
1
10
L (mm) 2 2 Figure 4.13: Ratio of σ∆V (VDS = VDD ) and σ∆V (VDS = 50 mV) as a function of T T the gate length.
It is seen that (4.66) can give both smaller and larger results than (4.67)12 , depending 2 2 on the length. Figure 4.13 plots the ratio of σ∆V (VDS = VDD ) and σ∆V (VDS = T T 50 mV) as a function of the gate length. The measurements were performed on the same technologies as described in subsection 4.1.2. In saturation, threshold-voltage mismatch was extracted from the mismatch in the drain current by applying (2.7) at VGS = VT + 0.3 V. Current-factor mismatch was neglected, which might cause small errors. The change in threshold voltage mismatch between the two regimes was calculated by dividing the right-hand side of (4.66) by the right-hand side of (4.67). Despite a lot of scatter on the experimental data, the increase of the ratio with length is significant. The long-channel transistors are seen to be reasonably well described by the model. Further justification will be presented in subsection 4.2.5, where the symmetry of the MOSFET will be examined. The magnitude of ∆xs is found to be in the order of 5 − 50 nm. The highest ∆xs is observed for the tox = 2.0 nm PMOS transistors, that also suffer most severely from short channel effects (see subsection 4.1.2). For the observed values of ∆xs , the approximation ∆xs À lδVT is not expected to be fully valid. Therefore, besides short channel-effects, ∆xs is expected to be partly determined by the correlation length of the mismatch causing stochastic process. The model is seen to be invalid for short devices. This is due to the fact that the equations for VCS , inserted in (4.55) and (4.56), are incorrect. For short devices dVCS /dx becomes independent of the lateral position due to velocity saturation. This implies 12 The difference between (4.66) and (4.67) has a similar origin as the difference between physical mobility and effective mobility, as used in most drain-current models. Generally, the drain current is derived by solving (4.2) and neglecting the bias dependence of the mobility. This bias dependence is only introduced in the solution for the drain current. A physically more correct approach introduces the bias dependence before solving (4.2).
86
CHAPTER 4. PHYSICAL ORIGINS OF MOSFET MISMATCH
that we can again safely use the equations presented in chapter 2. However, note that σ∆VT itself is expected to vary with VDS when short-channel effects become too severe. Again looking at (4.66) and (4.67), it is seen that at higher gate bias they do not converge to 0, which is due to the non-zero value of δ. This was neglected in subsection 2.3.2 and it explains the correlation between ∆VT and ∆(1/ζsat ) as observed in subsection 2.4.3, table 2.3 and figure 2.15a.
4.2.3
Short- and narrow-channel effects
2 As was already observed in section 2.4, deviations from the σ∆P ∝ 1/W L law are expected for short or narrow transistors.
Short-channel effects. For short devices deviations can be due to 1) a smaller effective channel length than the metallurgical channel length [37, 38, 51], 2) the increase in surface potential, caused by the proximity of the extension regions [17, 35, 61], 3) the increase in doping level due to the halos [22, 74, 87, 91, 93]13 and 4) fluctuations in the short-channel effects themselves [94]. The second effect can cause a decrease in threshold-voltage mismatch, while the other effects increase the mismatch. In this subsection mainly the contribution of the lower effective channel length is investigated. The dependence of MOSFET mismatch on technology related parameters will be investigated in section 4.3 and chapter 5. To describe the impact of the smaller effective channel length, the following model is tried: A0,∆P
σ∆P = p
W (L − ∆L∆P )
,
(4.68)
where ∆L∆P models the change in channel length. Figures 4.14a+b present fits to experimental data for threshold-voltage mismatch and current-factor mismatch, respectively. The values for ∆L∆P are listed in table 4.1. The NMOS devices are seen to be well described by the model. Their short-channel effects are well controlled, as was earlier observed in subsection 4.1.2. The results for the PMOS transistors are less well fitted. The short devices suffer severely from short-channel effects and ∆L∆P becomes a function of the gate length. When trying to describe short-channel effects, it is better to use (2.41) as opposed to (4.68), to avoid singularities. However, note that (4.68) has a more physical base. Figure 4.14c plots the ratio of σ∆VT and σ∆β/β . This ratio is seen to be constant for NMOS devices, which indicates that the main short-channel effect is the reduction in channel length. The difference in the extracted ∆L∆VT and ∆L∆β/β is due to scatter on the experimental data. For the PMOS devices, also other effects are seen to play a role. Finally note that in figure 4.13 an increase of threshold voltage mismatch with increasing drain bias was observed for short transistors. This can be explained by the decrease in effective channel length, as described by (4.8), (4.25) and (4.29) and as reported in literature [51, 80]. 13 or
other structural changes related to short-channel devices
4.2. MISMATCH IN THE DRAIN CURRENT 2.4
tox = 1.5 nm ; NMOS tox = 1.5 nm ; PMOS tox = 2.0 nm ; NMOS tox = 2.0 nm ; PMOS model
10 8 6
WL s(Db/b) (% mm)
WL s(DVTstrong) (mV mm)
12
87
4 2
tox = 1.5 nm ; NMOS tox = 1.5 nm ; PMOS tox = 2.0 nm ; NMOS tox = 2.0 nm ; PMOS model
2 1.6 1.2 0.8
0.1
1
0.1
10
1
10
L (mm)
L (mm)
(b) s(DVTstrong) / s(Db/b) (mV %-1)
(a) 8 tox = 1.5 nm ; NMOS tox = 1.5 nm ; PMOS tox = 2.0 nm ; NMOS tox = 2.0 nm ; PMOS
7 6 5 4 3 2 1 0
0.1
1
10 L (mm)
(c)
√ √ Figure 4.14: W Lσ∆VT (a), W Lσ∆β/β (b) and their ratio (c) as a function of the transistor length. VDS = 50 mV. Symbols have the same meaning as in figure 4.13. Narrow-channel effects. To describe the impact of narrow-channel effects on σ∆P , we will make the same approximation as in subsection 4.1.2: The device is assumed to consist of three transistors in parallel, namely one center transistor and two transistors at the side. The transistors at the side can have different threshold voltage and also σ∆VT and σ∆β/β are expected to differ. The overall variation is determined by averaging the fluctuations over the width of the transistor. The increase in current density at the edges needs to be taken into account. In general, this gives: ¡ dIDmiddle ¢2 2 σ∆P =
dP
A20,∆P Wmiddle L
¡ dIDmiddle dP
+2
¡ dI
Dnarrow
dP
+ 2 dIDnarrow dP
¢2
¢2
A2narrow,∆P Wnarrow L
,
(4.69)
where IDmiddle and IDnarrow are the current flowing in the middle and edge transistors, respectively. The variation at the edge is described by A2narrow,∆P /Wnarrow L. Consider the linear regime as example. For this regime (4.69) yields: 2 σ∆V = T
2Wnarrow A2narrow,∆VT Wmiddle A20,∆VT + , W WL W WL
(4.70)
88
CHAPTER 4. PHYSICAL ORIGINS OF MOSFET MISMATCH 1.6 WL s(Db/b) (% mm)
WL s(DVTstrong) (mV mm)
4.5 4 3.5 3 2.5 2 0.1
1
10
1.4 1.2 1 0.8 0.6 0.1
VGS – VT @ 0.25 V
1
W (mm)
10 W (mm)
(a)
(b)
√
√
Figure 4.15: W Lσ∆VT (a) and W Lσ∆β/β (b) as a function of the transistor width. VDS = 50 mV. Symbols have the same meaning as in figure 4.14.
2 σ∆β/β
Wmiddle = W
µ
¶2 2 A0,∆β/β VGS − VT lw + VGS − VT (W ) WL µ ¶2 2 Anarrow,∆β/β 2Wnarrow VGS − VT narrow + . W VGS − VT (W ) WL
(4.71)
Figure 4.15 shows that these models give a good description of experimental data. √ Model parameters are listed in table 4.1. For narrow NMOS transistors W L · σ∆VT is seen to be smaller than for wide transistors. This could be explained by a lower doping level at the edge of the transistor, since this would result in reduced doping fluctuations. For the PMOS devices the doping level is more or less constant and √ STI is seen to increase W L · σ∆VT for more narrow transistors. Also an increase in σ∆β/β is observed for the PMOSFETs, which could be caused by sidewall roughness. This effect is seen to be less prominent for the NMOSFETs.
4.2.4
Comparison of mismatch in weak and strong inversion
The analysis presented in the previous subsection is valid in strong inversion. Deviations in weak inversion will now be discussed. Firstly consider the effect of halos. It was found in subsection 4.2.2 that the weight attributed to the local fluctuations is inversely proportional to local value of the inversion layer charge. From this it follows that: RL dx 2 L 0 Qi (ψs (x))2 A0,∆VT 2 . (4.72) σ∆VT = ³R ´2 L WL dx 0 Qi (ψs (x))
This equation is valid in weak inversion. In strong inversion ψs (x) needs to be replaced by: VT (x) = VT lw − (1 + δ)(ψs (x) − ψs0 ). (4.73)
4.2. MISMATCH IN THE DRAIN CURRENT 6 s(DVTweak)/s(DVTstrong) (-)
s(DVTweak)/s(DVTstrong) (-)
6 tox = 1.5 nm ; NMOS tox = 1.5 nm ; PMOS tox = 2.0 nm ; NMOS tox = 2.0 nm ; PMOS model
5 4 3 2 1 0
89
12.0 10.0 7.1 13.9
4.2
0.1
1.0
4 3 2 1
1.4 1.9 4.0 10.0
1
tox = 1.5 nm ; NMOS tox = 1.5 nm ; PMOS tox = 2.0 nm ; NMOS tox = 2.0 nm ; PMOS model
5
12.0 9.0 7.2 5.3 3.3
0 0.1
10
1.0 1.9 2.7
1
L (mm) 1 r(DVTweak , DVTstrong) (-)
r(DVTweak , DVTstrong) (-)
10
(b)
1 0.8 0.6 0.4 12.0
0
7.2
W (mm)
(a)
0.2
4.0
13.9
10.0
7.1
4.2
0.1
1.0
1.4
0.8 0.6 9.0 7.2 5.3 3.3
0.4 12.0
0.2
1.0
4.0 10.0 1.9
1
10
0 0.1
4.0 2.7
1
L (mm)
(c)
1.9
7.2
10 W (mm)
(d)
2 2 Figure 4.16: a+b) Ratio of σ∆V in weak inversion and σ∆V in strong inversion as a T T function of the transistor length (a) and width (b). c+d) Correlation between ∆VT in weak and strong inversion as a function of the transistor length (c) and width (d). Device widths (a+c) or lengths (b+d) are included in the figures and are given in µm. VDS = 50 mV.
The surface potential is calculated with (4.24). Note that we have assumed that the fluctuation mechanism doesn’t change with lateral position x. In strong inversion, a not too strong halo mainly reduces the short-channel effect. The variation of ψs with x is not too large and can be neglected14 . However, in weak inversion halos are expected too play a more significant role, because of the exponential dependence of inversion-layer charge to surface potential. 2 Figure 4.16a compares the calculation of the ratio of σ∆V in weak15 and strong T inversion to experimental data. The parameters out of table 4.1 were used in the calculation. For long-channel NMOS devices, it is observed that an increase in the ratio is expected. This can be explained from figure 4.5b. The halos cause two 14 Experimentally the impact of halos on device mismatch will be more thoroughly investigated in section 5.3. 15 In weak inversion, threshold voltage mismatch is equal to the mismatch in gate bias.
90
CHAPTER 4. PHYSICAL ORIGINS OF MOSFET MISMATCH
2Wnarrow
W Wnarrow
trapezoidal shape
strong inversion weak inversion
left side
z
increase in current
threshold voltage
Wnarrow
right side
Figure 4.17: Schematic drawing of the threshold voltage (left axis) as a function of its position in the width direction (z). On the right axis the increase in drain current for the trapezoidal VT profile is plotted.
bumps in the surface potential profile. For short transistors these bumps overlap and the only difference between weak and strong inversion is in effective channel length. For increasing length, the bumps appear and gain in relative importance. For very long transistors the impact of the halos is expected to decrease again. For the PMOS transistors the halos were found to be less effective (see figure 4.6a) and it follows from the calculation that no significant increase is expected. In figure 4.16a, it is observed that (4.72) underestimates the experimental data. This could be partly related to an inaccurate estimate of the surface-potential profile. However, also the width dependence has been neglected. It will turn out to explain most of the experimentally observed differences. For the short tox = 1.5 nm PMOS transistors, also an increase of the weak inversion mismatch is observed. This could be due the decrease in Lef f , but it is not observed for the other technologies. Equation (4.69) will be used to investigate the impact of the narrow-channel effect on the difference between weak and strong inversion. However, first note that the width of the edge transistor can be different in the two regimes, as is illustrated in figure 4.17. When the change in threshold voltage in width direction (z) is abrupt, weak strong weak strong Wnarrow = Wnarrow . For a non-abrupt change, Wnarrow < Wnarrow , due to the exponential dependence of drain current on surface potential. In case of a trapezoidal weak strong ∼ profile Wnarrow /(VT lw − VT narrow ). = (1 + δ)(kT /q)Wnarrow 2 Figure 4.16b shows the calculated and experimental increase of σ∆V as a function T of the width. Model parameters are again taken from table 4.1. Their values were obtained by a combined fit to the experimental results presented in figures 4.6b, 4.15a and 4.16b. Good agreements between the fits and experimental data are observed.
4.2. MISMATCH IN THE DRAIN CURRENT
91
However, note that in this case the impact of the halos was neglected. To explain these results, the same kind of reasoning follows as earlier. For narrow transistors, the device consists mainly of the edge transistor and no difference is 2 is still mainly determined expected. For wider transistors, the weak inversion σ∆V T by the edge transistors, due to the enormously larger edge-current density. In other words, the effective width of the device is reduced, which causes an increase in the variation. In strong inversion, this effect is much less pronounced. As expected, the PMOS transistors do not suffer from this increase in threshold voltage, since they don’t possess a lower threshold voltage at the edge. Finally consider figures 4.16c+d, that show the correlation between ∆VT in weak and strong inversion as a function of the gate length and width, respectively. As expected, the correlation is seen to decrease with increasing width. It seems to be less sensitive to the length, which indicates that the edge effect is more significant for the technologies under test. The correlation factor has a maximum value of ∼ 80 %, which is significantly lower than 100 %. This could be due to the percolation effects that were observed in subsection 4.2.1, but that have been neglected in this subsection. The halo creates a sharp potential peak at the source and drain sides. Local increases in potential can cause current paths through this barrier, that decrease its impact16 . In the edge transistor the current can be blocked by local regions of low potential. Again, this will decrease the impact. In order to describe the mixture of all these effects, detailed knowledge of the mismatch causing stochastic processes is required. Next, the problem has to be translated into a resistor network problem, taking into account the influence of halos and edge effects. Another option is to make use of a 3D simulator. However, this approach could be very time consuming.
4.2.5
Asymmetry of MOSFET mismatch
To end this section, we will investigate the asymmetry of MOSFET mismatch. This creates extra insight in the position dependence of the impact of microscopic fluctuations on macroscopic parameters [13, 105]. In addition to literature we will demonstrate that asymmetry is also present for long transistors, and that it √ increases with increasing length. This is directly linked to the deviation from the 1/ area law, as was observed in subsection 4.2.2. The asymmetry in the mismatch of the drain current is defined as: asymmetry =
σ 2 (ID2f − ID1f − ID2r + ID1r ) , 2 D2f − ID1f ) + σ (ID2r − ID1r )
σ 2 (I
(4.74)
where the subscript 1 or 2 denotes the transistor number, the subscript f means that the transistor is measured with normal source and drain definitions, while the subscript r means reversed source and drain definitions. From the analysis in subsection 4.2.1 no asymmetry is expected for long transistors in weak inversion. For 16 In
[93], this effect was observed by comparing 2D and 3D simulations.
92
CHAPTER 4. PHYSICAL ORIGINS OF MOSFET MISMATCH
80
100
NMOS VDS = 10 mV
asymmetry (%)
asymmetry (%)
100
10.0/0.09 10.0/0.20 10.0/0.45 1.0/1.0 4.0/4.0 10.0/7.2
60 40 20
80 60 40
0
0.2
0.4
0.6 0.8 1 VGS (V)
1.2
0
0.2
0.8
1
1.2
100 NMOS VDS = 0.50 V
saturation
60 linear regime
weak inversion
asymmetry (%)
asymmetry (%)
0.6
(b)
(a)
40
0.4
VGS (V)
100 80
10.0/0.09 10.0/0.20 10.0/0.45 1.0/1.0 4.0/4.0 10.0/7.2
20
0 0
NMOS VDS = 100 mV
80
VDS = 1.20 V NMOS 10.0/0.09 10.0/0.20 10.0/0.45
60
1.0/1.0 10.0/7.2 model 10.0/7.2
40 1.0/1.0
20
20
0
0
10.0/0.45
0
0.2
0.4
0.6
0.8
1
1.2
10./0.20
0
0.2
0.4
10.0/0.09
0.8
1
1.2
VGS (V)
VGS (V)
(c)
0.6
(d)
Figure 4.18: Asymmetry for four different values of the drain bias. The gate length is used as a parameter. The width over length ratio is given in µm/µm. short transistors at higher drain bias, asymmetry in the lateral surface-potential profile (see figure 4.4b) can cause asymmetry in the current. For long-channel transistors in strong inversion, at high enough values of the drain bias, asymmetry in the drain current is expected due to asymmetry in the inversion-layer charge-density. From (4.74) and the analysis in subsection 4.2.2 it is expected to be equal to: RL 0 0 (wδVT ,f in (x, ∆xs ) − wδV (L − x, ∆xs ))2 dx T ,f in asymmetry(∆xs ) = 0 . (4.75) 2 2L · σ∆I D /ID For short transistors the inversion layer is expected to become less asymmetrical. However, channel-length modulation could cause extra asymmetry. Figure 4.18 shows experimentally obtained curves of the asymmetry as a function of the gate bias for four different values of the drain bias. The presented results are for the tox = 1.5 nm NMOS transistors, but similar results were obtained for the other cases. In order to determine the asymmetry, we had to measure each transistor separately, instead of using the measurement algorithm described in subsection 2.1.3. Measurement repeatability was checked, and found to be no issue. However, as a side
4.2. MISMATCH IN THE DRAIN CURRENT
93
effect of using a different measurement algorithm, the source current was measured instead of the drain current. At a drain bias of VDS = 100 mV (figure 4.18b) no significant asymmetry is observed in as well weak as strong inversion. In moderate inversion the device becomes asymmetric. In this regime the transistor operates in weak inversion at the drain side, while at the source side it is in saturation. This is a highly asymmetric situation. At very low drain bias (VDS = 10 mV) and high gate bias, asymmetry was observed for some of the examined pair dimensions. This could be attributed to a measurement issue. One transistor in a module shares its source17 with a lot of other transistors. Therefore, at high gate bias, the total amount of tunnelling current through the source-gate overlap capacitance becomes significant as compared to the low drain current at low drain bias. Drains are connected separately. Therefore, the reverse measurement does not suffer from this problem. Now consider the intermediate drain bias (VDS = 0.50 V) case. At high gate bias, the transistors operate in the linear regime and are seen to be symmetric. Lowering the gate bias, moves the transistors into saturation, and causes asymmetry, which is most prominent for long channel transistors. The high asymmetry for the W = L = 4.0 µm devices is not fully understood, but might be caused by inaccuracies due to limited statistics18 . In weak inversion the transistors are again symmetrical, although signs of asymmetry start to be observed. Finally consider the high drain bias case (VDS = 1.20 V). In strong inversion, the transistors are fully operating in saturation. For long transistors, the model is seen to give a reasonable description of the experimental data for ∆xs = 75 nm. This value is higher than the value found in subsection 4.2.1, which was ∆xs = 8 nm. Note that both the measurements out of figure 4.13 and out of figure 4.18 are not very accurate and that we neglected current-factor mismatch. Furthermore, in the next section it will be found that most mismatch causing mechanisms contain both a threshold-voltage and current-factor component. This is not properly accounted for. However, the predictions of our model are qualitatively verified. Finally consider the short transistors, for which asymmetry due to channel length modulation is observed. In weak inversion, the asymmetry is most prominent for intermediate gate lengths. This can only be explained by the impact of the halos. The drain bias lowers the barrier due to the halo at the drain side, while the halo barrier at the source side remains fully intact. Similar behavior was experimentally observed and simulated in [106].
17 in
the case of normal terminal definitions to a measurement problem, only 42 device pairs per geometry were measured successfully. The other experimental results presented in this chapter are based on 84 measured device pairs per geometry. 18 Due
94
4.3
CHAPTER 4. PHYSICAL ORIGINS OF MOSFET MISMATCH
Physical origins of fluctuations
This section describes the fluctuation mechanisms that cause the variability of MOS transistors and it calculates their impact. The main origins of the fluctuations are identified as: 1) doping fluctuations in the channel, 2) doping fluctuations in the gate, 3) fluctuations in the oxide charge, and 4) fluctuations in surface-roughness scattering. The magnitude of the doping fluctuations and oxide charge will be modeled in subsection 4.3.1 following the work published in [3]. Our calculation of the magnitude of the fluctuations in surface-roughness scattering is presented in subsection 4.3.5 and it is based on the statistical properties of the oxide-silicon interface. The four fluctuation mechanisms affect transistor operation by influencing: 1) the threshold voltage, 2) the amount of gate depletion, 3) the magnitude of quantummechanical effects, and 4) the mobility. The impact of the fluctuation mechanisms on these macroscopic transistor parameters is calculated by using the charge-sheet approximation, i.e. it is again assumed that: Z dy 0 ∆ Ndope (y)dy = δNdope (x, y, z)dxdz, (4.76) W L area where Ndope dy is related to the channel or gate doping or any other charge sheet in the transistor19 . This equation was shown to be valid in strong inversion at low drain bias. Deviations in other regimes and for short- and narrow-channel transistors were discussed in the previous section. In order to calculate the variation in macroscopic parameter P , the fluctuations in the charge sheets need to be averaged out over the depth of the transistor: µ
Z 2 σ∆P
= depth
dP dNdope (y)
¶2 2 σ∆N (y)dy. dope
(4.77)
The first factor in the integral models the sensitivity of P to a fluctuation in the doping at depth y. This sensitivity can be determined by e.g. simulations or modeling. Here, the modeling approach is followed, since it creates insight in the origin of the sensitivity. Higher accuracy might be obtained by using simulations. We will start our calculations by examining one of the best studied mismatch phenomena, namely the impact of doping fluctuations on threshold voltage mismatch (subsection 4.3.2). The following subsections deal with gate depletion (subsection 4.3.3), quantum-mechanical effects (subsection 4.3.4), and mobility fluctuations (subsection 4.3.5). Subsection 4.3.6 combines all these effects in one model in order to determine which of them are relevant and to make a comparison to experimental data. The physical content of our models will be tested by examining gate- and bulk bias dependencies. Finally, in subsection 4.3.7 the results will be discussed. We will find that our calculations provide results that are close to the experimentally observed mismatch, while only two unknown parameters, related to gate depletion and surface-roughness scattering, need to be fitted. Finally, note that the presented 19 For
surface-roughness scattering a similar equation applies.
4.3. PHYSICAL ORIGINS OF FLUCTUATIONS
95
equations related to quantum-mechanical effects and mobility are valid for NMOS transistors. For PMOS transistors appropriate changes in proportionality constants are required.
4.3.1
Doping fluctuations
Variation in the amount of dopants can be caused by numerous effects. Overall, the probability (p) that one dopant is present in a small volume element (dV ) is equal to p = NA dV , independent of the presence of dopants in other volume elements. When the volume element is taken small enough, the chance that two dopants are present is negligible. The total number of dopants (NAtot ) in a certain volume (V ) is then 2 Poisson distributed20 with mean NAtot and variance σN = NAtot . It follows that Atot the doping concentration NA is also Poisson distributed with mean NA and variance 2 σN = NA /V . In a MOSFET, the average distance between dopants is of the order A of magnitude of ∼ 10 nm. This means that in most practical cases the Poisson distribution can be approximated by a normal distribution with the same mean and variance. The same kind of analysis holds for dopants in the gate. However, note that the variance can be higher due to the poly-grain structure of the gate material. This will be further looked into in subsection 4.3.3 and section 5.2. Finally note that besides the implanted channel doping, extra charge sheets (Qcs ) can exist, due to e.g. interface states, oxide charge or boron penetration. Using the 2 same statistics, one can write σQ = q|Qcs |/W L. In case of boron penetration the cs variance is again expected to be more related to the randomness of the gate structure than to number fluctuations and it will be higher.
4.3.2
Impact of fluctuations in channel doping on threshold voltage
Doping fluctuations are considered to determine the lower obtainable limit to the variation of MOSFET parameters. Therefore, random dopant effects have been extensively studied in literature [10–12, 14, 51, 74, 76–80, 82, 83, 85–94, 104, 107]. To describe the impact of doping fluctuations, the same approach as in [10, 11, 75] will be followed. Furthermore, we will estimate the correlation length related to threshold voltage fluctuations due to random dopants. Finally the impact of the doping profile is examined. We will now calculate the impact of doping fluctuations on the gate potential for fixed surface potential. In strong inversion this equals the mismatch in threshold voltage. Consider a charge sheet with thickness dy at a distance y from the interface that has a doping concentration that is δNA (y) higher than the average NA . It follows from 20 In [14] it was shown that clustering of dopants increases the variance by the average amount of dopants that are clustered together. We will assume that no clustering takes place.
96
CHAPTER 4. PHYSICAL ORIGINS OF MOSFET MISMATCH
Table 4.2: Technology parameters out of the ITRS roadmap that are used in the calculations presented in figure 4.19 tox,ef f (nm) Lgate (nm) NA (cm−3 )
2.3 65 1.5 · 1018
tox,ef f (nm) Lgate (nm) NA (cm−3 )
3.0 90 1.5 · 1018
high performance 2.0 1.9 45 32 2.5 · 1018 2.5 · 1018 low power 2.6 2.2 65 45 2.5 · 1018 2.5 · 1018
1.4 25 5.0 · 1018
1.0 13 9.0 · 1018
1.7 32 5.0 · 1018
1.4 16 9.0 · 1018
(4.6) that this increase results in such a decrease in depletion-layer width, that at its edge an amount of charge equal to (y/WD )δNA (y)dy is covered. In total, the shift in gate bias (δVGS ) due to δNA dy equals: µ ¶ qtox y δVGS = 1− δNA dy. (4.78) ²ox WD 2 From this the variance of ∆0 VT (σ∆ 0 V ) follows directly: T 2 σ∆ 0V T
q 2 t2ox = W L²2ox
Z 0
WD
p µ ¶2 y t2ox 2q 3 ²si NA ψs 1− . NA (y)dy = WD 3W L²2ox
(4.79)
The last equality holds for uniform doping profiles. Note that this equation was 2 2 derived considering only one device and that σ∆V = 2σ∆ . 0V T GS Besides the implanted channel doping, extra charge sheets (Qcs ) can exists, due to e.g. interface states, oxide charge or boron penetration. From (4.78), it follows directly that: µ ¶2 q|Qsc |t2ox ycs 2 σ∆0 VT = 1− , (4.80) W L²2ox WD where ycs is the depth of the charge sheet. For modern-day heavily-nitrided gate oxides, the fixed oxide-charge density can be as high as Nf = 2 · 1011 cm−2 . Although not dominant, its contribution cannot be neglected21 . Note that this situation might worsen once high-k dielectrics are introduced. Using (4.79), figure 4.19 shows the calculated threshold-voltage fluctuations as a function of the effective oxide thickness for transistors with minimum dimensions (W L = 3L2gate ) and for transistors with W = L = 1.0 µm. Technology parameters are taken out of the ITRS roadmap [108], and listed in table 4.2. For devices with a constant area, the variation lessens for each technology generation due to the decrease 21 The simulations presented in [84] showed that the effects of fluctuations in the fixed oxide-charge can be neglected. However they considered Nf = 2 · 1010 cm−3 , which is low for heavily-nitrided gate oxides.
4.3. PHYSICAL ORIGINS OF FLUCTUATIONS
97
50
2
1.5
30 1 20 0.5
10 high preformance low power 0 0.8
A0, DVT (mV mm)
s(D’VT) (mV)
40
1.2
1.6
2
high preformance low power 2.4
2.8
0 3.2
effective tox (nm)
Figure 4.19: Threshold-voltage fluctuations as a function of the effective oxide thickness for transistors with minimum dimensions (W L = 3L2gate ) (open symbols, left axis) and for transistors with W = L = 1.0 µm (filled symbols, right axis). Technology parameters are listed in table 4.2 in oxide thickness. Clearly, this is advantageous for analog operation. However, it is also seen that the variation of the minimum device increases when technologies are scaled down22 . Since some modern-day digital circuits can contain an enormous amount of transistors (> 109 ), the requirements on σ∆VT are quite stringent. From figure 4.19 it is clear that parameter variations are starting to play an important role in determining the design rules for digital circuits (see e.g. [7, 8]). Note again that doping variations give the lower limit to parameter fluctuations. In practice the fluctuations will be larger. We will proceed this subsection by estimating the correlation length lδψs and variance 2 σδψ , as defined in subsection 4.2.1. The impact on the surface potential (Vqp (y, r)) s of a point charge (qp ) at a distance y from the interface is approximated by: Ã ! qp 1 1 p Vqp (y, r) = −p . (4.81) 4π²si (y + 2(²si /²ox )tox )2 + r2 y2 + r2 The first term inside the brackets is due to the charge itself, the second term is due to its mirror charge in the gate. Assume that a dopant at depth y influences the surface 02 potential over an area lδψ (y) of: s Z ∞ 1 02 2πrVqp (y, r)dr = 2πy(y + 2(²si /²ox )tox ). (4.82) lδψs (y) = Vqp (y, 0) 0 As rough estimate for lδψs one can now use: R WD 02 (1 − y )2 NA (y) · lδψ (y)dy π²si π 2 s 2 + WD tox . = WD lδψs ∼ 0 R WD WD y 2 5 ²ox (1 − WD ) NA (y) · dy 0 22 In
reality the variations are even slightly higher since Lef f < Lgate .
(4.83)
98
CHAPTER 4. PHYSICAL ORIGINS OF MOSFET MISMATCH
NA (cm-3)
oxide-silicon interface
Boron Arsenic Gaussian fit
18
10
poly-gate
silicon substrate
17
10
-100
0
100
200
300
y (nm)
Figure 4.20: SIMS profiles of the boron and arsenic concentration for a 0.13 µm CMOS technology. The high concentrations of dopant atoms in the gate are caused by the halo implantations. The last equality is valid for a uniform doping profile. The variance σδψs is estimated by using (4.79) and putting W = L = lδψs . When NA = 1 · 1018 cm−3 and tox = 2.0 nm, this gives lδψs ∼ 36 nm and σδψs ∼ 28 mV ≈ kT /q. In other words, the magnitude of the local variation in the surface potential is comparable to the thermal voltage. This means that to accurately predict the mismatch in weak inversion, 3D analysis is required, as was derived in subsection 4.2.1. For a device to fully operate in strong inversion, the gate overdrive needs to be significantly larger than σδψs (VGS − VT & 3σδψs ). This is the case at the commonly used bias condition of VGS − VT = 150 mV. In reality, the doping profile is not uniform. Figure 4.20 shows SIMS profiles of the doping concentration in the NMOS and PMOS transistors of a 0.13 µm technology with tox = 2.0 nm23 , Lnominal = 130 nm and |VDD | = 1.5 V. These profiles are seen to be well described by Gaussian peaks:
NA (y) = NA0 · e
−
y−DN A WN A
2
,
(4.84)
where NA0 is the peak concentration, DNA the peak position and WNA the width of the peak. Knowing the profile, WD , A0,∆VT , lδVT and σδVT can be calculated using (4.7), (4.79) and (4.83). Extracted and calculated values for all parameters are listed in table 4.3. Also listed are the values obtained by assuming a uniform doping profile. The uniform doping concentration is chosen in such a way, that the depletion layer charge at threshold equals that of the non-uniform case. Depending on the sharpness 23 This value is related to the physical oxide thickness. The effective (or electrical) oxide thickness is equal to toxef f ∼ = 2.7 nm. This value takes into account gate depletion and quantummechanical effects. These phenomena will be studied in a more decent way in subsections 4.3.3 and 4.3.4, respectively.
4.3. PHYSICAL ORIGINS OF FLUCTUATIONS
99
Table 4.3: Parameters related to doping fluctuations, calculated from the arsenic and boron SIMS profiles out of figure 4.20. These parameters are compared to calculations, that assume uniform doping profiles.
NA (cm−3 ) DNA (nm) WNA (nm) WD (nm) A0,∆VT (mVµm) lδVT (nm) σδVT (mV)
Gaussian profile boron arsenic 6.78 · 1017 6.25 · 1017 96 75 103 72 52.3 53.3 1.39 1.30 59.7 63.0 16.5 14.6
uniform profile boron arsenic 4.07 · 1017 3.63 · 1017 56.5 59.8 1.55 1.51 58.7 61.5 18.7 17.3
of the peak, these values are about 20 % to 35 % smaller than those obtained from the body coefficient, that is extracted by varying the bulk bias. In the table, it is observed that the differences are not very large between using the correct doping profiles or assuming a uniform doping concentration. The values for A0,∆VT and σδVT are slightly lower and lδVT is somewhat larger. However, using extreme retrograde doping profiles could significantly lower the fluctuations, since most of the doping is then moved away from the oxide-silicon interface. This improvement follows directly from (4.79) and has been demonstrated by experiment [10] and simulations [74, 76, 79, 80, 83, 107].
4.3.3
Gate depletion
This subsection investigates the impact of gate depletion on MOSFET parameter fluctuations. It was shown in [21, 109–112] that gate depletion can severely degrade the matching performance. In [109–112] models are presented to describe the effect. However, the contribution of the inversion layer charge was not taken into account. The simulations of [109, 111] were performed in the weak inversion regime, while we are interested in strong inversion. Actually, in strong inversion matters are somewhat simplified by not making this approximation. Doping fluctuations cause an equal, but opposite change in QD and Qi , i.e. their contribution can be neglected. Gate depletion then causes parameter fluctuations through two mechanisms, namely 1) the increase in oxide thickness itself, and 2) the variation in this increase. The increase in oxide thickness was modeled in subsection 4.1.3. The increase in threshold voltage mismatch it causes is calculated by replacing tox by tox + tGD in (4.79) and (4.80). Note that tGD is a function of the gate bias. Therefore, the increase both affects threshold-voltage and current-factor mismatch in the simple model, developed in chapter 2 and used in chapter 3 to extract parameters. Microscopic fluctuations in tGD itself can be caused by fluctuations in tox and NP . In modern-day CMOS technologies the oxide thickness is very well controlled. Therefore,
100
CHAPTER 4. PHYSICAL ORIGINS OF MOSFET MISMATCH
it is not expected to give a significant contribution24 . It follows from (4.34) that: ∆tGD =
−1 ²2ox (VGS − φM S − φB ) ∆NP . tox + 2tGD ²si qNP NP
(4.85)
Modeling σ∆NP /NP is quite complicated. In general we can write: 2 σ∆N P /NP
A20,∆NP /NP ,poly.str. 2²ox = . + ²si W LtGD NP WL
(4.86)
The first term on the right-hand side is related to the number fluctuations of NP and provides the lower boundary to the variation. For NP = 5 · 1019 cm−3 , tox = 1.5 nm and VGS @tGD = 0.75 nm it is equal to (0.4 %µm)2 . The second term models the increase in NP related to the stochastic nature of the poly-silicon gate material. In subsection 4.3.6 it will be found that A0,∆NP /NP ,poly.str. ≈ 2.6 %µm for the technology under consideration. A0,∆NP /NP ,poly.str. is expected to decrease with decreasing grain size and is a function of e.g. the implantation conditions of the poly doping and the subsequent annealing steps. The impact of this variation on the drain current follows from (4.57) and (4.65): ¯ ¶ µ ∆ID ¯¯ |QD | 1 + ∆tGD . (4.87) =− ID ¯∆tGD tox + tGD ²ox (VGS − VT ) The terms in between the brackets are related to the current-factor and thresholdvoltage dependence on tGD , respectively. As example we fill in the same parameter values as earlier and NA = 1 · 1018 cm−3 . By only taking number fluctuations in NP into account, the first term of (4.87) gives a contribution of 0.10 %µm. The contribution of the second term is 0.40 mVµm. However, their importance increases for significant A0,∆NP /NP ,poly.str. . As mentioned earlier, tGD is a function of the gate bias. To end this subsection, note that effects related to gate depletion are expected to disappear once metal gates are introduced, as is planned for the 45 nm technology node.
4.3.4
Quantummechanical effects
To describe quantummechanical effects on threshold voltage fluctuations, usually three dimensional simulations are applied [78, 82, 83]. The use of a one dimensional approach was validated by simulations in [89]. In [24], a more analytical approach is followed, but numerical solving was required to obtain final results. In [82], 3D simulation results are compared to calculations which only take the quantummechanical increase in oxide thickness into account. In this subsection simple analytical expressions are obtained by extending the analysis of subsection 4.1.4 to take parameter 24 This conclusion will be further justified in subsection 4.3.5. However, note that it might change once high-k dielectrics are introduced.
4.3. PHYSICAL ORIGINS OF FLUCTUATIONS
101
fluctuations into account. Quantummechanical effects result in an increase in surface potential and an increase in oxide thickness. These effects will be dealt with separately. Increase in surface potential. The quantummechanical increase in surface potential (see (4.35)) enhances threshold voltage fluctuations through two mechanisms: 1) It increases the depletion layer width, which results in extra fluctuations in the depletion layer charge (see subsection 4.3.1), and 2) the increase in ψs itself is proportional to QD and will therefore vary from transistor to transistor. QM From (4.6), the quantummechanical increase in depletion layer width (∆WD ) is calculated to be: ²si ∆ψsQM QM ∼ ∆WD . (4.88) = qNA (WD )WD 2 The increase in σ∆V due to variations in QD follows immediately by replacing WD T QM with WD + ∆WD in (4.79) and (4.80). Now look at the threshold-voltage variation due to the dependence of ∆ψsQM on QD . As in subsection 4.3.2, consider a charge sheet with thickness dy at a distance y from the interface that has a doping concentration that is δNA (y) higher than the average NA . The related decrease in depletion-layer width is smaller than classically expected, because of the quantummechanical increase in surface potential. Using (4.35) and (4.79), we can write:
2 σ∆ 0V T
q 2 (tox + (²ox /²si )dQM )2 = W L²2ox
Z
QM WD
0
à 1−
y − dQM QM WD − dQM
!2 NA (y)dy,
(4.89)
1/3
dQM =
2²si BQM 1 ²si kT − . q|QD | 3|QD |1/3
(4.90)
2 2 Figure 4.21 compares the calculated increase in σ∆V (= 2σ∆ ) due to quantum0V T GS mechanical effects with the results obtained in [82] by 3D atomistic simulations. Also 2 shown is the calculated increase in σ∆V when a quantum-mechanical increase in T oxide thickness of 0.37 nm is assumed, as was done in [82]. It is observed that both models give a reasonably accurate description of the simulation results. In case of significant gate depletion, in (4.89) tox needs to be replaced by tox + tGD . It follows that this somewhat reduces the relative increase due to quantummechanical effects.
Increase in oxide thickness. The increase in oxide thickness due to the non-zero peak location of the electron concentration is given by (4.36). Note again, that this increase only affects the current-factor. The threshold voltage shift is fully modeled −1/3 by the increase in surface potential. It was found that tQM ∝ Eef f . It follows that ∆tQM /tQM = −∆Eef f /3Eef f . The mismatch in the drain-current is then given by: −∆tQM tQM ∆Eef f ∆ID = = . ID tox + tGD + tQM tox + tGD + tQM 3Eef f
(4.91)
102
CHAPTER 4. PHYSICAL ORIGINS OF MOSFET MISMATCH
increase in s2(DVT) (-)
4 3.5
3D simulations our model
3
shift in tox of 0.37 nm
2.5 2 1.5 1 1
2
3
4
5
tox (nm) 2 Figure 4.21: Increase in σ∆V due to quantummechanical effects. Compared are data T obtained from 3D atomistic simulations (symbols) (manually extracted from [82]), our model (solid line) and the calculated results assuming an increase in oxide thickness of 0.37 nm (dashed line). W = L = 50 nm and NA = 5 · 1018 cm−3 .
The mismatch in the effective field is calculated to be ¯ ¯ ∆Eef f ∆Eef f ¯¯ ∆Eef f ¯¯ = + = Eef f Eef f ¯∆QD Eef f ¯∆tGD =
(4.92)
(1 − η)∆QD η²ox (VGS − φM S − φB )∆tGD − . |QD + ηQi | |QD + ηQi |(tox + tGD + tQM )2
The first term on the right-hand side is related to doping fluctuations. This effect is introduced in (4.89) by replacing (tox + (²ox /²si )dQM ) with (tox + (²ox /²si )dQM + tQM QD ), where: tQM QD = −
(1 − η)tQM ²ox (VGS − VT ) . 3(tox + tGD + tQM )|QD + ηQi |
(4.93)
At NA = 1 · 1018 cm−3 , tGD = 0.75 nm and VGS − VT = 0.9 V, tQM QD = −0.07 nm, i.e. the effect can safely be neglected. The second term on the right-hand side of (4.92) is related to fluctuations in gate depletion. This results in an extra term +1/tQM GD inside the brackets of (4.87). The thickness tQM GD is given by: tQM GD =
3|QD + ηQi |(tox + tGD + tQM )3 . η²ox tQM (VGS − φM S − φB )
(4.94)
Filling in the same numbers and VGS − φM S − φB = 1.2 V, yields 1/tQM GD ≈ 1/80 nm−1 , which is small compared to 1/(tox + tGD ).
4.3. PHYSICAL ORIGINS OF FLUCTUATIONS
103
Finally note, that quantummechanical effects were not included in our treatment of gate-depletion in the previous subsection. To include them, in (4.85) and (4.87) tox needs to be replaced by tox + tQM .
4.3.5
Mobility fluctuations
Variation in the current factor can be due to variation in width and length (this will be discussed in chapter 6), variation in oxide capacitance (see subsection 4.3.3) and variation in mobility. In this subsection mobility fluctuations will be discussed. Based on subsection 4.1.5, the responsible mechanisms are separated into fluctuations due to: 1) scattering to fixed oxide charges (∆µf c /µf c ), 2) Coulomb scattering (∆µC /µC ), and 3) surface-roughness scattering (∆µsr /µsr ). From (4.22) and (4.37), the overall mismatch in drain current due to mobility mismatch is equal to: ¯ ∆ID ¯¯ µ ∆µf c µ ∆µC µ ∆µsr ∆µ = + + . (4.95) = ¯ ID ∆µ µ µf c µf c µC µC µsr µsr Furthermore, extra fluctuations can arise from the variation in effective field. Also, fluctuations in the inversion-layer charge affect mobility through the screening terms in (4.40) and (4.43). We will proceed by modeling the different components25 . Fluctuations in effective field. To calculate the impact of fluctuations in effective field n on mobility, we write µ ∝ Eef f . From this it follows by definition that: n≡
∂ln(µ) . ∂ln(Eef f )
(4.96)
The magnitude of n lies in between 0 and −2, depending on the dominant scattering mechanisms, and it is a function of the applied bias conditions. When we assume that this bias dependence is not too strong, it follows that: ¯ ∆µ ¯¯ ∆Eef f ∼ . (4.97) =n ¯ µ ∆Eef f Ee f f The mismatch in the effective field is given by (4.92). In analogy with the definitions of tQM QD and tQM GD in the previous subsection, the parameters tµQD and tµGD are defined as: (1 − η)²ox (VGS − VT ) . (4.98) tµQD = −n |QD + ηQi | 25 A similar approach to the modeling of mobility fluctuations was presented in [65], but a somewhat different model was used as base. We distinguish ourselves at several points: 1) Scattering to fixed oxide charges is not neglected, since the oxide charge density of modern-day heavily-nitrided gateoxides can be very high, 2) for Coulomb scattering, we take into account that the charge only scatters to a limited part of the channel doping, 3) we do not neglect fluctuations in screening by the inversion layer and 4) our model for fluctuations due to surface-roughness scattering is directly related to the physical properties of the oxide-silicon interface.
104
CHAPTER 4. PHYSICAL ORIGINS OF MOSFET MISMATCH tµGD =
1 |QD + ηQi |(tox + tGD + tQM )2 . n η²ox (VGS − φM S − φB )
(4.99)
Filling in the same parameter values as earlier and n = −1/3 yields26 tµQD = 0.44 nm and tµGD = −12 nm. Both effects cannot be neglected and are introduced in (4.89) and (4.87) in the same way as tQM QD and tQM GD . Scattering to fixed oxide charges. Fluctuations in the fixed oxide-charge were calculated in subsection 4.3.1. Introducing them in (4.39) to (4.41) gives: 2 = σ∆µ f c /µf c
8.21 · 10−25 (zµ /|Qi |)1/2 2 . 2 pµ Nf W L
(4.100)
These fluctuations are fully correlated with the threshold voltage fluctuations calculated with (4.80). Coulomb scattering. Fluctuations in channel-doping cause fluctuations in the Coulombscattering-limited mobility (µC ). According to (4.77), to estimate the magnitude, we need to know the sensitivity of the mobility to a variation in the doping concentration as a function of the depth. We assume that the range over which dopants impact the mobility is equal to the inversion layer thickness zµ ≈ 5 − 10 nm, which is supported by the simulation results presented in [88]. It follows that: 2 ∼ σ∆µ = C /µC |∆NA
WL
R zµ 0
2 . NA (y)dy
(4.101)
At a doping concentration of NA = 1 · 1018 cm3 , this results in A0,∆µC /µC |∆NA ∼ = 1.8 %µm. Note that remote impurity scattering is not taken into account in our formulation. More accurate expressions can be obtained by applying the theory out of [98, 99, 113], as was mentioned earlier. The mismatch ∆µC /µC |∆NA is correlated with other equations related to doping fluctuations. Combining (4.89) and (4.101), the correlation is found to be: ´ R zµ ³ à ! y−dQM ¯ NA (y)dy − 1 − QM 0 ∆µC ¯¯ WD −dQM ∼ r ρ , , ∆Q = D ´2 R zµ R WDQM ³ µC ¯∆NA y−dQM NA (y)dy · 0 1 − W QM −d NA (y)dy 0 D
QM
(4.102) For NA = 1 · 1018 cm3 this equals ρ ∼ −50 %. Screening by the inversion layer. Besides fluctuations in doping, variations in µf c and µC are caused by fluctuations in the inversion-layer charge through the screening terms in (4.39) to (4.43). It follows that: ¯ ∆µf c ¯¯ 2.27 · 10−13 (zµ /|Qi |)1/4 Nf ∆Qi , (4.103) = µf c ¯∆Qi pµ Qi 26 The
value of n = −1/3 is related to phonon scattering.
4.3. PHYSICAL ORIGINS OF FLUCTUATIONS ¯ 4 ∆µC ¯¯ γBH ³ = µC ¯∆Qi 2 )2 ln(1 + γ 2 ) − (1 + γBH BH
105
2 γBH 2 1+γBH
´
∆Qi . Qi
(4.104)
Note that ∆Qi /Qi is equal to ∆ID /ID as calculated in the previous subsections, i.e. without taking mobility fluctuations into account. Combining this with (4.95) gives: à ! ¯ ¯ ¯ µ ∆µf c ¯¯ ∆ID ¯¯ Qi µ ∆µC ¯¯ Qi ∆Qi = 1+ + . ¯ ¯ ¯ ID ∆Qi µf c µf c ∆Qi ∆Qi µC µC ∆Qi ∆Qi Qi
(4.105)
The factor inside the brackets has values in the range of 1.0-1.2, depending on the applied bias conditions. Surface-roughness scattering. To calculate the effect of surface-roughness scattering, we need to go a bit deeper into the model that was presented in subsection 4.1.5. Generally, the roughness of the surface is described by a first-order autoregressiveprocess with an autocovariance function (R(r)) [100]: R(r) = ∆2 e−r/L∆ ,
(4.106)
where ∆ ∼ 0.2−0.4 nm represents the magnitude of the roughness and L∆ ∼ 1−3 nm its correlation length. This L∆ can be related to the correlation length (lδµsr ) as defined in subsection 4.2.1. Since the power-density function is the Fourier transform of the autocovariance function: Z 2 lδµ = sr
0
∞
2πre−r/L∆ dr = 2πL2∆ .
(4.107)
It is assumed that ∆2 represents the variance of a process with a Gaussian distribution. 2 In a device with area W L, ∆2 is determined by W L/lδµ independent events. This sr 2 2 2 results in σ∆2 /∆ = 2lδµsr /W L and: 2 σ∆µ = sr /µsr
8πL2∆ . WL
(4.108)
For L∆ = 2 nm this gives A0,∆µsr /µsr = 1.0 %µm. The expected linear dependence of the standard deviation on L∆ , within reasonable limits, also followed from the simulations published in [81,114]. The quantummechanics included in this work have been neglected in our model. Note that from (4.106) we can now √ also calculate the variation in the oxide thickness itself. This is given by A0,∆tox = 4π∆·L∆ = 0.7−4·10−3 nmµm. As was concluded in subsection 4.3.3 this can be neglected. However, with further down scaling of the oxide thickness it might become an issue, and it would therefore be an interesting topic for future study.
106
4.3.6
CHAPTER 4. PHYSICAL ORIGINS OF MOSFET MISMATCH
Combination of all effects and comparison with experiments
In this subsection, the developed theory will be compared to experimental data. The 0.13 µm technology under consideration has a nominal gate length of 0.13 µm, a physical oxide-thickness of 2.0 nm, and a supply voltage of 1.5 V. Measured device pairs have W = L = 1.0 µm and are n-type. The drain bias is put to 50 mV. The channel doping profile is depicted in figure 4.20. It will be shown in section 5.3 that halos can be implanted through the gate and thus affect long-channel transistor behavior. This seriously complicates matters. Therefore, the transistors to which our models are experimentally compared did not receive a halo implantation. The magnitude of the different mobility components and the amount of fixed oxide-charge was determined in subsection 4.1.5. The doping-concentration in the gate is assumed to be 5 · 1019 cm−3 . Only two unknown parameters are left, namely the increase in gatedoping fluctuations due to the poly-silicon structure of the gate (A0,∆NP /NP ,poly.str. ) and the correlation length of the surface roughness (L∆ ). The latter is put to its minimal value of L∆ = 1 nm, which resulted in the best description of the experimental data. A0,∆NP /NP ,poly.str. is obtained from a fit to this data. In this subsection, we choose not to represent our data as ∆ID /ID , but as: ∆UT ≡ −(VGS − VT )
∆ID , ID
(4.109)
which is a measure of the difference in gate voltage at constant drain current, as follows from the following approximation: ∆ID ∼ −∆VT ∆β + . = ID VGS − VT β
(4.110)
Based on this, the mismatch in the current factor is defined as ∆β −d∆UT ≡ , β dVGS
(4.111)
which is a measure of the mismatch in the transconductance. Analyzing ∆UT instead of ∆ID /ID has the advantage of avoiding the singularity at VGS = VT . It is furthermore easier to relate a plot of ∆UT versus VGS to a mismatch in the threshold voltage and a mismatch in the current factor. It follows from (4.110) that ∆UT = ∆VT − (VGS − VT )∆β/β. In other words, when the tangent is taken at an overdrive voltage Vov , its slope represents the mismatch in the current factor times −1 and the intercept with the y-axis is equal to the mismatch in the threshold voltage. The gate overdrive usually lies in the range of Vov = 0.15 − 0.4 V. Figure 4.22 shows the experimental and modeled values of σ∆UT and σ∆β/β as a function of the gate overdrive. The fit is performed on the σ∆UT − (VGS − VT ) curve. It is found that A0,∆NP /NP ,poly.str. = 2.6 %µm, which is a reasonable value. Based on this, it is seen that the magnitude of the σ∆β/β − (VGS − VT ) curve is somewhat underestimated. The experimental curve displays a lot of scatter at higher values for the gate
4.3. PHYSICAL ORIGINS OF FLUCTUATIONS experimental modelled total channel doping fixed charge
gate doping Coulomb scattering surface-roughness scattering 1
7 6
s(Db/b) (%)
s(DUT) (mV)
107
5 4 3
0.8 0.6 0.4
2 0.2
1 0
0.2
0.4
0.6 0.8 1 VGS - VT (V)
1.2
0
0.2
0.4
0.6 0.8 VGS - VT (V)
1
1.2
1
1
0.998
0.8 rrepeat (-)
rrepeat (-)
Figure 4.22: Modeled and experimental σ∆UT and σ∆β/β as a function of the gate overdrive. Also shown are the contributions of the different fluctuation mechanisms that cause the mismatch. The parameters ∆UT and ∆β/β are defined in (4.109) and (4.111), respectively.
0.996 0.994
VBS = 0 V VBS = -1 V
0.992
0.6 0.4
VBS = 0 V VBS = -1 V VBS = -2 V
0.2
VBS = -2 V 0.99
0.2
0.4
0.6
0.8
VGS - VT (V)
(a) ∆UT
1
1.2
0
0.2
0.4
0.6
0.8
1
1.2
VGS - VT (V)
(b) ∆β/β
Figure 4.23: Measurement repeatability of ∆UT (a) and ∆β/β (b) as function of the gate overdrive. VBS = 0 V.
bias. Therefore, no claims can be made about the functional shape. Measurement repeatability is plotted in figure 4.23. It is observed that the measurements of ∆UT are almost fully repeatable, but that the repeatability drops slightly at higher gate bias. This is related to the switching of measurement range, as was earlier observed in subsection 3.3.2. The effect on the measurement of ∆β/β is much larger and it has been compensated for in figure 4.22. Lets return to this figure. Also plotted are the contributions of the different components. These components originate from uncorrelated physical mechanisms and the modeled total is calculated by adding them quadratically. It is observed that all com-
108
CHAPTER 4. PHYSICAL ORIGINS OF MOSFET MISMATCH
ponents play a role. For σ∆UT , at low gate overdrive, fluctuations in channel doping dominate. At higher gate bias, fluctuations in gate doping play a more prominent role. For σ∆β/β , fluctuations in gate doping affect the mismatch over the whole bias range. At low gate overdrive also fluctuations in channel doping and Coulomb scattering are observed. Most of the components in figure 4.22 consist of several subcomponents, as derived in the previous subsections. Figure 4.24 shows how the major components are built up. Subcomponent add up linearly, because they are caused by the same physical mechanism. In case of ∆UT , it is observed that about 50 % of the total doping fluctuations is explained by the calculations presented in subsections 4.3.2 and 4.3.4. Furthermore, a big portion is caused by Coulomb scattering. The fluctuations due to Coulomb scattering were split up in a part that is fully correlated with ∆QD (as depicted in figure 4.24a) and a fully uncorrelated part (as depicted in figure 4.22). In case of ∆β/β (see figure 4.24b), besides the effects calculated in subsections 4.3.2 and 4.3.4, all effects play a role. At low gate bias, Coulomb scattering is the dominating subcomponent. In figures 4.24)c+d it is observed that the fluctuations in gate-doping are partly compensated by the fluctuations in effective field. A somewhat thicker oxide reduces the inversion-layer charge, but it increases the mobility. To further test the theory, the results out of figure 4.22 are extrapolated to lower values of the bulk bias, as is displayed in figure 4.25. For σ∆UT , the bulk bias dependence is well predicted at low values of the gate overdrive. However, the prediction is incorrect at higher VGS −VT . This might indicate (4.86) to be too simple to accurately describe fluctuations in gate doping. E.g. one can imagine that A0,∆NP /NP ,poly.str. decreases with increasing tGD . Another possibility is that our mobility model is not accurate enough. The bulk bias dependence of σ∆β/β is hidden by measurement noise. Finally the correlation of ∆UT with itself at different bias conditions is examined. This correlation shows to what extend ∆UT is determined by the same physical mechanisms when the bias conditions are varied. Figure 4.26a plots the correlation of ∆UT at VGS − VT = 0.25 V with ∆UT at other values of the gate bias. This correlation is seen to decrease with increasing gate bias, which is reasonably well predicted by our models. Figure 4.26b shows the correlation of ∆UT at VBS = 0 V with ∆UT at lower values of the bulk bias. The decrease in correlation with decreasing bulk bias is underestimated by our model, but the decrease itself is observed to be small.
4.3.7
Discussion
From the experimental work presented in the previous subsection it can be concluded that our physical models provide a good prediction of the order of magnitude of the matching properties of the MOSFET. In order to obtain a good description only two fitting parameters are required that are related to the magnitude of doping fluctuations in the gate and to the correlation length of the surface-roughness. We also conclude that Coulomb scattering plays a more important role than it has been at-
4.3. PHYSICAL ORIGINS OF FLUCTUATIONS
0.3
3
total doping+quantum gate depletion effective field Coulomb scattering screening
0.25
2.5 2
total doping+quantum gate depletion
1.5
effective field Coulomb scattering screening
s(Db/b) (%)
s(DUT) (mV)
109
1
0.2 0.15 0.1 0.05 0
0.5 0
-0.05
0.2
0.4
0.6
0.8
1
-0.1
1.2
0.2
0.4
VGS - VT (V)
(a) channel doping
0.8
1
1.2
(b) channel doping
8
0.8 total gate doping effective field screening
6 4
0.6 s(Db/b) (%)
s(DUT) (mV)
0.6
VGS - VT (V)
2
0.4 0.2
total
effective field
gate doping
screening
0
0
-0.2
-2 -0.4
0.2
0.4
0.6
0.8
1
1.2
0.2
0.4
0.6
0.8
1
1.2
VGS - VT (V)
VGS - VT (V)
(d) gate doping
(c) gate doping
6.5
0.9
6
0.8
5.5
s(Db/b) (%)
s(DUT) (mV)
Figure 4.24: Calculation of major components out of figure 4.24 and their subcomponents
5 4.5 VBS = 0 V VBS = -1 V VBS = -2 V model
4 3.5 3 2.5
0.2
0.4
0.6
0.8
1
VBS = 0 V VBS = -1 V VBS = -2 V model
0.7 0.6 0.5 0.4 0.3
1.2
0.2
0.2
0.4
0.6
0.8
VGS - VT (V)
VGS - VT (V)
(a)
(b)
1
1.2
Figure 4.25: Modeled and experimental σ∆UT (a) and σ∆β/β (b) as a function of the gate overdrive with the bulk bias as parameter.
CHAPTER 4. PHYSICAL ORIGINS OF MOSFET MISMATCH r(DUT , DUT@VBS=0.0 V) (-)
r(DUT , DUT@VGT=0.25 V) (-)
110 1 0.95 0.9 0.85
VBS = 0 V VBS = -1 V VBS = -2 V model
0.8 0.75 0.7
0.2
0.4
0.6
0.8
1
1.2
1 0.98 0.96 0.94
VBS = 0 V VBS = -1 V VBS = -2 V model
0.92 0.9
0.2
0.4
0.6
0.8
VGS - VT (V)
VGS - VT (V)
(a)
(b)
1
1.2
Figure 4.26: a) Correlation of ∆UT at VGS − VT = 0.25 V with ∆UT at other values of the gate overdrive as a function of this gate overdrive. b) Correlation of ∆UT at VBS = 0 V with ∆UT at lower values of the bulk bias as a function of the gate overdrive.
tributed in literature. Besides explaining the mismatch in current factor at low values of the gate bias, it also causes an apparent mismatch in the threshold voltage. The more or less constant contribution of the mismatch in Coulomb scattering to ∆UT , as observed in figures 4.22 and 4.24a, can be explained as follows: To first order, due to screening by the inversion layer, the impact of Coulomb scattering is inversely proportional to the inversion layer charge. This gives rise to a 1/(VGS − VT ) dependence, which means that fluctuations due to Coulomb scattering cannot be distinguished from fluctuations in the threshold voltage. Therefore, it could indeed explain the generally observed inconsistency between the calculated impact of doping fluctuations on the threshold voltage and the experimentally observed values. Although predicting correct orders of magnitude, our model only showed limited predictive properties when extrapolated to other bias conditions. This can have two origins: 1) The structural properties of the MOSFET are not fully known, especially for the poly-silicon gate. 2) The applied model for gate depletion is very simple, while the models for mobility are semi-empirical at best. Furthermore, we are extrapolating these models to regions for which they were not developed and in which they were not tested. E.g., our equations related to the fluctuations in Coulomb scattering predict that it cannot be neglected, even at high values of the gate bias (see figure 4.22). It would be worthwhile to try to develop these models, based on more sound physical principles, as was for example recently done for the surface potential [115]. Another approach would be to try to accurately simulate the above mentioned effects. Note that the presented models do have their use. They allowed us to qualitatively identify dominant fluctuation mechanisms and at which bias conditions they play a role. The created insight is important when analyzing process splits, as will be found in the next chapter. It also suggest possible technology improvements or future bottlenecks. Fluctuations due to the poly-grain structure of the gate are process related and might be improved. Fluctuations due to Coulomb scattering can be reduced by
4.4. CONCLUSIONS
111
reducing the doping concentration close to the oxide-silicon interface, e.g. by implementing an undoped epitaxially grown silicon layer on top of the doped substrate. Remote Coulomb scattering to dopants in the gate could be a possible bottleneck when oxide thicknesses are reduced further.
4.4
Conclusions
In this chapter the physical origins of MOSFET mismatch were examined. Firstly, a brief overview of MOSFET operation was presented. Secondly, in the main part of the chapter, it was examined how microscopic mismatch affects macroscopic transistor behavior and which physical mechanisms are responsible for the microscopic mismatch. √ It was found that the commonly used σ∆P ∝ 1/ W L law only holds in the linear regime at low drain bias. In saturation a logarithmic deviation with the length has been observed. This deviation is caused by the higher resistivity of the channel at the drain side than at the source side. Its magnitude is limited by the correlation length of the mismatch causing stochastic process or by short-channel effects. Shortor narrow-channel devices also show deviations, as was reported earlier in literature. For short transistors, these deviations are mainly caused by geometrical effects. For narrow transistors other effects like the lower doping level and sidewall roughness can play a role. In weak inversion, √ several mechanisms have been identified that cause deviations from the σ∆P ∝ 1/ W L law. Firstly, in weak inversion, local variations are large, because of the exponential dependence of the inversion-layer-charge density on the surface potential. The current tends to flow around regions with high resistivity, while it prefers to flow in regions with low resistivity. This effect has been investigated by simulating resistor networks. Secondly, halos and narrow-channel effects have a relatively large impact on the mismatch in weak inversion. A strong halo determines most of the weak-inversion current, even for reasonably long transistors. Therefore, it effectively decreases the channel length, which leads to a relative increase in the magnitude of the fluctuations. The narrow-channel effect gives rise to similar behavior. Most of the weak-inversion current flows in the edge transistors. For wider transistors this effectively decreases the width, which again results in a relative increase of the magnitude of the fluctuations. For NMOS transistors, this width effect was found to be the main reason for the difference in mismatch between weak and strong inversion. PMOS transistors were found not to possess severe narrow-channel effects, and do not suffer from this behavior. Most of the above mentioned effects were expected to impact the symmetry of the MOSFET. At low drain bias the device was found to be fully symmetrical. In saturation, asymmetry was observed for long transistors, because of the higher resistivity of the channel at the drain side. For short transistors, channel-length modulation was seen to cause asymmetry. In weak inversion it was observed that higher drain bias lowers the halo barrier at the drain side of the transistor, again giving rise to a loss
112
CHAPTER 4. PHYSICAL ORIGINS OF MOSFET MISMATCH
of symmetry. The physical mechanisms that are responsible for microscopic fluctuations were identified as: 1) doping fluctuations in the channel, 2) doping fluctuations in the gate, 3) fluctuations in fixed-oxide charge, 4) fluctuations due to Coulomb scattering and 5) fluctuations due to surface-roughness scattering. The fluctuations in gate doping can be enhanced by the poly-grain structure of the gate material. These mechanisms affect transistor behavior by influencing: 1) the threshold voltage, 2) the amount of gate depletion, 3) the magnitude of quantummechanical effects, 4) the effective field, 5) the amount of carrier screening and 6) the mobility in general. It was found that none of the above mentioned effects can safely be neglected. At low gate overdrive the most dominant mechanism was found to be the fluctuation in channel doping. Besides directly influencing the threshold voltage mismatch, it also causes an apparent mismatch in threshold voltage due to Coulomb scattering. At higher values of the gate bias fluctuations in gate doping become more prominent. The predictive quality of the developed model was tested by varying the bulk bias and by looking at the correlation of the mismatch at different bias conditions. The bulk bias dependence was found to be well predicted at low values of the gate overdrive voltage. At higher gate bias the prediction became less accurate. Correlations at different bias conditions were reasonably well predicted. To obtain higher accuracy, it was reasoned that the models related to variations in gate doping and the models related to the variation in mobility need to be improved. The presented equations provide qualitative insight and can be used in the optimization of a technology with respect to its matching performance.
Chapter 5
Technological aspects Until now, we have addressed the matching properties of MOS transistors from a device point of view. However, these devices need to be fabricated. Ideally, the fabrication process does not influence the matching behavior of a technology. In this case, the magnitude of the microscopic fluctuations is lower bounded by the Poisson statistics attributed to the dopants, as was discussed in the previous chapter. However, we will see that the fabrication process can have an impact on the matching behavior. This can be due to the intrinsic properties of a certain process step or of a material used in this step, but it can also be caused by unwanted side effects. In literature, only a limited amount of papers have been published about the impact of processing on MOS transistor mismatch. These deal with the impact of the granular structure of the gate material [21, 112], channel engineering [10] and the impact of charging damage during processing [116]. Others deal with layout related issues like the impact of the proximity of a capacitor [47], the impact of metal lines [20, 46] or more complex layouts of the transistors themselves [117, 118]. In [8] an overview is presented of how the matching performance of a 0.18 µm CMOS technology improves with the optimization of several process steps. However, it is not specified how these optimizations are done. In this chapter it is not attempted to present a full overview of the impact of CMOS process steps on MOSFET mismatch. We shall limit ourselves to the examples encountered during this work. These will demonstrate how certain process steps can have a devastating effect on the matching properties of a technology, while hardly affecting average MOSFET operation. In the first section of this chapter the examined technologies are briefly introduced. The second section examines the choice of gate material and the third section looks at the impact of the halo implantation. In the fourth section the examined technologies are compared and the impact of scaling on the matching properties is discussed. The fifth section briefly addresses the matching behavior for future device architectures after which this chapter is concluded.
113
114
5.1
CHAPTER 5. TECHNOLOGICAL ASPECTS
Technology descriptions
In this chapter we will examine four CMOS technologies that were developed in IMEC. These technologies are optimized towards transistors with physical gate lengths of 100 nm to 180 nm. A schematic overview of their process flows is presented in table 5.1. The different steps will now briefly be described. Shallow Trench Isolation (STI). Shallow trench isolation is applied to electrically isolate one device from the other. This is achieved by depositing oxide in etched trenches. The depth of these trenches ranges from 325 nm to 400 nm for the examined technologies. Deep well implantations. The deep well implantation is also used to provide isolation between transistors. This implantation fixes the doping concentration under the STI. Together with the thickness of the STI, this doping concentration determines the possible leakage from one transistor to the other: The higher the doping level, the better the isolation. The n-well is implanted with phosphorus with an energy of 380 keV and a dose of 1.0 · 1013 cm−2 . The p-well is implanted with boron with an energy of 180 keV and a dose of 1.2 · 1013 cm−2 . Channel implantations. The channel implantation is performed at a lower energy than the deep well implantation. It determines the doping concentration at the top silicon interface and by this the threshold voltage. In the n-well an arsenic implantation is used with energies in the range of 100 − 200 keV and doses in the range of 3 · 1012 − 6 · 1012 cm−2 . In the p-well a boron implantation is used with energies in the range of 20 − 40 keV and doses in the range of 0.5 · 1013 − 2 · 1013 cm−2 . Gate stack. The gate stack consists of the gate insulator (SiO2 ) and the gate electrode. Nitrogen is introduced in the gate oxide to prevent boron penetration for PMOS transistors. In case of the Lnominal = 100 nm and Lnominal = 130 nm technologies, the very high nitrogen concentration also gives rise to an increase in the relative permittivity to a value in between ²r ≈ 3.9 and ²r ≈ 6, thus reducing the SiO2 -equivalent oxide thickness. These equivalent thicknesses are also listed in table 5.1 for the examined technologies. For the Lnominal = 180 nm technology the gate consists of amorphous silicon, that after recrystallization results in grain sizes of ∼ 100 nm. Poly-silicon is used for technologies with shorter nominal gate lengths. This material has grain sizes of ∼ 30 nm. In case of the Lnominal = 100 nm technology, gate pre-doping is applied for the NMOSFETs to lower the gate resistance and to decrease the gate depletion. This gate pre-doping consists of a phosphorus implantation with an energy of 25 keV and a dose of 2.0 · 1015 cm−2 . Together with the channel doping, the gate stack determines the long-channel threshold voltage and current factor. These parameters are shown in figure 5.1 as a function of the gate length1 . As expected, the threshold voltage decreases and the current fac1 In
this chapter, the average values of the threshold voltages and current factors are extracted
5.1. TECHNOLOGY DESCRIPTIONS
115
Table 5.1: Front end of line process steps of the four examined CMOS technologies. Lnominal (nm) VDD (V) oxide thickness (nm) gate pre-doping (nm) Shallow Trench Isolation Deep NWELL implantation NWELL channel implantation Deep PWELL implantation PWELL channel implantation Gate stack N-halo implantation N-LDD implantation P-halo implantation P-LDD implantation Spacers N-HDD and P-HDD implantations Silicidation
100 1.2 1.5 yes
130 1.5 2.0 no
150 1.5 3.0 no
180 1.8 3.5 no
yes yes
yes yes
no no
no no
tor increases with decreasing oxide thickness. Halo implantations. The halo implantations are performed to counter the shortchannel effect. They are responsible for the roll-up of the VT − L curves in figure 5.1. In a well optimized technology the nominal transistor lies close to the peak of this curve. This is observed to be the case for the NMOS transistors. The PMOS transistors are seen to be not as well optimized. For the halos of the NMOS transistors a boron or BF2 implantation is used at an angle in the range of 25◦ − 45◦ , with energies in the range of 10 − 75 keV and with doses in the range of 1 · 1013 − 3 · 1013 cm−2 . For the halos of the PMOS transistors an arsenic implantation is used at an angle in the range of 25◦ − 45◦ , with energies of ∼ 100 keV and with doses of ∼ 2 · 1013 cm−2 . Note that for the Lnominal = 150 nm and Lnominal = 180 nm technologies the halo implantations for the PMOSFETs were not yet introduced. Lightly Doped Drain (LDD) implantations, spacers and Highly Doped Drain (HDD) implantations. Two implantation steps are executed in order to dope the source and drain regions. Firstly, the low energy LDD implant is performed. This determines the junction depth close to the channel, which impacts short-channel behavior. However, the resistance of the LDD regions is not negligible and it decreases the drive current. The n-LDD is implanted with arsenic with energies in the range of 5−30 keV and doses with the maximum slope method. However, for reasons explained in chapter 3, the mismatch in these parameters is extracted by the three-points method.
116
CHAPTER 5. TECHNOLOGICAL ASPECTS 0.5
0
NMOS VT (V)
VT (V)
0.4 0.3 0.2
-0.1 -0.15 -0.2
100 nm 130 nm
-0.3
150 nm 180 nm
-0.35 -0.4
0.1
1
PMOS 0.1
1 L (mm)
L (mm) 140 NMOS 400 300 200 100 nm 130 nm
100
0.1
150 nm 180 nm 1 L (mm)
(L/ W) b (mA V-2)
(L/ W) b (mA V-2)
500
0
150 nm 180 nm
-0.25
0.1 0
100 nm 130 nm
-0.05
PMOS
120 100 80 60 40 100 nm 130 nm
20 0
0.1
150 nm 180 nm 1 L (mm)
Figure 5.1: Threshold voltage and normalized current factor as a function of the gate length for the four CMOS technologies that are examined in this chapter. The legends list the nominal gate lengths of the technologies.
in the range of 1·1014 −2·1015 cm−2 . The p-LDD is implanted with boron or BF2 with energies in the range of 1 − 10 keV and doses in the range of 1 · 1014 − 2 · 1015 cm−2 . Secondly, after forming spacers at the side of the gate, the HDD is implanted at a higher energy and with a higher dose than the LDD implantation. This lowers the resistance of the main part of the source and drain regions, but because of the spacers it doesn’t affect the junction depth and the doping level close to the channel. Furthermore, the HDD implantation also dopes the gate. The n-HDD is implanted with arsenic with energies in the range of 25 − 75 keV and doses in the range of 1 · 1015 − 1 · 1016 cm−2 . The p-HDD is implanted with boron or BF2 with energies in the range of 2 − 25 keV and doses in the range of 2 · 1015 − 5 · 1015 cm−2 . A schematic overview of the resulting doping profile was presented in figure 4.1. Note that for the Lnominal = 150 nm and Lnominal = 180 nm technologies the LDD implantations for the PMOSFETs were omitted, because of a too strong lateral diffusion of the HDD regions. Thermal steps. After most implantation steps a thermal step is given in order to electrically activate the dopants. These thermal steps also cause diffusion of dopants
5.2. IMPACT OF THE GATE
117
0.5 4.2 1.0
1.4 1.9
4.0
0.3 amorphous gate
0.2
PMOS
-0.05 VT (V)
VT (V)
0.4
0 10.0 7.1 12.0
-0.1 amorphous gate poly gate
12.0
-0.15 -0.2
10.0
-0.25
poly gate
7.1
-0.3
0.1 NMOS 0 0.1
4.2
-0.4 0.1
1
350
4.0
1
100
300
4.2
1.0
1.4 1.9
4.0
12.0 10.0 7.1
200 150
amorphous gate poly gate
(L/ W) b (mA V-2)
(L/ W) b (mA V-2)
1.4 1.9
L (mm)
L (mm)
250
1.0
-0.35
PMOS 80 12.0
60
10.0
7.1
40
100
4.2
1.0
1.4 1.9
4.0
amorphous gate poly gate
20
50 NMOS 0 0.1
1
0 0.1
L (mm)
1 L (mm)
Figure 5.2: Threshold voltage and normalized current factor as a function of the gate length for the Lnominal = 150 nm technology with tox = 3.0 nm. Transistors with poly-silicon gates and with amorphous gates are compared. The numbers in the figures give the width of the transistors in µm. from strongly doped regions to regions with lower doping levels. Silicidation. To further lower the resistance of the gate, source and drain, a titaniumcobalt silicide is formed on top of these regions.
5.2
Impact of the gate
As was reasoned in the previous chapter, the gate can influence the matching properties of a technology 1) by increasing the effective oxide thickness due to gate depletion, 2) by fluctuations in the gate doping itself and 3) by boron penetration through the gate oxide in case of PMOS transistors [21]. In this section two experiments related to the processing of the gate stack are evaluated. The first investigates the impact of changing the gate material from amorphous silicon to poly-crystalline silicon. The second looks at the way the gate is doped.
118
CHAPTER 5. TECHNOLOGICAL ASPECTS 8
6
s(DVT) (mV)
s(DVT) (mV)
7 amorphous gate
5
poly gate
4 3 2
5 4 3
1
NMOS 0
0.2
0.4
0.6
0.8
1
PMOS
0
1.2
0
0.2
0.4
WL (mm-1)
1/
0.6 1/
1.6
0.8
1
1.2
WL (mm-1)
2
1.4
amorphous gate poly gate
1.2
s(Db/b) (%)
s(Db/b) (%)
amorphous gate poly gate
6
2
1 0
7
1 0.8
amorphous gate 1.5
poly gate
1
0.6 0.4
0.5
0.2 0
NMOS 0
0.2
0.4
0.6 1/
0.8
1
1.2
WL (mm-1)
PMOS 0
0
0.2
0.4
0.6 1/
0.8
1
1.2
WL (mm-1)
√ Figure 5.3: σ∆VT and σ∆β/β as a function of 1/ W L for transistors with W > 0.5 µm and L > 0.4 µm. Transistors with poly-silicon gates and with amorphous gates are compared. Error bars represent 99 % confidence intervals. √ Table 5.2: Proportionality constants of the σ∆P ∝ 1/ W L-law for transistors with an amorphous gate and with poly-silicon gate material. Also the one-sigma confidence intervals are given. gate material amorphous poly
5.2.1
A0,∆VT A0,∆β/β (mVµm) (%µm) NMOS 5.41 ± 0.16 1.25 ± 0.15 4.31 ± 0.33 1.01 ± 0.12
A0,∆VT A0,∆β/β (mVµm) (%µm) PMOS 5.98 ± 0.36 1.40 ± 0.14 3.27 ± 0.37 1.09 ± 0.08
Amorphous or poly-crystalline silicon as gate material?
The impact of the gate material on transistor performance was examined in the Lnominal = 150 nm technology with tox = 3.0 nm. Figure 5.2 shows the average threshold voltage and normalized current factor for transistors processed with poly-
5.2. IMPACT OF THE GATE
119
silicon gate material and for transistors with amorphous gates. No significant differences are observed for the NMOS transistors. For the PMOS transistors a higher absolute threshold voltage and a slightly lower current factor are observed for the devices with a poly-silicon gate. This means that these devices suffer more strongly from gate depletion than the devices with an amorphous gate. The shift in threshold voltage can also be partly caused by a higher level of boron penetration for the transistors with an amorphous gate. However, if this effect would be dominant, this would cause an opposite shift in the current factor. √ Figure 5.3 shows σ∆VT and σ∆β/β as a function of 1/ W L for the transistors from table 2.1 with W > 0.5 µm and L > 0.4 µm. It is observed that transistors with poly-silicon gates possess a better matching performance than those with amorphous gates. Proportionality constants of the expected linear relationships are listed in table 5.2. The use of poly-silicon gate material results in better matching behavior because of the smaller grain size compared to the grain size of the amorphous gate after recrystallization. The correlation lengths of mismatch causing stochastic processes in the gate are directly proportional to this grain size. This also means that, even though the PMOS transistors with a poly-silicon gate suffer more from gate depletion, the impact of the variation in this gate depletion is smaller, since it is more effectively averaged out over the transistor. In general, we can state that one way to decrease the impact of the gate on the matching behavior of a technology is to decrease the poly-grain size of the gate material. An even better matching performance could be achieved after the introduction of metal gates.
5.2.2
Impact of the gate doping
In [21] it was reasoned that the temperature of the rapid thermal anneal step after the HDD implantation impacts the matching behavior in the following way: When the temperature is too low, the matching performance is degraded by too strong gate depletion. However, when the temperature is too high, it is degraded, because of the boron penetration that occurs for PMOS transistors. We investigated the impact of the gate doping on the Lnominal = 130 nm technology with tox = 2.0 nm. First consider the NMOS transistors. The following variations in the process were applied: For the reference, the arsenic HDD implantation is performed with an energy of 40 keV and a dose of 6.0 · 1015 cm−2 . To reduce possible gate depletion, gate pre-doping was added to two device wafers. This gate pre-doping consists of a phosphorus implant with an energy of 25 keV and a dose of 2.0·1015 cm−2 . The third NMOS process split had a reduced HDD dose of 4.0 · 1015 cm−2 for which increased gate depletion is expected. For the PMOS reference the boron HDD is implanted with an energy of 4.0 keV and a dose of 3.0 · 1015 cm−2 . For the second experimental split the thermal budget after the HDD implant is increased, which is expected to result in lower gate depletion, but possibly higher boron penetration. For the third split the HDD dose is increased to 4.5 · 1015 cm−2 , which should also lead to lower gate depletion.
120
CHAPTER 5. TECHNOLOGICAL ASPECTS 0.4
0
0.3 0.25
PMOS
-0.05 7.1
4.2
10.0 1.0
0.2
1.4 1.9
4.0
reference gate pre-doping reduced HDD dose
0.15 0.1 0.05
VT (V)
VT (V)
0.35
10.0
-0.15 -0.2 -0.25 -0.3
NMOS
0 0.1
-0.1
reference increased HDD termal budget decreased HDD implantation energy
7.1
4.2
-0.35 0.1
1
1.4 1.9
4.0
1 L (mm)
L (mm)
500
160
400 4.2
300 10.0
200
1.0
1.4 1.9
4.0
7.1
reference gate pre-doping reduced HDD dose
100 NMOS 0 0.1
1 L (mm)
(L/ W) b (mA V-2)
(L/ W) b (mA V-2)
1.0
PMOS
140 120 7.1
100 80 60 40 20 0 0.1
10.0
4.2
1.0
1.4 1.9
4.0
reference increased HDD termal budget decreased HDD implantation energy 1 L (mm)
Figure 5.4: Threshold voltage and normalized current factor as a function of the gate length for the Lnominal = 130 nm technology with tox = 2.0 nm. The figures related to the NMOS transistors show data for the reference, for transistors with gate pre-doping and for transistors that received a reduced HDD dose. The figures related to the PMOS transistors show data for the reference, for transistors with increased thermal budget and for transistors for which the HDD implantation energy was decreased. The numbers in the figures give the width of the transistors in µm.
Figure 5.4 compares the average threshold voltage and normalized current factor for the examined experimental splits. It is indeed observed that in cases with expected lower gate depletion the absolute value of the threshold voltage is lower and the value of the current-factor is higher with respect to the reference. The opposite behavior is observed when the gate depletion is expected to be higher. √ Figure 5.5 shows σ∆VT and σ∆β/β as a function of 1/ W L for the same experimental splits. The proportionality constants are listed in table 5.3. In case of the PMOS devices, no significant differences are observed. For the NMOSFETs, the transistors that suffer most from gate depletion, also posses the worst matching characteristics. However, the differences are small and it is doubtful if they are significant. More accuracy might be obtained in a future experiment by increasing the population size above the 84 device pairs per dimension that were available for this experiment. However, at this point we can conclude that the gate doping is not the dominant
5.2. IMPACT OF THE GATE
121
3.5
reference gate pre-doping reduced HDD dose
5 4
s(DVT) (mV)
s(DVT) (mV)
6
3 2
2 1.5
0.5
NMOS
0
0
0.2
0.4
0.6
0.8
1
1.2
PMOS 0
0.2
0.4
0.6 1/
WL (mm-1)
1/
0.8
s(Db/b) (%)
reference gate pre-doping reduced HDD dose
1
1.2
reference increased HDD termal budget decreased HDD implantation energy
1 0.8 0.6 0.4
0.5
0.2
PMOS
NMOS 0
1
WL (mm-1)
1.2
1.5 s(Db/b) (%)
2.5
1
1 0
reference increased HDD termal budget decreased HDD implantation energy
3
0
0.2
0.4
0.6 1/
0.8
1
1.2
WL (mm-1)
0 0
0.2
0.4
0.6 1/
0.8
1
1.2
WL (mm-1)
√ Figure 5.5: σ∆VT and σ∆β/β as a function of 1/ W L for transistors with W > 0.5 µm and L > 0.4 µm. The experimental splits are the same as in figure 5.4. Error bars represent 99 % confidence intervals.
√ Table 5.3: Proportionality constants of the σ∆P ∝ 1/ W L-law related to the measurements presented in figure 5.5. Also the one-sigma confidence intervals are given. experimental split reference gate pre-doping reduced HDD dose reference increased HDD thermal budget reduced HDD implantation energy
A0,∆VT A0,∆β/β (mVµm) (%µm) NMOS 4.01 ± 0.32 1.04 ± 0.08 3.64 ± 0.33 1.06 ± 0.12 4.29 ± 0.49 1.18 ± 0.10 PMOS 2.51 ± 0.26 0.98 ± 0.10 2.51 ± 0.28 0.89 ± 0.11 2.59 ± 0.29 0.95 ± 0.08
122
CHAPTER 5. TECHNOLOGICAL ASPECTS Table 5.4: Experimental splits on the halo implantation conditions no halo type angle dose (cm−2 ) energy (keV) type angle dose (cm−2 ) energy (keV)
no halo
no halo
reference BF2 35◦ 1.6 · 1013 120 As 35◦ 2.1 · 1013 120
increased dose NMOS BF2 35◦ 2.5 · 1013 120 PMOS As 35◦ 3.0 · 1013 120
increased angle BF2 45◦ 1.6 · 1013 120 As 45◦ 2.1 · 1013 120
mismatch causing effect of the examined technology.
5.3
Impact of the halo implantation
The impact of the halo implantation on MOS transistor matching is examined on the Lnominal = 130 nm technology with tox = 2.0 nm by varying the implantation conditions as listed in table 5.4. Four experimental splits are investigated: 1) No halo is implanted, 2) the reference implantation conditions, 3) the halo dose is increased and 4) the implantation angle is increased. Figure 5.6 shows the average threshold voltage and normalized current factor as a function of the gate length for the different experimental splits. As expected, the short-channel threshold voltage becomes larger for increasing halo dose. The same effect is observed when the implantation angle is increased, which is due to the fact that doping close to the source and drain regions contributes less to the threshold voltage than doping in the center region of the channel. For the current factor no significant differences are observed for the long-channel transistors. For short-channel transistors the current factor is largest when no halos are present. This is most probably due to the smaller effective channel length of these transistors. This effect is observed to be present in a much stronger way for the PMOS transistors than for the NMOS transistors. Also the extra channel doping introduced by the halos causes larger Coulomb scattering, which reduces the current factor. We will now look at the extra parameter fluctuations that the halos introduce. The first subsection of this section discusses the matching properties for long and wide transistors, while the second subsection deals with short- and narrow-channel effects.
5.3. IMPACT OF THE HALO IMPLANTATION 0
10.0 7.1
NMOS 4.2
VT (V)
0.4
1.0
1.4
0.3
1.9 4.0
no halo
0.2 0.1
.
13
1.6 10
cm , 120 keV, 35
2.5 10
.
13
cm , 120 keV, 35
.
cm , 120 keV, 45
13
1.6 10 0 0.1
-0.1 -0.15
2.1 10
.
13
.
3.0 10
13
.
13
2.1 10
-0.2
o
-2
o
-0.3
o
-0.35
NMOS
4.2
7.1
1.0
1.4 1.9
no halo 1.6 10
.
13
cm , 120 keV, 35
2.5 10
.
13
cm , 120 keV, 35
.
13
cm , 120 keV, 45
1.6 10
-2
o
-2
o
-2
o
1
4.0
(L/ W) b (mA V-2)
(L/ W) b (mA V-2)
1.4 1.9
4.0
PMOS 1 L (mm)
300
0 0.1
o
cm , 120 keV, 45
160 10.0
100
o
-2
cm , 120 keV, 35
1.0
4.2
7.1
-0.4 0.1
1
500
200
o
-2
10.0
L (mm)
400
-2
cm , 120 keV, 35
-0.25
-2
-2
no halo
-0.05 VT (V)
0.5
123
PMOS
140 120
4.2 10.0
100
1.0
7.1
80
no halo
60
2.1 10
.
13
40
.
3.0 10
13
20
.
13
0 0.1
L (mm)
2.1 10
1.4 1.9
-2
o
-2
o
-2
o
cm , 120 keV, 35 cm , 120 keV, 35 cm , 120 keV, 45
4.0
1 L (mm)
Figure 5.6: Threshold voltage and normalized current factor as a function of the gate length for the Lnominal = 130 nm technology with tox = 2.0 nm. The displayed experimental splits are described in table 5.4. The numbers in the figures give the width of the transistors in µm.
5.3.1
Long- and wide-channel transistors
√ Figure 5.7 shows σ∆VT and σ∆β/β as a function of 1/ W L for the experimental splits described by table 5.4. The proportionality constants are listed in table 5.5. Since halos are only supposed to be located around the source and drain regions, they are not expected to affect the matching properties of long-channel transistors. However, it is observed that the mismatch increases when halos are √ implanted. Furthermore, it is observed that σ∆VT and σ∆β/β are proportional to 1/ W L for all splits. This means that the mismatch causing stochastic process is not only located at the source and drain sides of the transistor, but that it is present over the whole area of the transistor. We therefore believe that the gate does not act as a perfect mask and that part of the halos are implanted through the gate. This results in localized regions of high concentration of boron or arsenic at the gate side of the oxide or at the channel side, as is displayed in figure 5.9. Localized concentrations at the gate side result in extra fluctuations in gate depletion, while localized concentrations in the channel result in extra fluctuations in the threshold voltage and Coulomb scattering.
124
CHAPTER 5. TECHNOLOGICAL ASPECTS
4
5
.
1.6 10
13
.
2.5 10
13
.
13
1.6 10
4
-2
o
-2
o
cm , 120 keV, 35 cm , 120 keV, 35 -2
cm ,
120 keV, 45
3
o
no halo
3.5
.
13
.
13
.
13
2.1 10
3
3.0 10
2.5
2.1 10
2
-2
o
-2
o
cm , 120 keV, 35 cm , 120 keV, 35 -2
cm ,
120 keV, 45
o
1.5
2
1
1
0.5
0
NMOS
0
0
0.2
0.4
0.6 1/
2 s(Db/b) (%)
s(DVT) (mV)
no halo
6
0.8
1
1.2
.
13
1.5
.
13
.
13
2.5 10 1.6 10
1
1.4 -2
o
-2
o
cm , 120 keV, 35 cm , 120 keV, 35 -2
cm ,
120 keV, 45
0.2
0.4
0.6
o
0.8
1
1.2
WL (mm-1)
no halo
1.2
.
13
.
13
.
13
2.1 10
1
3.0 10 2.1 10
0.8
-2
o
-2
o
cm , 120 keV, 35 cm , 120 keV, 35 -2
cm ,
120 keV, 45
0.6
o
0.4
0.5
0.2
NMOS 0
0
1/
no halo 1.6 10
PMOS
WL (mm-1)
s(Db/b) (%)
s(DVT) (mV)
7
0
0.2
0.4
0.6 1/
0.8
1
0
1.2
PMOS 0
0.2
0.4
WL (mm-1)
0.6 1/
0.8
1
1.2
WL (mm-1)
√ Figure 5.7: σ∆VT and σ∆β/β as a function of 1/ W L for transistors with W > 0.5 µm and L > 0.4 µm. The displayed experimental splits are listed in table 5.4. Error bars represent 99 % confidence intervals.
1 4.2
1.0
1.4
1.9
0.55
4.0
0.8 0.6 0.4 0.2
0.8 0.6
4.2
1.0
1.4
1.9
0.55
4.0
0.4 0.2 0
0 NMOS -0.2
r(DVT , Db/b) (-)
r(DVT , Db/b) (-)
1
0
1
2
3 L (mm)
4
-0.2 -0.4
PMOS 0
1
2
3
4
L (mm)
Figure 5.8: Correlation between ∆VT and ∆β/β as a function of the gate length. The displayed experimental splits are listed in table 5.4. Symbols have the same meaning as in figure 5.7. The numbers in the figures give the width of the transistors in µm.
5.3. IMPACT OF THE HALO IMPLANTATION
125
√ Table 5.5: Proportionality constants of the σ∆P ∝ 1/ W L-law and the long-channel correlation between ∆VT and ∆β/β for transistors with different halo implantation conditions, as listed in table 5.4. Also the one-sigma confidence intervals are given. A0,∆VT (mVµm) no halo reference increased dose increased angle
3.08 ± 0.42 4.01 ± 0.32 5.53 ± 0.92 5.20 ± 0.30
no halo reference increased dose increased angle
2.25 ± 0.27 2.51 ± 0.26 3.13 ± 0.44 2.97 ± 0.46
poly-silicon gate
A0,∆β/β (%µm) NMOS 0.92 ± 0.10 1.04 ± 0.08 1.58 ± 0.39 1.46 ± 0.23 PMOS 0.89 ± 0.11 0.98 ± 0.11 1.02 ± 0.16 1.07 ± 0.13
ρ(∆VT , ∆β/β) (-) 0.15 ± 0.12 0.30 ± 0.21 0.62 ± 0.24 0.51 ± 0.23 0.02 ± 0.19 0.13 ± 0.16 0.27 ± 0.10 0.27 ± 0.14
local concentrations of Boron or Arsenic due to halo implantation
oxide gate depletion
extra channel doping
Figure 5.9: Schematic drawing of a MOSFET gate including localized regions of charge due to the halo implant. This charge can result in localized regions of extra gate depletion or in regions of extra charge in the channel. Now lets compare the different splits. As expected, it is found that the fluctuations increase when the halo dose is increased. Furthermore, it is found that the fluctuations become larger when the angle of the implant is increased. This can be due to two reasons. Firstly, it is possible that a 45◦ implantation results in charge that is located closer to the oxide-silicon interface than in the 35◦ case. It was found in section 4.3 that this would increase the fluctuations. Secondly, it is possible that 45◦ is a preferential direction with respect to channelling through a stack of grains or implantation along grain boundaries. Based on this, as future work it would be interesting to examine the increase in mismatch due to the halos as a function of the
126
CHAPTER 5. TECHNOLOGICAL ASPECTS
poly-grain size and structure of the gate. Figure 5.8 shows the correlation between ∆VT and ∆β/β as a function of the gate length. A positive correlation is observed, that increases with the amount of charge that is implanted through the gate. We will present two possible explanations for this positive correlation. For the first we need to take a look at figures 4.22 and 4.24, that display the calculated mismatch in ∆UT and of its components as a function of the gate bias. The function ∆UT was defined in such a way that its derivative gives the current factor, ∆β/β = −d∆UT /dVGS @VGS − VT = Vov , and that the intercept with the y-axis is equal to the mismatch in the threshold voltage, ∆VT = ∆UT + (VGS − VT )∆β/β@VGS − VT = Vov . Looking at the figures, it is observed that the Coulomb scattering contribution to ∆UT has a negative slope for overdrive voltages larger than Vov > 0.5 V. This negative slope indeed causes a positive correlation between ∆VT and ∆β/β. Note that applied gate-overdrive voltages generally lie around Vov ≈ 0.3 V. This indicates that our model for Coulomb scattering needs to be refined. Further note that we are not claiming that the mobility increases when more Coulomb scattering is present. We claim that Coulomb scattering is not properly accounted for in our strong-inversion drain-current model. As a result, it lowers the current by apparently increasing the threshold voltage, which is somewhat compensated for by an apparent increase in the current factor. The second explanation for the positive correlation between ∆VT and ∆β/β considers the possibility that our original explanation for the increase in the mismatch was not correct. The correlation could be an artifact of the fact that the drain current model on which the three-points extraction method is based was not derived for devices with halos. However, if this would be a problem, the correlation would be expected to decrease for longer transistors, which is not observed. In order to examine the exact influence of halos on device behavior, two dimensional simulations could be employed. Finally, we note that, in parallel to this work, it was found in [119] that also LDDs can be implanted through the gate. It was shown that the matching performance of a 0.25 µm process can be improved by 1) reducing the LDD implantation energy, 2) increasing the thickness of the gate and 3) increasing the thickness of the implantation oxide on top of the gate. It was also reasoned that the matching performance can be improved by using poly-silicon as gate material instead of an amorphous gate. In conclusion it can be stated that one has to be careful whenever the gate is assumed to act as a mask for an implantation step. When such an implantation goes through the gate, it can seriously degrade the matching performance of a technology.
5.3.2
Short- and narrow-channel effects
This subsection experimentally investigates the effect of the halo implantation on the shortbehavior of the matched parameters. Figure 5.10 √ √ and narrow-channel shows W Lσ∆VT and W Lσ∆β/β as a function of W/L for the experimental splits listed in table 5.4. All the measured devices have the same area of approximately W L = 1.6 µm2 . The shortest measured device is 0.13 µm long, while the narrowest
5.3. IMPACT OF THE HALO IMPLANTATION 14
WL » 1.6 mm2
6
WL s(DVT) (mV mm)
WL s(DVT) (mV mm)
7
127
with halo
5 4 3 no halo
2 0.01
0.1
1
10 8 6
no halo .
13
.
13
.
13
2.1 10 3.0 10 2.1 10
-2
o
-2
o
-2
o
cm , 120 keV, 35 cm , 120 keV, 35 cm , 120 keV, 45
with halo no halo
4 2
NMOS
10
12
WL » 1.6 mm2 0 0.01 0.1
100
PMOS 1
1.8 NMOS
WL » 1.6 mm2
WL s(Db/b) (% mm)
WL s(Db/b) (% mm)
2.2
1.8 1.6
with halo
1.4 1.2 1
WL » 1.6 mm2
1.6 1.4
with halo
1.2 1 0.8
0.8 0.6 0.01
100
W/L (-)
W/L (-)
2
10
no halo
0.1
1
10 W/L (-)
no halo
100
0.6 0.01
0.1
1
10
PMOS 100
W/L (-)
√ √ Figure 5.10: W Lσ∆VT and W Lσ∆β/β as a function of W/L for the experimental splits listed in table 5.4. Symbols have the same meaning as in figure 5.7. The lines are introduced to guide the eye.
device is 0.15 µm wide. In case of absence of short- and narrow-channel effects, the results plotted in figure 5.10 should yield horizontal lines. This is clearly not the case. Comparing the different splits, it is observed that the mismatch increases for square devices when halos are implanted. This is because the halos are implanted through the gate, as was reasoned in the previous subsection. We will now consider the deviations for the short transistor pairs, thus the pairs with high √ W/L-ratio. As was mentioned in subsection 4.2.3, deviations from the σ∆P ∝ 1/ W L relationship can originate from four effects, namely: 1) A shorter effective channel length than the metallurgical length, 2) the increase in surface potential, caused by the proximity of the extension regions, 3) the increase in doping level due to the halos, and 4) fluctuations in the short-channel effects themselves. Looking at figure 5.10, it is observed that the relative increase in mismatch for shortchannel transistors is more prominent when no halos are implanted, because devices with halos possess a larger effective channel length. In case of the NMOS transistors without halos, it is observed that for the shortest gate length the matching performance becomes slightly better again. If significant, this could be explained as follows: When the gate length becomes very short, the main difference between the splits will
128
CHAPTER 5. TECHNOLOGICAL ASPECTS
no longer be determined by the effective channel length, but by the amount that the potential barrier between source and drain is lowered. This barrier lowering is strongest for the transistors without halos. This behavior is not observed for the PMOS transistors. Note that the PMOS transistors suffer more from short-channel effects than the NMOS transistors in this technology and fluctuations in the shortchannel effect itself might start to play a prominent role. These fluctuations are expected to be strongest for the transistors without halos. We will now consider the narrow-channel effects, displayed in figure 5.10 by the devices with low W/L-ratio. As was seen in subsection 4.2.3, the threshold voltage mismatch for the NMOS transistors decreases with decreasing width, while a slight increase is observed for the PMOS transistors. The mismatch in the current factor increases for both NMOS and PMOS transistors. Furthermore, it is observed that the difference in mismatch between devices with and without halo, decreases as the width is reduced. A possible explanation is that the STI introduces topography in the poly-silicon gate, which effectively reduces the amount of halo charge that is implanted through the gate.
5.4
Comparison of different CMOS technologies
By comparing the matching properties of several CMOS processes, it was found in [6] that, as a rule of thumb, A0,∆VT in mV·µm is equal to the effective oxide thickness in nm for a well optimized process. Current-factor mismatch was found to be independent of the technology generation and A0,∆β/β = 1.0 %µm. In order to try to understand these empirical laws we first need to assume some scaling relationships. Lets assume that tox,ef f ∝ κ−1 , where κ is the scaling factor. In this case the doping concentration NA ∝ κ1.5 , the depletion-layer width WD ∝ κ−0.75 and the inversion-layer width zµ ∝ κ−0.5 . The inversion-layer charge at constant gate overdrive scales as Qi ∝ κ. These are crude approximations, but they will serve for the purpose of an order of magnitude calculation. In figure 4.22a out of subsection 4.3.6 it was observed that the mismatch in the drain current2 at low gate overdrive is mainly determined by fluctuations in channel doping. The fluctuation in channel doping affects the drain current through threshold-voltage fluctuations and fluctuations in the amount of Coulomb scattering, as was observed in figure 4.24a. From (4.79), it directly follows that fluctuations in the threshold voltage scale like A0,∆VT ∝ κ−0.625 . To estimate the scaling of the fluctuations in Coulomb scattering, it is assumed that µ/µC ∝ zµ NA /Qi . Together with (4.101) this yields A0,∆UT |∆µC ∝ κ−0.5 . We conclude that the two main components to the mismatch scale less strongly than the oxide thickness, opposing the experimental observations. This suggests that people learn from mistakes in older technologies, which helps to improve the newer ones. It also indicates that research remains necessary to keep on obtaining the same levels of improvement with down scaling. 2 Remember
that σ∆ID /ID = σ∆UT /(VGS − VT ).
5.4. COMPARISON OF DIFFERENT CMOS TECHNOLOGIES 3.5
NMOS PMOS expected value
7 6
A(Db/b) (% mm)
A(DVT) (mV mm)
8
5 4 3
3 2.5
NMOS PMOS expected value
2 1.5 1
2
0.5
1 0
129
0
100
120
140
160
180
nominal gate length (nm)
(a)
100
120
140
160
180
nominal gate length (nm)
(b)
Figure 5.11: A0,∆VT (a) and A0,∆β/β (b) for several technology generations as a function of their nominal gate length. Error bars are 1σ long. The solid lines plot the expected values: A∆VT in mVµm is expected to be equal to the effective oxide thickness in nm. The physical oxide thicknesses of the investigated technologies are listed in table 5.1. Now consider the current factor. It is observed in figures 4.22b and 4.24b+d that the current factor is determined by several relatively small mechanisms. Some of them decrease when the transistor is scaled down, while others increase. From this, it can be understood that the overall mismatch in the current factor did not drastically differ for different technology generations. Lets consider the fluctuations in the gate doping separately, since they contribute the most. It follows from (4.85) and (4.87) that approximately σ∆β/β |∆Np ∝ (QD + Qi )σ∆Np /Np /tox,ef f Np . When we assume that the doping concentration in the gate and A0,∆Np /Np do not scale, it follows that A0,∆β/β |∆Np scales at a rate of κ1.75 − κ2.0 . This clearly demonstrates the need for the scaling of A0,∆Np /Np , which is generally done by reducing the poly-grain size, and ultimately by introducing metal gates. We will now look further into how well the technologies discussed in this chapter follow the empirical scaling laws. Note that none of these technologies was optimized with respect to their matching performance. Figure 5.11 shows A0,∆VT and A0,∆β/β as a function the nominal gate length of the technologies presented in table 5.1. Reference processing conditions were used. The solid line shows the expected value. It is observed that the technology with Lnominal = 180 nm has a significantly worse matching performance than later technologies. This is caused by the change in gate material from amorphous- to poly-silicon. In case of A0,∆VT , the PMOS transistors perform as expected. For the NMOS transistors, the observed matching properties lie above the expected values. This could be due to halos that are unintentionally implanted through the gate. Note that the poly-grain size was not scaled. In case of A0,∆β/β , it is observed that the mismatch indeed lies around 1.0 %µm for technologies with poly-silicon gates. In [8] it is calculated that, in order to obtain a 90 % yield on a 1 Mbit SRAM, it is required that A0,∆VT < 6.0 mVµm for an Lnominal = 180 nm technology and that
130
CHAPTER 5. TECHNOLOGICAL ASPECTS 20 s(Db/b) (%)
s(DVT) (mV)
50 40 30
NMOS PMOS
15
10
20 0.20/0.10
10 0
0.20/0.15
0.25/0.18
NMOS PMOS
0.25/0.13
5
0
100
120
140
160
180
nominal gate length (nm)
(a)
0.20/0.10
100
0.25/0.18 0.25/0.13
120
0.20/0.15
140
160
180
nominal gate length (nm)
(b)
Figure 5.12: σ∆VT (a) and σ∆β/β (b) for the minimum size transistor of several technology generations as a function of their nominal gate length. The W/L-ratios are included in the figure and are in (µm/µm). Solid symbols are based on experimental data, while the values of the open symbols are calculated from A0,∆VT (a) and A0,∆β/β (b) and the device areas. Error bars are 1σ long. The physical oxide thicknesses of the investigated technologies are listed in table 5.1. A0,∆VT < 2.5 mVµm for an Lnominal = 100 nm technology. It is observed in figure 5.11a that for the NMOS transistors these specs are not reached. The picture becomes worse when we look at the matching properties of the minimum size transistor, which is displayed in figure 5.12 for the examined technology generations. In this figure the experimental data is represented by solid symbols. The open symbols are calculations based on the area of the minimum transistor and on A0,∆VT and A0,∆β/β . The mismatch in PMOS transistors is seriously increased because of short-channel effects. In order to improve SRAM yield, these short-channel effects need to be brought under control. The mismatch of the NMOS transistors might also be improved by reducing the poly-grain size. Finally note that one can also decrease the sensitivity of SRAM yield to parameter fluctuations by increasing the threshold voltage of the SRAM transistors by an extra implantation step. However, this solution requires an extra mask and lithography step, which increases the cost of the process.
5.5
Alternative device concepts
Ever since the semiconductor industry started with the down-scaling of device dimensions, people have tried to predict the end of this down-scaling. These predictions are based on economical, technological and physical considerations and the relationships between them. However, it is wrong to assume that everything will end at a fixed point. It is better to talk about a gradual decline or, to put it more positively, change. For example, because of economical reasons, a lot of companies stopped to develop new processes or they even went fabless. However, semiconductor foundries took ad-
gate
source strained Si drain Si1-xGex
source channel drain buried oxide back gate
(a) strained Si
(b) SOI
channel
gate
131
gate
5.5. ALTERNATIVE DEVICE CONCEPTS
buried oxide silicon substrate (c) FinFET
Figure 5.13: Schematic representation of a MOSFET with a strained silicon channel (a), a silicon-on-insulator (SOI) MOSFET (b) and a FinFET (c). The source and drain of the FinFET are located in front and to the back of the channel, respectively.
vantage of this situation and started to grow more strongly. A constant technological issue is the development of new lithography tools with rising costs. Physical barriers started to play a role when the supply voltage needed to be scaled down, because of the reliability of the gate oxide. At this moment one of the main concerns and physical barriers is caused by the leakage current, which can result in unacceptably large power dissipation. To keep leakage levels under control, threshold-voltage scaling slowed down. Together with a decreasing supply voltage, this would result in a lowering of the drive current. To solve this problem, alternative device concepts are introduced. For example, one can use germanium to strain the silicon (see figure 5.13a), which increases the mobility. Another option is to use a silicon-on-insulator (SOI) substrate instead of a standard silicon one (see figure 5.13b). This reduces the junction capacitances, which increases the speed. Another possibility is the use of FinFETs, which are double- or triple-gate devices (see figure 5.13c). Because of the multiple gates, the drain current increases. Another advantage of SOI devices and FinFETs is that they allow for a better control of short-channel effects. The question to ask ourselves is how this change in device structure affects the matching behavior. As an introduction, some of the issues will be briefly discussed. Strained silicon MOSFETs. A strained silicon MOSFET contains a silicon-germanium layer on which a thin silicon layer (tSi ∼ 10 nm) is grown epitaxially. The silicon is strained, because it has a higher lattice constant than the relaxed Si1−x Gex layer, where x is the fraction of Germanium atoms. Besides the mobility, the difference in lattice constants causes a band offset, which results in a negative shift of the threshold voltage in the range of 0 − 500 mV. This shift is a function of the germanium concentration and the thickness of the strained silicon film, which could give rise to extra parameter fluctuations [120]. The device is most sensitive to these fluctuations when the depletion layer width is equal to the silicon film thickness. To compensate the negative shift in threshold voltage, for the NMOS transistors extra doping needs to be implanted. This results in an increase in the parameter fluctu-
132
CHAPTER 5. TECHNOLOGICAL ASPECTS
ations. Furthermore, because of the increase in bulk mobility, the contribution of other mobility components becomes more dominant, and therefore also their contribution to the amount of fluctuations. Fluctuations in the bulk mobility itself are also expected to increase, because of fluctuations in the germanium content and film thickness. Overall, we conclude that a strained silicon MOSFET will be more difficult to optimize with respect to parameter fluctuations than a standard MOSFET. Silicon-on-insulator MOSFETs. The channel region of an SOI MOSFET consists of a thin silicon layer on top of an oxide. When this silicon layer is thicker than the depletion-layer width, the SOI MOSFET is called partially depleted. The fabrication and operation of such a MOSFET is very similar to that of a normal MOSFET and no major changes in the matching behavior are expected. When the silicon layer is thinner than the depletion-layer width, the SOI MOSFET is called fully depleted (FD). In this case a higher doping concentration is needed to reach the required threshold voltage. In a fully depleted device an increase in the doping level is not compensated for by a decrease in depletion layer width, which makes the device more sensitive to doping fluctuations. The threshold voltage is also dependent on the silicon-film thickness, which could cause an increase in the fluctuations. As was mentioned before, using a FD-SOI device improves short channel behavior. Actually, when the silicon film is thin enough, no doping would be required at all. In this case the threshold voltage is determined by the gate work-function, which needs to be properly engineered. This is not straightforward, but it solves the problem of doping fluctuations. The sensitivity of such a device to film thickness and gate length was examined in [121]. It was also found that these sensitivities decrease when a negative voltage is applied to the back gate. FinFETs. FinFETs are fabricated by etching silicon fins on an SOI substrate. The gate goes around the whole fin, making it a double- or triple-gate device. The device is contacted at the front and at the back (not shown in the two dimensional figure 5.13c). The sensitivity of FinFETs to device fluctuations was examined in [121, 122]. The same issues play a role as for FD-SOI MOSFETs. However, in the case of FinFETs the silicon film thickness is determined by the lithography and etching processes, which means that it is less well controlled than the silicon film thickness of SOI devices. Also, the gate oxide is grown on the side of the fins, which will result in a worse control of its thickness.
5.6
Conclusions
This chapter presented some examples of how certain process steps can influence the matching behavior of a technology. Furthermore, the scaling of matching properties was discussed and issues for alternative device structures were briefly addressed. It was found that decreasing the grain size of the poly- or amorphous-silicon gate material can greatly improve the matching behavior of a technology, because it reduces
5.6. CONCLUSIONS
133
the correlation length of the mismatch causing stochastic process. By comparing devices with halo implantation and without halo implantation, it was found that the halo seriously degrades the matching performance, which was mainly observed for the NMOS transistors. The gate did not act as a perfect mask for the implantation step and the halo was implanted through the gate. This resulted in extra localized charge in the channel region or at the oxide-gate interface. Increasing the halo dose or the implantation angle worsens the effect. However, for short devices, the halo improves the matching behavior, because of the increase in effective channel-length. For very short NMOS transistors, the devices without halo start to perform better again. For a very short device the effective channel length is no longer determined by the doping concentration, but the higher doping concentration in the devices with halos causes a decrease in the matching performance. For narrow devices the impact of the halo becomes smaller. This could be due to the the topography introduced by the shallow-trench isolation, that can scatter the implantation. With respect to the scaling of the matching properties of CMOS technologies, it is concluded that the matching performance improves faster with scaling than what is expected theoretically. This indicates that newer technologies profit from what is learned during the development of older technologies. However, it also indicates that research efforts remain necessary to keep parameter fluctuations under control. The matching performances of four experimental technologies are compared to what is expected based on literature. The nominal gate lengths of these technologies ranged from 100 nm to 180 nm. It was found that the matching performance significantly improved after changing from amorphous to poly-silicon gate material. Furthermore, the PMOS transistors were observed to follow the empirical scaling laws. The NMOS transistors have worse matching performance than empirically expected. This could be improved by reducing the grain size of the gate material or by making sure that the halos are not implanted through the gate. For the minimum size transistors, the matching performance of the PMOS devices is also poor. This is caused by the poor short-channel control of these transistors, which needs to be improved in order to be able to obtain acceptable SRAM yield.
Chapter 6
Impact of line-edge roughness on parameter fluctuations, off-state current and yield Doping fluctuations are considered to determine one of the fundamental lower limits to parameter fluctuations. These effects have been extensively studied in literature and in chapter 4. A less well studied effect is the impact of line-edge roughness (LER). While the previous chapter discussed some current technology issues, LER is considered to be one of the main limiting factors for future technologies. In general, the printed gates of transistors exhibit a certain roughness. This roughness is ultimately limited by Poisson statistics on the number of photons during the exposure of the resist [123]. However, in practice, chemical properties of the resist make out the main contribution to LER [124]. As transistor gate-lengths are scaled down, LER is expected to have an impact on parameter fluctuations, off-state current and yield. In literature, the effects of LER have mainly been investigated by 2D [125, 126] or 3D [127, 128] device simulations. In the case of 2D simulations, the poly-gate is divided in small segments and for each segment the current is found from the simulation. The same approach is followed in [129], but here an analytical model is used to describe the drain current of a segment. In [126]1 the simulations were calibrated on a 0.13 µm process on which the LER was exaggerated. These simulations were then used to fix the requirements on LER for 34 nm gate-length transistors. However, it is not investigated how the importance of LER increases as MOSFETs scale down and when it will really become an issue. This is essential information for gate-patterning process optimization as well as device optimization. In this chapter we will start in section 6.1 by characterizing the LER itself. In section 1 The
work presented in this publication was done in parallel to the work presented in this chapter.
135
136
CHAPTER 6. IMPACT OF LINE-EDGE ROUGHNESS
line width (nm)
130
125
120
115
110
100
200
300
400
500
600
z (nm)
(b) (a)
Figure 6.1: a) SEM picture of three poly-silicon lines after etch. The edges of the fully displayed center line have been highlighted. b) Local line width as a function of the position along the line (z).
6.2, the theory presented in section 2.4 and section 4.2 will be used to describe the impact of LER on parameter fluctuations, off-state current and yield. This will result in analytical expressions that directly link properties of the LER to the above mentioned effects. In section 6.3 these models are experimentally verified, which, in section 6.4, allows us to make predictions of the impact of LER and to present guidelines for LER scaling. Section 6.5 concludes this chapter.
6.1
Characterization of line-edge roughness
When examining mismatch effects, generally one does not have any directly measurable information about the stochastic microscopic processes that cause the mismatch. Line-edge roughness forms an exception, because the lines are visible after gate patterning, and stochastic properties can be extracted. Figure 6.1a shows a SEM picture of a printed poly-silicon line for which the two edges are highlighted. Figure 6.1b plots the local line width as a function of the position along the line (z). To detect the edge, in figure 6.2a we look at the intensity profile of the SEM picture at a certain position along the line. Four peaks are observed, that are related to the four line-edges in the picture. Figure 6.2b zooms in on one of these peaks, which is fitted by a function that consists of two halves of Gaussian functions: ( Intensity =
2
nil + amp · e−((x−pos)/brl) 2 nir + (amp − nir + nil) · e−((x−pos)/brr)
x < pos . x > pos
(6.1)
6.1. CHARACTERIZATION OF LINE-EDGE ROUGHNESS 130 Intensity (a.u.)
Intensity (a.u.)
140
137
120 100 80
120 110
90
brr
80 70
60
amp
100
brl
nir
nil
60
40
0
100
200
300
400 500 x (nm)
600
pos 50 260 270 280 290 300 310 320 330 340 x (nm)
(b)
(a)
Figure 6.2: a) Intensity profile of the SEM picture displayed in figure 6.1a at a certain value of z b) Zoom-in on one of the peaks. The fitted parameters amp, brl, brr, nil, nir and pos are defined in figure 6.2b. The edge of the line is assumed to be located at the point of maximum intensity, thus at x = pos. According to the ITRS roadmap [108], the standard deviation of this edge position (σLER ) should be smaller than 3.3 % of the gate length, but no justification is presented for this number. Furthermore, figure 6.1 contains information about the shape of the line, which is neglected when only σLER is taken into account, but which is required to describe the impact of LER on transistor behavior. Also the correlation between the roughness of the two edges can be calculated. However, it is found to be insignificant (ρ = 0.19 ± 0.32). This means that the variance of the line-width 2 2 roughness (LWR) is equal to two times the variance in LER, i.e. σLW R = 2σLER . From this point on, we choose to analyze LWR instead of LER. This has the advantage that slight rotations in an analyzed SEM picture are to first order cancelled out. The full spectral properties of the LWR are contained in its autocovariance function (RLW R ), which is the Fourier transform of its power spectrum, and is defined by: 2 RLW R (d) ≡ σLW R · ρ(Llocal (z), Llocal (z + d)),
(6.2)
where the autocorrelation function ρ(Llocal (z), Llocal (z + d)) is the correlation of the width of the line2 at a certain position z and its width at a certain position z + d fur2 ther down the line. Note that RLW R (0) = σLW R and that the width of this function presents a measure of the correlation width of the process. The autocovariance function is calculated from experimental data by the biased estimate as follows: RLW R (n · step) =
N −n 1 X (Llocal [i] − Llocal )(Llocal [i + n] − Llocal ). N i=1
(6.3)
These experimental data consist of N measurements of the local length at a distance step from each other. Figure 6.3 shows the autocovariance function of a state-of-the2 The
local width of the line is equal to the local length of the gate.
138
CHAPTER 6. IMPACT OF LINE-EDGE ROUGHNESS
autocovariance (nm2)
12 experimental 1st order autoregressive 2nd order autoregressive
10 8 6 4 2 0 -2
0
50
100 150 d (nm)
200
Figure 6.3: Autocovariance function of the LWR of a 193 nm gate-paterning process. A first-order and second-order autoregressive model are fitted to the experimental data.
art 193 nm lithography process. This autocovariance function is extracted from 5 lines with a length of 700 nm each. Also, fits are shown of a first-order and second-order autoregressive model. The autocovariance function of a first-order autoregressive process is given by: 2 −α1 |d| RLW R (d) = σLW . R·e
(6.4)
For the examined lithography process it is found that σLW R = 3.2 nm and α1 = 0.020 nm−1 . The autocovariance function of a second-order autoregressive process is given by: µ ¶ α2 2 −α2 |d| RLW R (d) = σLW · e cos(p|d|) + sin(p|d|) . (6.5) R p For the lithography process under study, the fit yields σLW R = 3.1 nm, α2 = 0.036 nm−1 and p = 0.030 nm−1 . It is observed that the second-order process gives the best fit. However, since the data plotted in figure 6.3 is only accurate at the lowest values of d, we cannot justify using the second-order process. Furthermore, a second-order autoregressive process indicates some kind of damped oscillation in space related to the LER causing process, which is not physical. We will find in the next section that parameter fluctuations due to LER are proportional to the total area under the autocovariance function, which is larger for the first-order process. Since we want to use our calculations to provide upper boundaries for the maximum allowable LER, we choose to use the first-order autoregressive process to describe the roughness of the line.
6.2. MODELING THE IMPACT OF LINE-WIDTH ROUGHNESS 2
VT (V)
0.25
1
0.2 0.5
0.15 0.1
0
0.05 0
NMOS
0.1
1 Lgate (mm)
10
-6 VDS = -50 mV
-0.3
-4
-0.2
-3
-0.15
-2
-0.1
-1 0
-0.05
-0.5
0
PMOS
0.1
(a)
-10 -9
10
-15 NMOS
0.1
VDS = 1.2 V 1 Lgate (mm)
(c)
10
-20
Ioff / W (A mm-1)
-5
-8
10
-10
0
-6
-10
5 -7
-10
10
-8
-10
PMOS
VDS = -1.2 V 15
-9
-10
-10
-10
0.1
1 Lgate (mm)
dLn(Ioff)/dL (% nm-1)
0
dLn(Ioff)/dL (% nm-1)
Ioff / W (A mm-1)
1 10
(b)
10
-10
1 Lgate (mm)
-5
-7
10
-5
-0.25
dVT/dL (mV nm-1)
1.5
0.3
-0.35 dVT/dL (mV nm-1)
VDS = 50 mV
VT (V)
0.4 0.35
139
20 10
(d)
Figure 6.4: left axes: Threshold voltage (a+b) and off-state current (c+d) as a function of the gate length. Right axes: the derivative of the threshold voltage (a+b) and of the logarithm of the off-state current (c+d) to the gate length. The oxide thickness is equal to 1.5 nm.
6.2
Modeling the impact of line-width roughness
Most transistor parameters are a function of its length, as is displayed in figure 6.4 for the threshold voltage and the off-state current3,4 . Line-width roughness influences transistor behavior through this length dependence. In order to calculate its impact, we will use the one-dimensional equivalent of the theory published in [5], which was summarized in section 2.4. As example of a strong-inversion parameter, the impact of LWR on the threshold voltage mismatch is calculated in subsection 6.2.15 . It was found in subsection 4.2.1 that deviations are to be expected in the weak-inversion regime and the impact of LWR on the off-state current is calculated in subsection 6.2.2. Subsection 6.2.3 models the impact of LWR on yield.
3 The
off-state current can also be referred to as the leakage current. experimental results shown in this section are for a 130 nm technology with a nominal gate length of 100 nm, an oxide thickness of 1.5 nm and a supply voltage of 1.2 V. 5 For other strong inversion parameters, like e.g. the on-state current, the same approach can be followed. 4 All
140
CHAPTER 6. IMPACT OF LINE-EDGE ROUGHNESS
poly silicon gate
Llocal
poly silicon gate
active area
Llocal
active area
(a) W À Wc
(b) W ¿ Wc
Figure 6.5: Schematic top-view of a transistor with a much larger width than the correlation width of the LWR (a) and of a transistor with a much smaller width than this correlation width (b).
6.2.1
Impact of line-width roughness on the threshold voltage
Locally, i.e. at a certain value of z, the variation in threshold voltage due to LWR is calculated by: ¯ ¯ √ ¯ dVT ¯ ¯ ¯ σLW R . σ∆VT ,local = 2 ¯ (6.6) dL ¯ The overall variation in the mismatch of the threshold voltage is calculated by averaging the local variations over the width of the transistor. Mathematically this translates into: µ ¶2 dVT 2 = 2 σ∆V [G ∗ G ∗ RLW R ](0) = (6.7) T dL µ =2
dVT dL
¶2
2 α1 W
µ 1−
¶ ¢ 1 ¡ 2 1 − e−α1 W · σLW R, α1 W
where the geometry function G(z) = 1/W for |z| < W/2 and G(z) = 0 for |z| > W/2. The last equality holds when the LWR is described by a first-order autoregressive model. Two extreme cases are distinguished. Firstly, when the transistor is much wider than the correlation width (Wc ) of the LWR, as is schematically displayed in figure 6.5a, then: 2 σ∆V T
2 ∼ = W
µ
dVT dL
¶2 Z
∞
RLW R (z)dz = −∞
2 Wc · σ∆V T ,local
W
,
(6.8)
from which it follows that Wc is equal to the area under the autocorrelation function6 . Since the maximum of the autocorrelation function is equal to 1, the total area under this function is a measure of its width. When the LWR is represented by a first order autoregressive process, then Wc = 2/α1 , which is equal to Wc = 100 nm for the case displayed in figure 6.3. 6 Note that the total area under the autocorrelation function is equal to the zero-frequency component of the normalized power spectrum.
10
NMOS
1
W s(DVT) (mV mm1/2)
W s(DVT) (mV mm1/2)
6.2. MODELING THE IMPACT OF LINE-WIDTH ROUGHNESS
flat part of VT roll-off curve
0.1 0.01
Experimental LWR contribution 0.1
141 PMOS
10 1
flat part of VT roll-off curve
0.1 Experimental
0.01
LWR contribution 0.001
1
10
0.1
1
10
Lgate (nm)
Lgate (nm)
Figure 6.6: Experimental mismatch in the threshold voltage (symbols) and calculated mismatch due to LWR (lines). The calculations use the first order autoregressive model to describe the LWR for which the parameters are extracted from figure 6.3. Secondly, in the other extreme case when W ¿ Wc , as is displayed in figure 6.5b, (6.7) can be approximated by: µ 2 σ∆V T
∼ =2
dVT dL
¶2 2 RLW R (0) = σ∆V . T ,local
(6.9)
This means that the mismatch causing process does not have enough space to change the local length over the width of the transistor and the variation between transistors is equal to the local variation. Figure 6.6 compares the experimentally obtained mismatch in threshold √ voltage with the calculated mismatch due to LWR. The σ∆VT is normalized to 1/ W . It is observed that LWR does not give a significant contribution to the fluctuations down to the minimum available gate length of 80 nm.
6.2.2
Impact of line-width roughness on the off-state current
The off-state current varies much more strongly with the gate length than the threshold voltage, as can be seen in figure 6.4. For this reason a linearization like in (6.6) is not accurate. However, the logarithm of the off-state current can be linearized: ¯ ¯ ¯ dln(Iof f ) ¯ ¯ ¯ σLW R . σln ≡ σ∆0 ln(Iof f ),local = ¯ (6.10) ¯ dL Locally, this results in a lognormal distribution of the off-state current due to LWR. In our analysis we consider the ratio of the off-state current of a device that suffers from LWR and the off-state current of an ideal device with no LWR. This is denoted as r0 Iof f . Firstly, consider the average of this ratio (µr0 Iof f ), which ideally is equal to one. However, the asymmetry of the lognormal distribution causes an increase in the off-
142
CHAPTER 6. IMPACT OF LINE-EDGE ROUGHNESS
state current, which is given by: 2
µr0 Iof f = eσln /2 ,
(6.11)
and is independent of the transistor width. Using σLW R = 3.2 nm and figure 6.4, it follows that an increase of about 20 % is expected for the Lgate = 80 nm transistors, which is relatively small. Secondly, consider the fluctuations in r0 Iof f . Because of the nonlinearity introduced by the lognormal distribution, the shape of the autocovariance function of this ratio (Rr0 Iof f (d)) is not the same as that of the LWR. In order to calculate Rr0 Iof f (d) we will assume that the current only flows in the x-direction. Then, by using (3.4), it is found that: ³ ´ 2 2 Rr0 Iof f (d) = eσln · eρLW R (d)·σln − 1 , (6.12) where ρLW R (d) is the autocorrelation function of the LWR. The variance in r0 Iof f is now given by: σr20 Iof f = [G ∗ G ∗ Rr0 Iof f ](0). (6.13) For wide transistors (W À Wc,Iof f ) this equation simplifies to: ³ 2 ´W 2 c,Iof f σr20 Iof f ∼ , (6.14) = eσln · eσln − 1 W R∞ where Wc,Iof f ≡ −∞ Rr0 Iof f (z)/Rr0 Iof f (0)dz. For the first order autoregressive process shown in figure 6.2, Wc,Iof f = 90 − 100 nm, depending on the sensitivity of the off-state current to the gate length as displayed in figure 6.4. For narrow transistors (6.13) simplifies to: ³ 2 ´ 2 σr20 Iof f ∼ (6.15) = eσln · eσln − 1 . It is not straightforward to compare this theory to experiment, because r0 Iof f is defined with respect to an ideal device of which we do not know the electrical properties. This problem is overcome by matching two transistors in such a way that the ideal device drops out of the equation. This is obtained by analyzing the mismatch in the logarithm of the off-state current (∆ln(Iof f )). In order to calculate the variance of this quantity from (6.13), the distribution of r0 Iof f needs to be known. For a very narrow device this distribution is expected to be lognormal, while it is normal for a very wide device, as follows from the central limit theorem. However, generally the distribution lies somewhere in between these extreme cases. As an approximation, we will assume r0 Iof f to be lognormally distributed with a mean given by (6.11) and a variance given by (6.13). This results in: ³ ´ 2 2 2 −σln σ∆ln(I ≈ 2ln σ · e + 1 . (6.16) 0 r Iof f of f ) This equation gives correct results for very narrow and very wide transistors. For the intermediate cases the inaccuracy has been checked in a numerical way and it is 2 < 1.0. Figure 6.7 compares the experimentally found to be smaller than 20 % when σln obtained mismatch in the off-state current with the calculated mismatch due to LWR. Like for the threshold voltage, it is observed that LWR does not give a significant contribution to the fluctuations down to the minimum available gate length of 80 nm.
6.2. MODELING THE IMPACT OF LINE-WIDTH ROUGHNESS 0
NMOS
W s(Dln(Ioff)) (mm1/2)
W s(Dln(Ioff)) (mm1/2)
0
10
-1
10
-2
Experimental
10
LWR contribution
-3
10
-4
10
-5
10
143
10
PMOS
-1
10
-2
Experimental
10
LWR contribution
-3
10
-4
10
-5
0.1
1
10
10
0.1
Lgate (nm)
1
10
Lgate (nm)
Figure 6.7: Experimental mismatch in the logarithm of the off-state current (symbols) and calculated mismatch due to LWR (lines). The calculations use the first order autoregressive model to describe the LWR for which the parameters are extracted from figure 6.3.
6.2.3
Impact of line-width roughness on yield
Besides causing parameter fluctuations and increasing the off-state current, LWR can also decrease yield. To calculate this decrease, it is assumed that a device fails when it is locally shorter than a certain critical gate length (Lcritical ). The probability (plocal ) that this happens at a specific location follows from the normal distribution of the LWR and is given by: µ ¶ 1 Lgate − Lcrit √ plocal = erf c , (6.17) 2 2σLW R where erf c is the complementary error function. We now assume that the device consists of W/Wc segments when W > Wc and of one segment when W < Wc . Within one segment the line-width is constant, the standard deviation of the line width is equal to σLW R , and the deviation of one segment from the average is uncorrelated with the deviations of the other segments. From this it follows that the probability that one device fails (pdevice ) is equal to: pdevice = 1 − (1 − plocal )min(1,W/Wc ) .
(6.18)
When a circuit contains Ndevice devices and the circuit fails when one transistor fails, the circuit yield is given by: yield = (1 − pdevice )Ndevice .
(6.19)
As an example, consider a 1Mbit SRAM, which has 6 million minimum-size transistors. We assume W < Wc and Lcrit = 0.7Lgate . If we allow for a maximum yield loss due to LWR of 0.5 %, then the requirement on LWR is 6σLW R < 0.3Lgate , which is equal to the ITRS roadmap requirement. Requirements on LWR based on parameter fluctuations and the off-state current will be presented in section 6.4.
144
CHAPTER 6. IMPACT OF LINE-EDGE ROUGHNESS
(a) no extra LWR
(b) extra LWR
(c) one side
(d) in phase
(e) 90◦ phase difference
(f) 180◦ phase difference
Figure 6.8: Top-view SEM pictures of the gates of the special transistors that were fabricated using e-beam patterning to study LWR effects
6.3
Experimental investigation of the impact of linewidth roughness
As was shown in the previous section, LWR does not significantly affect MOSFET behavior down to gate lengths of 80 nm. Therefore, in order to experimentally investigate LWR, it needs to be artificially increased. This will allow us to make predictions for future technologies. The setup of our experiments is described in the first subsection of this section. Results are given in the second and third subsections while the last subsection deals with the issue of yield.
6.3.1
Experimental setup
To create transistors with extra rough gates, electron-beam (e-beam) lithography was used. E-beam lithography has the advantage that it can produce any gate shape with a resolution of approximately 20 nm. This allowed us to create transistors with sinusoidally shaped gate-edges, as displayed in figures 6.8c-f. Four types of this kind of transistors were fabricated, 1) transistors with one sinusoidal edge and the other edge smooth (figure 6.8c), and transistors with two sinusoidally shaped edges that 2) are in phase (figure 6.8d), 3) have 90◦ phase difference (figure 6.8e), and 4) have 180◦ phase difference (figure 6.8f). As reference, also normal transistors were available (figure 6.8a). The transistors are made in the 130 nm process, that was described earlier. The examined average gate lengths range from 50 nm to 100 nm, while the gate width was fixed at 1.0 µm. Only NMOS transistors were available. In the first experiment, the amplitude of the sinusoidally shaped edges of the gates was varied
40 30
autocovariance (nm2)
autocovariance (nm2)
6.3. EXPERIMENTAL INVESTIGATION OF THE IMPACT OF LWR
sLWR = 6.2 nm Wc = 80 nm
20
original corrected
10 0 sLWR = 5.5 nm Wc = 100 nm -10 0 50
100
150
70 60 50 40 30 20 10 0 -10
145
original corrected s = 8 nm LWR
Wc = 125 nm
s LWR
= 7.4 nm
Wc = 135 nm 0
50
100
d (nm)
d (nm)
(a) medium extra rough
(b) strong extra rough
150
Figure 6.9: Autocovariance functions of the LWR of transistors with medium extra rough gates and with strong extra rough gates. •) Based on the extracted gate edges. ¥) Compensated for the smoothing out of the LWR due to diffusion of the extensions. from 0 % to 40 % of the average gate length, while the period was kept constant at 120 nm. In the second experiment, the period was varied from 40 nm to 1000 nm at a constant amplitude of 20 % of the gate length. A SEM picture was made of each of the fabricated transistors from which the exact line shapes were extracted. This information is required later to calculate the expected drain current of the transistors. Besides transistors with sinusoidally shaped gate edges, also transistors with extra rough edges have been fabricated, as displayed in figure 6.8b. This was achieved by randomly varying the e-beam dose along the edges of the transistor. Two varieties of extra rough transistors were measured, one with medium extra-rough gates and the other with strong extra-rough gates. The respective autocovariance functions of the LWR are displayed in figure 6.9. Per transistor length 65 device pairs were available.
6.3.2
Sinusoidally-shaped gate edges
In this subsection the experimental results with respect to the sinusoidally shaped transistors are compared with calculations. As a first approximation, the current is calculated by: Z 1 ID = ID,noLW R (Llocal (z))dz, (6.20) W W where ID,noLW R (Llocal (z)) is the drain current of an ideal transistor without LWR with a gate length of Llocal (z). This ideal current is measured on the reference transistors, that approximately have straight edges. Only a limited discrete set of reference gate lengths is available, and an interpolation algorithm is used to determine ID,noLW R for all available local gate lengths. Figures 6.10a+b compare the calculated off- and on-state currents with the experimental data for the transistors of which one edge has the sinusoidal shape. These
146
CHAPTER 6. IMPACT OF LINE-EDGE ROUGHNESS
1.6 10
50 60 70 100
10-6 -8
10
increasing amplitude
calculated Ion (A)
calculated Ioff (A)
-3
-4
10
-10
10
-3
1.4 10
-3
1.2 10
50 60 70 100
-3
1 10
-4
increasing amplitude
8 10
-4
6 10 6 10-4 1 10-3 1.4 10-3 Experimental Ion (A)
10-10 10-8 10-6 10-4 Experimental Ioff (A)
(b) simple calculation
(a) simple calculation
1.6 10
50 60 70 100
-6
10
-8
10
increasing amplitude
calculated Ion (A)
calculated Ioff (A)
-3
-4
10
-10
10
-8
-6
1 10-3 -4
increasing amplitude
8 10
6 10 6 10-4 1 10-3 1.4 10-3 Experimental Ion (A)
-4
10 10 10 Experimental Ioff (A)
(d) using smoothing window
(c) using smoothing window -4
-3
2 10
50 60 70 100
10
-6
10
calculated Ion (A)
calculated Ioff (A)
-3
1.2 10
50 60 70 100
-4
-10
10
-8
10
-10
10
-3
1.4 10
-3
1.6 10
-3
50 60 70
1.2 10
-4
8 10
-4
-10
10
-8
-6
-4
10 10 10 Experimental Ioff (A) (e) varying period
4 10 -4 -3 -3 4 10 1.2 10 2 10 Experimental Ion (A) (f) varying period
Figure 6.10: Comparison of calculated and experimental off-state currents (a+c+e) and on-state currents (b+d+f) for the transistors with one sinusoidal gate edge. The average length of the transistors is given in the legends and the arrows indicate increasing amplitude from 0 % to 40 % of the gate length (a-d). a+b) Equation 6.20 is used in the calculation. c+d) The smoothed length (6.21) is used in the calculation. e+f) The amplitude is fixed at 20 % of the gate length and the period is varied.
6.3. EXPERIMENTAL INVESTIGATION OF THE IMPACT OF LWR
(a)
(b)
(c)
(d)
147
Figure 6.11: Schematic top-view drawings of transistors with one sinusoidal gate edge (a+b) and two sinusoidal gate edges in phase (c+d). The full lines represent the gate edge and the dashed lines the tips of the source and drain regions. The arrows denote the directions of the current flows. figures should be read as follows. On the x-axis the experimentally obtained current is plotted and on the y-axis the calculated current. Each symbol represents one transistor and the shape of the symbol is related to its average length. For each average length results are shown for amplitudes of the sinus of 0 % (reference), 5 %, 10 %, 20 % and 40 % of the gate length. The amplitude increases in the direction of the arrows. For the off-state current (figure 6.10a) it is observed that (6.20) overestimates the experimentally observed current for increasing amplitudes. This can be explained by the fact that, by using (6.20), we have implicitly assumed that the tips of the extension regions exactly follow the gate, as is schematically displayed in figure 6.11a. However, in reality the roughness is smoothed out because of diffusion of the extensions during the processing after their implantation. This results in the situation displayed in figure 6.11b. Mathematically this can be taken into account by replacing Llocal in (6.20) by Lsmooth , which is given by: Lsmooth (z) =
1 2Wsmooth
Z
z+Wsmooth
Llocal (z 0 )dz 0 ,
(6.21)
z−Wsmooth
where 2Wsmooth is the width of the applied rectangular smoothing window. Figure 6.10c compares the experimental results with the new calculations in which Wsmooth was used as a fitting parameter and found to be equal to 30 nm. A reasonable agreement between calculation and experiment is observed. Note that the value of 30 nm is comparable to the junction depth of the extensions. Therefore it can be considered a realistic value. For the on-state current (figure 6.10b) it is seen that the calculation gives a reasonable prediction of the experimental results, except for some extreme points. Applying the smoothing window does not significantly change the picture, as can be seen in figure
CHAPTER 6. IMPACT OF LINE-EDGE ROUGHNESS
50 60 70 100
-4
10
-6
10
calculated Ion (A)
calculated Ioff (A)
148
-8
10
-6
10
-8
10
increasing amplitude
-3
1.4 10
-3
1 10
50 60 70 100
-4
6 10
increasing amplitude
-4
-10
10
calculated Ion (A)
calculated Ioff (A)
50 60 70 100
-4
6 10
(b) current in x-direction
(a) current in x-direction -4
-3
1 10
50 60 70 100
2 10-4 2 10-4 6 10-4 1 10-3 1.4 10-3 Experimental Ion (A)
10-10 10-10 10-8 10-6 10-4 Experimental Ioff (A)
10
-3
1.4 10
-10
10
-8
-6
-4
10 10 10 Experimental Ioff (A)
(c) shortest distance
2 10 2 10-4 6 10-4 1 10-3 1.4 10-3 Experimental Ion (A) (d) shortest distance
Figure 6.12: Comparison of calculated and experimental off-state current (a+c) and on-state current (b+d) for the transistors with two sinusoidal gate edges in phase. The average length of the transistors is given in the legends and the arrows indicate increasing amplitude from 0 % to 40 % of the gate length. a+b) The current is assumed to flow in the x-direction c+d) The current is assumed to flow in the direction of the shortest distance between source and drain. 6.10d. This demonstrates that in strong inversion the effect of varying gate length can indeed be linearized to first order. The smoothing reduces the increase in the current for the shorter parts of the channel, while it also reduces the decrease in current for the longer parts. In the linear approximation these two effects exactly compensate each other. Figures 6.10e+f compare the calculations of the off- and on-state current with experimental data for the case where the amplitude is fixed and the period is varied. The smoothing window is applied. A good agreement between experiment and theory is observed. Now consider the case where both gate edges are sinusoidally shaped and in phase. Figures 6.12a+b compare the calculations of the off- and on-state current with experimental data for the case where the period is fixed and the amplitude is varied. The
-5
10
-6
10
-7
10
50 60 70 100
-8
10
increasing amplitude
10-9
calculated Ion (A)
calculated Ioff (A)
6.3. EXPERIMENTAL INVESTIGATION OF THE IMPACT OF LWR
-8
10
increasing amplitude
10-10 10-10 10-8 10-6 10-4 Experimental Ioff (A) (c) 180◦ phase difference
calculated Ion (A)
calculated Ioff (A)
-6
10
50 60 70 100
-3
1 10
increasing amplitude
(b) 90◦ phase difference
(a) 90◦ phase difference -4
50 60 70 100
6 10-4 6 10-4 1 10-3 1.4 10-3 Experimental Ion (A)
10-10 10-1010-9 10-8 10-7 10-6 10-5 Experimental Ioff (A)
10
-3
1.4 10
149
-3
1.4 10
50 60 70 100
-3
1 10
increasing amplitude -4
6 10 -4 -3 -3 6 10 1 10 1.4 10 Experimental Ion (A) (d) 180◦ phase difference
Figure 6.13: Comparison of calculated and experimental off-state current (a+c) and on-state current (b+d) for the transistors with two sinusoidal gate edges with a phase difference of 90◦ (a+b) and 180◦ (c+d). The average length of the transistors is given in the legends and the arrows indicate increasing amplitude from 0 % to 20 % of the gate length. smoothed gate length was used in the calculations. It is observed that the current is seriously underestimated. This is caused by the assumption that the current only flows in the x-direction, as is schematically displayed in figure 6.11c. Since both edges are in phase, the gate length does not vary and the calculated current is independent of the amplitude. However, in reality the current mainly flows in the direction of the shortest distance between source and drain, as is shown in figure 6.11d. When the length in (6.20) is replaced with this shortest distance, while still taking into account the smoothing of (6.21), the calculation is found to accurately describe the experimental data, as is shown in figures 6.12c+d. Finally, figure 6.13 shows the results for the cases where the two edges have a phase difference of 90◦ and of 180◦ . In these cases, to avoid too short transistors, the amplitude per edge is varied from 0 % to 20 %, instead of 40 %. The experimental data are seen to be well described. In the calculation a smoothing window with Wsmooth = 30 nm
150
CHAPTER 6. IMPACT OF LINE-EDGE ROUGHNESS
is applied and the current is assumed to flow in the shortest direction from source to drain.
6.3.3
Extra rough gates
To evaluate the impact that the diffusion of the extension regions has on the LWR of the extra rough gates, (6.21) is applied to the extracted local gate lengths. From this, new autocovariance functions (RLW R,smooth ) are extracted, that are compared to the 2 original autocovariance functions in figure 6.9. It is observed that R(0) ≡ σLW R decreases. This means that the increase in off-state current due to LWR decreases 2 because of smoothing. However, together with the decrease in σLW R , Wc is seen to increase in such a way that the area under the autocovariance functions remains unchanged. In other words, smoothing out of the roughness does not change the magnitude of the parameter fluctuations. Mathematically the impact of smoothing on the autocovariance function can be calculated as follows: RLW R,smooth (d) = [SW ∗ SW ∗ RLW R ](d),
(6.22)
where the smoothing window SW (z) = 1/2Wsmooth for |z| < Wsmooth and SW (z) = 0 for |z| > Wsmooth . Since the area under SW (z) is equal to one, it immediately follows that smoothing indeed does not change the area under the autocovariance function. Applying (6.22) to the autocovariance function of the first-order autoregressive process (6.4) gives: µ ³ ´¶ 4Wsmooth Wc Wc − 2 2 Wc σLW = 1 − 1 − e · σLW (6.23) R,smooth R, 2Wsmooth 4Wsmooth 2 where σLW R,smooth = RLW R,smooth (0). From this it follows that smoothing effectively reduces σLW R when Wc < 4Wsmooth and that it is more effective for smaller Wc . Reducing Wc also reduces the parameter fluctuations, as follows from (6.8) and (6.14), but it could also decreases yield, as follows from (6.18) and (6.19). It is more difficult to calculate the impact of the fact that the current does not flow purely in the x-direction. Numerical evaluation of the extra rough lines revealed no significant changes in the autocovariance functions. However, the average gate length is found to be reduced by approximately 0.5 nm. This is considered to be insignificant and will be neglected. We will now look at some experimental results. Figure 6.14 shows the Ion − Iof f curves for the transistors with no extra LWR and for the transistors with strong extra LWR. The symbols represent measurement data and the full line shows the average for the transistors without extra LWR. From this average, the Ion − Iof f curve for the transistors with extra rough gates is calculated using (6.11) with σLW R,smooth (dashed line). Reasonable agreement with the experimental data is observed, but a lot of scatter is present on the data. In a future experiment it would be advisable to increase the population size above the 65 device pairs per geometry that were
6.3. EXPERIMENTAL INVESTIGATION OF THE IMPACT OF LWR smooth edges extra rough edges smooth edges - average extra rough edges - calculated
-8
10
151
smooth edges extra rough edges smooth edges - average extra rough edges - calculated
Ioff / W (A mm-1)
Ioff / W (A mm-1)
-5
10
-6
10
-7
10
-8
10
-9
10
-9
10
-10
-10
10
10 -4
5 10
7 10
-4
(a)
-4
-3
9 10 1.1 10 Ion / W (A mm-1)
-4
5 10
-4
6 10
-4
7 10 Ion / W (A mm-1)
(b)
Figure 6.14: Ion − Iof f curves for transistors without extra roughness and for transistors with strong extra roughness. Symbols represent experimental data and the full line gives the average for the transistors with no extra roughness. The dashed line is the calculated curve for the extra rough transistors. The full Ion − Iof f curve is shown (a) and also the first part of the curve is shown (b).
available for this experiment. This would also allow the investigation of the expected increase in parameter fluctuations itself.
6.3.4
-4
8 10
Yield
To illustrate the impact of LWR on yield, a criterion with respect to punch through is defined. Figure 6.15a shows the range of measured drain currents as a function of the gate bias for the population of transistors with no extra roughness and for the population of transistors with strong extra roughness. The gate length in this figure is equal to Lgate = 57 nm. In the presence of extra roughness, it is observed that the upper boundary of this range bends upwards at low values of the gate bias. This shows that the gate starts to loose its control over the channel and it is an indication of the onset of punch through. Therefore, as yield criterion we will say that a transistor fails 2 when d2 ln(Iof f )/dVGS > 0 at VGS = 0 V. Using this criterion, figure 6.15b shows the experimentally obtained device yield (symbols) as a function of the device width for transistors with an average gate length of Lgate = 62 nm. Results are shown for the case with no extra roughness, medium extra roughness and strong extra roughness. It is seen that the yield decreases with increasing LWR and with increasing device width. This is in agreement with the analysis presented in subsection 6.2.3. The model out of this subsection is seen to be in reasonable agreement with the measurement data for Lcritical = 50 nm.
CHAPTER 6. IMPACT OF LINE-EDGE ROUGHNESS
-2
10
ID (A)
-3
10
L = 57 nm W = 1 mm
upper boundaries
-4
10
2
2
d ln(I ) / dV off
-5
10
>0
lower boundaries
-6
10
smooth transistors extra rough transistors
-7
10
GS
0
0.2
0.4
0.6
0.8
1
1.2
fraction of working devices (-)
152
smooth transistors transistors with medium extra roughness transistors with strong extra roughness model 1 0.8 0.6 0.4 0.2 0
L = 62 nm Lcrit = 50 nm 1
10 W (mm)
VGS (V)
(a)
(b)
Figure 6.15: a) Upper and lower boundaries of the measured population of drain currents as a function of the gate bias. Results are shown for transistors with no extra roughness (full lines) and for transistors with strong extra roughness (dashed lines). b) Device yield as a function of the device width for transistors with no extra roughness (•), medium extra roughness (¨), and strong extra roughness (¥). Lines represent the model out of subsection 6.2.3.
6.4
Prediction of the impact of line-width roughness and scaling guidelines
After deriving and experimentally testing models regarding the impact of LWR, we will now use these models to predict the impact of LWR on future technologies. Based on this, requirements for LWR will be specified. In order to be able to make these predictions, it is necessary to know dVT /dL and dln(Iof f )/dL. For this, 2D device simulations have been employed. Technology parameters are taken from the ITRS roadmap and are listed in table 6.1. The channel doping is assumed to be uniform, which means that the halos are fully overlapping. The doping level is chosen in such a way that the ITRS requirements regarding the on-state current are met. Both the high-performance and the low-power option are investigated. Also listed in table 6.1 are the simulation results regarding the threshold voltage, on-state current, off-state current, dVT /dL and dln(Iof f )/dL. For the 130 nm technology node, it is observed that the simulated values of these parameters are approximately equal to their experimental counterparts, in case of the NMOS transistors (also see figure 6.4). This provides some confidence in the simulated results for future technologies. Knowing the sensitivities of the threshold voltage and off-state current to the channel length, we can calculate the expected threshold voltage fluctuations and increase in off-state current, using the theory that was presented in section 6.2. In these calculations the LWR is assumed to be described by the first-order autoregressive process,
6.4. PREDICTION OF THE IMPACT OF LWR AND GUIDELINES
153
Table 6.1: Input for and results of the 2D simulations, used to determine the sensitivity to the gate length of the threshold voltage at VDS = 50 mV and the off-state current at VDS = VDD node (nm) VDD (V) tox,ef f (nm) Lchannel (nm) NA (cm−3 ) VT (V) Ion (µAµm−1 ) Iof f (Aµm−1 ) dVT −1 ) dL (mVnm dln(Iof f ) (%nm−1 ) dL VDD (V) tox,ef f (nm) Lchannel (nm) NA (cm−3 ) VT (V) Ion (µAµm−1 ) Iof f (Aµm−1 ) dVT −1 ) dL (mVnm dln(Iof f ) −1 (%nm ) dL
130
90 65 high performance 1.2 1.0 0.9 2.3 2.0 1.9 65 45 32 1.3 · 1018 1.3 · 1018 1.5 · 1018 0.290 0.209 0.193 869 915 913 8.5 · 10−9 4.0 · 10−7 2.6 · 10−6 3.05 5.42 9.09 -23.7 -32.6 -43.4 low power 1.2 1.1 1.0 3.0 2.6 2.2 90 65 45 0.8 · 1018 1.1 · 1018 1.7 · 1018 0.320 0.329 0.359 600 619 607 9.8 · 10−10 2.5 · 10−9 3.3 · 10−9 2.13 2.74 5.04 -13.7 -21.5 -33.8
45
32
0.7 1.4 25 1.6 · 1018 0.124 924 7.2 · 10−6 9.56 -42.3
0.5 1.0 13 2.3 · 1018 0.075 897 3.0 · 10−5 18.5 -56.2
0.9 1.7 32 2.8 · 1018 0.375 612 2.8 · 10−9 7.84 -45.8
0.7 1.4 16 3.4 · 1018 0.309 599 1.5 · 10−7 18.3 -86.4
displayed in figure 6.2. Smoothing out of the roughness because of diffusion of the extension regions has not been taken into account, because this effect is expected to decrease for future technologies. Figure 6.16a shows the expected threshold voltage mismatch (symbols) as a function of the channel lengths for the examined technology nodes. Also shown are the expected fluctuations, based on the scaling law that was presented in section 5.4 (dashed line). It is observed that, for a well optimized technology, LWR starts to become important for channel lengths below 40 nm. When the mismatch due to LWR is required to be smaller than √ the expected fluctuations without LWR, this gives a maximum to the allowed Wc · σLW R . This requirement is plotted in figure 6.16b. The dashed line represents the current status of what a gate-patterning process can achieve. When LWR does not decrease for future technologies and we wish to keep parameter fluctuations under control, the transistor needs to optimized in such a way that dVT /dL < 7 mVnm−1 . More generally, it can be stated that the magnitude of the LWR (partly) determines the size of the design space for MOSFET development. Now consider the off-state current. Figure 6.17a shows the expected increase in off-
154
CHAPTER 6. IMPACT OF LINE-EDGE ROUGHNESS 120 Wc sLWR (nm3/2)
W s(DVT) (mV mm1/2)
30 high performance low power expected value
25 20 15 10
80 60 40 20
5 0
high performance low power present-day lithography
100
0
0
20
40
60
80
100
0
20
40
60
80
100
Lchannel (nm)
Lchannel (nm)
(b)
(a)
Figure 6.16: a) Predicted mismatch in the threshold voltage due to LWR (symbols) as a function of the channel length. The dashed line gives the expected overall mismatch in the threshold voltage. b) Requirement on the LWR to keep the threshold-voltage mismatch below the overall expected value. The to the channel lengths corresponding technology nodes are listed in table 6.1. 10 high performance low power maximum increase
sLWR (nm)
increase in Ioff (-)
100
10
high performance low power present-day lithography
8 6 4 2
1 0
20
40
60
80
100
0
0
20
40
(a)
60
80
100
Lchannel (nm)
Lchannel (nm)
(b)
Figure 6.17: a) Predicted increase in the off-state current due to LWR (symbols) as a function of the channel length. b) Requirement on the LWR to keep the increase below a factor 2. The technology nodes corresponding to the channel lengths are listed in table 6.1. state current as a function of the channel length. It is observed that below channel lengths of 40 nm this increase is larger than a factor 2. If we require the increase to be smaller, this results in a maximum allowed σLW R . This requirement is plotted in figure 6.17b. When LWR does not decrease for future technologies and we wish to keep the increase in off-state current under control, the transistor needs to optimized in such a way that dln(Iof f )/dL < 37 %nm−1 . Again it can be stated that the magnitude of the LWR (partly) determines the size of the design space for MOSFET development. Note that the requirement shown in figure 6.17b is somewhat more relaxed than the
6.5. CONCLUSIONS
155
ITRS roadmap requirement, which is based on yield. In the ITRS roadmap no mention is made about the correlation width Wc . Therefore, in addition to the ITRS √ roadmap, we propose a new figure of merit for LWR, namely Wc σLW R for which the requirements are given in figure 6.16b.
6.5
Conclusions
This chapter investigated the impact of line-width roughness on MOSFET mismatch, off-state current and yield. The LWR was described by a first-order autoregressive process, which is represented by an autocovariance function. The magnitude of this 2 autocovariance function was found to be equal to σLW R and its width equal to Wc . It was reasoned that line-width roughness (LWR) affects MOSFET parameters through their dependence on the gate length. In the calculation of the impact of LWR on threshold-voltage mismatch all equations were linearized. The variance followed from standard matching theory and was calculated by twice taking the convolution of the autocovariance function with the geometry function and by multiplying the result with the squared sensitivity of the threshold voltage to the gate length. For wide transistors the variance was found to be inversely proportional to the transistor width, which is the one dimensional equivalent of the one-over-area model presented earlier. For very narrow transistors the gate length does not have the space to locally vary 2 within the device and the variance in average length is equal to σLW R . By comparing the theoretically calculated fluctuations due to LWR to the mismatch of a 130 nm technology, it was found that LWR does not give a significant contribution to the parameter fluctuations for gate-lengths ranging down to 80 nm. Locally the off-state current was reasoned to possess a log-normal distribution. The strong asymmetry of this distribution causes the LWR to increase the average offstate current. On a 130 nm technology this increase was still found to be small. The impact of LWR on yield was calculated by first evaluating the probability that locally a device has a shorter length than a certain critical gate length. The yield followed from the amount of times that this probability appears in a circuit. This resulted in the requirement that 6σLW R < 0.3Lgate , which is the same as the ITRS roadmap requirement on LWR. Transistors with sinusoidal gate shapes were fabricated in order to experimentally evaluate the averaging processes of the local properties of the LWR. It was found that diffusion of the extension regions smoothes out the roughness. This was taken into account by applying a smoothing window to the LWR with a width of two times 30 nm. This smoothing results in a reduced increase in the off-state current due to LWR, but parameter fluctuations remain unchanged. It was also observed that the current mainly flows in the direction of the shortest distance between source and drain, but this has little impact on the more realistic situation where the roughness is random. Besides transistors with sinusoidally shaped gates, also transistors with extra rough gates were created. The developed models with respect to the increase in off-state current and yield were validated, but the experimental accuracy was low.
156
CHAPTER 6. IMPACT OF LINE-EDGE ROUGHNESS
Using the developed models, predictions were made regarding the threshold voltage mismatch and increase in off-state current caused by LWR. It is concluded that these effects start to play a role for technologies for which the nominal transistor has a channel length smaller than 40 nm. Requirements on the LWR were presented to keep parameter fluctuations and increase in off-state current under control. This resulted in a new figure of merit that also takes into account the correlation width of the LWR.
Chapter 7
Conclusions, future work and outlook 7.1
Conclusions
In this work we have addressed the matching properties of deep submicron MOSFETs. In five chapters we have treated the modeling of the mismatch in the drain current, mismatch parameter extraction, the physical origins of mismatch, technological aspects and the impact of line-edge roughness. This includes all major areas of study related to MOSFET mismatch at the device level. The overall conclusions are presented chapter by chapter. The emphasis lies on the original contributions made by this work. For more extensive conclusions we refer to the corresponding chapters. Chapter 2: Measurement and modeling of mismatch in the drain current. A physicsbased deep-submicron model to describe the mismatch in the drain current has been developed and for the first time demonstrated on a 180 nm technology. As opposed to literature, we model the impact of a mismatch in the threshold voltage and a mismatch in the current factor separately. This results in a continuous model that is valid from moderate to strong inversion and in as well the linear as the saturation regime. The inaccuracy is smaller than 20 % at all bias conditions above threshold. Chapter 3: Parameter extraction. The most common methods to extract the mismatch in threshold voltage and current factor are, for the first time, directly compared. Significant differences are observed, which can seriously affect the conclusions with respect to the matching performance of a technology. The differences between methods are related to small modeling errors or the nonexistence of a proper model for the weak inversion regime. The preferred method depends on the application. With respect to model and measurement accuracy, current-mismatch fitting-methods 157
158
CHAPTER 7. CONCLUSIONS, FUTURE WORK AND OUTLOOK
yield the best results. A disadvantage of these methods is that they are slow. Applying a current criterion is much faster and also yields excellent measurement accuracy. However, the physical content of the extracted threshold voltage mismatch is less well defined, and it is difficult to use it to characterize a technology. The maximum slope method is reasonably fast, provides understandable results, but it is sensitive to the contact resistance, which can lead to inaccuracies. The three- and four-points methods are most sensitive to noise introduced by the measurement setup. However, they are very fast and provide understandable results. When the three- or four-points method is applied it is required to use fixed gate-overdrive voltages for the bias points. Chapter 4: Physical origins of MOSFET mismatch. By solving the current equations √ we find that the most commonly applied 1/ area law for mismatch is only valid in strong inversion at low values of the drain bias. In addition to literature, we have found that in weak inversion deviations are mainly caused by an exponentially larger contribution to the mismatch of sidewall transistors. In strong inversion at higher values of the drain bias, we find that the lateral non-uniformity of the inversion layer causes a logarithmic, i.e. weak, deviation, which is in accordance with recent literature. We reason that this non-uniformity also results in asymmetry of the MOSFET. The impact on the mismatch of doping fluctuations in the channel and gate, mismatch in the oxide charge and mismatch in surface roughness scattering is calculated. Besides the direct impact of doping fluctuations on the threshold voltage, we derived that Coulomb scattering plays a significant role. As opposed to literature, we reason that fluctuations in Coulomb scattering appear as apparent fluctuations in the threshold voltage. By this, we can explain a large part of the gap between the calculated and experimentally obtained mismatch in the threshold voltage. Experimental testing of our model shows excellent descriptive and reasonable predictive behavior. Chapter 5: Technological aspects. Examples of the impact of technological parameters on the matching behavior have been presented. It is confirmed that the granular structure of the poly-silicon gate has a significant impact. As new technological issue, we find that halos can unintentionally be implanted through the gate. This results in a serious degradation of the matching performance by causing extra fluctuations in either the channel doping or gate depletion. The scaling of the matching performance of technologies has been addressed. It was concluded that the matching performance improves beyond what is expected from basic scaling laws. However, it was also reasoned that research efforts remain necessary to keep parameter fluctuations under control. This is especially important for the minimum sized transistor, for which the parameter fluctuations get worse as dimensions are scaled down. Chapter 6: Impact of line-edge roughness on parameter fluctuations, off-state current and yield. When we started working on line-edge roughness (LER), this was a relatively new subject. Therefore, most of the presented work is original. In order to evaluate the impact of LER, we developed a method to characterize the roughness itself, we derived models to calculate the impact of LER on transistor behavior, we experimentally tested these models and we predicted the impact of LER for future
7.2. FUTURE WORK
159
technologies. Edge roughness of the gate is described by a first order autoregressive process. It is characterized by the standard deviation of the roughness and a correlation width. The impact of LER on parameter fluctuations, increase in off-state current and yield is calculated. Compared to experimental values for parameter fluctuations, line-edge roughness is not expected to have a significant impact down to the minimum measured gate length of 80 nm. Developed models are verified on specially fabricated transistors with sinusoidally-shaped gate-edges. Based on these models and device simulations, it is predicted that line-edge roughness will start to become important for devices with 32 nm channel lengths for modern-day gate-patterning processes. Based on the standard deviation and the correlation width of the LER, a new figure of merit has been introduced to describe the impact of LER on parameter fluctuations.
7.2
Future work
The variability of the minimum sized transistor increases with the down-scaling of transistor dimensions. Good matching performance has always been a technology requirement for analog applications, and it has also become a necessity for digital designs. We end this thesis by introducing seven possible topics of future research regarding the stochastic properties of MOSFETs and by presenting an outlook. Develop one model for the mismatch in the drain current for the complete inversion regime. It was observed that the mismatch in the weak inversion regime cannot be predicted from the mismatch in strong inversion, because of the effect of the isolation and because of halos. As supply voltages scale down, analog design is pushed more and more into the weak inversion region. Therefore, it is necessary to develop a model that is valid in the whole inversion regime, i.e. weak, moderate and strong inversion. This could be a model that divides the transistor in six sub-transistors, as displayed in figure 7.1. Half of these sub-transistors model the edge transistors, while the other half model the center transistor; four of the sub-transistors model the halos at the source and drain sides, while two sub-transistors model the center. The mismatch of each of the sub-transistors would be modeled separately by the mismatch model provided in chapter 2. Provide more physical models for the mismatch in the mobility. It was reasoned in chapter 4 that mismatch due to Coulomb scattering can give a significant contribution to the overall mismatch. However, this analysis was based on a semi-empirical model for the mobility. It would be of great interest to use more physical models to derive the stochastic properties of the mobility. Evaluate the mismatch in the gate tunnelling current. As the thickness of the gate oxide scales down, the gate tunnelling current becomes significant. This changes the MOSFET characteristics and requires adaptation of models and extraction routines. The tunnelling current itself is strongly dependent on the oxide thickness. Therefore,
160
CHAPTER 7. CONCLUSIONS, FUTURE WORK AND OUTLOOK
source
drain
center transistor
gate
drain halo
center transistor
source halo
edge transistor
Figure 7.1: Proposed model to describe the mismatch in the drain current in the complete inversion regime
it is by itself susceptible to stochastic variations that need to be studied. Evaluate the mismatch in devices with high-k dielectrics. To get rid of large tunnelling currents, high-k materials are foreseen as gate dielectric. Besides changing the effective oxide thickness, using a different dielectric will give rise to different values of dielectric charge, a different concentration of interface states and it can significantly influence mobility. All these effects will have an impact on the matching behavior of a technology, which needs to be studied before a new dielectric is introduced. Matching properties of new devices. In order to reach sufficient performance at ultrasmall dimensions, alternative device concepts are introduced, as discussed at the end of chapter 5. These new concepts will give rise to new sources of fluctuations, which need to be examined. As an example consider the FinFET. The lower boundary to the stochastic fluctuations of such a device are foreseen to be caused by fin-width and fin-length roughness. Their impact can be examined by following the same approach and defining similar experiments as was done in chapter 6 to evaluate the impact of line-edge roughness. Investigate the matching properties of MOSFETs at the circuit level. In this thesis the matching properties of MOSFETs were investigated by looking at matched transistor pairs. In reality a MOSFET is operated in a circuit environment. This gives rise to asymmetries that can cause, often unforeseen, systematic contributions to the mismatch. These need to be investigated by a proper set of test structures in order to define a set of layout rules. Furthermore, other circuit elements, such as interconnects, can add to the variability. As an example, consider the following experiment. As test structures one could design a matched transistor pair (NMOS and PMOS), a matched invertor, and an SRAM cell. In each of these structures the dimensions of the transistors are kept the same. From the matching properties of the matched transistor pair one should be able to predict the matching properties of the invertor
7.3. OUTLOOK
161
from which one should be able to predict the symmetry of the SRAM cell. Other circuit elements can cause deviations from this expected behavior, which can now be evaluated. Investigate the small-signal matching properties of MOSFETs. In matching analysis only the DC behavior of MOSFETs is considered. Small signal parameters are generally ignored, while it would be very interesting to measure the matching properties of e.g. the overlap capacitance. However, measurement accuracy is always an issue in matching analysis. Therefore, this kind of research probably requires dedicated test structures.
7.3
Outlook
Variability is but one of many scaling issues. Nevertheless, the stochastic properties of MOSFETs will become a limiting factor. This means that a significant effort remains necessary to keep variabilities under control and to try to decrease them. Using retrograde doping profiles improves the matching performance, but it seriously complicates the process. Fully depleted SOI devices and FinFETs have the potential for very good matching performance, but this has yet to be demonstrated, while these architectures also contain new possible sources of parameter fluctuations. In general, variability increases when device dimensions scale down, and the point has been reached where a digital transistor cannot anymore be considered as fully digital with 100 % certainty. For instance, MOSFET mismatch could result in significant variability in the MOSFET delay, which could cause timing issues. This requires digital design methodologies to take variabilities into account, as has always been the case for analog design. Based on new design methodologies, new figures of merit for device optimization could be derived. For instance, at some point it might be more efficient for a certain application to have somewhat more accurate transistors instead of very fast ones. More generally, it is predicted that device optimization will become part of the circuit-design methodology. This kind of optimization would be strongly application dependent, but it can potentially lead to better performing circuits. When solutions to the variability problem are no longer possible at the device level, solutions need to be found at the circuit and system levels.
Appendix S
Nederlandstalige samenvatting S.1
Inleiding
Twee identiek geachte MOSFETs zullen toch verschillen vertonen. Deze variabiliteit vindt zijn oorsprong op verscheidene niveaus. Verschillen zullen bijvoorbeeld optreden, wanneer circuits niet binnen dezelfde fabriek zijn geproduceerd. Ze worden veroorzaakt door de niet exact gelijke machines, maar ook doordat er andere mensen bij de fabricage zijn betrokken. Binnen ´e´en fabriek zullen de eigenschappen van een machine langzaam in tijd vari¨eren. Wanneer een wafer tijdens processing wordt verhit, zal de temperatuur aan de rand van de wafer iets verschillen van de temperatuur in het centrum. Al de bovengenoemde effecten veroorzaken verschillen tussen transistoren die zich op verschillende chips bevinden. Identiek geachte transistoren binnen dezelfde chip zullen echter ook niet exact gelijk zijn. Deze variabiliteit wordt intra-die variabiliteit genoemd. Het verschil tussen twee transistoren heet mismatch. Dit lokale verschil heeft een systematische en een stochastische component. Het systematische verschil tussen twee transistoren is hetzelfde op verschillende chips. Het wordt bijvoorbeeld veroorzaakt door een verschil in omgeving. Bijvoorbeeld, wanneer ´e´en van de twee MOSFETs dicht bij een metaalbaantje ligt, kan dit mechanische stress veroorzaken, die de eigenschappen van de transistor enigszins aanpast [20]. De stochastische component wordt veroorzaakt door zeer lokale verschillen, met afmetingen van ongeveer 1 nm tot 100 nm, die niet op voorhand te voorspellen zijn. Men kan hierbij bijvoorbeeld denken aan de discreetheid van dopering: De exacte locatie en hoeveelheid van doperingsatomen verschilt van transistor tot transistor [3]. Ook kan de niet bedoelde ruwheid van de gate van de MOSFET verschillen veroorzaken. In het algemeen kan 163
164
APPENDIX S. SUMMARY IN DUTCH 20
14
W/L = 12.0/0.10 mm/mm
W/L = 12.0/0.10 mm/mm 12
15
8
counts
counts
10
6 4
10
5
2 0 -12
-8
-4 0 4 DVT (mV)
8
0
12
(a)
-0.6 -0.4 -0.2 0 0.2 0.4 0.6 Dln(Ioff) (-) (b)
Figure S.1: Distributies van de mismatch in de drempelspanning (a) en van de mismatch in de lekstroom (b). worden gesteld dat lokale verschillen uitmiddelen over het oppervlak van een transistor. Dit leidt tot de volgende wet [5]: De variatie van het verschil van een parameter 2 (σ∆P ) is omgekeerd evenredig met het oppervlak (W L), 2 σ∆P =
A20,∆P . WL
(S.1)
De evenredigheidsconstante A0,∆P karakteriseert de matching performantie van een bepaalde technologie. Figuur S.1 toont twee voorbeelden van veel voorkomende distributies van parameterverschillen op het intra-die niveau. Het eerste voorbeeld betreft de mismatch in de drempelspanning (∆VT ). Deze parameter heeft een normale verdeling, welke volledig wordt beschreven door een gemiddelde (µ∆P of ∆P ) en een standaarddeviatie (σ∆P ). Zoals eerder vermeld, is het gemiddelde ongelijk aan nul indien er een systematisch verschil tussen de twee transistoren of in hun omgeving aanwezig is. De breedte van de distributie wordt bepaald door de stochastische component. Als tweede voorbeeld wordt in figuur S.1b de distributie van de mismatch in de lekstroom getoond, welke benaderd kan worden door een lognormale distributie. In het algemeen kan worden gesteld dat bij sterke inversie de mismatch in een parameter benaderd kan worden door een normale distributie en in zwakke inversie door een lognormale distributie. Waarom zijn intra-die variaties belangrijk? In het geval van analoge applicaties is het in [4] aangetoond dat de grootte van de mismatch de afweging bepaalt tussen de snelheid waarmee een circuit werkt, het vermogen dat het verbruikt en de nauwkeurigheid. Als voorbeeld wordt in [6] getoond dat hoe nauwkeuriger men een analoog-digitaal omzetter probeert te maken, hoe beter de matchingeigenschappen moeten zijn om een bepaalde yield te halen. In de meeste digitale circuits is embedded geheugen
S.2. METEN EN MODELLEREN VAN MISMATCH IN DE DRAINSTROOM 165 noodzakelijk. Het is in [7, 8] aangetoond dat slechte matchingeigenschappen van een technologie voor een onvoldoende yield van SRAMs kan zorgen. Aangezien transistorafmetingen kleiner worden, wordt variabiliteit voor dit soort circuits, waarbij de afmeting een cruciale rol speelt, een niet meer te verwaarlozen probleem. Dit doctoraat behandelt de matchingeigenschappen van diep submicron MOSFETs. De volgende doelstellingen werden hierin beoogd: • Het ontwikkelen van een op fysica gebaseerd model voor de mismatch in de drainstroom als een functie van de instelspanningen en dit voor een zo groot mogelijk instelgebied. • Het vergelijken van verschillende methodes voor de karakterisatie van de matchingeigenschappen van MOSFETs. • De fysische oorsprong van MOSFET mismatch begrijpen en modelleren. • Onderzoek naar de technologische oorzaken van MOSFET mismatch. • Onderzoek naar de invloed van toekomstige oorzaken van MOSFET mismatch. De realisatie van deze doelstellingen wordt beschreven in vijf technische hoofdstukken die in de volgende vijf secties zullen worden samengevat. Deze secties behandelen het meten en modelleren van mismatch in de drainstroom (sectie S.2), parameterextractie (sectie S.3), de fysische oorsprong van de mismatch tussen MOSFETs (sectie S.4) en technologische aspecten (sectie S.5). Als ´e´en van de voornaamste toekomstige oorzaken van MOSFET mismatch wordt in sectie S.6 de invloed van de ruwheid van lijnranden onderzocht. Sectie S.7 concludeert dit doctoraat.
S.2
Het meten en modelleren van mismatch in de drainstroom
Bij het modelleren van de mismatch in de drainstroom als functie van de instelspanningen wordt gestreefd naar een aantal eigenschappen. Ten eerste willen we dat het model geldig is in een groot instelgebied, zodat het voor zoveel mogelijk toepassingen gebruikt kan worden. Ten tweede willen we dat het model op de fysische eigenschappen van de mismatch is gebaseerd. Dit verzorgt de relatie met de technologie waarvan het model de matchingeigenschappen beschrijft. Ten derde willen we dat het model continu is in het gehele instelgebied. Vervolgens moet er ook worden nagedacht over de methode om de modelparameters te extraheren. Verder wordt ge¨eist dat de maximale onnauwkeurigheid van de standaarddeviatie binnen 20 % ligt. Voordat we in subsectie S.2.2 verder ingaan op de modellering van de mismatch, wordt nu eerst het meten van de mismatch in de drainstroom beschreven.
166
APPENDIX S. SUMMARY IN DUTCH
L (mm)
10
constant W/L constant area
1
0.1
1
10 W (mm)
(a)
(b)
Figure S.2: a) Layout van een standaard transistor paar. b) Transistorafmetingen voor de evaluatie van de stochastische mismatch voor een 0.18 µm CMOS proces.
S.2.1
Het meten van mismatch in de drainstroom
Om de mismatch in de drainstroom te kunnen meten is het volgende noodzakelijk: 1) een set teststructuren met de juiste dimensies, 2) een meetalgoritme en 3) een voldoende populatiegrootte. Figuur S.2a toont de layout van de standaard teststructuur. Deze bestaat uit twee MOSFETs die vlak naast elkaar liggen. Ze hebben een gemeenschappelijke gate-, source- en bulkaansluiting. De stromen van de twee MOSFETs lopen in dezelfde richting en de lay-out is zo symmetrisch mogelijk om systematische afwijkingen te voorkomen. Deze teststructuur is met verschillende afmetingen beschikbaar, welke voor een 0.18 µm CMOS proces in figuur S.2b zijn getoond. Deze afmetingen zijn gekozen met (S.1) in het achterhoofd. De evenredigheidsconstante wordt geschat op basis van de vierkante transistorparen. Voor korte of smalle afmetingen kunnen de resultaten van (S.1) afwijken. Deze afwijkingen worden onderzocht met behulp van de transistorparen met een constant oppervlak, maar met een andere lengte-breedte verhouding. De meting van de ID − VGS -curves van beide transistoren gebeurt op de volgende manier. Eerst wordt de spanning op de gemeenschappelijke gate, source en bulk aangelegd. Vervolgens wordt de spanning op de drain van de eerste transistor aangelegd en de stroom gemeten, waarna hetzelfde voor de tweede transistor wordt gedaan. Nadat deze gemeten is, wordt de spanning op de gate opgehoogd. Dit hele proces wordt ge¨ıtereerd totdat de volledige curves zijn gemeten. Deze meting wordt herhaald met omgekeerde pindefinities van de drains om een eventuele offset, die bijvoorbeeld door een drift in de temperatuur kan worden veroorzaakt, uit te filteren. Deze hele procedure neemt ongeveer 300 ms per instelpunt in beslag. Een groot aantal metingen is nodig om een standaarddeviatie (σ) nauwkeurig te kunnen bepalen. In het algemeen kan voor een normaal verdeelde parameter worden
S.2. METEN EN MODELLEREN VAN MISMATCH IN DE DRAINSTROOM 167 gesteld dat:
σ , (S.2) 2Ndev waarin Ndev gelijk is aan het aantal gemeten transistorparen per afmeting en σσ de standaarddeviatie van de geschatte standaarddeviatie weergeeft. Stel dat een transistorpaar ´e´en maal voorkomt op een testchip, dat er 40 tot 200 van deze testchips op een wafer aanwezig zijn, en dat er twee wafers per experimentele split zijn, dan geeft dit een statistische onnauwkeurigheid (σσ ) van 3.5 % tot 8 %. Voor de meeste experimenten is dit voldoende nauwkeurig. σσ = √
S.2.2
Het modelleren van mismatch in de drainstroom
Het modelleren van de relatieve mismatch in de drainstroom (∆ID /ID ) bestaat uit drie onderdelen. Ten eerste wordt de mismatch in de drainstroom gemodelleerd als functie van de instelspanningen en wordt het geldigheidsgebied van het model onderzocht. Ten tweede wordt bekeken hoe de modelparameters moeten worden geschat. Ten derde wordt de lengte- en breedteafhankelijkheid van de parameters gemodelleerd. Voor het modelleren van de mismatch in de drainstroom wordt uitgegaan van een bepaald model voor de drainstroom zelf. In het algemeen kan de drainstroom worden uitgedrukt als een functie van een aantal modelparameters (P1 , P2 , ...) en de instelspanningen (VGS , VDS , VBS ). Kortom, de drainstroom kan worden geschreven als ID (VGS , VDS , VBS , P1 , P2 , ...). Wanneer de mismatch (∆P1 , ∆P2 , ...) klein is, volgt dat: ∆ID ∼ 1 ∂ID 1 ∂ID ∆P1 + ∆P2 + . . . . (S.3) = ID ID ∂P1 ID ∂P2 De mismatch in de drainstroom zal nu worden uitgedrukt in termen van de mismatch in de drempelspanning (∆VT ) en de mismatch in de stroomfactor (∆(1/β0 ), ∆(1/ζsr ), ∆(1/ζsat ))1 . In tegenstelling tot eerder gepubliceerd werk, bekijken we voor de drempelspanning en stroomfactor afzonderlijk welke benaderingen er met betrekking tot het model voor de drainstroom noodzakelijk zijn. Hierdoor zullen we in staat blijken te zijn met een redelijk simpel model de mismatch in de drainstroom in een zeer groot instelgebied te beschrijven. In het geval van de drempelspanning kunnen we een zeer algemeen model aannemen voor de drainstroom. We gaan ervan uit dat de drainstroom een functie van de gateoverdrive spanning (VGS − VT ) is en niet van VGS of VT afzonderlijk. Dit leidt tot de volgende bijdrage van de mismatch in de drempelspanning aan de mismatch in de drainstroom: ¯ 1 dID gm ∆ID ¯¯ 1 ∂ID ∆VT ∼ ∆VT = − ∆VT , (S.4) = =− ID ¯∆VT ID ∂VT ID dVGS ID waar gm de transconductantie voorstelt. Aangezien onze aanname geldig is in het gehele inversiegebied, was de verwachting dat dit model ook in het gehele inversiegebied geldig is. Het zal echter blijken dat (S.4) de mismatch alleen nauwkeurig 1 Deze
parameters zullen later in deze subsectie verder gedefinieerd worden.
168
APPENDIX S. SUMMARY IN DUTCH NMOS W = 10.0 mm L = 0.18 mm
DID/ID (%)
10
1
VDS = 50 mV VDS = 1.8 V
0.1
0
0.5
1
1.5
2
VGS (V)
Figure S.3: De ∆ID /ID −VGS curves van een bepaald transistorpaar bij VDS = 50 mV en VDS = VDD . Ook getoond zijn de absolute kleinste kwadraten fit (doorgetrokken lijnen) en de relatieve kleinste kwadraten fit (stippellijnen) van ons model.
beschrijft voor VGS > VT . Hier zal in sectie S.4 op worden ingegaan. We zouden nu verder kunnen gaan met het modelleren van gm /ID , maar hier zijn verdere aannames voor nodig, terwijl het ons doel voorbij schiet. De waarde van gm /ID kan afhankelijk van de toepassing worden bepaald. Bijvoorbeeld, voor een handberekening kan een simpel model voor de drainstroom worden gebruikt, terwijl voor circuitsimulaties een uitgebreider model meer nauwkeurigheid kan geven. Voor parameterextractie en onderzoek naar de nauwkeurigheid van het model kan gm /ID op basis van de meetdata worden bepaald en hoeft dus niet te worden gemodelleerd. Dit is de aanpak die hier wordt gevolgd. De afhankelijkheid van de drain- en bulkspanning kan worden opgedeeld in de bekende instelspanningsafhankelijkheid van gm /ID en van ∆VT (VDS , VBS ). De afhankelijkheid van ∆VT als functie van de drainspanning is experimenteel onderzocht door ∆VT voor verschillende waardes van VDS met behulp van een stroomcriterium te extraheren. Voor het onderzochte proces was geen significante afhankelijkheid zichtbaar. De bulkspanningsafhankelijkheid van ∆VT wordt, in tegenstelling tot in de literatuur, niet gemodelleerd met een mismatch in ∆VT bij VBS = 0 V (∆VT 0 ) en een mismatch in de bodyeffectco¨effici¨ent (∆γ). Wanneer door het aanleggen van een negatieve bulkspanning het depletiegebied onder de gate groter wordt, zijn de bijbehorende extra doperingsfluctuaties niet gecorreleerd met de al aanwezige fluctuaties. Wanneer de bulkspanning verder wordt verlaagd geldt hetzelfde. Hieruit volgt dat de opsplitsing van ∆VT in ∆VT 0 en ∆γ niet volledig accuraat kan zijn. Wij kiezen er daarom voor om niet de bulkspanningsafhankelijkheid van ∆VT te modelleren, maar alleen de bulkspanningsafhankelijkheid van σ∆VT . Het is aangetoond dat het
S.2. METEN EN MODELLEREN VAN MISMATCH IN DE DRAINSTROOM 169
VDS = 0.05 V VDS = 0.3 V VDS = 0.9 V VDS = 1.8 V
10
s(DID/ID) (%)
s(DID/ID) (%)
100
0.5
10
NMOS W = 10.0 mm L = 0.18 mm
NMOS W = 0.25 mm L = 0.18 mm 0
VDS = 0.05 V VDS = 0.3 V VDS = 0.9 V VDS = 1.8 V
1 1
1.5
2
0
0.5
NMOS W = 1.0 mm L = 1.0 mm 1
0
0.5
1
1
1.5
2
1.5
VDS = 0.05 V VDS = 0.3 V VDS = 0.9 V VDS = 1.8 V
10 s(DID/ID) (%)
s(DID/ID) (%)
VDS = 0.05 V VDS = 0.3 V VDS = 0.9 V VDS = 1.8 V
10
1 VGS (V)
VGS (V)
2
NMOS W = 0.25 mm L = 7.2 mm 0
0.5
VGS (V)
1
1.5
2
VGS (V)
Figure S.4: Experimentele (symbolen) en gemodelleerde (lijnen) σ∆ID /ID −VGS curves voor een aantal waardes van de drainspanning en transistorafmetingen. VBS = 0 V volgende semi-empirische model een goede beschrijving geeft: µ ¶α VBS σ∆VT (VBS ) = σ∆VT |VBS =0 · 1 − . φB
(S.5)
Hierin is φB de oppervlaktepotentiaal en de fittingparameter α is gelijk aan 0.5 wanneer de mismatch in de drempelspanning wordt veroorzaakt door doperingsfluctuaties en gelijk aan 1 wanneer de mismatch wordt veroorzaakt door fluctuaties in de oxidecapaciteit. Voor korte transistoren wordt α kleiner, omdat in dit geval de bulk van het kanaal wordt afgeschermd door de source en drain. Om de bijdrage van de mismatch in de stroomfactor op de drainstroom uit te rekenen is een uitgebreider model voor de drainstroom nodig: ID = β(VGS − VT − VDS /2)VDS .
(S.6)
De stroomfactor (β ≡ W Cox µ(VGS − VT , VDS , VBS )/L) wordt gemodelleerd door: 1 VGS − VT − VDS /2 VDS 1 = + + . β β0 ζsr ζsat
(S.7)
Hierin modelleert de term die β0 bevat de stroomfactor waarbij de mobiliteit gelimiteerd wordt door de bulkmobiliteit. De termen met ζsr en ζsat modelleren in
APPENDIX S. SUMMARY IN DUTCH 20
s(model) / s(exp.) - 100 (%)
s(model) / s(exp.) - 100 (%)
170
0 -20 0.25/0.18 10.0/0.18 0.25/0.25 7.1/0.25 0.35/0.35 4.2/0.45 1.0/1.0
-40 -60 -80 -100
0
0.5
1
1.4/1.9 1.9/2.7 0.55/3.3 4.0/4.0 0.35/5.3 0.25/7.2 10.0/7.2
1.5
20 0 -20 -40 -60 -80 -100
2
-2
0.25/0.18
1.4/1.9
10.0/0.18
1.9/2.7
0.25/0.25
0.55/3.3
7.1/0.25
4.0/4.0
0.35/0.35
0.35/5.3
4.2/0.45
0.25/7.2
1.0/1.0
10.0/7.2
-1.5
VGS (V)
20 0 -20
-60 -80 -100
0
0.5
0.25/0.18
1.4/1.9
10.0/0.18
1.9/2.7
0.25/0.25
0.55/3.3
7.1/0.25
4.0/4.0
0.35/0.35
0.35/5.3
4.2/0.45
0.25/7.2
1.0/1.0
10.0/7.2
1
-0.5
0
(b) PMOS, VBS = 0 V s(model) / s(exp.) - 100 (%)
s(model) / s(exp.) - 100 (%)
(a) NMOS, VBS = 0 V
-40
-1 VGS (V)
1.5
VGS (V)
(c) NMOS, VBS = −1.8 V
2
20 0 -20 0.25/0.18 10.0/0.18 0.25/0.25 7.1/0.25 0.35/0.35 4.2/0.45 1.0/1.0
-40 -60 -80 -100
-2
-1.5
1.4/1.9 1.9/2.7 0.55/3.3 4.0/4.0 0.35/5.3 0.25/7.2 10.0/7.2
-1
-0.5
0
VGS (V)
(d) PMOS, VBS = 1.8 V
Figure S.5: Relatief verschil tussen de gemodelleerde and experimentele σ∆ID /ID −VGS curves voor alle gemeten transistorafmetingen. De W/L waardes worden in de figuren vermeld. |VDS | = 1.8 V, VBS = 0 V en |VBS | = 1.8 V. eerste orde de afwijkingen met respectievelijk toenemende gate- en drainspanning. Deze afwijkingen worden ondermeer veroorzaakt door surface-roughness scattering, snelheidssaturatie en de weerstand van de source en drain. De bijdrage aan de drainstroom van de mismatch in de stroomfactor volgt door (S.3) toe te passen op (S.6), wat het volgende oplevert: ¯ ∆ID ¯¯ 1 1 1 − βVDS ∆ . (S.8) = −β∆ − β(VGS − VT − VDS /2)∆ ID ¯∆(1/β) β0 ζsr ζsat Dit model is alleen geldig in sterke inversie. Echter, voor lagere waardes van de gatespanning is de bijdrage van de mismatch in de drempelspanning dominant en kan de bijdrage van de mismatch in de stroomfactor constant worden gesteld. Ten slotte moet nog worden vermeld dat in saturatie VDS moet worden vervangen door de saturatiespanning, die wordt uitgerekend door dID /dVDS = 0 te stellen. De totale mismatch in de drainstroom wordt uitgerekend door de rechterzijde van (S.4) en (S.8) op te tellen. De parameters ∆VT , ∆(1/β0 ), ∆(1/ζsr ) en ∆(1/ζsat )
S.2. METEN EN MODELLEREN VAN MISMATCH IN DE DRAINSTROOM 171 7
s(DVT) (mV)
1.0/1.0
10.0/0.18 7.1/0.25
5
4.2/0.45
4
0.55/3.3 0.35/5.3 0.25/7.2
1.4/1.9
3 2
4.0/4.0
1.9/2.7
1 0
0.2
0.4
0.6
0.8
1.4/1.9 0.25/7.2
40 0.35/5.3
4.0/4.0
1.9/2.7
20
1
0
0.2
W L (mm)
35
0.4
0.6
1/
W L (mm)
0.8
1
35 1.0/1.0
NMOS
30
7.1/0.25 10.0/0.18 4.2/0.45
25 20
1.4/1.9
(W/L) s(D1/zsat) (W)
(W/L) s(D1/zsr) (W)
0.55/3.3 4.2/0.45
60
0 1/
0.55/3.3 0.25/7.2
15 0.35/5.3
10
4.0/4.0
1.9/2.7
5
NMOS
30
0
0.2
1.0/1.0 10.0/0.18 7.1/0.25
25 20
1.4/1.9
15
1.9/2.7
0.55/3.3 0.25/7.2 4.2/0.45
0.35/5.3
10
4.0/4.0
5
10.0/7.2
0
1.0/1.0
7.1/0.25 10.0/0.18
10.0/7.2
10.0/7.2
0
NMOS
80 (W/L) s(D1/b0) (W V)
NMOS
6
10.0/7.2
0.4
0.6
1/
W L (mm)
0.8
1
0
0
0.2
0.4
0.6
1/
W L (mm)
0.8
1
Figure S.6: Experimentele (open symbolen) en gemodelleerde (volle symbolen) √ waardes van σ∆VT , σ∆(1/β0 ) , σ∆(1/ζsr ) en σ∆(1/ζsat ) als een functie van 1/ W L. In de figuren worden de W/L verhoudingen van de transistoren gegeven in µm/µm’s. De lijnen geven het gemodelleerde resultaat indien allen A0 wordt meegenomen. De foutenbalken geven de 99 % betrouwbaarheidsintervallen.
worden geschat door middel van een kleinste kwadraten fit aan twee ∆ID /ID − VGS curves, de eerste bij een lage waarde van VDS en de tweede bij VDS = VDD . De gatespanning loopt van VGS = VT tot VGS = VDD . Deze kleinste kwadraten fit kan op twee manieren worden uitgevoerd. De eerste minimaliseert de som van het absolute kwadratische verschil, wat de standaard is in de literatuur, en de tweede minimaliseert de som van het relatieve kwadratische verschil. Figuur S.3 toont aan dat de tweede methode het beste resultaat oplevert. Figuur S.4 vergelijkt experimentele σ∆ID /ID −VGS curves met de gemodelleerde curves voor vier extreme transistorafmetingen. De drainspanning is gebruikt als parameter. Het model levert een nauwkeurige beschrijving van de experimentele data, bij zowel de drainspanningen die in de fits zijn gebruikt (VDS = 50 mV en VDS = 1.8 V) als de tussenliggende waardes van de drainspanning. Figuur S.5 toont de relatieve nauwkeurigheid van het model voor alle afmetingen bij |VDS | = 1.8 V met VBS = 0 V en met |VBS | = 1.8 V. Boven de drempelspanning is het model nauwkeurig. Zoals eerder vermeld wordt het model onnauwkeurig in het subthreshold gebied. De lengte- en breedteafhankelijkheid van de mismatch wordt voor lange en brede tran-
172
APPENDIX S. SUMMARY IN DUTCH
sistoren gemodelleerd door (S.1). Voor korte of smalle transistoren zijn afwijkingen van dit model mogelijk. Dit wordt als volgt semi-empirisch gemodelleerd: 2 σ∆P =
A20,∆P AL,∆P AW L,∆P AW,∆P + + + . 2 2 WL WL W L W 2 L2
(S.9)
Hierin geeft de tweede term aan de rechterzijde de afwijking voor korte transistoren, de derde term de afwijking voor smalle transistoren en de vierde term de afwijking voor korte en smalle transistoren. Dit model is alleen geldig wanneer voor lange transistoren de parameter, waarvan de mismatch wordt gekarakteriseerd, zelf geen functie van de lengte en breedte is. Dit is het geval voor de drempelspanning. 2 2 2 moet de rechterzijde van (S.9) worden veren σ∆(1/ζ Voor σ∆(1/β , σ∆(1/ζ sat ) 0) sr ) 2 2 menigvuldigd met L /W . Figuur S.6 toont dat dit model een goede beschrijving geeft van de lengte- en breedteafhankelijkheden van de matchingeigenschappen van het onderzochte 0.18 µm CMOS proces. Er kunnen ook correlaties optreden tussen de parameters (ρ(∆P1 , ∆P2 )). De lengte- en breedteafhankelijkheid van deze correlaties wordt gemodelleerd door: ρ(∆P1 , ∆P2 ) = A20,∆P1 ,∆P2 +
AL,∆P1 ,∆P2 AW,∆P1 ,∆P2 AW L,∆P1 ,∆P2 + + . L W WL
(S.10)
Voor lange en brede transistoren is de correlatie onafhankelijk van de lengte en breedte. De tweede term aan de rechterzijde modelleert de afwijking voor korte transistoren, de derde term de afwijking voor smalle transistoren en de vierde term de afwijking voor korte en smalle transistoren.
S.3
Parameterextractie
Er zijn verscheidene methodes beschikbaar om de mismatch in de drempelspanning en de stroomfactor te extraheren. Alhoewel de definities van deze parameters een fysische basis hebben, zal toch blijken dat, afhankelijk van de gebruikte methode, grote verschillen in de ge¨extraheerde waardes kunnen optreden. In deze sectie en in hoofdstuk 3 wordt de afweging tussen de verschillende methodes gemaakt. Deze afweging hangt af van modelnauwkeurigheid, meetnauwkeurigheid en -tijd, en de fysische betekenis van de parameters. Deze drie eigenschappen worden respectievelijk in de volgende drie subsecties behandeld. De onderzochte methodes worden opgesplitst in directe extractiemethodes en fitmethodes aan de mismatch in de drainstroom. Directe extractiemethodes extraheren de absolute waardes van de drempelspanning en stroomfactor voor beide transistoren in een paar en berekenen de mismatch door deze waardes van elkaar af te trekken. De onderzochte directe extractiemethodes zijn de maximum slope methode, de drie-punts methode, de vier-punts methode en het toepassen van een stroomcriterium. Er worden twee versies van de drie- en vierpunts methode bekeken. Bij de eerste liggen de absolute waardes van de aangelegde gatespanningen vast, terwijl bij de tweede de overdrives van de gate ten opzichte van
S.3. PARAMETEREXTRACTIE
173
de drempelspanning vast liggen. Fitmethodes extraheren de mismatch in de drempelspanning en in de stroomfactor door middel van een lineaire kleinste kwadraten fit aan experimentele ∆ID /ID curves. Er worden twee fitmethodes onderzocht. Methode A is beschreven in [23, 35, 117] en methode B is de methode die in de vorige sectie en in hoofdstuk 2 is beschreven. De meetdata die in deze sectie wordt gepresenteerd is afkomstig van een 130 nm CMOS proces met een oxidedikte van 2.0 nm, een nominale gatelengte van 130 nm en een voedingsspanning van 1.5 V. De onderzochte transistorafmetingen zijn: W/L =0.25/0.18 (smal en kort), 10.0/0.18 (breed en kort), 1.0/1.0 (breed en lang), 0.25/7.2 (smal en lang) en 10.0/7.2 µm/µm (breed en lang met groot oppervlak). In de volgende drie subsecties zal respectievelijk ingegaan worden op modelnauwkeurigheid, meetnauwkeurigheid en -tijd, en de fysische betekenis van de parameters.
S.3.1
Modelnauwkeurigheid
De meeste van de onderzochte extractiemethodes2 zijn gerelateerd aan een model voor de drainstroom of een model voor de mismatch in de drainstroom. De modelnauwkeurigheid is gedefinieerd als het relatieve verschil tussen de gemodelleerde en experimenteel gevonden waardes van σ∆ID /ID . Modelnauwkeurigheid is voornamelijk van belang wanneer een model gebruikt wordt voor de simulatie van circuits. Bij een lage drainspanning geven alle modellen een goede nauwkeurigheid. Bij een hoge drainspanning leveren alleen de fitting methodes een goede nauwkeurigheid. Dit wordt veroorzaakt door het feit dat voor de fitmethodes de meetdata bij hoge drainspanning onderdeel uitmaken van de fit, maar dat bij de directe extractiemethodes alle parameters bij lage drainspanning worden ge¨extraheerd. Dit is vooral zichtbaar bij kortere gatelengtes waarbij serieweerstand en snelheidssaturatie niet kunnen worden verwaarloosd.
S.3.2
Meetnauwkeurigheid en -tijd
De matchingeigenschappen van een technologie worden bepaald door twee bijna identieke grootheden te meten, van elkaar af te trekken en het verschil te analyseren. Dit maakt dat het meten van mismatch zeer gevoelig is voor onnauwkeurigheden. De ene extractiemethode kan hier gevoeliger voor zijn dan de andere. De meetnauwkeurigheid wordt onderzocht door de meting twee keer uit te voeren en de correlatie tussen de twee metingen (ρrepeat ) te analyseren. De fractie van de gemeten of ge¨extraheerde variantie van de mismatch die wordt bepaald door de meetonnauwkeurigheid is gelijk aan 1 − ρrepeat . Als er door de meting geen ruis wordt toegevoegd geldt ρrepeat = 1. Voor de meeste extractiemethodes geldt dat ze voor alle onderzochte transistorafmetingen een zeer goede meetnauwkeurigheid hebben (ρrepeat > 0.99). Er zijn echter twee duidelijke uitzonderingen. Ten eerste geldt voor de mismatch van de transistorparen 2 Behalve
de maximum slope methode en het toepassen van een stroomcriterium.
174
APPENDIX S. SUMMARY IN DUTCH 20 1 ID (mA)
0.8
verandering van 10 stroombereik
5
0.6
NMOS VDS=50 mV W=10.0 mm L=7.2 mm
0.4 0.2
0 0.2
0.4
rrepeat (-)
ID @ VGS2 verschuift naar 15 het volgende stroombereik
0.6
0.8
1
0 1.2
VGS2 (V)
Figure S.7: Meetnauwkeurigheid (ρrepeat ) van ∆VT (rechter as) als een functie van de tweede instelspanning voor de drie-punts methode. Op de linker as is de drainstroom getoond en het moment waarop het meetsysteem van meetbereik verandert (stippellijn). VGS1 = VGS2 − 0.2 V, VGS3 = VGS2 + 0.4 V met W/L = 10.0/7.2 µm/µm, die met de drie- of vier-punts methode is ge¨extraheerd, dat ρrepeat (∆VT ) ∼ 0.6, ρrepeat (∆β0 /β0 ) ∼ 0.4 en ρrepeat (∆θ) ∼ 0.2 3 . Deze slechte nauwkeurigheid wordt veroorzaakt door ruis die door het meetsysteem wordt toegevoegd. Dit wordt aangetoond in figuur S.7. Deze figuur toont op de linker as de drainstroom als functie van de gatespanning. Op de rechter as is ρrepeat (∆VT ) uitgezet als functie van de gatespanning van het tweede instelpunt. Als bij dit tweede instelpunt de stroom in een hoger meetbereik terecht komt, dan neemt de meetnauwkeurigheid drastisch af, wat overeen komt met de eigenschap dat een meetsysteem aan de onderkant van een meetbereik relatief meer ruis toevoegt dan aan de bovenkant van een meetbereik. Het is hier op zijn plaats om een aantal opmerkingen te maken over de geobserveerde meetonnauwkeurigheid. Ten eerste is deze alleen geobserveerd voor de W/L = 10.0/7.2 µm/µm transistorparen. Deze paren hebben het grootste oppervlak van de onderzochte paren en daardoor de laagste mismatch, wat ze het meest gevoelig maakt voor meetruis. Ten tweede hebben alleen de drie- en vier-punts methodes last van deze meetruis. Op de maximum slope methode na, gebruiken de andere methodes meer meetpunten waarover de ruis wordt uitgemiddeld. In het geval van de maximum slope methode vindt de extractie nog net in het lagere meetbereik plaats, maar het ligt in de lijn der verwachting dat deze methode ook gevoelig is voor meetruis. Ten derde moet nog worden vermeld dat de experimenteel gevonden meetnauwkeurigheid veel beter is dan de absolute nauwkeurigheid die in de handleiding van het meetsysteem is gespecificeerd. Voor het meten van mismatch is voornamelijk de reproduceerbaarheid van de meting op korte termein van belang in plaats van de absolute meetonnauwkeurigheid. Zoals eerder vermeld is de relatieve 3 ∆θ is de mobiliteitsreductiefactor. In het geval van de vier-punts methode zijn er twee mobiliteitsreductiefactoren die beiden met een slechte nauwkeurigheid bepaald zijn.
S.3. PARAMETEREXTRACTIE
175
grootte van de meetruis een functie van de gemeten stroom en dus afhankelijk van hoe de transistor is ingesteld. Als vuistregel stellen we dat meetruis de meting kan be¨ınvloeden bij transistoroppervlaktes groter dan 10 µm2 . Een tweede vorm van meetonnauwkeurigheid is geobserveerd bij de W/L = 10.0/0.18 µm/µm transistorparen voor welke ρrepeat (∆θ) ∼ 0.85, wat iets lager is dan voor andere transistorafmetingen. Dit wordt veroorzaakt door de weerstand tussen de probetips en de bondingpaden die varieert met ongeveer σRcontact = 150 mΩ. Deze variatie kan significant toenemen indien de probetip versleten is of wanneer de bondingpaden zijn gedegradeerd door eerdere metingen. Het is daarom aan te bevelen om een extractiemethode te gebruiken die ook de bijdrage van de mismatch in de serieweerstand extraheert. Wanneer dit niet gebeurd, zoals bij de maximum slope methode en fitting methode A, kan een meetonnauwkeurigheid ten gevolge van een slecht contact doorwerken in de extractie van de mismatch in de drempelspanning en de stroomfactor. De reden dat de transistorparen met W/L = 10.0/0.18 µm/µm het meest gevoelig zijn voor fluctuaties in de contactweerstand is omdat hun intrinsieke weerstandsfluctuaties (σ∆R ) het kleinst zijn van de onderzochte paren. In het algemeen kan worden gesteld dat men op moet passen voor deze vorm van onnauwkeurigheid wanneer σ∆R < 2 Ω. Er moet worden opgemerkt dat bovenstaand verhaal alleen opgaat wanneer de transistoren in het lineaire gebied opereren. In saturatie hangt de stroom niet sterk af van de drainspanning en hebben fluctuaties in de contactweerstand een verwaarloosbare invloed. Ten slotte wordt kort ingegaan op meettijd. Zoals eerder vermeld kost het ongeveer 0.3 s om een meetpunt te vergaren. Hieruit volgt dat de drie- en vier-punts methodes het snelst zijn in gebruik (1 − 2 s). Het toepassen van een stroomcriterium of de maximum slope methode kost 2 − 3 s. De fitting methodes zijn het meest langzaam in gebruik (5 − 15 s).
S.3.3
Fysische betekenis
Wanneer de fysische oorzaak van de mismatch van MOSFETs wordt onderzocht is het van belang dat de fysische betekenis van de ge¨extraheerde parameters duidelijk is. Deze fysische betekenis is voor ∆VT en ∆β/β op twee manieren onderzocht. Ten eerste zijn de methodes onderling vergeleken door de met verschillende methodes ge¨extraheerde parameters met elkaar te correleren. Ten tweede is de fysische betekenis van de stroomfactor onderzocht door expres een bekende lengtevariatie aan te brengen en te kijken hoe goed deze terug wordt ge¨extraheerd. De resultaten zullen nu methode per methode worden besproken. Drie- en vier-punts methodes. De correlatie tussen de met de drie- en vier-punts methodes ge¨extraheerde waardes van ∆VT en ∆β/β is in praktisch alle onderzochte situaties groter dan 90 %. Dit houdt bijvoorbeeld in dat de met de drie-punts methode ge¨extraheerde ∆VT door dezelfde fysische mechanismen wordt veroorzaakt als de met de vier-punts methode ge¨extraheerde ∆VT . Een verschil treedt echter op wanneer de waardes van σ∆VT en σ∆β/β met elkaar worden vergeleken. Wanneer de
APPENDIX S. SUMMARY IN DUTCH
NMOS VDS=50 mV W=10.0 mm 0.5 L=0.18 mm dVT/dVGS1 * DVT 0.45 DVT
VT (V)
0.55
0.4 0.4
0.5
0.6 0.7 0.8 VGS1 (V) (a)
0.9
s(DVT) (mV)
176
4.5 4 3.5 3 2.5 2 1.5 1 0.4
NMOS VDS=50 mV W=10.0 mm L=0.18 mm 0.5
0.6 0.7 0.8 VGS1 (V)
0.9
(b)
Figure S.8: Met de drie-punts methode geschatte drempelspanning (a) en mismatch in de drempelspanning (b) als een functie van de locatie van de instelspanningen. (•) vaste instelspanningen, (¥) instelspanningen met een vaste overdrive ten opzichte van een eerder bepaalde drempelspanning. De stippellijn in figuur b voorspelt de met vaste overdrive geschatte σ∆VT uit de met vaste instelspanningen ge¨extraheerde waarde. VGS2 = VGS1 + 0.2 V, VGS3 = VGS1 + 0.6 V.
instelspanningen worden gekozen ten opzichte van een initieel bepaalde drempelspanning (vaste overdrive) is de ge¨extraheerde waarde van σ∆VT 30 % tot 50 % hoger dan wanneer de instelspanningen vaste waardes hebben. De verklaring van dit verschijnsel is te vinden in figuur S.8a. Deze figuur toont de met de drie-punts methode ge¨extraheerde waarde van de drempelspanning als een functie van de positie van de instelspanningen. Door modelonnauwkeurigheden is deze waarde niet onafhankelijk van de aangelegde instelspanningen. Dit veroorzaakt een soort terugkoppeling: Van een transistor met een iets hogere drempelspanning zal de ge¨extraheerde waarde een fractie worden onderschat wanneer de instelspanningen vast liggen. Figuur S.8b toont de ge¨extraheerde σ∆VT als een functie van de positie van de instelspanningen. Zoals te zien is kan met behulp van figuur S.8a de met vaste overdrive geschatte σ∆VT worden voorspeld uit de met vaste waardes van de gatespaning geschatte σ∆VT . Het volgt ook dat de met vaste overdrive geschatte waarde niet sterk afhangt van de positie van de instelspanningen. Wanneer de geschatte waarde van ∆β/β wordt vergeleken met een opzettelijk aangebrachte lengte variatie, wordt deze iets nauwkeuriger terug geschat door de drie- of vier-punts methode met vaste overdrive (nauwkeurigheid > 90 % voor niet al te korte transistoren) dan met de methodes met vaste instelspanningen (nauwkeurigheid > 70 % voor niet al te korte transistoren). Samenvattend kan worden gesteld dat het voor het schatten van ∆VT en ∆β/β niet veel uitmaakt of de drie- of vier-punts methode wordt gebruikt. Het is echter wel van belang dat de instelspanningen geen vaste waardes hebben, maar dat ze met een vaste overdrive ten opzichte van een initieel bepaalde drempelspanning worden aangelegd. Maximum slope methode. De maximum slope methode extraheert waardes voor ∆VT
S.3. PARAMETEREXTRACTIE
177 r(DVT1, DVT2) (mV)
s(DVT) (mV)
100 NMOS VDS=50 mV
10
-10
10
-9
10
-8
10
-7
-6
-5
10 10 10 ID @ VGS = VT IDcrit (A)
(a) standaarddeviatie
1 NMOS 0.8 VDS=50 mV 0.6 0.4 0.2 0 -10
10
-9
10
-8
10
-7
-6
-5
10 10 10 ID @ VGS = VT IDcrit (A)
(b) correlatie
Figure S.9: a) Met behulp van een stroomcriterium geschatte standaarddeviatie van de mismatch in de drempelspanning als een functie van het toegepaste stroomniveau. b) Correlatie met de met de drie-punts methode bepaalde mismatch in de drempelspanning als een functie van het toegepaste stroomniveau. The doorgetrokken lijnen veronderstellen dat ∆VT in zwakke inversie gelijk is aan ∆VT in sterke inversie. (•) W = 10.0 µm, L = 0.18 µm. (¥) W = 1.0 µm, L = 1.0 µm.
en ∆β/β die dicht in de buurt liggen van de waardes die met de drie-punts methode met vaste overdrive zijn ge¨extraheerd. Er treedt echter een duidelijk verschil op wanneer de geschatte waarde van ∆β/β wordt vergeleken met een opzettelijk aangebrachte lengte variatie. Een minder goed contact tussen de probetip en het bonding pad zorgt ervoor dat de waarde van ∆β/β voor sommige afmetingen een grote systematische afwijking heeft. In het geval van de drie- en vier-punts methode veroorzaakt deze onnauwkeurigheid een afwijking in de geschatte waardes voor ∆θ, maar wordt ∆β/β wel goed geschat. Het toepassen van een stroomcriterium. Figuur S.9a toont de met een stroomcriterium ge¨extraheerde waarde van σ∆VT als functie van het gebruikte stroomniveau (IDcrit ). Figuur S.9b toont de correlatie van ∆VT met de met de drie-punts methode bepaalde ∆VT . De stippellijn geeft het niveau aan dat normaliter gebruikt wordt om de drempelspanning te bepalen. Bij het toepassen van een stroomcriterium wordt ervan uitgegaan dat de mismatch in de stroomfactor te verwaarlozen is. Deze aanname is niet geldig voor hoge gatespanningen wat de te hoge waarde voor σ∆VT verklaart en de lage correlatie met de drie-punts methode. Echter, voor lagere gatespanning wordt de mismatch door ∆VT gedomineerd en zou de ge¨extraheerde waarde van σ∆VT onafhankelijk moeten zijn van de gatespanning. In figuur S.9a wordt dit aangegeven met de doorgetrokken lijn. Experimenteel wordt echter geobserveerd dat de mismatch in zwakke inversie toeneemt. We kunnen er blijkbaar niet vanuit gaan dat ∆VT hetzelfde is in zowel het zwakke als sterke inversiegebied. Een stroomcriterium extraheert de mismatch in het gebied tussen zwakke en sterke inversie in, waardoor de exacte fysische betekenis van ∆VT moeilijk te bepalen is.
178
APPENDIX S. SUMMARY IN DUTCH
Table S.1: Kwalitative vergelijking tussen extractiemethodes in vergelijking tot modelnauwkeurigheid, meetnauwkeurigheid, gevoeligheid voor een slecht contact, meetsnelheid en fysische betekenis van de parameters. De afkorting v.i. staat voor vaste instelspanningen en v.o. staat voor vaste overdrive van de gatespanning. model maximum slope 3 punts – v.i. 3 punts – v.o. 4 punts – v.i. 4 punts – v.o. stroomcriterium fitmethode A fitmethode B
modelnauwkeurigheid n.v.t. n.v.t. + +
meetnauwkeurigheid + + +
Rs gevoeligheid + + + + + +
meetsnelheid 0 + 0 + 0 0 -
fysische betekenis 0 0 + 0 + 0
Fitmethode A. Fitmethode A [23, 35, 117] kan worden toegepast in het lineaire gebied en in saturatie. De ge¨extraheerde waardes van σ∆VT kunnen tot 30 % afwijken van de met de drie-punts methode bepaalde waardes en de correlatie tussen de methodes ligt tussen 50 % en 98 %, afhankelijk van de transistorafmeting. Voor ∆β/β kunnen de verschillen zelfs iets groter zijn. Fitmethode A verwaarloost effecten zoals surfaceroughness scattering, serieweerstand en snelheidssaturatie. Fluctuaties in deze parameters worden gedeeltelijk opgevangen door een toename in de ge¨extraheerde fluctuaties in ∆VT en ∆β/β. Aangezien de serieweerstand wordt verwaarloosd veroorzaakt een slecht contact een systematische afwijking van de geschatte ∆β/β wanneer de extractie in het lineaire gebied gebeurd. In saturatie heeft de serieweerstand aan de drainzijde een verwaarloosbare invloed op het gedrag van de transistor en speelt een slecht contact geen rol. In dit geval geldt dat een opzettelijke variatie in de lengte met een nauwkeurigheid > 70 % wordt terug geschat voor niet al te korte transistoren. Fitmethode B. Fitmethode B probeert een nauwkeurige beschrijving van de mismatch in de drainstroom te geven in een zo groot mogelijk instelgebied van de transistor. Dit zorgt ervoor dat de fysische inhoud van de uiteindelijk ge¨extraheerde parameters iets minder duidelijk is. De correlatie van ∆VT en ∆β/β met de drie-punts methode ligt tussen 60 % en 95 %, afhankelijk van de transistorafmeting. Een opzettelijke variatie in de lengte wordt met een nauwkeurigheid > 80 % terug geschat voor niet al te korte transistoren.
S.3.4
Samenvatting
Tabel S.1 vat de resultaten van deze sectie samen. Voor meetnauwkeurigheid is het gunstig om fitmethodes te gebruiken aangezien meetruis over veel meetpunten wordt
S.4. FYSISCHE OORSPRONG VAN MISMATCH TUSSEN MOSFETS
179
uitgemiddeld. Deze methodes geven ook de meest nauwkeurige beschrijving van de mismatch in de drainstroom, wat van belang is voor circuitontwerp. Deze methodes zijn echter langzaam in gebruik. De drie- en vier-punts methodes zijn sneller, maar meetruis kan onnauwkeurigheden veroorzaken voor transistoren met een groot oppervlak. De drie- en vier-punts methodes gebruiken een beperkt instelgebied wat de fysische interpretatie van de parameters ten goede komt. Wanneer de drie- of vier-punts methode worden gebruikt moeten de instelpunten met een vaste overdrive ten opzichte van een eerder bepaalde initi¨ele drempelspanning worden aangelegd. De maximum slope methode en fitmethode A hebben als nadeel dat ze gevoelig zijn voor de contactweerstand. Het toepassen van een stroomcriterium extraheert ∆VT in het gebied tussen zwakke en sterke inversie, waar de fysische betekenis van ∆VT onduidelijk is. Deze methode is wel geschikt om ∆VT in zwakke inversie te bepalen.
S.4
Fysische oorsprong van mismatch tussen MOSFETs
Onderzoek naar de fysische oorsprong van de mismatch tussen transistoren kan leiden tot verbeterde modellen, inzicht in welke fluctuatiemechanismen domineren en uiteindelijk tot een technologie met betere matchingeigenschappen. De mismatch van macroscopische transistorparameters wordt door twee zaken bepaald, namelijk 1) hoe een microscopische afwijking in de transistor de macroscopische stroom be¨ınvloedt en 2) de eigenlijke oorsprong van deze microscopische afwijking. Deze twee zaken worden respectievelijk in de volgende twee subsecties behandeld. In de eerste subsectie wordt de drift-diffusie vergelijking opgelost in aanwezigheid van een microscopische afwijking. In de tweede subsectie wordt de invloed van doperingsfluctuaties in het kanaal, doperingsfluctuaties in de gate, fluctuaties in de oxidelading en de invloed van surface-roughness scattering op de parameterfluctuaties gemodelleerd.
S.4.1
Mismatch in de drainstroom
De stroomdichtheid (Jn ) in een MOSFET wordt beschreven door de drift-diffusie vergelijking: dn dψs + kT µn . (S.11) Jn = −qnµn dx dx Hierin is k de constante van Boltzman, n de elektronendichtheid, q de elementaire lading, T de temperatuur, µn de mobiliteit en ψs de oppervlaktepotentiaal. De eerste term aan de rechterzijde beschrijft de driftcomponent van de stroom en de tweede term beschrijft de diffusiecomponent. Op basis van deze vergelijking zal nu de mismatch in de drainstroom in het zwakke en sterke inversiegebied worden uitgerekend. Hierna worden kort-kanaal effecten besproken en worden de eerder geobserveerde verschillen tussen zwakke en sterke inversie verklaard.
180
APPENDIX S. SUMMARY IN DUTCH
Zwakke inversie. In zwakke inversie wordt mismatch voornamelijk bepaald door lokale fluctuaties in de oppervlakte potentiaal (δψs ). De totale oppervlakte potentiaal is de som van de potentiaal zonder fluctuaties en de microscopische potentiaal fluctuaties, dus ψs (x) = ψs0 + δψs (x). Verder geldt dat n ∝ eqψs /kT en definieren we δfn (x) ≡ eqδψs (x)/kT , waaruit volgt dat n(x) = δfn ·n0 (x). Hierin is n0 (x) gelijk aan de elektronenconcentratie in het geval dat er geen microscopische fluctuaties zijn. Dit invullen in (S.11) leert dat de drijvende kracht van de stroom evenredig is met dn0 (x)/dx en dat de weerstand die de stroom op positie x ondervindt evenredig is met 1/δfn (x). Hieruit volgt dat de relatieve variatie in de drainstroom kan worden uitgerekend met de quasi-weerstandsmethode zoals gepubliceerd in [102] en gedemonstreerd in figuur S.10a. In dit netwerk heeft iedere cel een iets andere weerstandswaarde die wordt bepaald op basis van de verdeling van δfn . Het aantal cellen dat moet worden meegenomen in het weerstandsnetwerk hangt af van de grootte van de transistor en van de correlatielengte van het stochastische proces dat de mismatch veroorzaakt. In het algemeen kan worden aangenomen dat δψs normaal is verdeeld, waaruit volgt dat δfn een lognormale verdeling heeft. De asymmetrie van de lognormale verdeling zorgt voor een verschuiving van de gemiddelde waarde van de drainstroom, voor een toename in de standaarddeviatie ten opzichte van sterke inversie en voor een niet volledige correlatie tussen de drempelspanningsfluctuaties in zwakke en sterke inversie. Deze effecten spelen een rol wanneer σδψs dezelfde orde van grootte heeft of groter is dan kT /q. In dit geval is het moeilijk om het weerstandsnetwerk uit figuur S.10a analytisch op te lossen en er is daarom voor gekozen om het te simuleren. Figuur S.10b toont de relatieve verschuiving in de drainstroom (µ∆0 fn + 1 4 ) als een functie van W/L (= NW /NL ) voor een aantal transistorafmetingen en σδψs = 2kT /q. Voor smalle transistoren is de stroom lager dan gemiddeld, omdat de stroom geblokkeerd kan worden door gebieden met lage oppervlaktepotentiaal. Voor brede transistoren is de stroom groter dan gemiddeld door stroompaden van source naar drain met een hoge oppervlaktepotentiaal. Figuur S.10c toont de toename van de standaarddeviatie ten opzichte van de situatie in sterke inversie en figuur S.10d toont de correlatie tussen de de mismatch in zwakke en sterke inversie. Deze resultaten kunnen gedeeltelijk de eerder waargenomen verschillen tussen zwakke en sterke inversie verklaren. Sterke inversie. In sterke inversie kan de diffusieterm uit (S.11) worden verwaarloosd. In dit geval is de lokale weerstand die de stroom ondervindt omgekeerd evenredig met de lokale elektronenconcentratie. Bij lage drainspanning is de gemiddelde elektronenconcentratie praktisch uniform over het kanaal verdeeld. Echter, bij hogere waardes van de drainspanning neemt de electronconcentratie aan de drainzijde af. Dit heeft tot gevolg dat een microscopische afwijking in de drempelspanning een groter effect heeft op de drainstroom wanneer deze afwijking zich aan de drainzijde bevindt dan wanneer deze zich aan de sourcezijde bevindt. Dit wordt ge¨ıllustreerd in figuur S.11, waarin de gesimuleerde afname van de drainstroom wordt getoond als functie van de 4 ∆0 P is het verschil in een macroscopische transistorparameter P van een transistor die last heeft van microscopische fluctuaties en een ideale transistor zonder deze fluctuaties. ∆P is het verschil in P van twee transistoren die beide last hebben van microscopische fluctuaties. δP (x, y, z) is de microscopische afwijking in P op positie {x, y, z}.
S.4. FYSISCHE OORSPRONG VAN MISMATCH TUSSEN MOSFETS N N = 256
N N = 16
N N = 64
N N =4
W
2
181
W
L
W
L
W
L L
NL = 3
m(D’fn) + 1 (-)
Drain
Source
NW = 5
1.5
1
0.5 s(dys)/(kT/q) = 2
= R’/2 = 1/2dfn
0 0.01
(a)
0.1
1
10
100
NW/NL (-)
(b)
1.8
0.8
s(dys)/(kT/q) = 2
r(D’fn , D’ys)
toename in s(DID/ID) (-)
1.9
1.7 1.6 1.5
0.7 0.65 0.6
1.4
0.55
1.3 1.2 0.01
0.75
0.1
1 NW/NL
(c)
10 (-)
100
0.5 0.01
s(dys)/(kT/q) = 2
0.1
1
10 NW/NL (-)
100
(d)
Figure S.10: a) Quasi-weerstandsnetwerk waarmee de variatie in de stroom in het zwakke inversie gebied kan worden bepaald. De verschuiving in het gemiddelde van de stroom (b), de toename in de standaarddeviatie (c) en de correlatie tussen de mismatch in zwakke en sterke inversie (d) worden getoond voor verscheidene transistorafmetingen als functie van de verhouding van het aantal cellen in de breedte van de transistor (NW ) en het aantal cellen in de lengte (NL ). σδψs /(kT /q)=2.
positie waar de doperingsconcentratie lokaal wordt verhoogd. De hoogte van de piek wordt door drie effecten gelimiteerd: 1) De eindige uitgangsweerstand van de MOSFET, 2) de correlatielengte van het mismatch veroorzakende stochastische proces en 3) de invloed van de drain, die het effect van fluctuaties aan de drainzijde vermindert. Het eerste effect speelt een minimale rol, maar de laatste twee kunnen beiden van belang zijn. Voor lange transistoren in saturatie heeft √ de niet uniforme elektronenconcentratie een logaritmische afwijking van de σ∆P ∝ 1/ W L wet tot gevolg. De invloed van lokale fluctuaties in de drempelspanning op de drainstroom in saturatie wordt in benadering
182
APPENDIX S. SUMMARY IN DUTCH
DID/ID (%)
-0.2
ND = 5·1020 tox = 2.3 nm ND = 5·1019
x
ND = 5·1019
y NA = 1.4·1018
ND = 5·1020
NA =
ND = 5·1020
-0.15
-0.1
-0.05
VDS = 0.05 V VDS = 0.10 V VDS = 0.25 V VDS = 0.50 V VDS = 1.20 V model Dx = 120 nm VGS = 1.20 V
1.3·1018
-5.5 nm xh +5.5 nm
0
0 source
(a)
0.2
0.4
0.6
xh (mm)
0.8
1 drain
(b)
Figure S.11: a) Schematische tekening van de gesimuleerde MOSFET met een enigszins hogere doperingsconcentratie tussen xh − 5.5 nm en xh + 5.5 nm. b) Gesimuleerde en berekende toename in de drainstroom als een functie van xh . gegeven door: 2 σ∆I | ≈ D /ID ∆VT
³ ln
(VGS
´
2 2 2lδV σδV 4kVT kδ 2 T T − + k , δ − VT 0 ) 2 VGS − VT 0 WL
L ∆xs
kV2 T
(S.12)
Hierin is kVT ≈ 1 en kδ ≈ 0, σδVT en lδVT stellen de standaarddeviatie en de correlatielengte van de lokale variatie in de drempelspanning voor en ∆xs is de afstand tussen de piek in figuur S.11b en de drain. Figuur S.12 vergelijkt de in saturatie bepaalde σ∆VT met de in het lineaire gebied bepaalde σ∆VT voor twee verschillende technologie¨en. De logaritmische afwijking voor lange transistoren wordt inderdaad experimenteel teruggevonden. Kort-kanaal effecten. Kort-kanaal effecten be¨ınvloeden de mismatch op vier manieren: 1) De kleinere effectieve lengte van het kanaal ten opzichte van de gatelengte verhoogt de mismatch voor korte transistoren, 2) de toename in de oppervlaktepotentiaal ten gevolge van de nabijheid van het source- en draingebied verlaagt de mismatch voor korte transistoren, 3) de hogere doperingsconcentratie ten gevolge van de halo’s vergroot de mismatch voor kortere transistoren en 4) fluctuaties in kort-kanaal effecten zelf verhogen de mismatch voor kortere transistoren. Het kort-kanaal gedrag van de onderzochte NMOS transistoren kan goed worden beschreven door middel van een kortere effectieve lengte. De kort-kanaal effecten voor de PMOS transistoren zijn groter, waardoor ook andere effecten een rol spelen. Verschil tussen zwakke en sterke inversie. Om kort-kanaal effecten tegen te werken worden halo’s ge¨ımplanteerd die dicht bij de source en drain voor een hogere doperingsconcentratie zorgen. In zwakke inversie leidt dit tot een hogere oppervlaktepotentiaal dicht bij de source en drain dan in het centrum van het kanaal. In sterke inversie is de drempelspanning bij de source en drain hoger. In zwakke inversie hangt de stroom exponentieel af van de oppervlakte potentiaal, terwijl de stroom in sterke
s(DVTsat)2 / sD(VTlin)2 (-)
S.4. FYSISCHE OORSPRONG VAN MISMATCH TUSSEN MOSFETS
183
2.5 tox = 1.5 nm ; NMOS tox = 1.5 nm ; PMOS tox = 2.0 nm ; NMOS tox = 2.0 nm ; PMOS lang-kanaal model
2
Dxs=5 nm Dxs=8 nm
1.5
Dxs=16 nm Dxs=48 nm
1
0.5
0.1
1
10
L (mm) 2 2 Figure S.12: Verhouding tussen σ∆V (VDS = VDD ) en σ∆V (VDS = 50 mV) als een T T functie van de gatelengte.
inversie lineair of kwadratisch van de drempelspanning afhangt. De invloed van halo’s is dus veel groter in zwakke inversie dan in sterke inversie en in zwakke inversie zal het effect van de halo’s voor veel langere transistoren zichtbaar zijn. De halo’s zorgen voor een kleiner effectief oppervlak en daardoor voor meer fluctuaties. Voor korte transistoren overlappen de halo’s en treedt dit effect niet op. Een soortgelijk effect treedt op aan de zijkant van de transistoren. Voor NMOS transistoren is in het algemeen de oppervlaktepotentiaal (in zwakke inversie) of de drempelspanning (in sterke inversie) kleiner aan de zijkant. In zwakke inversie zorgt dit voor een exponentieel hogere stroomdichtheid aan de zijkant. Dit verkleint het effectieve oppervlak en voor brede transistoren kan het eveneens voor een toename van de mismatch in zwakke inversie zorgen. Voor smalle transistoren is het potentiaalprofiel in het kanaal redelijk uniform en treedt dit effect niet op. Figuur S.13 vergelijkt de in zwakke en sterke inversie ge¨extraheerde σ∆VT als een functie van de transistorlengte en -breedte. Het grootste verschil tussen deze twee gebieden treedt op voor de onderzochte NMOS transistoren en het verschil neemt duidelijk toe met toenemende breedte. Dit kan dus worden verklaard uit de relatief grote bijdrage in zwakke inversie van de parasitaire transistoren aan de zijkant van de MOSFET. Bij de PMOS transistoren spelen deze parasitaire transistoren geen rol en is het verschil tussen zwakke en sterke inversie veel minder prominent aanwezig.
S.4.2
Fysische oorsprong van de fluctuaties
Macroscopische fluctuaties in transistor parameters kunnen worden veroorzaakt door microscopische fluctuaties in 1) de doperingsconcentratie in het kanaal, 2) de dope-
184
APPENDIX S. SUMMARY IN DUTCH 6 s(DVTweak)/s(DVTstrong) (-)
s(DVTweak)/s(DVTstrong) (-)
6 tox = 1.5 nm ; NMOS tox = 1.5 nm ; PMOS tox = 2.0 nm ; NMOS tox = 2.0 nm ; PMOS model
5 4 3 2 1 0
12.0 10.0 7.1 13.9
4.2
0.1
1.0
1.4 1.9 4.0 10.0
1
tox = 1.5 nm ; NMOS tox = 1.5 nm ; PMOS tox = 2.0 nm ; NMOS tox = 2.0 nm ; PMOS model
5 4 3 2 1
12.0 9.0 7.2 5.3 3.3
0 0.1
10
1.0 1.9 2.7
4.0
1
L (mm)
7.2
10 W (mm)
(a)
(b)
2 2 Figure S.13: Verhouding van σ∆V in zwakke inversie en σ∆V in sterke inversie als T T een functie van de transistor lengte (a) en breedte (b). De breedtes (a) of de lengtes (b) van de transistoren zijn in µm’s in de figuren vermeld. VDS = 50 mV.
ringsconcentratie in de gate, 3) de oxidelading en 4) surface-roughness scattering. Deze vier fluctuatiemechanismen be¨ınvloeden de drainstroom via de drempelspanning, de gatecapaciteit (gatedepletie) en de mobiliteit. De mobiliteit wordt gelimiteerd door de bulk mobiliteit, oxidelading, Coulomb scattering en surface-roughness scattering. Om het belang van deze verschillende fluctuatiemechanismen te leren kennen is hun invloed uitgerekend met behulp van de charge-sheet benadering, welke geldig is in 2 sterke inversie voor lage drainspanning. Hierin is σ∆P de variatie van de mismatch in een macroscopische parameter P , dP/dNdope (y) is de gevoeligheid van P voor 2 een variatie in de ladingsconcentratie op diepte y en σ∆N stelt de variatie in de dope ladingsconcentratie op diepte y voor. Deze ladingsconcentratie heeft betrekking op de lading in het kanaal, de vaste lading in het oxide of de lading in de gate. De grootte van de variatie in de lading in het kanaal of de vaste oxidelading wordt bepaald op basis van Poisson statistiek: De variatie op het totale aantal doperingsatomen in een bepaald volume is gelijk aan het gemiddelde aantal doperingsatomen in dit volume. De variatie in de doperingsconcentratie in de gate wordt bepaald door de structuur van de gate en het is een fitparameter in onze analyse. De invloed van de variatie in surface-roughness scattering op de mobiliteit wordt berekend uit de grootte van de variatie en de gevoeligheid van de mobiliteit voor surface-roughness scattering. Dit gebeurt op basis van de stochastische eigenschappen van het oxide-silicium interface, namelijk de grootte van de oxidedikte variatie en de correlatielengte van de variatie. In formulevorm komt dit neer op: µ
Z 2 σ∆P
= depth
dP dNdope (y)
¶2 2 σ∆N (y)dy. dope
(S.13)
Deze correlatielengte is ook als fitparameter gebruikt. Behalve de grootte van de variaties moeten ook de gevoeligheden naar deze variaties worden uitgerekend. Dit is gedaan op basis van de vergelijkingen voor MOS transis-
S.4. FYSISCHE OORSPRONG VAN MISMATCH TUSSEN MOSFETS experimenteel gemodelleerd totaal kanaaldopering oxide lading
185
gatedopering Coulomb scattering surface-roughness scattering 1
7 s(DUT) (mV)
s(Db/b) (%)
6 5 4 3
0.8 0.6 0.4
2 0.2
1 0
0.2
0.4
0.6
0.8 1 VGS - VT (V)
1.2
0
0.2
0.4
0.6 0.8 VGS - VT (V)
1
1.2
Figure S.14: Gemodelleerde en gemeten waardes van σ∆UT en σ∆β/β als een functie van VGS − VT . De individuele bijdragen van de oorzaken van de mismatch zijn ook getoond. toroperatie zoals gepubliceerd in [95,96]. Voor de mobiliteit wordt het semi-empirische model gebruikt dat is gepubliceerd in [101]. Het resultaat van deze oefening wordt gepresenteerd in figuur S.14. Hierin worden de bijdragen van de verschillende oorzaken van de fluctuaties5 op σ∆UT en σ∆β/β getoond voor transistorparen uit een 0.13 µm CMOS technologie met een oxidedikte van 2.0 nm, een voedingspanning van 1.5 V en met W = L = 1.0 µm. De parameter ∆UT is gedefinieerd als ∆UT ≡ −(VGS − VT )∆ID /ID en de parameter ∆β/β is gedefinieerd als ∆β/β ≡ −d∆UT /dVGS . De mismatch in de drempelspanning is direct gerelateerd aan ∆UT en wordt gegeven door ∆VT = ∆UT + (VGS − VT )∆β/β. De fitparameters met betrekking tot de grootte van de fluctuaties in de gatedopering en de surface-roughness scattering zijn bepaald aan de hand van de σ∆UT −VGS curve, welke goed wordt beschreven. Op basis van deze parameters wordt de σ∆β/β − VGS curve redelijk goed voorspeld. De getoonde oorzaken van de fluctuaties zijn onafhankelijk van elkaar en zijn dus ongecorreleerd. Dit houdt in dat hun bijdragen kwadratisch bij elkaar opgeteld moeten worden. Voor lage waardes van de gatespanning wordt σ∆UT (en dus ook σ∆ID /ID ) voornamelijk bepaald door fluctuaties in de kanaaldopering, terwijl voor hogere waardes van de gatespanning ook de fluctuaties in de gatedopering een rol gaan gaan spelen. De gehele σ∆β/β − VGS curve wordt gedomineerd door de variaties in de gatedopering, maar voor lage gatespanning spelen ook de fluctuaties in de kanaaldopering en Coulomb scattering een niet te verwaarlozen rol. 5 Variaties in Coulomb scattering worden alleen veroorzaakt door doperingsfluctuaties dicht bij het oxide-silicium interface en niet door de doperingsfluctuaties in het gehele depletiegebied. Variaties in Coulomb scattering kunnen worden opgesplitst in een gedeelte dat 100 % is gecorreleerd met de totale doperingsfluctuaties en in een gedeelte dat niet met deze fluctuaties is gecorreleerd. Het 100 % gecorreleerde gedeelte is meegenomen in de fluctuaties ten gevolge van de kanaaldopering terwijl het volledig ongecorreleerde gedeelte expliciet wordt getoond in figuur S.14.
186
APPENDIX S. SUMMARY IN DUTCH 0.3 totaal dopering+quantum gate depletie effectieve veld Coulomb scattering screening
0.25
2.5 totaal dopering+quantum gate depletie
2 1.5
effectieve veld Coulomb scattering screening
s(Db/b) (%)
s(DUT) (mV)
3
1
0.2 0.15 0.1 0.05 0
0.5 0
-0.05
0.2
0.4
0.6
0.8
1
-0.1
1.2
0.2
0.4
VGS - VT (V)
(a) kanaaldopering
0.8
1
1.2
(b) kanaaldopering
8
0.8 totaal gate dopering effectieve veld screening
6 4
0.6 s(Db/b) (%)
s(DUT) (mV)
0.6
VGS - VT (V)
2
0.4 0.2
totaal
effectieve veld
gate dopering
screening
0
0
-0.2
-2 -0.4
0.2
0.4
0.6
0.8
VGS - VT (V)
(c) gatedopering
1
1.2
0.2
0.4
0.6
0.8
1
1.2
VGS - VT (V)
(d) gatedopering
Figure S.15: Invloed van de kanaal- en gatedopering op σ∆UT en σ∆β/β , opgesplitst in de mechanismen via welke deze parameters be¨ınvloed worden. Figuur S.15 toont de mechanismen via welke de belangrijkste oorzaken van de fluctuaties de drainstroom be¨ınvloeden. Wanneer de oorzaak hetzelfde is en alleen het mechanisme verschilt, worden de curves lineair bij elkaar opgeteld. Doperingsfluctuaties in het kanaal be¨ınvloeden ∆UT voornamelijk via fluctuaties in de drempelspanning en fluctuaties in Coulomb scattering6 . Bij het berekenen van de fluctuaties in de drempelspanning is ook rekening gehouden met quantummechanische effecten. Het valt op dat de absolute bijdrage van Coulombscattering aan ∆UT niet sterk afhangt van de gatespanning7 . Dit heeft tot gevolg dat deze bijdrage niet goed te onderscheiden is van fluctuaties in de drempelspanning. Dit zou voor een gedeelte het verschil kunnen verklaren tussen de experimentele en de in de literatuur gemodelleerde waardes van de fluctuaties in de drempelspanning. De doperingsfluctuaties in het kanaal be¨ınvloeden ook de fluctuaties in ∆β/β en voornamelijk via fluctuaties in Coulomb scattering en 6 In tegenstelling tot figuur S.14, wordt hier de volledig met de fluctuaties in de kanaaldopering gecorreleerde fluctuaties in Coulomb scattering getoond. 7 Voor hogere waardes van de gatespanning worden mobiliteitsfluctuaties belangrijker. Echter, voor hogere waardes van de gatespanning vermindert de invloed van Coulomb scattering ten gevolge van screening door de inversielaag. In eerste orde benadering compenseren deze twee effecten elkaar.
187
6.5
0.9
6
0.8
5.5
s(Db/b) (%)
s(DUT) (mV)
S.5. TECHNOLOGISCHE ASPECTEN
5 4.5 VBS = 0 V VBS = -1 V VBS = -2 V model
4 3.5 3 2.5
0.2
0.4
0.6
0.8
1
VBS = 0 V VBS = -1 V VBS = -2 V model
0.7 0.6 0.5 0.4 0.3
1.2
0.2
0.2
0.4
0.6
0.8
VGS - VT (V)
VGS - VT (V)
(a)
(b)
1
1.2
Figure S.16: a) Gemodelleerde en experimentele waardes van σ∆UT als functie van VGS − VT voor verschillende waardes van de bulkspanning. b) Correlatie van ∆UT bij VGS − VT = 0.25 V en bij andere waardes van de gatespanning als een functie van VGS − VT voor verschillende waardes van de bulkspanning. fluctuaties in het effectieve veld8 . Fluctuaties in de gatedopering be¨ınvloeden ∆UT en ∆β/β via fluctuaties in gatedepletie en fluctuaties in het effectieve veld. De nauwkeurigheid van het opgestelde fysische model is onderzocht door de bulkspanningsafhankelijkheid van σ∆UT te voorspellen en te vergelijken met experimentele resultaten. Figuur S.16a toont een goede voorspelling voor lage waardes van de gatespanning, maar de voorspelling wordt minder nauwkeurig voor hogere VGS . Er is ook gekeken naar de correlatie van ∆UT bij lage VGS met ∆UT bij andere waardes van VGS . Figuur S.16b toont dat ons model het inzakken van de correlatie redelijk, maar niet perfect, voorspelt. We merken hier op dat zowel de bulkspanningsafhankelijkheid als het inzakken van de correlatie geen sterke effecten zijn. Voor een hogere nauwkeurigheid is een preciezere modellering noodzakelijk van bijvoorbeeld de mobiliteit, de invloed van de fluctuaties in de gatedopering en van het gatemateriaal zelf. Nauwkeurige simulaties zouden het inzicht verder kunnen vergroten.
S.5
Technologische aspecten
Om een transistor te fabriceren zijn verscheidene processtappen noodzakelijk. Deze processtappen kunnen verwachte, maar ook onverwachte, gevolgen hebben voor de matchingeigenschappen van een technologie. In de volgende twee subsecties worden respectievelijk de keuze tussen een amorf of poly-silicium gatemateriaal besproken en de invloed van de implantatie van de halo’s. Deze twee processtappen hadden de duidelijkste invloed op de mismatch van de tijdens de duur van dit doktoraat onderzochte processtappen. De laatste subsectie van deze sectie behandelt de scha8 Het
effectieve veld be¨ınvloedt de drainstroom via de mobiliteit.
188
APPENDIX S. SUMMARY IN DUTCH 8
6
s(DVT) (mV)
s(DVT) (mV)
7 amorfe gate
5
poly gate
4 3 2
5 4 3
1
NMOS 0
0.2
0.4
0.6 1/
0.8
1
PMOS
0
1.2
0
0.2
0.4
WL (mm-1)
0.6 1/
0.8
1
1.2
WL (mm-1)
2
1.6 1.4
amorfe gate poly gate
1.2
s(Db/b) (%)
s(Db/b) (%)
amorfe gate poly gate
6
2
1 0
7
1 0.8
amorfe gate 1.5
poly gate
1
0.6 0.4
0.5
0.2 0
NMOS 0
0.2
0.4
0.6 1/
0.8
1
1.2
PMOS 0
0
0.2
WL (mm-1)
0.4
0.6 1/
0.8
1
1.2
WL (mm-1)
√ Figure S.17: σ∆VT en σ∆β/β als een functie van 1/ W L. Er worden transistoren vergeleken met poly-silicium gates en met amorf silicium gates. De foutbalken representeren 99 % betrouwbaarheidsintervallen. Het proces onder studie heeft transistoren met een nominale lengte van 150 nm, een oxidedikte van 3.0 nm en een voedingspanning van 1.5 V.
lingswetten van de matchingeigenschappen.
S.5.1
Amorf of poly-silicium gatemateriaal?
√ Figuur S.17 toont σ∆VT en σ∆β/β als functie van 1/ W L voor transistorparen met een amorf silicium gate en voor transistoren met een poly-silicium gate. Het valt op dat voor zowel de NMOS als de PMOS transistoren betere matchingeigenschappen hebben in het geval van een poly-silicium gate. Dit wordt verklaart uit het feit dat na rekristallisatie de silicium korrelgrootte in de amorf silicium gate veel groter is dan de korrelgrootte in de poly-silicium gate. Hieruit volgt dat er meer silicium korrels beschikbaar zijn in een bepaald oppervlak wanneer een poly-silicium gate wordt gebruikt en dat fluctuaties die aan de gate zijn gerelateerd meer zijn uitgemiddeld [21].
S.5. TECHNOLOGISCHE ASPECTEN 2
geen halo
6 5
.
1.6 10
13
.
2.5 10
13
.
13
1.6 10
4
-2
o
-2
o
cm , 120 keV, 35 cm , 120 keV, 35 -2
cm ,
120 keV, 45
3
s(Db/b) (%)
s(DVT) (mV)
7
o
2
geen halo .
13
.
13
.
13
1.6 10 1.5
2.5 10 1.6 10
1
-2
o
-2
o
cm , 120 keV, 35 cm , 120 keV, 35 -2
cm ,
120 keV, 45
o
0.5
1 0
189
NMOS
NMOS 0
0
0.2
0.4
0.6 1/
0.8
1
1.2
0
0.2
0.4
WL (mm-1)
0.6 1/
0.8
1
1.2
WL (mm-1)
√ Figure S.18: σ∆VT en σ∆β/β als een functie van 1/ W L voor verschillende condities voor de implantatie van de halo’s. De foutbalken representeren 99 % betrouwbaarheidsintervallen. Het proces onder studie heeft transistoren met een nominale lengte van 130 nm, een oxidedikte van 2.0 nm en een voedingspanning van 1.5 V.
S.5.2
Invloed van het implanteren van een halo
√ Figuur S.18 toont σ∆VT en σ∆β/β als functie van 1/ W L voor transistorparen met verschillende condities voor de implantatie van de halo’s en ook voor het geval dat er geen halo’s worden ge¨ımplanteerd9 . Het valt op dat de implantatie van halo’s de matchingeigenschappen verslechtert. Aangezien de meeste transistorafmetingen in de figuur langer zijn dan 1.0 µm kan deze verslechtering niet worden verklaard uit de extra dopering rond de source en drain. We concluderen dat de gate geen ideaal masker is voor de halo implantatie en dat ook in het centrum van het kanaal een gedeelte van de halo’s door de gate heen wordt ge¨ımplanteerd. De toename in de mismatch die hierdoor veroorzaakt wordt zou kunnen worden verminderd door een dikkere gate, een dikker implantatieoxide of een fijnere korrelstructuur van het poly-silicium gatemateriaal.
S.5.3
Schaling van de matchingeigenschappen
De schaling van matchingeigenschappen is empirisch onderzocht in [6]. Hier werd gevonden dat voor een technologie met een redelijke matchingperformantie de waarde van A0,∆VT in mV·µm ongeveer gelijk is aan de elektrische oxidedikte in nm. Voor de stroomfactor geldt dat A0,∆β/β ≈ 1 %µm, onafhankelijk van de technologiegeneratie. Figuur S.19 test deze schalingswetten voor vier verschillende technologie¨en die op IMEC ontwikkeld zijn. Hierbij moet worden opgemerkt dat geen van deze technologie¨en geoptimaliseerd zijn naar hun matchingperformantie. De technologie met 9 De figuur toont alleen resultaten voor NMOS transistoren. Voor PMOS transistoren was het geobserveerde verschil minder prominent aanwezig.
190
APPENDIX S. SUMMARY IN DUTCH 3.5
NMOS PMOS verwachtte value
7 6
A(Db/b) (% mm)
A(DVT) (mV mm)
8
5 4 3
2.5
NMOS PMOS verwachtte value
2 1.5 1
2
0.5
1 0
3
0
100
120
140
160
180
100
120
140
160
180
nominale gatelengte (nm)
nominale gatelengte (nm)
(b)
(a)
Figure S.19: A0,∆VT (a) en A0,∆β/β (b) voor een aantal technologiegeneraties als een functie van hun nominale gatelengte. Foutbalken zijn 1σ lang. De lijnen tonen de verwachte waardes: A∆VT in mVµm is gelijk aan de oxide dikte in nm, A0,∆β/β ≈ 1 %µm. 20 s(Db/b) (%)
s(DVT) (mV)
50 40 30
NMOS PMOS
15
10
20 0.20/0.10
10 0
0.20/0.15
0.25/0.18
NMOS PMOS
0.25/0.13
5
0
100
120
140
160
180
nominale gatelengte (nm)
(a)
0.20/0.10
100
0.25/0.18 0.25/0.13
120
0.20/0.15
140
160
180
nominale gatelengte (nm)
(b)
Figure S.20: σ∆VT (a) en σ∆β/β (b) van de transistoren met minimale afmetingen voor een aantal technologiegeneraties als een functie van hun nominale gatelengte. De W/L-verhoudingen worden in de figuur vermeld en worden gegeven in (µm/µm). Gevulde symbolen representeren de experimentele data, terwijl de open symbolen berekend zijn met behulp van de gegevens uit figuur S.19. Foutbalken zijn 1σ lang. een nominale gatelengte van 180 nm wijkt het meest af van de verwachte waarde. Dit wordt veroorzaakt door het feit dat deze technologie een amorf silicium gate heeft, terwijl de andere technologie¨en een poly-silicium gate hebben. De PMOS transistoren volgen de schalingswetten. De matchingeigenschappen van de NMOS transistoren zijn slechter, wat kan worden veroorzaakt door de ongewenste implantaties van de halo’s door de gate. Figuur S.20 toont σ∆VT en σ∆β/β voor de minimale transistorafmetingen als functie van de technologiegeneratie. De gevulde symbolen representeren de experimentele
S.6. INVLOED VAN RUWHEID VAN LIJNRANDEN
191
data, terwijl de open symbolen berekend zijn met behulp van de gegevens uit figuur S.19. Voor de technologie¨en met de kortste nominale gatelengtes valt het op dat de PMOS transistoren met minimale afmetingen een slechtere matchingperformantie √ hebben dan op basis van de 1/ W L wet wordt verwacht. Hieruit wordt geconcludeerd dat het van belang is om kort-kanaal effecten onder controlle te houden. Ten slotte merken we nog op dat een goede matchingperformantie noodzakelijk is voor een goede analoge performantie en voor een goede yield in digitale circuits. De analoge performantie wordt bepaald door de waarde van A0,∆VT , welke afneemt met schaling. De variatie van de transistoren met minimale afmetingen neemt echter toe met schaling waardoor parameterfluctuaties een serieus probleem beginnen te worden.
S.6
Invloed van ruwheid van lijnranden op parameterfluctuaties, lekstroom en yield
In het laatste technische hoofdstuk van deze thesis en in deze sectie wordt de invloed van de ruwheid van lijnranden (LWR10 ) op het gedrag van de MOSFET onderzocht. Deze ruwheid veroorzaakt parameterfluctuaties, een toename in de lekstroom en vermindert de yield. LWR wordt gezien als ´e´en van de fundamentele problemen bij het schalen van de transistor en dient daarom onderzocht te worden. De eerste subsectie van deze sectie behandelt de karakterisatie van de ruwheid zelf, de tweede subsectie modelleert de invloed van de ruwheid, de derde subsectie bestudeert deze invloed experimenteel en de vierde subsectie doet een voorspelling met betrekking tot de invloed van de LWR op toekomstige technologie¨en.
S.6.1
Karakterisatie van de randruwheid
Figuur S.21a toont een bovenaanzicht van ´e´en hele en twee halve poly-silicium lijnen. Het is duidelijk te zien dat de randen niet perfect recht zijn, maar dat er een zekere ruwheid aanwezig is. De breedte van de lijn in het centrum wordt getoond in figuur S.21b als functie van de positie z langs de lijn. Deze breedte wordt beschreven door een eerste-orde autoregressief proces waarvan de autocovariantie RLW R (d) gegeven wordt door: 2 −2|d|/Wc RLW R (d) = σLW . (S.14) R·e 2 Hierin is σLW R de variatie in de lokale lijnbreedte en Wc beschrijft de afstand waarna de LWR niet meer met zichzelf is gecorreleerd. R ∞ Deze parameter2 is gelijk aan het oppervlak onder de autocorrelatiefunctie: Wc ≡ −∞ RLW R (z)dz/σLW R . Voor een state-ofthe-art 193 nm lithografie proces dat op IMEC is ontwikkeld geldt dat σLW R = 3.2 nm en Wc = 100 nm. Deze waardes zullen in de rest van deze sectie in de berekeningen worden gebruikt, tenzij anders aangegeven. Het is ook nodig de correlatie tussen de 10 LWR
staat voor line-width roughness, wat ruwheid van lijnbreedte betekend.
192
APPENDIX S. SUMMARY IN DUTCH
line width (nm)
130
125
120
115
110
100
200
300
400
500
600
z (nm)
(b) (a)
Figure S.21: a) SEM plaatje van drie poly-silicium lijnen na ets. De randen van de volledig getoonde lijn in het centrum zijn aangeduid. b) Lokale lijnbreedte als een functie van de positie z langs de lijn. ruwheid van beide randen te kennen. Het is gevonden dat er geen significante correlatie tussen de ruwheid van verschillende lijnranden aanwezig is. De LWR zelf wordt normaal verdeeld verondersteld.
S.6.2
Modellering van de invloed van randruwheid
De meeste transistoreigenschappen zijn een functie van de lengte. Via deze lengteafhankelijkheid be¨ınvloedt de LWR de MOSFET op drie manieren: 1) Het vergroot de parameter fluctuaties, 2) het vergroot de lekstroom en 3) het verslechtert de yield. Deze effecten zullen nu worden besproken. Fluctuaties in de drempelspanning. Als voorbeeld van de invloed van LWR op parameterfluctuaties worden de fluctuaties in de drempelspanning berekend. De aanpak is geldig voor elke parameter in sterke inversie. Zwakke inversie wordt besproken bij de behandeling van de invloed van LWR op de lekstroom. Wanneer de standaarddeviatie van ∆VT ten gevolge van LWR niet te groot is kan deze worden benaderd door: ¯ ¯ ¯ dVT ¯ ¯ ¯ σ
. σ∆VT = 2 ¯ (S.15) dL ¯ Hierin is dVT /dL de gevoeligheid van de drempelspanning naar een variatie in de transistorlengte en is σ de variatie in de gemiddelde lengte van de transistor. Voor zeer smalle transistoren (W ¿ Wc ) geldt dat σ = σLW R . Dit houdt in dat binnen een transistor de LWR niet de ruimte heeft om de lokale lengte significant
S.6. INVLOED VAN RUWHEID VAN LIJNRANDEN
193
p te veranderen. Voor brede transistoren (W À Wc ) geldt √ dat σ = Wc /W · σLW R . Dit is de ´e´endimensionale equivalent van de 1/ W L wet. De met (S.15) berekende σ∆VT is vergeleken met de experimenteel gevonden waardes van σ∆VT voor een 130 nm technologie met een oxidedikte van 1.5 nm en een voedingsspanning van 1.2 V. De minimaal onderzochte en beschikbare gate had een lengte van 80 nm. Hieruit kan worden geconcludeerd dat LWR geen significante bijdrage levert aan de parameterfluctuaties van de onderzochte technologie. Toename van en fluctuaties in de lekstroom. De lekstroom hangt zeer sterk van de gatelengte af. Bij de modellering van de lekstroom is linearisatie zoals in (S.15) dan ook niet toegestaan. Het logaritme van de lekstroom (Iof f ) kan echter wel gelineariseerd worden naar een afwijking in de lengte, waaruit volgt dat de lokale lekstroom in benadering lognormaal is verdeeld. De asymmetrie van de lognormale verdeling 2 zorgt voor een toename van de gemiddelde lekstroom met een factor eσln /2 , waarin σln ≡ |dln(Iof f )/dL| · σLW R . Voor de onderzochte 130 nm technologie levert dit een maximale toename van 20 % voor de 80 nm lange transistor, wat als niet significant wordt beschouwd. Voor de modellering van de fluctuaties in de lekstroom wordt ervan uitgegaan dat deze kunnen worden benaderd met een lognormale verdeling, wat nauwkeurig is indien σln < 1. In dit geval worden de fluctuaties uitgedrukt als σ∆ln(Iof f ) . Voor smalle transistoren (W ¿ Wc,Iof f , Wc,Iof f ≈ Wc ) geldt dat σ∆ln(Iof f ) ≈ 2σln . Voor brede transistoren geldt: µ ¶ ³ 2 ´W c,Iof f 2 σln σ∆ln(I ≈ 2ln 1 + e − 1 . of f ) W
(S.16)
Wanneer deze fluctuaties worden vergeleken met de experimenteel geobserveerde fluctuaties van de lekstroom van een 130 nm CMOS technologie, volgt weer dat de invloed van LWR nog niet van belang is. Invloed op yield. Bij de berekening van de invloed van LWR op yield, wordt er van uitgegaan dat een transistor faalt wanneer lokaal de transistorlengte ergens beneden een kritische grens komt Lcrit . De kans dat dit lokaal het geval is (plocal ) kan worden berekend op basis van de normale verdeling van de LWR. Het aantal maal dat dit onafhankelijk in ´e´en transistoren kan gebeuren is gelijk aan W/Wc wanneer W > Wc , en gelijk aan 1 voor W < Wc . Stel dat een circuit faalt wanneer ´e´en van de Ndevice transistoren binnen dat circuit faalt. Dan geldt voor de yield: yield = (1 − pdevice )Ndevice ·min(1,W/Wc ) .
(S.17)
Als voorbeeld wordt een circuit met 6 miljoen transistoren beschouwd (bijvoorbeeld een SRAM) met W < Wc en Lcrit = 0.7Lgate . Wanneer we ten gevolge van LWR een verlies in yield van 0.5 % toestaan, dan moet gelden dat 6σLW R < 0.3Lgate . Dit is gelijk aan de specificatie die de ITRS roadmap [108] aan de LWR toekent.
194
APPENDIX S. SUMMARY IN DUTCH
(a) geen extra LWR
(b) extra LWR
(c) enkelzijdig
(d) in fase
(e) 90◦ faseverschil
(f) 180◦ faseverschil
Figure S.22: SEM plaatjes van de bovenaanzichten van gates van de speciale transistoren die met behulp van e-beam zijn gefabriceerd om LWR gerelateerde effecten te bestuderen.
S.6.3
Experimentele studie van de invloed van randruwheid
Zoals in de vorige subsectie is vermeld is de invloed van LWR nog niet zichtbaar bij de tijdens dit doctoraat beschikbare techologie¨en. Om toch de invloed van LWR op experimentele wijze te kunnen onderzoeken is de LWR opzettelijk verslechterd met behulp van e-beam lithografie. Figuur S.22 toont bovenaanzichten van speciaal voor dit onderzoek gefabriceerde poly-silicium gates. Figuur S.22a toont de gate die standaard met e-beam geschreven wordt. Figuur S.22b toont een gate waar de ruwheid opzettelijk is verhoogd door tijdens het schrijven de elektronendosis willekeurig langs de rand te vari¨eren. Figuur S.22c tot en met figuur S.22f tonen gates met sinusvormige randen met behulp waarvan het uitmiddelen van de stroom onderzocht kan worden. Zowel de amplitude als de periode van de sinussen is gevarieerd. Als eerste zal nu worden ingegaan op de transistoren met sinusvormige randen. Daarna worden de transistoren met extra ruwheid behandeld en ten slotte zullen we kort ingaan op yield. Transistoren met sinusvormige randen. Bij de berekening van de drainstroom van een transistor waarvan ´e´en van de randen sinusvormig is wordt er in eerste instantie vanuit gegaan dat de source- en drainextensies exact dezelfde vorm als de gate hebben. Dit is schematisch weergegeven in figuur S.23a. Wanneer deze berekening wordt vergeleken met experimenteel gevonden waardes van de drainstroom, blijkt de berekening de stroom te overschatten. De verklaring hiervoor is dat ten gevolge van diffusie van de doperingsatomen in de source en drain, de extensies de gate niet exact volgen. Dit wordt schematisch weergegeven in figuur S.23b. Dit kan in de berekening worden meegenomen door de lokale gatelengte (Llocal ) te vervangen door een ’uitgesmeerde’
S.6. INVLOED VAN RUWHEID VAN LIJNRANDEN
195
(a)
(b)
(c)
(d)
Figure S.23: Schematische bovenaanzichten van transistoren met ´e´en sinusvormige rand (a+b) en twee sinusvormige randen in fase (c+d). De doorgetrokken lijnen representeren de rand van de gate en de stippellijnen representeren de uiteinden van de source en drain. De pijlen geven de stroomrichting aan. gatelengte (Lsmooth ) die wordt gegeven door: Lsmooth (z) =
1 2Wsmooth
Z
z+Wsmooth
Llocal (z 0 )dz 0 .
(S.18)
z−Wsmooth
Hierin 2Wc de breedte van het toegepaste smoothing window. De berekende drainstroom komt overeen met de gemeten waarde wanneer Wsmooth = 30 nm. Dit is min of meer gelijk aan de junctiediepte wat op zijn beurt weer min of meer gelijk is aan de diffusieafstand van de doperingsatomen in de source en drain. Hieruit wordt geconcludeerd dat de gevonden waarde voor Wsmooth realistisch is. We onderzoeken nu de transistoren waarbij beide zijden sinusvormig zijn en in fase. In dit geval is de lokale lengte hetzelfde in de gehele transistor, zoals is aangegeven in figuur S.23c. Wanneer de stroom echter op deze manier wordt berekend, leidt dit tot een serieuze onderschatting van de stroom. Dit wordt veroorzaakt door het feit dat in werkelijkheid de stroomrichting niet uniform is, maar de kortste weg van source naar drain kiest, zoals is weergegeven in figuur S.23d. Wanneer hier rekening mee wordt gehouden in de berekening komt het resultaat weer overeen met de experimenteel gevonden stromen. Wanneer zowel het smoothing window als de niet uniforme stroomrichting in rekening worden gebracht, worden ook de experimentele resultaten van de transistoren met twee sinusvormige randen met een fase verschil van 90◦ en 180◦ goed voorspeld. Ten slotte merken we nog op dat het effect van een niet uniforme stroomrichting voor realistische transistoren in de meeste gevallen te verwaarlozen is. Transistoren met extra ruwheid. Figuur S.24 vergelijkt curves van de aanstroom (Ion ) versus de lekstroom (Iof f ) van de onderzochte transistoren met normale randen en van de transistoren met extra ruwe randen. Ook is de lekstroom voor de transistoren met
196
APPENDIX S. SUMMARY IN DUTCH normale randen extra LWR normale randen - gemiddelde extra LWR - berekend
-8
10
normale randen extra LWR normale randen - gemiddelde extra LWR - berekend
Ioff / W (A mm-1)
Ioff / W (A mm-1)
-5
10
-6
10
-7
10
-8
10
-9
10
-9
10
-10
-10
10
10 -4
5 10
7 10
-4
-4
-3
9 10 1.1 10 Ion / W (A mm-1)
(a)
-4
5 10
-4
6 10
-4
7 10 Ion / W (A mm-1)
(b)
Figure S.24: Iof f − Ion curves voor transistoren met normale randen en voor transistoren met extra ruwe randen. De symbolen representeren experimentele data en de doorgetrokken lijn toont het gemiddelde van de Iof f − Ion curve van de transistoren met normale randen. De stippellijn is het op basis van deze curve berekende gemiddelde voor de transistoren met ruwe randen. a) Volledige Iof f − Ion curve. b) Het eerste gedeelte van de Iof f − Ion curve.
extra ruwe randen berekend op basis van de gemiddelde waarde van de Iof f −Ion curve voor de transistoren met normale randen. Bij deze berekening is rekening gehouden met het eerder bepaalde smoothing window. Zoals in de figuur is te zien wordt de toename in de lekstroom goed voorspeld met behulp van de theorie uit subsectie S.6.2. Yield. De invloed van LWR op transistoryield is onderzocht door te kijken naar normale transistoren, naar transistoren met lichte extra LWR en naar transistoren met zware extra LWR. We zullen nu het gekozen falingscriterium introduceren: Figuur S.25a toont het bereik waarbinnen de drainstroom valt voor transistoren met een normale gate en voor transistoren met zware extra LWR. Het valt op dat de bovengrens voor de transistoren met zware extra ruwheid omhoog krult, terwijl dit voor de andere getoonde grenzen niet het geval is. Dit omhoog krullen11 duidt op punchthrough en wanneer het plaats vindt zeggen we dat een transistor faalt12 . Figuur S.25b toont de transistoryield als functie van de transistorbreedte voor de drie onderzochte situaties. We zien dat de in subsectie S.6.2 ontwikkelde theorie een redelijke beschrijving van de waarnemingen geeft wanneer de kritische lengte gelijk aan 50 nm wordt gekozen. Zoals verwacht zien we dat de transistoryield inzakt wanneer de LWR toeneemt en wanneer de breedte van de transistor toeneemt. 11 d2 ln(I 2 of f )/dVGS > 0 12 Dit is een zeer streng
-4
8 10
falingscriterium, maar het voldoet om de bruikbaarheid van de eerder ontwikkelde theorie aan te tonen.
-2
10
ID (A)
-3
10
L = 57 nm W = 1 mm
bovengrenzen
-4
10
2
2
d ln(I ) / dV off
-5
10
>0
ondergrenzen
-6
10
normale transistoren zware extra LWR
-7
10
GS
0
0.2
0.4
0.6
0.8
1
1.2
fractie van werkende transistoren (-)
S.6. INVLOED VAN RUWHEID VAN LIJNRANDEN
197
normale transistoren lichte extra LWR zware extra LWR model 1 0.8 0.6 0.4 0.2 0
L = 62 nm Lcrit = 50 nm 1
10 W (mm)
VGS (V)
(a)
(b)
Figure S.25: a) Boven- en ondergrenzen van de gemeten populaties van drainstromen als een functie van de gatespanning. Er worden resultaten getoond voor transistoren met normale randen (doorgetrokken lijnen) en met zwaar extra ruwe randen (stippellijnen). b) Transistoryield als een functie van de transistorbreedte voor transistoren zonder extra ruwheid (•), lichte extra ruwheid (¨), en sterke extra ruwheid (¥). De lijnen tonen een fit van het in subsectie S.6.2 beschreven model.
S.6.4
Voorspelling van de invloed van randruwheid
Nu we de modellen uit subsectie S.6.2 experimenteel hebben getest, kunnen deze modellen worden gebruikt voor een voorspelling over de toekomstige invloed van LWR en om eisen aan de maximum toelaatbare LWR op te stellen. Voor deze voorspelling is het nodig om de grootte van de LWR te kennen en de gevoeligheid van de drempelspanning en de lekstroom naar een variatie in de lengte. Voor de grootte van de LWR wordt de waarde genomen die in subsectie S.6.1 is bepaald. De lengtegevoeligheid van de transistor met nominale afmetingen is voor verscheidene toekomstige technologie¨en met behulp van twee dimensionale transistorsimulaties bepaald. Voor de fluctuaties in de drempelspanning worden de resultaten samengevat in figuur S.26a. Op de x-as staat de kanaallengte van de nominale transistor van de desbetreffende technologie13 . √ Op de y-as staan de fluctuaties in de drempelspanning uit, genormaliseerd naar 1/ W . De stippellijn geeft de grootte van de mismatch in de drempelspanning indien LWR niet aanwezig zou zijn. De hoogte van deze lijn is bepaald op basis van de schalingswetten die in subsectie S.5.3 zijn besproken. Uit de figuur volgt dat voor transistoren met kanaallengtes kleiner dan 40 nm de fluctuaties ten gevolge van LWR gaan domineren. Indien er ge¨eist wordt dat σ∆VT ten gevolge van LWR kleiner√is dan de verwachte waarde van σ∆VT , wordt de maximaal toegestane waarde voor Wc σLW R vast gelegd, wat als nieuw figure-of-merit voor de invloed van LWR op parameterfluctuaties wordt ge¨ıntroduceerd. De maximaal 13 Dus
elk punt in figuur S.26 en figuur S.27 representeert een andere technologie.
198
APPENDIX S. SUMMARY IN DUTCH 100 toename in Ioff (-)
W s(DVT) (mV mm1/2)
30 hoge performantie laag vermogen verwachtte waarde
25 20 15 10
hoge performantie laag vermogen maximale toename 10
5 0
1 0
20
40
60
80
100
0
20
Lchannel (nm)
40
60
80
100
Lchannel (nm)
(a)
(b)
Figure S.26: a) Voorspelde mismatch in de drempelspanning ten gevolge van LWR (symbolen) als een functie van de kanaallengte van de nominale transistor van de desbetreffende technologie. De stippellijn geeft de verwachte waarde van de mismatch. b) Eis aan de LWR om de invloed op de mismatch kleiner te houden dan de totaal verwachte waarde. 120 hoge performantie laag vermogen hedendaagse lithografie
8 6 4
80 60 40
2 0
hoge performantie laag vermogen hedendaagse lithografie
100 sLWR (nm)
Wc sLWR (nm3/2)
10
20 0
20
40
60
80
100
0
0
20
40
Lchannel (nm)
(a)
60
80
100
Lchannel (nm)
(b)
Figure S.27: a) Voorspelde toename in de lekstroom ten gevolge van LWR (symbolen) als een functie van de kanaallengte van de nominale transistor van de desbetreffende technologie. b) Eis aan de LWR om deze toename onder een factor 2 te houden. toegestane waarde van dit figure-of-merit is met de symbolen weergegeven in figuur S.26b. De stippellijn geeft de huidige status van wat er met lithography bereikt kan worden. Figuur S.27a toont de verwachte relatieve toename in de lekstroom als een functie van de kanaallengte. Indien ge¨eist wordt dat deze toename onder een factor twee ligt, dan geldt weer dat voor de huidige eigenschappen van lithografische processen deze eis niet gehaald wordt voor technologie¨en met Lchannel < 40 nm. In dit geval wordt een maximale eis aan σLW R opgelegd, die is weergegeven in figuur S.27b. Tenslotte vermelden we nog dat de invloed van LWR ook beperkt kan worden door ervoor te zorgen dat dVT /dL en dln(Iof f )/dL minder snel te laten toenemen bij het
S.7. CONCLUSIES
199
schalen. Dit zorgt echter voor een significante toename in de complexiteit van de procesontwikkeling.
S.7
Conclusies
In dit doctoraat zijn de matchingeigenschappen van diep-submicron MOSFETs onderzocht. In vijf technische hoofdstukken zijn de volgende onderwerpen behandeld: 1) Het meten en modelleren van mismatch in de drainstroom, 2) parameterextractie, 3) de fysische oorsprong van de fluctuaties, 4) technologische aspecten en 5) de invloed van de ruwheid van lijnranden op parameterfluctuaties, lekstroom en yield. De belangrijkste conclusies worden per onderwerp besproken. Het meten en modelleren van mismatch in de drainstroom. Er is een op fysica gebaseerd model ontwikkeld voor de mismatch in de drainstroom van diep-submicron technologie¨en. Het model is getest op een 0.18 µm CMOS technologie. In tegenstelling tot de literatuur modelleren wij de invloed van de drempelspanning en de stroomfactor onafhankelijk van elkaar. Het resultaat is een continu model dat geldig is van moderate tot en met sterke inversie en zowel in het lineaire gebied als in saturatie. De onnauwkeurigheid van het model is kleiner dan 20 % voor alle condities van de instelspanningen boven de drempelspanning. Parameterextractie. De meest gebruikte methodes om de mismatch in de drempelspanning en de mismatch in de stroomfactor te extraheren zijn met elkaar vergeleken. Er treden verschillen op die groot genoeg zijn om de conclusies met betrekking tot de matchingperformantie van een bepaalde technologie danig te be¨ınvloeden. De verschillen tussen de methodes worden voornamelijk veroorzaakt door kleine modelfouten of het gebrek aan een goed model voor het zwakke inversiegebied. De te prefereren methode hangt af van de applicatie. Met betrekking tot de model- en meetnauwkeurigheid leveren fitmethodes het beste resultaat. Een nadeel van deze methodes is dat ze erg langzaam in gebruik zijn. Het toepassen van een stroomcriterium is veel sneller en levert ook een uitstekende meetnauwkeurigheid. De fysische inhoud van de ge¨extraheerde mismatch in de drempelspanning is echter minder goed gedefinieerd vanwege het feit dat de drempelspanning tussen het zwakke en sterke inversie gebied in bepaald wordt. De maximum slope methode is redelijk snel en levert goed interpreteerbare resultaten. Voor korte en brede transistoren is deze methode echter wel gevoelig voor een variatie in de contactweerstand tussen probe en bondingpad. De drie- en vier-punts methodes zijn het meest gevoelig voor de ruis die door het meetsysteem wordt ge¨ıntroduceerd. Deze methodes zijn echter wel snel en produceren goed te begrijpen resultaten. Wanneer de drie- of vier-punts methode wordt toegepast is het noodzakelijk om een vaste overdrive spanning ten opzichte van een eerder bepaalde drempelspanning aan te leggen in plaats van de condities voor de instelspanningen vast te kiezen. De fysische oorsprong van de fluctuaties. Door middel van het oplossen van de drift-
200
APPENDIX S. SUMMARY IN DUTCH
√ diffusie vergelijking is gevonden dat de in literatuur standaard gebruikte 1/ W L-wet alleen geldig is in sterke inversie bij lage drainspanning. In zwakke inversie wordt de afwijking voor de NMOS transistoren voornamelijk veroorzaakt door de hogere drempelspanning aan de zijkant van de transistor. In sterke inversie bij hogere drain bias vinden we dat de laterale niet uniformiteit van √ de inversielaag een logaritmische en dus zwakke, afwijking veroorzaakt van het 1/ W L-model. De oorzaken van mismatch in transistorparameters zijn: 1) Doperingsfluctuaties in het kanaal, 2) doperingsfluctuaties in de gate, 3) fluctuaties in de oxidelading en 4) Fluctuaties in de surface-roughness. De invloed van deze fluctuaties op de mismatch in de drainstroom is gemodelleerd. Behalve de invloed van doperingsfluctuaties op de drempelspanning, spelen ook de fluctuaties in Coulomb scattering een belangrijke rol, welke ogenschijnlijk gelijk zijn aan fluctuaties in de drempelspanning. Voor een gedeelte verklaart dit het traditioneel waargenomen verschil tussen de berekening van de mismatch in de drempelspanning en de experimenteel waargenomen waardes. Experimenteel wordt gevonden dat de ontwikkelde modellen de mismatch uitstekend beschrijven en redelijk goed voorspellen. Technologische aspecten. De voornaamste twee invloeden van de technologie op de matchingeigenschappen, die tijdens dit doctoraat zijn gevonden, zijn de volgende: 1) De korrelgrootte van het gatemateriaal heeft een belangrijke invloed op de matchingeigenschappen. In het geval van een poly-silicium gate (kleine korrelgrootte) zijn de matchingeigenschappen beter dan voor een amorfe silicium gate (grootte korrelgrootte na rekristallisatie.). 2) Ook is er gevonden dat de gate niet als een ideaal masker werkt voor de halo implantatie. Wanneer de halo gedeeltelijk door de gate heen wordt ge¨ımplanteerd kan dit tot een serieuze verslechtering van de mismatch leiden. Er kunnen zowel extra fluctuaties in de kanaaldopering ontstaan als extra fluctuaties in de gate. Het schalingsgedrag van de parameters die de mismatch beschrijven is onderzocht. De in de literatuur gevonden waardes voor A0,∆VT nemen lichtjes sneller af met schaling dan op basis van de schalingsregels wordt verwacht. Het blijft echter belangrijk om de matchingeigenschappen van een technologie te optimaliseren, zeker wanneer we de transistor met minimale afmetingen beschouwen, voor welke de parameterfluctuaties zelfs toenemen met schaling. De invloed van de ruwheid van lijnranden op parameterfluctuaties, lekstroom en yield. De ruwheid van lijnranden (LWR) wordt beschreven met een eerste-orde autoregressief proces. Dit proces wordt gekarakteriseerd door de standaarddeviatie van de LWR en door een correlatiebreedte. De invloed van LWR op 1) parameterfluctuaties, 2) de toename in de lekstroom en 3) yield zijn berekend. Vergelijking met experimentele data toont aan dat LWR geen significante rol speelt tot aan de minimaal gemeten gatelengte van 80 nm. De ontwikkelde modellen zijn onderzocht op speciaal hiervoor ontwikkelde teststructuren met sinusvormige randen van de gate. Met behulp van deze modellen wordt voorspeld dat LWR belangrijk wordt wanneer de kanaallengte 32 nm bereikt. Op basis van de standaarddeviatie van de LWR en de correlatiebreedte is een nieuw figure-of-merit voor de invloed van LWR op parameterfluctuaties ge¨ıntroduceerd.
Bibliography [1] C. McDonald, “Copy EXACTLY! A paradigm shift in technology transfer method,” Proc. of the 1997 IEEE Advanced Semiconductor Manufacturing Conference, pp. 414– 417, 1997. [2] H. Tuinhout, A. Bretveld, and W. Peters, “Current mirror test structures for studying adjacent layout effects on systematic transistor mismatch,” Proc. of the 2003 International Conference on Microelectronic Test Structures, pp. 221–226, 2003. [3] R. Keyes, “The effect of randomness in the distribution of impurity atoms on FET thresholds,” Applied Physics, vol. 8, pp. 251–259, 1975. [4] P. Kinget and M. Steyaert, “Impact of transistor mismatch on the speed-accuracypower trade-off of analog CMOS circuits,” Proc. of the 1996 Custom Integrated Circuits Conference, pp. 333–336, 1996. [5] M. Pelgrom, A. Duinmaijer, and A. Welbers, “Matching properties of MOS transistors,” IEEE Journal of Solid-State Circuits, vol. 24, no. 4, pp. 1433–1440, 1989. [6] M. Pelgrom, H. Tuinhout, and M. Vertregt, “Transistor matching in analog CMOS applications,” International Electron Devices Meeting 1998, pp. 915–918, 1998. [7] A. Bhavnagarwala, X. Tang, and J. Meindl, “The impact of intrinsic device fluctuations on CMOS SRAM cell stability,” IEEE Journal of Solid-State Circuits, vol. 36, no. 4, pp. 658–665, 2001. [8] P. Stolk, H. Tuinhout, R. Duffy, E. Augendre, L. Bellefroid, M. Bolt, J. Croon, C. Dachs, F. Huisman, A. Moonen, Y. Ponomarev, R. Roes, M. Da Rold, E. Seevinck, K. Sreerambhatla, R. Surdeanu, R. Velghe, M. Vertregt, M. Webster, N. van Winkelhoff, and A. Zegers-Van Duijnhoven, “CMOS device optimization for mixed-signal technologies,” International Electron Devices Meeting 2001, pp. 215–218, 2001. [9] R. Van Overstraeten, G. Declerck, and G. Broux, “The influence of surface potential fluctuations on the operation of the MOS transistor in weak inversion,” IEEE Transactions on Electron Devices, vol. ED-20, no. 12, pp. 1154–1158, 1973. [10] K. Takeuchi, T. Tatsumi, and A. Furukawa, “Channel engineering for the reduction of random-dopant-placement-induced threshold voltage fluctuations,” International Electron Devices Meeting 1997, pp. 841–844, 1997. [11] P. Stolk, F. Widdershoven, and D. Klaassen, “Modeling statistical dopant fluctuations in MOS transistors,” IEEE Transactions on Electron Devices, vol. 45, no. 9, pp. 1960– 1971, 1998.
201
202
BIBLIOGRAPHY
[12] A. Asenov, “Random dopant induced threshold voltage lowering and fluctuations in sub-0.1 µm MOSFET’s: A 3-d ”atomistic” simulation study,” IEEE Transactions on Electron Devices, vol. 45, no. 12, pp. 2505–2513, 1998. [13] T. Mizuno and A. Toriumi, “Experimental evidence for statistical-inhomogeneous distributed dopant atoms in Si metal-oxide-semiconductor field-effect transistor,” Journal of Applied Physics, vol. 77, no. 7, pp. 3538–3540, 1995. [14] H. Tuinhout, F. Widdershoven, P. Stolk, J. Schmitz, B. Dirks, K. van der Tak, P. Bancken, and J. Politiek, “Impact of ion implantation statistics on VT fluctuations in MOSFETs: Comparison between decaborane and boron channel implants,” Proc. of 2000 Symposium on VLSI Technology, pp. 134–135, 2000. [15] E. Vittoz, “The design of high-performance analog circuits on digital CMOS chips,” IEEE Journal of Solid-State Circuits, vol. SC-20, no. 3, pp. 657–665, 1985. [16] K. Lakshmikumar, R. Hadaway, and M. Copeland, “Characterization and modeling of mismatch in MOS transistors for precision analog design,” IEEE Journal of Solid-State Circuits, vol. SC-21, no. 6, pp. 1057–1066, 1986. [17] J. Bastos, M. Steyaert, B. Graindourze, and W. Sansen, “Matching of MOS transistors with different layout styles,” Proc. of the 1996 International Conference on Microelectronic Test Structures, pp. 17–18, 1996. [18] T. Serrano-Gotarredona and B. Linares-Barranco, “A new five-parameter MOS transistor mismatch model,” IEEE Electron Device Letters, vol. 21, no. 1, pp. 37–39, 2000. [19] P. Drennan and C. McAndrew, “A comprehensive MOSFET mismatch model,” International Electron Devices Meeting 1999, pp. 167–170, 1999. [20] H. Tuinhout, M. Pelgrom, R. de Vries, and M. Vertregt, “Effects of metal coverage on MOSFET matching,” International Electron Devices Meeting 1996, pp. 735–738, 1996. [21] H. Tuinhout, A. Montree, J. Schmitz, and P. Stolk, “Effects of gate depletion and boron penetration on matching of deep submicron CMOS transistors,” International Electron Devices Meeting 1997, pp. 631–634, 1997. [22] R. Difrenza, P. Llinares, G. Ghibaudo, E. Robillart, and E. Granger, “Dependence of channel width and length on MOSFET matching for 0.18 µm CMOS technology,” Proc. of the 30th European Solid-State Device Research Conference, pp. 584–587, 2000. [23] J. Bastos, Characterization of MOS transistor mismatch for analog design. K.U. Leuven, 1998. [24] R. Difrenza, Impact des fluctuations technologiques sur l’appariement du transistor MOS des fili`eres 0.18 et 0.12µm. Institut National Polytechnique de Grenoble, 2002. [25] H. Yang, V. Macary, J. Huber, W.-G. Min, B. Baird, and J. Zuo, “Current mismatch due to local dopant fluctuations in MOSFET channel,” IEEE Transactions on Electron Devices, vol. 50, no. 11, pp. 2248–2254, 2003. [26] C. Michael and M. Ismail, “Statistical modeling of device mismatch for analog MOS integrated circuits,” IEEE Journal of Solid-State Circuits, vol. 27, no. 2, pp. 154–166, 1992. [27] M. Conti, P. Crippa, S. Orcioni, and C. Turchetti, “Statistical modeling of MOS transistor mismatch based on the parameters’ autocorrelation function,” Proc. 1999 IEEE International Symposium on Circuits and Systems, vol. 6, pp. 222–225, 1999.
BIBLIOGRAPHY
203
[28] Q. Zhang, J. Liou, J. McMacken, J. Thomson, and P. Layman, “Modeling of mismatch effect in submicron MOSFETs based on BSIM3 model and parametric tests,” IEEE Electron Device Letters, vol. 22, no. 3, pp. 133–135, 2001. [29] Q. Zhang, J. Liou, J. McMacken, J. Thomson, and P. Layman, “SPICE modeling and quick estimation of MOSFET mismatch based on BSIM3 model and parametric tests,” IEEE Journal of Solid-State Circuits, vol. 36, no. 10, pp. 1592–1595, 2001. [30] P. Drennan and C. McAndrew, “Understanding MOSFET mismatch for analog design,” Proc. of the 2002 Bipolar/BiCMOS Circuits and Technology Meeting, pp. 449– 452, 2002. [31] J.-B. Shyu, G. Temes, and F. Krummenacher, “Random error effects in matched MOS capacitors and current sources,” IEEE Journal of Solid-State Circuits, vol. SC-19, no. 6, pp. 948–955, 1984. [32] C. Abel, C. Michael, M. Ismail, C. Teng, and R. Lahri, “Characterization of transistor mismatch for statistical CAD of submicron CMOS analog circuits,” Proc. 1993 IEEE International Symposium on Circuits and Systems, pp. 1401–1404, 1993. [33] S.-C. Wong, J.-K. Ting, and S.-L. Hsu, “Characterization and modeling of MOS mismatch in analog CMOS technology,” Proc. of the 1995 International Conference on Microelectronic Test Structures, pp. 171–176, 1995. [34] S.-C. Wong, K.-H. Pan, D.-J. Ma, M. Liang, and P. Tseng, “On matching properties and process factors for submicrometer CMOS,” Proc. of the 1996 International Conference on Microelectronic Test Structures, pp. 43–47, 1996. [35] J. Bastos, M. Steyaert, A. Pergoot, and W. Sansen, “Mismatch characterization of submicron MOS transistors,” Analog Integrated Circuits and Signal Processing, vol. 12, no. 2, pp. 95–106, 1997. [36] S.-C. Wong, K.-H. Pan, and D.-J. Ma, “A CMOS mismatch model and scaling effects,” IEEE Electron Device Letters, vol. 18, no. 6, pp. 261–263, 1997. [37] T. Serrano-Gotarredona and B. Linares-Barranco, “Mismatch characterization of submicron MOS transistors,” Analog Integrated Circuits and Signal Processing, vol. 21, no. 3, pp. 271–296, 1999. [38] T. Serrano-Gotarredona and B. Linares-Barranco, “A new 5-parameter MOS transistors mismatch model,” Proc. 6th IEEE International Conference on Electronics, Circuits and Systems, pp. 315–318, 1999. [39] T. Serrano-Gotarredona and B. Linares-Barranco, “A methodology for MOS transistor mismatch parameter extraction and mismatch simulation,” Proc. 2000 IEEE International Symposium on Circuits and Systems, pp. 109–112, 2000. [40] T. Serrano-Gotarredona and B. Linares-Barranco, “A new strong inversion 5parameter transistor mismatch model,” Proc. 2000 IEEE International Symposium on Circuits and Systems, pp. 381–384, 2000. [41] M.-F. Lan and R. Geiger, “Modeling of random channel parameter variations in MOS transistors,” Proc. 2001 IEEE International Symposium on Circuits and Systems, vol. 1, pp. 85–88, 2001. [42] HP4063 manual. Agilent Technologies. [43] H. Tuinhout, “Design of matching test structures,” Proc. of the 1994 International Conference on Microelectronic Test Structures, pp. 21–27, 1994.
204
BIBLIOGRAPHY
[44] A. Steegen, M. Stucchi, A. Lauwers, and K. Maex, “Silicide induced pattern density and orientation dependent transconductance in MOS transistors,” International Electron Devices Meeting 1999, pp. 497–500, 1999. [45] S. Chetlur, S. Sen, E. Harris, H. Vaidya, I. Kizilyalli, R. Gregor, and B. Harding, “Influence of passivation anneal position on metal coverage dependent mismatch and hot carrier reliability,” Proc. of 7th International Symposium on Physical and Failure Analysis of Integrated Circuits, pp. 21–24, 1999. [46] H. Tuinhout and M. Vertregt, “Characterization of systematic MOSFET current factor mismatch caused by metal CMP dummy structures,” IEEE Transactions on Semiconductor Manufacturing, vol. 14, no. 4, pp. 302–310, 2001. [47] R. Gregor, “On the relationship between topography and transistor matching in an analog CMOS technology,” IEEE Transactions on Electron Devices, vol. 39, no. 2, pp. 275–282, 1992. o, A. Rupp, E. Vandamme, E. Augendre, S. Pochet, and L. De[48] G. Badenes, C. Perell´ ferm, “Optimisation of critical parameters in a low cost high performance deep submicron CMOS technology,” Proc. of the 29th European Solid-State Device Research Conference, pp. 628–631, 1999. [49] F. Forti and M. Wright, “Measurement of MOS current mismatch in the weak inversion region,” IEEE Journal of Solid-State Circuits, vol. 29, no. 2, pp. 138–142, 1994. [50] M. Denison, A. Pergoot, and M. Tack, “Prediction of MOS matching in weak and moderate inversion from threshold matching in strong inversion,” Proc. of the 28th European Solid-State Device Research Conference, pp. 648–651, 1998. [51] T. Mizuno, “Influence of statistical spatial-nonuniformity of dopant atoms on threshold voltage in a system of many MOSFETs,” Japanese Journal of Applied Physics, Part 1, vol. 35, no. 2B, pp. 842–848, 1996. [52] S. Sun and J. Plummer, “Electron mobility in inversion and accumulation layers on thermally oxidized silicon surfaces,” IEEE Journal of Solid-State Circuits, vol. SC-15, no. 4, pp. 562–573, 1980. [53] S. Schwarz and S. Russek, “Semi-empirical equations for electron velocity in silicon: part I-bulk,” IEEE Transactions on Electron Devices, vol. ED-30, no. 12, pp. 1629– 1633, 1983. [54] S. Schwarz and S. Russek, “Semi-empirical equations for electron velocity in silicon: part II-MOS inversion layer,” IEEE Transactions on Electron Devices, vol. ED-30, no. 12, pp. 1634–1639, 1983. [55] S. Takagi, A. Toriumi, M. Iwase, and H. Tango, “On the universality of inversion layer mobility in Si MOSFET’s: part I-effects of substrate impurity concentration,” IEEE Transactions on Electron Devices, vol. 41, no. 12, pp. 2357–2362, 1994. [56] S. Takagi, A. Toriumi, M. Iwase, and H. Tango, “On the universality of inversion layer mobility in Si MOSFET’s: part I-effects of surface orientation,” IEEE Transactions on Electron Devices, vol. 41, no. 12, pp. 2363–2368, 1994. [57] J. Hauser, “Extraction of experimental mobility data for MOS devices,” IEEE Transactions on Electron Devices, vol. 43, no. 11, pp. 1981–1988, 1996. [58] R. van Langevelde and F. Klaassen, “Effect of gate-field dependent mobility degradation on distortion analysis in MOSFET’s,” IEEE Transactions on Electron Devices, vol. 44, no. 11, pp. 2044–2052, 1997.
BIBLIOGRAPHY
205
[59] E. Li, C. Liu, and H. Ng, “Effect of statistical variation on threshold voltage in narrowchannel MOSFETs,” Electronic Letters, vol. 26, no. 17, pp. 1390–1391, 1990. [60] E. Felt, A. Narayan, and A. Sangiovanni-Vincentelli, “Measurement and modeling of MOS transistor current mismatch in analog IC’s,” Proc. of 1994 IEEE International Conference on Computer Aided Design, pp. 272–277, 1994. [61] M. Steyaert, J. Bastos, R. Roovers, P. Kinget, W. Sansen, B. Graindourze, A. Pergoot, and E. Janssen, “Threshold voltage mismatch in short-channel MOS transistors,” Electronic Letters, vol. 30, no. 18, pp. 1546–1548, 1994. uhn, S. Marksteiner, T. Kopley, and W. Weber, “New method for verification of [62] C. K¨ analytical device models using transistor parameter fluctuations,” International Electron Devices Meeting 1997, pp. 145–148, 1997. [63] S. Lovett, M. Welten, A. Mathewson, and B. Mason, “Optimizing MOS transistor mismatch,” IEEE Journal of Solid-State Circuits, vol. 33, no. 1, pp. 147–150, 1998. [64] K. Okada, H. Onodera, and K. Tamaru, “Layout dependent matching analysis of CMOS circuits,” Analog Integrated Circuits and Signal Processing, vol. 25, no. 3, pp. 309–318, 2000. [65] A. Maxim and M. Gheorghe, “A novel physical based model of deep-submicron CMOS transistors mismatch for monte carlo spice simulation,” Proc. 2001 IEEE International Symposium on Circuits and Systems, vol. 5, pp. 511–514, 2001. [66] J. Schmitz, H. Tuinhout, A. Montree, Y. Ponomarev, P. Stolk, and P. Woerlee, “Gate polysilicon optimization for deep-submicron MOSFETs,” Proc. of the 29th European Solid-State Device Research Conference, pp. 156–159, 1999. [67] H. Thibieroz and A. Duvallet, “Mismatch characterization and modelization of deep submicron CMOS transistors,” Proceedings of the SPIE, vol. 3881, pp. 121–128, 1999. [68] A. Van den Bosch, M. Steyaert, and W. Sansen, “The extraction of transistor mismatch parameters: the CMOS current-steering D/A converter as a test structure,” Proc. 2000 IEEE International Symposium on Circuits and Systems, pp. 745–748, 2000. [69] U. Schaper, C. Linnenbank, and R. Thewes, “Precise characterization of long-distance mismatch of CMOS devices,” IEEE Transactions on Semiconductor Manufacturing, vol. 14, no. 4, pp. 311–317, 2001. [70] M. Hamer, “First-order parameter extraction on enhancement silicon MOS transistors,” IEE Proceedings I (Solid-State and Electron Devices), vol. 133, no. 2, pp. 49–54, 1986. [71] C. Mourrain, B. Cretu, G. Ghibaudo, and P. Cottin, “New method for parameter extraction in deep sub-micrometer MOSFETs,” Proc. of the 2000 International Conference on Microelectronic Test Structures, pp. 181–186, 2000. [72] S. Kubicek, W. Henson, A. De Keersgieter, G. Badenes, P. Jansen, H. van Meer, D. Kerr, A. Naem, L. Deferm, and K. De Meyer, “Investigation of intrinsic transistor performance of advanced CMOS devices with 2.5nm NO gate oxides,” International Electron Devices Meeting 1999, pp. 823–826, 1999. [73] H. Tuinhout, “Characterisation of systematic MOSFET transconductance mismatch,” Proc. of the 2000 International Conference on Microelectronic Test Structures, pp. 131– 136, 2000.
206
BIBLIOGRAPHY
[74] T. Cochet, T. Skotnicki, G. Ghibaudo, and A. Poncet, “Lateral dependence of dopantnumber threshold voltage fluctuations in MOSFETs,” Proc. of the 29th European SolidState Device Research Conference, pp. 680–683, 1999. [75] N. Sano, K. Matsuzawa, A. Hiroki, and N. Nakayama, “Influence of statistical spatialnonuniformity of dopant atoms on threshold voltage in a system of many MOSFETs,” Japanese Journal of Applied Physics, Part 2, vol. 41, no. 5B, pp. 552–554, 2002. [76] A. Asenov, “Random dopant threshold voltage fluctuations in 50 nm epitaxial channel MOSFETs: A 3d ’atomistic’ simulation study,” Proc. of the 28th European Solid-State Device Research Conference, pp. 300–303, 1998. [77] A. Asenov, A. Brown, J. Davies, and S. Saini, “Hierarchical approach to ”atomistic” 3D MOSFET simulation,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 18, no. 11, pp. 1558–1565, 1999. [78] A. Asenov, G. Slavcheva, A. Brown, J. Davies, and S. Saini, “Quantum mechanical enhancement of the random dopant induced threshold voltage fluctuations and lowering in sub 0.1 micron MOSFETs,” International Electron Devices Meeting 1999, pp. 535– 538, 1999. [79] A. Asenov and S. Saini, “Suppression of random dopant-induced threshold voltage fluctuations in sub-0.1-µm MOSFET’s with epitaxial and δ-doped channels,” IEEE Transactions on Electron Devices, vol. 46, no. 8, pp. 1718–1724, 1999. [80] D. Frank, Y. Taur, M. Ieong, and H.-S. Wong, “Monte Carlo modeling of threshold variation due to dopant fluctuations,” Proc. of 1999 Symposium on VLSI Technology, pp. 169–170, 1999. [81] A. Asenov and S. Kaya, “Effect of oxide interface roughness on the threshold voltage fluctuations in decano MOSFETs with ultrathin gate oxides,” Proc. 2000 International Conference on Simulation of Semiconductor Processes and Devices, pp. 135–138, 2000. [82] A. Asenov, G. Slavcheva, A. Brown, J. Davies, and S. Saini, “Increase in the random dopant induced threshold fluctuations and lowering in sub-100 nm MOSFETs due to quantum effects: A 3-d density-gradient simulation study,” IEEE Transactions on Electron Devices, vol. 48, no. 4, pp. 722–729, 2001. [83] A. Asenov, G. Slavcheva, A. Brown, R. Balasubramaniam, and J. Davies, “Statistical 3d ’atomistic’ simulation of decano MOSFETs,” Superlattices and Microstructures, vol. 27, no. 2/3, pp. 215–227, 2000. [84] C.-R. Ryou, S. Hwang, H. Shin, C.-H. Lee, Y. Park, and H. Min, “Three-dimensional simulation of discrete oxide charge effects in 0.1 µm MOSFETs,” Solid-State Electronics, vol. 45, pp. 1165–1172, 2001. [85] H. Yamamoto, Y. Okada, and N. Sano, “Quantitative prediction of threshold voltage fluctuations in sub-100 nm MOSFETs by a new dopant model,” Proc. of the Device Research Conference, pp. 171–172, 2001. [86] T. Ezaki, T. Ikezawa, A. Notsu, K. Tanaka, and M. Hane, “3D MOSFET simulation considering long-range Coulomb potential effects for analyzing statistical dopantinduced fluctuations associated with atomistic process simulator,” Proc. 2002 International Conference on Simulation of Semiconductor Processes and Devices, pp. 91–94, 2002. [87] T. Ezaki, T. Ikezawa, and M. Hane, “Investigation of realistic dopant fluctuation induced device characteristics variation for sub-100nm CMOS by using atomistic 3D
BIBLIOGRAPHY
207
process/device simulator,” International Electron Devices Meeting 2002, pp. 311–314, 2002. [88] W. Gross, D. Vasileska, and D. Ferry, “Three-dimensional simulations of ultrasmall metal-oxide-semiconductor field-effect transistors: The role of the discrete impurities on the device terminal characteristics,” Journal of Applied Physics, vol. 91, no. 6, pp. 3737–3740, 2002. [89] G. Iannaccone and E. Amirante, “Quantum and semiclassical modeling of the threshold voltage dispersion due to random dopants in deep submicron MOSFETs,” Proc. of the 2002 2nd IEEE Conference on Nanotechnology, pp. 197–200, 2002. [90] Y. Oda, Y. Ohkura, K. Suzuki, S. Ito, H. Amakawa, and K. Nishi, “Statistical fluctuation analysis by Monte Carlo ion implantation method,” Proc. 2002 International Conference on Simulation of Semiconductor Processes and Devices, pp. 199–202, 2002. [91] T. Ezaki, T. Ikezawa, A. Notsu, K. Tanaka, and M. Hane, “Three dimensional MOSFET simulation for analyzing statistical dopant-induced fluctuations associated with atomistic process simulator,” IEICE Transactions on Electronics, vol. E86-C, no. 3, pp. 409–415, 2003. [92] Y. Oda, Y. Ohkura, K. Suzuki, S. Ito, H. Amakawa, and K. Nishi, “Statistical threshold voltage analysis by Monte Carlo ion implantation method,” IEICE Transactions on Electronics, vol. E86-C, no. 3, pp. 416–420, 2003. [93] V.-Y. Thean, M. Sadd, and E. White Jr., “Effects of dopant granularity on superhalochannel MOSFET’s according to two- and three-dimensional computer simulations,” IEEE Transactions on Nanotechnology, vol. 2, no. 2, pp. 97–101, 2003. [94] Y. Yasuda, M. Takamiya, and T. Hiramoto, “Separation of effects of statistical impurity number fluctuations and position distribution on Vth fluctuations in scaled MOSFETs,” IEEE Transactions on Electron Devices, vol. 47, no. 10, pp. 1838–1842, 2000. [95] Y. Tsividis, Operation and modeling of the MOS transistor. McGraw-Hill international editions, 2nd ed., 1999. [96] Y. Taur and T. Ning, Fundamentals of modern VLSI devices. Cambridge university press, 1998. [97] N. Collaert, Alternative transistor structures: Modeling and optimisation of the vertical advanced heterojunction MOSFET. K.U. Leuven, 2000. [98] J. Brews, “Carrier-density fluctuations and the IGFET mobility near threshold,” Journal of Applied Physics, vol. 46, no. 5, pp. 2193–2203, 1975. [99] T. Ando, A. Fowler, and F. Stern, “Electronic properties of two-dimensional systems,” Reviews of Modern Physics, vol. 54, no. 2, pp. 437–672, 1982. [100] M. Lundstrom, Fundamentals of carrier transport. Cambridge university press, 2nd ed., 2000. [101] H. Shin, G. Yeric, A. Tasch, and C. Maziar, “Physically-based models for effective mobility and local-field mobility of electrons in MOS inversion layers,” Solid-State Circuits, vol. 34, no. 6, pp. 545–552, 1991. [102] K. Takeuchi, “Channel size dependence of dopant-induced threshold voltage fluctuation,” Proc. of 1998 Symposium on VLSI Technology, pp. 72–73, 1998. [103] Medici manual. Avant!
208
BIBLIOGRAPHY
[104] Y. Yasuda, M. Takamiya, and T. Hiramoto, “Threshold voltage fluctuations induced by statistical ’position’ and ’number’ impurity fluctuations in bulk MOSFETs,” Superlattices and Microstructures, vol. 28, no. 5/6, pp. 357–361, 2000. [105] T. Tanaka, T. Usuki, Y. Momiyama, and T. Sugii, “Direct measurement of Vth fluctuations caused by impurity positioning,” Proc. of 2000 Symposium on VLSI Technology, pp. 136–137, 2000. [106] T. Tanaka, T. Usuki, T. Futatsugi, Y. Momiyama, and T. Sugii, “Vth fluctuations induced by statistical variation of pocket dopant profile,” International Electron Devices Meeting 2000, pp. 271–274, 2000. [107] A. Asenov, “Random dopant induced threshold voltage lowering and fluctuations in sub 50 nm MOSFET’s: a statistical 3d ’atomistic’ simulation study,” Nanotechnology, vol. 10, pp. 153–158, 1999. [108] International technology roadmap for semiconductors. Sematech, 2001 ed. [109] A. Asenov and S. Saini, “Influence of the polysilicon gate on the random dopant induced threshold voltage fluctuations in sub 100 nm MOSFETs with thin gate oxides,” Proc. of the 29th European Solid-State Device Research Conference, pp. 188–191, 1999. [110] R. Difrenza, P. Llinares, G. Morin, E. Granger, and G. Ghibaudo, “A new model for threshold voltage mismatch based on the random fluctuations of dopant number in the MOS transistor gate,” Proc. of the 31th European Solid-State Device Research Conference, pp. 299–302, 2001. [111] A. Asenov and S. Saini, “Polysilicon gate enhancement of the random dopant induced threshold voltage fluctuations in sub-100 nm MOSFET’s with ultrathin gate oxide,” IEEE Transactions on Electron Devices, vol. 47, no. 4, pp. 805–812, 2000. [112] R. Difrenza, J. Vildeuil, P. Llinares, and G. Ghibaudo, “Impact of grain number fluctuations in the MOS transistor gate on matching performance,” Proc. of the 2003 International Conference on Microelectronic Test Structures, pp. 244–249, 2003. [113] J. Brews, “Theory of the carrier-density fluctuations in an IGFET near threshold,” Journal of Applied Physics, vol. 46, no. 5, pp. 2181–2192, 1975. [114] A. Asenov, S. Kaya, and J. Davies, “Oxide thickness variation induced threshold voltage fluctuations in decano MOSFETs: a 3D density gradient simulation study,” Superlattices and Microstructures, vol. 28, no. 5/6, pp. 507–515, 2000. [115] G. Slavcheva, J. Davies, A. Brown, and A. Asenov, “Potential fluctuations in metaloxide-semiconductor field-effect transistors generated by random impurities in the depletion layer,” Journal of Applied Physics, vol. 91, no. 7, pp. 4326–4334, 2002. [116] J. Zhao, H. Chen, and C. Teng, “Investigation of charging damage induced Vt mismatch for submicron mixed-signal technology,” Proc. of the Reliability Physics Symposium, pp. 33–36, 1996. [117] J. Bastos, M. Steyaert, R. Roovers, P. Kinget, W. Sansen, B. Graindourze, A. Pergoot, and E. Janssens, “Mismatch characterization of small size MOS transistors,” Proc. of the 1995 International Conference on Microelectronic Test Structures, pp. 271–276, 1995. [118] A. V. d. Bosch, M. Steyaert, and W. Sansen, “A high density matched hexagonal transistor structure in standard CMOS technology for high speed applications,” Proc. of the 1999 International Conference on Microelectronic Test Structures, pp. 212–215, 1999.
BIBLIOGRAPHY
209
[119] J. Dubois, J. Knol, M. Bolt, H. Tuinhout, J. Schmitz, and P. Stolk, “Impact of source/drain implants on threshold voltage matching in deep sub-micron CMOS technologies,” Proc. of the 32nd European Solid-State Device Research Conference, pp. 115– 118, 2002. [120] J.-S. Goo, Q. Xiang, Y. Takamura, F. Arasnia, E. Paton, P. Besser, J. Pan, and M.R. Lin, “Band offset induced threshold variation in strained-Si nMOSFETs,” IEEE Electron Device Letters, vol. 24, no. 9, pp. 568–570, 2003. [121] K. Takeuchi, R. Koh, and T. Mogami, “A study of the threshold voltage variation for ultra-small bulk and SOI CMOS,” IEEE Transactions on Electron Devices, vol. 48, no. 9, pp. 1995–2001, 2001. [122] S. Xiong and J. Bokor, “Sensitivity of double-gate and FinFET devices to process variations,” IEEE Transactions on Electron Devices, vol. 50, no. 11, pp. 2255–2261, 2003. [123] D. He, H. Solak, W. Li, and F. Cerrina, “Extreme ultraviolet and x-ray resist: Comparison study,” Journal of Vacuum Science & Technology B, vol. 17, no. 6, pp. 3379–3383, 1999. [124] S. G.M., M. Stewart, V. Singh, and C. Willson, “Spatial distribution of reaction products in positive tone chemically amplified resists,” Journal of Vacuum Science & Technology B, vol. 20, no. 1, pp. 185–190, 2002. [125] P. Oldiges, Q. Lin, K. Petrillo, M. Sanches, M. Ieong, and M. Hargrove, “Modeling line edge roughness effects in sub 100 nanometer gate length devices,” Proc. 2000 International Conference on Simulation of Semiconductor Processes and Devices, pp. 131–134, 2000. [126] T. Linton, M. Chandhok, B. Rice, and G. Schrom, “Determination of the line edge roughness specification for 34 nm devices,” International Electron Devices Meeting 2002, pp. 303–306, 2002. [127] S. Kaya, A. Brown, A. Asenov, D. Magot, and T. Linton, “Analysis of statistical fluctuations due to line edge roughness in sub-0.1µm MOSFETs,” Proc. 2001 International Conference on Simulation of Semiconductor Processes and Devices, pp. 78–81, 2001. [128] A. Asenov, S. Kaya, and A. Brown, “Intrinsic parameter fluctuations in decananometer MOSFETs introduced by gate line edge roughness,” IEEE Transactions on Electron Devices, vol. 50, no. 5, pp. 1254–1260, 2003. [129] C. Diaz, H.-J. Tao, Y.-C. Ku, A. Yen, and K. Young, “An experimentally validated analytical model for gate line-edge roughness (LER) effects on technology scaling,” IEEE Electron Device Letters, vol. 22, no. 6, pp. 287–289, 2001.