S
KATHOLIEKE UNIVERSITEIT LEUVEN FACULTEIT WETENSCHAPPEN DEPARTMENT OF COMPUTER SCIENCE Celestijnenlaan 200 A - B-3001 Heverlee, Belgium
ACCELERATION OF WAVEFORM RELAXATION METHODS FOR LINEAR ORDINARY AND PARTIAL DIFFERENTIAL EQUATIONS
Promotoren : Prof. Dr. ir. S. VANDEWALLE Prof. Dr. H. VAN DE VEL
Proefschrift voorgedragen tot het behalen van het doctoraat in de wetenschappen door
Jan JANSSEN
December 1997
S
KATHOLIEKE UNIVERSITEIT LEUVEN FACULTEIT WETENSCHAPPEN DEPARTMENT OF COMPUTER SCIENCE Celestijnenlaan 200 A - B-3001 Heverlee, Belgium
ACCELERATION OF WAVEFORM RELAXATION METHODS FOR LINEAR ORDINARY AND PARTIAL DIFFERENTIAL EQUATIONS
Proefschrift voorgedragen tot het behalen van het doctoraat in de wetenschappen
Jury : Prof. Dr. ir. S. Vandewalle, promotor Prof. Dr. H. Van de Vel, promotor Prof. Dr. ir. A. Lumsdaine (University of Notre Dame, U.S.A.) Prof. Dr. S. Poedts Prof. Dr. ir. D. Roose Prof. Dr. W. Van Assche
door
Jan JANSSEN
December 1997
ACCELERATION OF WAVEFORM RELAXATION METHODS FOR LINEAR ORDINARY AND PARTIAL DIFFERENTIAL EQUATIONS Jan Janssen Department of Computer Science, K.U.Leuven Celestijnenlaan 200A, 3001 Heverlee, Belgium
Abstract Waveform relaxation is an iterative method for numerically solving large systems of ordinary dierential equations. Its key idea is to solve the system of dierential equations by integrating a sequence of subsystems in fewer variables within an iterative procedure. As such, the method can be regarded as the natural extension of the classical relaxation methods for solving systems of algebraic equations, with iteration vectors consisting of functions in time (waveforms) instead of scalar values. Although waveform relaxation methods have been applied to a wide variety of problems, most convergence studies so far concentrated on a certain class of linear rstorder dierential systems. In this thesis, we develop a theoretical framework to study the methods for more general problems. This framework is rst used to investigate the convergence behaviour of the Jacobi and Gauss{Seidel waveform relaxation methods. The results are applied to several model parabolic partial dierential equations, discretised by nite dierences or nite elements. The latter waveform variants are shown to converge as fast as the Jacobi and Gauss{Seidel relaxation methods for the corresponding steady-state problems. Next, we study acceleration methods to improve the convergence of the standard waveform methods. These acceleration techniques are analogous to the ones used to accelerate classical relaxation methods for algebraic systems (e.g., derived from an elliptic partial dierential equation). In particular, we consider the successive overrelaxation, Chebyshev and multigrid techniques. For some problems, we are again able to prove that the convergence behaviour of the accelerated waveform methods is similar to that of the associated techniques for systems of algebraic equations.
Acknowledgements When I was a last-year student, I started dreaming about doing scienti c research and obtaining a Ph.D. Therefore, I was pleased to be oered an assistant research position at the Department of Computer Science right after my graduation. Since then, ve years of hard and stressful, but very satisfying work have passed, in which I managed to nish this thesis. It would not be complete without a special word of thanks to the many people that somehow helped to make my dream come true. In the rst place, I wish to express my gratitude to my supervisor Prof. Stefan Vandewalle for his continuous guidance. He introduced me to the eld of waveform relaxation methods, assisted me in writing papers and helped me to establish my rst scienti c contacts. I especially appreciated his friendliness, broad background and punctuality, as well as his remarkable sense of humour. I also want to thank my supervisor Prof. Hugo Van de Vel for giving me wide freedom in pursuing the research topics that most interested me and for the interest he showed in my work. I wish to thank Prof. Stefaan Poedts, Prof. Dirk Roose and Prof. Walter Van Assche for accepting to be in my jury. A special word of thanks goes out to Prof. Andrew Lumsdaine, as he specially ew over from the U.S.A. to judge this Ph.D. thesis. This dissertation would not have existed but for the nancial support that I received from diverse sources. During the rst years of my work, I was nanced by the Belgian Incentive Program \Information Technology { Computer Science of the Future (IT/IF/5)", initiated by the Belgian State, Prime Minister's Service, Federal Oce for Scienti c, Technical and Cultural Aairs. After that, I was privileged to work during two extra years with support from the Research Fund K.U.Leuven (OT/94/16). I am most indebted to Dirk Roose for oering me these opportunities. In addition, his continuous search for sponsoring allowed me to visit several interesting scienti c meetings. I belonged to the parallel/scienti c computing group of the Department of Computer Science. As its members over the years I got to know about twenty young people seriously interested in numerical mathematics. They all created a lively atmosphere and helped me in various ways, some with clever hints on LATEX or UNIX, others with interesting and stimulating discussions. I also appreciated the good understanding with my oce mates Kurt Lust, Karl Meerbergen, Hilde Vanaenroyde, Pierre Verlinden and Jan Verschelde, the excellent service of the computer systems group and the administrative eorts of our secretaries. This thesis has greatly bene ted from numerous discussions held at conferences, and from an intense e-mail correspondence with scientists working in closely related areas. In particular, I want to mention interesting conversations with Morten Bjrhus, Wai{Shing (Danny) Luk, Andrew Lumsdaine, Ulla Miekkala, Mark Reichelt and my co-authors Min i
ii
ACKNOWLEDGEMENTS
Hu and Ken Jackson. Special thanks also go to Johan Quaegebeur of the Department of Mathematics for helping me with some functional analysis problems, and to Jo Simoens for the close collaboration during the work on his engineering thesis. Next, I want to thank some people who are very close to me. First of all, I am most grateful to my parents for their endless support and encouragement throughout my studies. I could also rely on my sister, her husband and my godchild Fleur at the hardest times, and had some wonderful friends (you know who you are!) to pass my free time with. Odd as it may seem, this list would not be complete without paying tribute to Asja, the adorable Bernese Mountain Dog of my future parents-in-law. The numerous walks I took her on always relaxed me completely and motivated me over and over again. Finally, I wish to dedicate this dissertation to my ancee Nancy, for her great love, her patience and her understanding displayed throughout the duration of this work. Her joy of living and optimism cheered me up at my most gloomy moments, and continuously gave me the strength to bring this thesis to a successful conclusion. Jan Janssen Heverlee, December 1997
Contents Acknowledgements Contents Notations and Abbreviations Nederlandse Samenvatting 1 Introduction
1.1 Waveform relaxation methods . . . . . . . . . 1.1.1 Basic ideas . . . . . . . . . . . . . . . 1.1.2 Some standard waveform methods . . . 1.1.3 An example . . . . . . . . . . . . . . . 1.2 A survey of the waveform relaxation literature 1.2.1 General convergence results . . . . . . 1.2.2 Acceleration techniques . . . . . . . . . 1.3 Thesis overview . . . . . . . . . . . . . . . . .
2 Functional Analysis Preliminaries
. . . . . . . .
2.1 Banach spaces . . . . . . . . . . . . . . . . . . . 2.2 Spectral properties of convolution-like operators 2.2.1 The continuous-time case . . . . . . . . . 2.2.2 The discrete-time case . . . . . . . . . .
3 Basic Waveform Relaxation Methods
. . . . . . . . . . . .
. . . . . . . . . . . .
3.1 Description of the method . . . . . . . . . . . . . . 3.1.1 The continuous-time case . . . . . . . . . . . 3.1.2 The discrete-time case . . . . . . . . . . . . 3.2 Convergence analysis . . . . . . . . . . . . . . . . . 3.2.1 The continuous-time case . . . . . . . . . . . 3.2.2 The discrete-time case . . . . . . . . . . . . 3.2.3 Discrete-time versus continuous-time results 3.3 Model problem analysis . . . . . . . . . . . . . . . . 3.3.1 Description of the model problems . . . . . 3.3.2 Theoretical results . . . . . . . . . . . . . . 3.3.3 Numerical results . . . . . . . . . . . . . . . 3.4 Some concluding remarks . . . . . . . . . . . . . . . iii
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
i iii v ix 1
1 1 2 4 5 5 6 8
11 11 14 14 19
23 23 23 24 26 26 33 37 37 38 40 43 47
CONTENTS
iv
4 Waveform Relaxation Methods and Successive Overrelaxation 4.1 Description of the method . . . . . . . . . . . . . . . . . . . 4.1.1 The continuous-time case . . . . . . . . . . . . . . . . 4.1.2 The discrete-time case . . . . . . . . . . . . . . . . . 4.2 Convergence analysis . . . . . . . . . . . . . . . . . . . . . . 4.2.1 The continuous-time case . . . . . . . . . . . . . . . . 4.2.2 The discrete-time case . . . . . . . . . . . . . . . . . 4.2.3 Discrete-time versus continuous-time results . . . . . 4.2.4 An extension of the theory for more general problems 4.3 Model problem analysis . . . . . . . . . . . . . . . . . . . . . 4.3.1 Theoretical results . . . . . . . . . . . . . . . . . . . 4.3.2 Numerical results . . . . . . . . . . . . . . . . . . . . 4.4 Some concluding remarks . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
5.1 Polynomial acceleration of waveform relaxation . . . . . . . . . . . . 5.2 Convolution-based polynomial acceleration of waveform relaxation . . 5.2.1 Chebyshev acceleration in the frequency domain . . . . . . . . 5.2.2 Convolution-based Chebyshev acceleration in the time domain 5.3 Model problem analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 The Chebyshev{Picard method . . . . . . . . . . . . . . . . . 5.3.2 Chebyshev{Jacobi waveform relaxation . . . . . . . . . . . . . 5.3.3 Chebyshev{Gauss{Seidel waveform relaxation . . . . . . . . . 5.3.4 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Some concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. 87 . 89 . 89 . 93 . 95 . 95 . 99 . 103 . 106 . 106
6.1 Description of the method . . . . . . . . . . . . . . . . . . . . . 6.1.1 The continuous-time case . . . . . . . . . . . . . . . . . . 6.1.2 The discrete-time case . . . . . . . . . . . . . . . . . . . 6.2 Convergence analysis . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 The continuous-time case . . . . . . . . . . . . . . . . . . 6.2.2 The discrete-time case . . . . . . . . . . . . . . . . . . . 6.2.3 Discrete-time versus continuous-time results . . . . . . . 6.2.4 An extension of the two-grid results to the multigrid case 6.3 Model problem analysis . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 Theoretical results . . . . . . . . . . . . . . . . . . . . . 6.3.2 Numerical results . . . . . . . . . . . . . . . . . . . . . . 6.4 Some concluding remarks . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
5 Chebyshev Acceleration of Waveform Relaxation Methods
6 Multigrid Waveform Relaxation Methods
7 Concluding Remarks and Suggestions for Future Research
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
49
. . . . . . . . . . . .
. . . . . . . . . . . .
49 49 51 51 52 61 64 69 71 71 78 85
87
109 109 109 110 110 111 113 115 116 118 118 123 128
129
7.1 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 7.2 Suggestions for future research . . . . . . . . . . . . . . . . . . . . . . . . 130
Bibliography
133
Notations and Abbreviations Below we present lists of symbols and abbreviations used in the text, together with a brief explanation of their meaning. We have added the page number(s) on which they
rst occur and/or where more information can be found. The lists are restricted to symbols/abbreviations that are used more or less frequently. Symbols with a generally known meaning, such as the ones for number spaces (R, C , . . . ), dierentiation operators (d=dt, @=@t, @=@x, @ 2=@x2, . . . ) and relational operators (2, , <, , . . . ) are omitted, as well as various well-known abbreviations (det, lim, max, sup, . . . ). We will occasionally use the same symbol to denote dierent objects. In these cases, we will list all possible meanings of the symbol, the correct interpretation of which should be clear from the context in which it is used.
List of symbols n] ( ) ]t e (z ) jj kk
: : : : : : : : : :
equal everywhere, p. 4 approximate equality, p. 40 convolution, p. 14, p. 19 n-th element of sequence (e.g., xn], un]), p. 11, p. 19 -th iterate (e.g., u( )), p. 2 transposition operator, p. 2 sequence of length N (e.g., u ), p. 19 (discrete) Laplace transform (e.g., xe (z), e( )(z)), p. 21, p. 27 modulus of complex number, p. 11 (Euclidean) vector norm and induced matrix norm, p. 12, p. 15, p. 18 maximum norm on C 0 T ], p. 12 p-norm on lp(N ), p. 12 p-norm on Lp(0 T ), p. 12 norm on X , p. 11
k kC 0T ] k klp(N ) k kLp (0T ) k kX
: : : :
a(z) a( )
: rst characteristic polynomial of linear multistep method, p. 24 : bilinear form corresponding to operator L/;m, p. 31, p. 39 v
NOTATIONS AND ABBREVIATIONS
vi
a_ (z) aij A Ah, AH Arg() b(z) bij B Bh, BH C 0 T ]
C
d d(z) da D e( ), e( )(t) e( ), e( )n] E (z), E (z) h, hi, H H
: : : : : : : : : : : : : : : : : : : :
H(z) H (z)
: : :
Hc , hc H , h
: : : : : : : : : :
H
I
K(z) K (z)
K Kc , kc K , k
lp(N ) Lp(0 T ) L
L L()
(M , N )
: : : :
derivative of a(z) with respect to z, p. 25 matrix elements/blocks of A, p. 39, p. 50 d d (stiness) matrix, p. 5, p. 39
ne-grid, coarse-grid stiness matrix, p. 110 principal argument of a complex number (e.g., Arg(;z)), p. 25 second characteristic polynomial of linear multistep method, p. 24 matrix elements/blocks of B , p. 39, p. 50 d d nonsingular (mass) matrix, p. 6, p. 39
ne-grid, coarse-grid mass matrix, p. 110 Banach space of continuous functions on 0 T ], p. 11 set of complex numbers with point at in nity, p. 25 dimension of general ODE, p. 2, p. 5, p. 6 midpoint of the ellipse E (z), p. 89 diagonal element of constant diagonal matrix DA = daI , p. 30 diagonal part of matrix (e.g., DA , DB ), p. 6, p. 24 continuous-time error waveform, p. 27 discrete-time error sequence, element, p. 33, p. 43 ellipse which is the boundary of R(z), R (z), p. 70, p. 89 mesh sizes, p. 38, p. 109, p. 116 { d d matrix, p. 14 { Sobolev space, p. 31, p. 39 H + Hc(z), with Hc(z) the Laplace transform of hc , p. 17 discrete Laplace transform or Z -transform of h , p. 19 { general, linear operator in Banach space X , p. 12 { operator of the form H + Hc , p. 14 linear Volterra convolution operator, kernel, p. 14 linear discrete convolution operator, kernel, p. 19 identity matrix/operator, p. 13, p. 17 symbol of the continuous-time WR operator K, p. 27 symbol of the discrete-time WR operator K , p. 34 continuous-time WR operator, p. 27 linear Volterra convolution operator, kernel, p. 27 discrete-time WR operator, kernel, p. 33, p. 34 Banach space of p-summable sequences, p. 12 Banach space of p-th power Lebesgue-integrable functions on (0 T ), p. 12 lower-triangular part of matrix (e.g., LA, LB ), p. 6, p. 24 elliptic dierential operator, p. 6 Laplace transform (e.g., L(k(t))), p. 17 matrix splitting (e.g., A = MA ; NA , B = MB ; NB ), p. 6, p. 23
NOTATIONS AND ABBREVIATIONS
M(z) M (z)
vii
O p p(z), p (z) P R(z )
: : : : : : : : : : :
q(z), q (z) q () Q (z )
: : :
r R(z), R (z)
: :
Re() R() S (@S , int S ) T T () u u, u(t) u_ , u_ (t) u0 u u, u(x t) U X Z ()
: : : : : : : : : : : : : :
symbol of the continuous-time two-grid WR operator M, p. 111 symbol of the discrete-time two-grid WR operator M , p. 113 continuous-time two-grid WR operator, p. 111 linear Volterra convolution operator, kernel, p. 111 discrete-time two-grid WR operator, p. 113 number of time steps, p. 19 null space (e.g., N (I ; H)), p. 14 big \O" (order) symbol, p. 16 prolongation operator, p. 110 largest semi-axis of the ellipse E (z), E (z), p. 70, p. 89 -th degree polynomial Q (z ) that is \optimal" with respect to the region R, p. 90 smallest semi-axis of the ellipse E (z), E (z), p. 70, p. 89 -th degree polynomial for which q (1) = 1, p. 87 -th degree polynomial with z-dependent coecients for which Q (z 1) = 1, p. 89 restriction operator, p. 110 elliptical region containing the eigenvalues of K(z), KJAC (z ), p. 70, p. 89 real part of complex number (e.g., Re(z)), p. 17 range (e.g., R(I ; H)), p. 13 stability region (boundary, interior), p. 25, p. 35, p. 36 length of time interval, p. 5, p. 43 -th degree Chebyshev polynomial of the rst kind, p. 90 solution of linear system, p. 6 solution of ODE, p. 2 derivative of u with respect to t, p. 2 initial value, p. 2 fully discrete solution of ODE, p. 19 solution of PDE, p. 6, p. 38 upper-triangular part of matrix (e.g., UA , UB ), p. 6, p. 24 (normed) vector space or Banach space, p. 11 discrete Laplace transform or Z -transform (e.g., Z (h )), p. 19
0 1 : : : k 0 1 : : : k (t), m
(z), (z) 1
: : : : : :
linear multistep coecients, p. 24 linear multistep coecients, p. 24 (discrete) delta function, p. 50, p. 51 m-dimensional Laplace operator, p. 38 inclination angle of the ellipse E (z), E (z), p. 70, p. 89 largest-magnitude eigenvalue of KJAC(0), p. 30
M Mc , mc M
N
N ()
NOTATIONS AND ABBREVIATIONS
viii
1(z), (1) (z) 1, 2 ! #, #(t) #h , #H #
()
( )
( )k]
R()
: : : : : : : : : :
%()
:
%R()
:
() e()
: : :
largest-magnitude eigenvalue of KJAC(z), KJAC (z ), p. 55, p. 63 number of pre-smoothing, post-smoothing steps, p. 109, p. 110 overrelaxation parameter, p. 49 overrelaxation function or kernel of the form ! (t) + !c (t), p. 50 spatial grids with mesh sizes h and H , p. 38, p. 109 overrelaxation sequence or kernel of the form ! + (!c) , p. 51 spectral radius (e.g., (H), (H )), p. 13, p. 15 iteration convergence factor, p. 43 time-level convergence factor, p. 46 virtual spectral radius with respect to the region R/R (e.g.,
R(KCSOR (z)), R(Q (z K(z)))), p. 70, p. 89 asymptotic averaged spectral radius (e.g., %(q (K)), %(q (K(z)))), p. 88 virtual asymptotic averaged spectral radius with respect to the region R (e.g., %R(Q (z K(z)))), p. 90 spectrum (e.g., (H), (H )), p. 13, p. 15 essential spectrum (e.g., e (H), e(H )), p. 14, p. 15
xed time step, p. 19
List of abbreviations BDF CH CN codim CSOR DSSOR FFT GS IVP JAC ODE opt PIC PDE SSSOR SOR WR
: : : : : : : : : : : : : : : : :
backward dierentiation formula(e), p. 25 Chebyshev, p. 94 Crank{Nicolson, p. 25 co-dimension, p. 14 convolution SOR, p. 7 double-splitting SOR, p. 50 fast Fourier transform, p. 69 Gauss{Seidel, p. 29 initial-value problem, p. 1 Jacobi, p. 29 ordinary dierential equation, p. 1 optimal, p. 55 Picard, p. 95 partial dierential equation, p. 6 single-splitting SOR, p. 50 successive overrelaxation, p. 6 waveform relaxation, p. 1
Nederlandse Samenvatting: Versnelling van Golfvormrelaxatiemethoden voor Lineaire Gewone en Partiele Dierentiaalvergelijkingen Inhoudsopgave 1 Inleiding
x
1.1 Golfvormrelaxatiemethoden . . . . . . . . . . . . . . . . . . . . . . . . . x 1.2 Een overzicht van de golfvormrelaxatieliteratuur . . . . . . . . . . . . . . xi 1.3 Thesisoverzicht . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
2 Enkele begrippen uit de functionaalanalyse
xiv
3 Basis-golfvormrelaxatiemethoden
xvi
4 Golfvormrelaxatiemethoden en successieve overrelaxatie
xxii
5 Chebyshev-versnelling van golfvormrelaxatiemethoden
xxxi
6 Multirooster-golfvormrelaxatiemethoden
xxxv
7 Besluiten en suggesties voor verder ondezoek
xxxix
2.1 Banach-ruimten . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv 2.2 Spectrale eigenschappen van convolutieachtige operatoren . . . . . . . . . xiv 3.1 Beschrijving van de methode . . . . . . . . . . . . . . . . . . . . . . . . . xvi 3.2 Convergentieanalyse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviii 3.3 Modelprobleemanalyse . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi 4.1 Beschrijving van de methode . . . . . . . . . . . . . . . . . . . . . . . . . xxii 4.2 Convergentieanalyse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiv 4.3 Modelprobleemanalyse . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxix 5.1 Veeltermversnelling van golfvormrelaxatie . . . . . . . . . . . . . . . . . . xxxi 5.2 Convolutiegebaseerde veeltermversnelling van golfvormrelaxatie . . . . . xxxii 5.3 Modelprobleemanalyse . . . . . . . . . . . . . . . . . . . . . . . . . . . .xxxiv 6.1 Beschrijving van de methode . . . . . . . . . . . . . . . . . . . . . . . . . xxxv 6.2 Convergentieanalyse . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xxxvi 6.3 Modelprobleemanalyse . . . . . . . . . . . . . . . . . . . . . . . . . . . .xxxix
ix
NEDERLANDSE SAMENVATTING
x
1 Inleiding
1.1 Golfvormrelaxatiemethoden Basisideeen
Golfvormrelaxatie is een iteratieve oplossingsmethode voor beginwaardeproblemen bestaande uit gewone di erentiaalvergelijkingen (of di erentiaal-algebrasche vergelijkingen ). De methode werd ingevoerd als alternatief voor de klassieke stap-per-stapmethoden in het kader van de numerieke simulatie van grootschalige elektrische netwerken 50]. Golfvormrelaxatiemethoden ontkoppelen het gegeven probleem op het niveau van de dierentiaalvergelijking. Het grote beginwaardeprobleem wordt dan iteratief opgelost door de kleinere ontkoppelde deelproblemen op te lossen in een proces dat wordt herhaald totdat er convergentie optreedt. Waarden van de vorige iteratie worden gebruikt voor de variabelen van de andere deelproblemen. Golfvormrelaxatiemethoden zijn continue iteratieve methoden: uit een gegeven vector van functies die de oplossing benadert, berekenen de methoden een nieuwe benadering op een heel tijdsinterval. De methoden verschillen van klassieke iteratieve technieken in het feit dat ze itereren met continue functies of golfvormen en dus kunnen worden gede nieerd zonder verwijzing naar een tijdsdiscretisatie. De deelproblemen die voorkomen in een golfvormiteratie worden gewoonlijk veel kleiner gekozen dan het originele probleem$ een bijkomende vereiste voor een bruikbare methode is dat deze ontkoppelde problemen gemakkelijk oplosbaar zijn. Een computerimplementatie gebruikt een standaardintegratiemethode om de ontkoppelde deelproblemen op te lossen. In het resulterende discrete golfvormalgoritme volstaat het om de tijdstap van dit integratieschema zodanig te kiezen dat enkel het verloop van de lokale variabelen accuraat wordt weergegeven. Golfvormrelaxatiemethoden zijn dus in feite onvervalste variabelestap-integratiemethoden. Ze kunnen ook eenvoudig op een parallelle architectuur worden ge%&mplementeerd door de deelproblemen te verdelen onder de verschillende processoren. Een nadeel van golfvormmethoden is dat zij aanzienlijk wat computergeheugen in beslag nemen aangezien zij tenminste een golfvorm opslaan voor elke onbekende.
Enkele standaardgolfvormmethoden
Beschouw het volgende stelsel bestaande uit d niet-lineare, eerste-orde dierentiaal (-algebra%&sche) vergelijkingen F (t u u_ ) = 0 u(0) = u0 t > 0 (1) of, componentsgewijs, Fi(t u1 u2 : : : ud u_ 1 u_ 2 : : : u_ d) = 0 ui(0) = (u0)i i = 1 2 : : : d : Het Jacobi-golfvormrelaxatie algoritme voor (1) berekent een \nieuwe" golfvorm u( ) uit de \oude" u( 1) door het oplossen van volgende vergelijkingen, ;
Fi t u(1
;
1) : : : u( 1) u( ) u( 1) : : : u( 1) i 1 i i+1 d ( 1) ( 1) ( ) u_ 1 : : : u_ i 1 u_ i u_ (i+1 1) : : : u_ (d 1) = 0 ;
;
;
;
;
;
;
;
;
(2)
NEDERLANDSE SAMENVATTING
xi
waarbij u(i )(0) = (u0)i en i = 1 2 : : : d. De iteratie wordt gestart met een initi%ele benadering u(0). Het voorgaande algoritme transformeert het probleem van het oplossen van het ddimensionale stelsel (1) tot een probleem waarin herhaaldelijk vergelijkingen in een variabele moeten worden opgelost. Bemerk de gelijkenis met de Jacobi-relaxatiemethode voor algebra%&sche vergelijkingen die gebaseerd is op een analoog principe. Het GaussSeidel-golfvormrelaxatie algoritme wordt bekomen door een eenvoudige aanpassing aan de dierentiaal(-algebra%&sche) vergelijkingen (2), i.e.,
Fi(t u(1 ) : : : u(i ) u(i+1 1) : : : u(d ;
1) u_ ( ) : : : u_ ( ) u_ ( 1) : : : u_ ( 1)) = 0 1 i i+1 d
;
;
;
:
In het vervolg zullen we vaak de term dynamische iteratie gebruiken als synoniem voor golfvormrelaxatie, terwijl er naar de klassieke relaxatiemethoden voor algebra%&sche systemen zal worden verwezen als statische iteraties. In het proefschrift worden ook nog de golfvormrelaxatie-Newton -methode, de golfvorm-Picard -methode en de golfvorm-Newton methode gede nieerd.
Een voorbeeld
We illustreren de werking van het Jacobi-golfvormrelaxatiealgoritme aan de hand van het klassieke systeem dierentiaalvergelijkingen dat de sinus- en cosinusfunctie de nieert,
F1(t u1 u2 u_ 1 u_ 2) = u_ 1 ; u2 = 0 u (0) = 0 met 1 F2(t u1 u2 u_ 1 u_ 2) = u_ 2 + u1 = 0 u2(0) = 1 :
(0) De iteratie wordt gestart met de initi%ele benaderingen u(0) 1 (t) 0 en u2 (t) 1. Voor 1 worden de Jacobi-iteratiefuncties u(1 ) en u(2 ) berekend door het oplossen van (2). We kunnen eenvoudig nagaan dat deze functies worden gegeven door een aantal termen van de Taylor-ontwikkelingen van de oplossingsfuncties,
8 2i+1 X1 > (2 1) (2 ) > i t > u ( t ) = u ( t ) = ( ; 1) > 1 1 < (2i + 1)! ;
;
i=0
2i > > (2 ) (t) = u(2 +1)(t) = X(;1)i t > > u 2 : 2 (2i)! i=0
1 0:
Voor een gra sche voorstelling verwijzen we naar Figuur 1.1 (p. 5).
1.2 Een overzicht van de golfvormrelaxatieliteratuur Algemene convergentieresultaten
De meest algemene convergentieanalyse van golfvormrelaxatie voor niet-lineaire eersteorde problemen werd voorgesteld in de elektrische-ingenieurswereld 51, 101, 102]. Deze analyse gaf heel wat inzicht in het kwalitatief convergentiegedrag van dynamische methoden, maar is van weinig nut indien men is ge%&nteresseerd in kwantitieve voorspellingen
NEDERLANDSE SAMENVATTING
xii
over de convergentie. Daarom bestudeerden Miekkala en Nevanlinna het convergentiegedrag van golfvormmethoden voor lineaire stelsels dierentiaalvergelijkingen u_ + Au = f u(0) = u0 t > 0 (3) met A 2 C d d 63, 64]. Voor zulke problemen kan het Jacobi-schema (2) worden geschreven als u_ ( ) + MAu( ) = NAu( 1) + f u( )(0) = u0 (4) met A = MA ; NA, MA = DA en NA = LA + UA (A = DA ; LA ; UA met DA , LA en UA respectievelijk een diagonaal-, onderdriehoeks- en bovendriehoeksmatrix). Analoog wordt de Gauss-Seidel-golfvormvariant voor (3) gede nieerd door MA = DA ; LA en NA = UA . Miekkala en Nevanlinna bewezen dat (onder de juiste veronderstellingen) de asymptotische convergentiesnelheid van deze Jacobi- en Gauss-Seidel-golfvormmethoden gelijk is aan de asymptotische convergentiesnelheid van de overeenkomstige statische relaxatiemethoden voor Au = f .
;
Versnellingstechnieken
Golfvormrelaxatiemethoden werden reeds toegepast op praktische eerste-orde beginwaardeproblemen uit verschillende toepassingsgebieden, waaronder de simulatie van grootschalige elektrische circuits, zie o.m. 17, 102, 104]. We kunnen de methoden ook toepassen op parabolische begin-randwaardeproblemen, die worden beschreven door een partiele di erentiaalvergelijking van de vorm @ u(x t) + Lu(x t) = f (x t) x 2 # t > 0 (5) @t met L een elliptische dierentiaaloperator op het ruimtelijke domein #. Semi-discretisatie van bovenstaande vergelijking (enkel de ruimtelijke variabelen worden gediscretiseerd) levert immers een stelsel eerste-orde gewone dierentiaalvergelijkingen.Interessante parti%ele dierentiaalvergelijkingen die reeds werden opgelost met golfvormrelaxatie zijn b.v. diffusievergelijkingen 10] en de Navier-Stokes-vergelijkingen 72]. In deze thesis beschouwen we enkel lineaire, tweede-orde operatoren L met tijdsonafhankelijke co%eci%enten. Het gebruik van eindige di erenties resulteert dan in een stelsel van de vorm (3), terwijl een eindige-elementen methode leidt tot een stelsel B u_ + Au = f u(0) = u0 t > 0 (6) met B niet-singulier. Voor bovenvermelde toepassingen is een na%&eve implementatie van de Jacobi- en Gauss-Seidel-golfvormmethoden uit de vorige paragraaf niet aan te raden. Men kan trachten de convergentie te verbeteren met behulp van versnellingstechnieken die oorspronkelijk werden ontwikkeld voor statische iteratieve methoden. We beperken ons hieronder tot een overzicht van de technieken die nader zullen worden bestudeerd in dit proefschrift, en herhalen de tot nu toe gekende resultaten voor lineaire problemen van de vorm (3). Successieve overrelaxatie (SOR). De door Miekkala en Nevanlinna afgeleide convergentieresultaten voor de SOR-golfvormrelaxatie methode, gede nieerd als een natuurlijke uitbreiding van de klassieke SOR-procedure voor stelsels algebra%&sche vergelijkingen, zijn
NEDERLANDSE SAMENVATTING
xiii
eerder teleurstellend 63, 64]. Recent werd er door Reichelt, White en Allen echter een convolutie-SOR- of CSOR-golfvormrelaxatie algoritme ontwikkeld, waarbij de vermenigvuldiging met een overrelaxatieparameter in het vorige algoritme wordt vervangen door een convolutie met een tijdsafhankelijke overrelaxatiefunctie 82]. Voor bepaalde halfgeleidersimulaties bekwamen zij alvast veelbelovende numerieke resultaten met deze laatste methode. Lineaire of veeltermversnelling. In tegenstelling tot de situatie in het statische geval, toonde Nevanlinna aan dat golfvormrelaxatiemethoden niet essentieel kunnen worden versneld met lineaire of veeltermversnellingstechnieken, i.e., door het nemen van lineaire combinaties van de iteratiefuncties 68]. Multiroostertechnieken. Multigrid - of multiroostertechnieken zijn zeer snelle oplossingsmethoden voor elliptische parti%ele dierentiaalvergelijkingen, die eenvoudig kunnen worden gecombineerd met golfvormrelaxatiemethoden in de context van parabolische problemen, zie b.v. Lubich en Ostermann 55] of Vandewalle 95]. De resulterende multiroostergolfvormrelaxatie methoden convergeren bijna even snel als hun statische varianten$ hun convergentiegedrag is onafhankelijk van de roosterafstand gebruikt bij de discretisatie van de parti%ele dierentiaalvergelijking.
1.3 Thesisoverzicht In dit proefschrift breiden we de bestaande convergentietheorie voor lineaire stelsels differentiaalvergelijkingen (3) uit tot problemen van de vorm (6) met B niet-singulier. We bestuderen verschillende \versnelde" golfvormvarianten voor zulke algemene lineaire differentiaalproblemen, waarbij de nadruk ligt op de convolutiegebaseerde SOR- en Chebyshev -iteraties. Ook de multirooster-golfvormrelaxatiemethode wordt nader bekeken. Deze resultaten werden reeds gepubliceerd in 31, 34, 35, 36, 37, 38, 39]. Het convergentiegedrag van golfvormrelaxatiemethoden voor (6) met niet-singuliere B wordt bepaald door de spectraalradius/norm van de operatoren, bekomen door de iteratieve golfvormrelaxatieschema's te herschrijven in expliciete vorm. In Hoofdstuk 2 bespreken we de spectrale eigenschappen van de resulterende operatoren. Met behulp van deze resultaten zullen we in de volgende hoofdstukken het convergentiegedrag van de hierboven vermelde golfvormmethoden onderzoeken voor algemene lineaire stelsels gewone dierentiaalvergelijkingen. De bekomen theoretische resultaten worden telkens toegepast op verschillende semi-gediscretiseerde diusieproblemen en geveri eerd met behulp van uitgebreide numerieke experimenten. In Hoofdstuk 3 bespreken we de convergentie van de zogenoemde basis-golfvormrelaxatie methoden, terwijl in Hoofdstuk 4 de SOR-iteraties (zowel de gewone als de convolutiegebaseerde methode) worden geanalyseerd. Een gedeelte van het werk uit dit laatste hoofdstuk werd uitgevoerd in samenwerking met Min Hu en Ken Jackson van het Departement Computerwetenschappen van de Universiteit van Toronto, Canada. Hoofdstuk 5 bevat de studie van de convolutie-Chebyshev-golfvormmethoden. De multirooster-golfvormvariant voor (6), bekomen door eindige-elementendiscretisatie van een lineair, parabolisch begin-randwaardeprobleem (5), is het onderwerp van Hoofdstuk 6. Besluiten en suggesties voor verder onderzoek worden tenslotte gegeven in Hoofdstuk 7.
NEDERLANDSE SAMENVATTING
xiv
2 Enkele begrippen uit de functionaalanalyse 2.1 Banach-ruimten
We herhalen in het kort enkele frequent gebruikte de nities en eigenschappen uit de functionaalanalyse, zie b.v. 43, 105]. De nitie 2.1 Een Banach-ruimte X is een genormeerde, volledige vectorruimte, d.w.z., een ruimte waarin elke Cauchy-rij convergeert naar een vector in X . Hieronder geven we enkele klassieke voorbeelden van zulke ruimten. Een eerste voorbeeld is de ruimte der continue functies C 0 T ], uitgerust met de maximumnorm kxkC 0T ] = max kx(t)k t 0T ] 2
waarbij k k een willekeurige C d -vectornorm voorstelt. We vermelden ook de ruimten der p-integreerbare (Lebesgue-meetbare) functies Lp(0 T ), waarvoor 8 q < p R T kx(t)kp dt 0 kxkLp(0T ) = : inf fM : kx(t)k M (bijna overal) in (0 T )g
1p<1 p=1: Tenslotte bekijken we de ruimte der p-sommeerbare rijen lp(N ), wiens norm gegeven is door 8q > < p PNn=01 kxn]kp 1 p < 1 kxklp(N ) = > p=1: : sup fkxn]kg ;
0 n
Het asymptotische convergentiegedrag van iteratieve methoden in zulke (oneindigdimensionale) Banach-ruimten wordt bepaald door de spectraalradius van de corresponderende operator, zoals volgt uit volgende stelling 43, p. 382]. Stelling 2.1 Zij H een begrensde, lineaire operator in een Banach-ruimte X . Stel (H) < 1. Dan convergeert het iteratief schema x( ) = Hx( 1) + ' = 1 2 : : : (7) met x(0) en ' willekeurig gekozen in X , naar de unieke oplossing van x ; Hx = '. Stel (H) > 1. Dan kan (7) met x(0) = 0 niet convergeren voor eender welke ' 2 X . ;
2.2 Spectrale eigenschappen van convolutieachtige operatoren Het continue geval
In het vervolg zullen we aantonen dat elke twee opeenvolgende golfvormiteratiefuncties voor (6) met B niet-singulier voldoen aan een relatie van de vorm u( ) = Hu( 1) + ' = (H + Hc ) u( 1) + ' ;
;
NEDERLANDSE SAMENVATTING
xv
met H 2 C d d en Hc een lineaire Volterra-convolutie operator met kern hc,
Hc x(t) = (hc x)(t) :=
Zt 0
hc (t ; s)x(s) ds :
Hieronder wordt de spectraalradius van zulke operatoren bestudeerd, zowel op eindige als oneindige tijdsintervallen. Eindige tijdsintervallen Het volgende lemma is een onmiddellijk gevolg van een stabiliteitsresultaat uit de perturbatietheorie. Het kan ook worden bewezen op een meer elementaire manier.
Lemma 2.2 Beschouw H = H + Hc als een operator in C 0 T ], en veronderstel dat hc 2 C 0 T ]. Dan is H een begrensde operator en (H) = (H ).
Oneindige tijdsintervallen Het bewijs van het equivalent van Lemma 2.2 op oneindige intervallen is gebaseerd op een stelling van Paley en Wiener, zie b.v. 20, p. 45] of 73, p. 60], die een noodzakelijke en voldoende voorwaarde geeft opdat de oplossing van de Volterra-vergelijking x + k x = f begrensd zou zijn.
Lemma 2.3 Beschouw H = H + Hc als een operator in Lp(0 1) met 1 p 1, en veronderstel dat hc 2 L1(0 1). Dan is H een begrensde operator en
(H) = sup (H(z))
(8)
= sup (H(i))
(9)
Re(z) 0
R
2
met H(z ) = H + Hc(z), en Hc(z) = L(hc (t)) := meerde van hc .
R
1
0
hc(t)e ztdt de Laplace-getransfor;
Merk op dat hc 2 L1(0 1) impliceert dat limz H(z) = H . Bijgevolg is de spectraalradius van H op eindige intervallen steeds kleiner dan de spectraalradius van H op oneindige intervallen. !1
Het discrete geval
Een implementatie van golfvormrelaxatiemethoden vereist dat de continue iteratieve schema's worden gediscretiseerd in de tijd. We zullen aantonen dat, als we hiertoe lineaire multistapmethoden gebruiken, we de resulterende golfvormiteraties kunnen schrijven in expliciete vorm als u( ) = H u( 1) + ' met u( ) = fu( )n]gNn=01, u( )n] de benadering van u( )(t) voor t = n en de gekozen tijdstap. De operator H is een lineaire discrete convolutie operator met kern h , ;
;
(H x ) n] = (h x )n] :=
n X m=0
hn ; m]xm] :
NEDERLANDSE SAMENVATTING
xvi
We herhalen de spectrale eigenschappen van zulke operatoren, zie b.v. 64], in een algemene context. Eindige tijdsintervallen In de eindigdimensionale ruimte lp(N ) kunnen we H x schrijven als een matrix-vectorproduct, waarbij de matrix een N N blok-Toeplitz-onderdriehoeksmatrix is met h0] het constante element op de diagonaal. We bekomen onmiddellijk het volgende lemma.
Lemma 2.4 Beschouw H als een operator in lp(N ) met 1 p 1 en N eindig. Dan is H een begrensde operator en
(H ) = (h0]) = (H (1))
met H (z ) = Z (h ) := h .
(10)
PN 1 n n=0 hn]z de discrete Laplace- of Z -getransformeerde van ;
;
Oneindige tijdsintervallen Het bewijs van het volgende lemma is gebaseerd op een discrete versie van de stelling van Paley-Wiener 53].
Lemma 2.5 Beschouw H als een operator in lp(1) met 1 p 1, en veronderstel dat h 2 l1(1). Dan is H een begrensde operator en
(H ) = max
(H (z)) z 1 = max
(H (z)) z =1
(11) (12)
j j
j j
met H (z ) = Z (h ) :=
P
n=0 hn]z 1
;
n
de discrete Laplace- of Z -getransformeerde van h .
Net als in het continue geval, volgt uit (10) en (11) dat de spectraalradius van H kleiner is op eindige dan op oneindige intervallen.
3 Basis-golfvormrelaxatiemethoden
3.1 Beschrijving van de methode Het continue geval
Beschouw een algemeen, lineair beginwaardeprobleem
B u_ + Au = f u(0) = u0 t > 0
(13)
met B A 2 C d d en B niet-singulier. De basis-golfvormrelaxatie methode voor (13) kan worden gede nieerd in termen van de matrix-splitsingen B = MB ; NB en A = MA ; NA, resulterend in het iteratief schema
MB u_ ( ) + MAu( ) = NB u_ (
1) + N u( 1) + f A
;
;
u( )(0) = u0 :
(14)
NEDERLANDSE SAMENVATTING
xvii
Het convergentiegedrag en de rekenkundige complexiteit van bovenstaande iteratie hangt duidelijk af van de aard van de splitsingsmatrices. Een natuurlijke keuze splitst B en A op dezelfde manier, en analoog aan de splitsingen (MA NA ) die worden gebruikt voor het oplossen van het lineaire stelsel Au = f met klassieke relaxatiemethoden
MA u( ) = NA u(
;
1) + f
of u( ) = MA 1NA u( ;
;
1) + M 1 f A ;
(15)
4, 24, 98, 106]. We verwijzen naar Tabel 3.1 (p. 24) voor enkele voorbeelden. Tenslotte merken we op dat de schema's (4) voor lineaire stelsels van de vorm (3) ook kunnen worden geanalyseerd met behulp van de meer algemene iteratie (14). Het volstaat immers om hiertoe MB = I en NB = 0 in te vullen.
Het discrete geval We herhalen eerst de de nitie van de lineaire multistapmethode voor het oplossen van u_ = f (t u), u(0) = u0, zie b.v. 48, p. 11], k k X 1X l=0 lun + l] = l=0 lf n + l] n 0
(16)
met l en l re%ele constanten en een constante stapgrootte. We veronderstellen dat er steeds k startwaarden u0] u1] : :P : uk ; 1] gegeven zijn. veeltermen PkDe karakteristieke k l l worden gede nieerd als a(z) := l=0 lz en b(z) := l=0 lz , terwijl S het stabiliteitsgebied van de methode aanduidt. Als de wig ' := fz : j Arg(;z )j < z 6= 0g met 0 < < =2 tot S behoort, noemt men de methode A()-stabiel. Als het volledige linkse complexe halfvlak tot het stabiliteitsgebied behoort hebben we te doen met een A-stabiele methode. We gebruiken voornamelijk de achterwaartsedi erentieformules en de trapeziumregel (of Crank-Nicolson -methode), die uitgebreid worden besproken in Voorbeeld 3.3.1 (p. 25). Als we (16) toepassen op het continue schema (14), bekomen we k k X 1X ( )n + l] + lMA u( )n + l] = M u l=0 l B l=0 k k k X X 1 N u( 1)n + l] + X ( 1) lNAu n + l] + lf n + l] : l=0 l B l=0 l=0 ;
(17)
;
Hierbij itereren we niet op de k gegeven startwaarden, d.w.z. u( )n] = u( 1)n] = un] voor n < k. Aangezien we ons verder zullen concentreren op impliciete methoden (met k 6= 0), kan de bovenstaande vergelijking uniek worden opgelost voor elke n als en slechts als aan de discrete oplossingsvoorwaarde is voldaan, m.a.w., als k 62 ;;M 1M : (18) B A k ;
;
NEDERLANDSE SAMENVATTING
xviii
3.2 Convergentieanalyse Het continue geval
De golfvormrelaxatieoperator en zijn symbool We kunnen de algemene oplossingsformule voor lineaire stelsels dierentiaalvergelijkingen (3) toepassen op (14), tenminste nadat we linker- en rechterlid vermenigvuldigen met MB 1 15, p. 119]. Dit levert een schema van de vorm ;
u( ) = Ku(
1) + '
waarin de continue golfvormrelaxatieoperator K is gegeven door K = MB 1 NB + Kc met Kc een lineaire Volterra-convolutieoperator wiens kern gelijk is aan ;
;
kc(t) = e MB;1MAtMB 1 (NA ; MA MB 1NB ) : Zij e( ) := u( ) ; u de fout van de -de iteratiefunctie. Ze voldoet aan een relatie van de vorm e( ) = Ke( 1). M.a.w., e( ) is de oplossing van de dierentiaalvergelijking MB e_( ) + MAe( ) = NB e_( 1) + NAe( 1) e( )(0) = 0 : Als we de Laplace-getransformeerde van e( ) noteren als e( )(z), bekomen we door Laplacetransformatie dat e( )(z ) = K(z ) e( 1) (z ) met K(z) = (zMB + MA ) 1(zNB + NA ) (19) het symbool van de operator K. Vaak (b.v. in het Jacobi- of Gauss-Seidel-geval) komt dit symbool overeen met de iteratiematrix van de overeenkomstige statische relaxatiemethode voor het lineaire stelsel (zB + A)ue(z) = fe(z) + Bu0 (20) bekomen door Laplace-transformatie van (13). In het bijzonder is K(0) steeds gelijk aan de iteratiematrix van de statische methode (15) voor Au = f . ;
;
;
;
;
;
;
;
Convergentie op eindige tijdsintervallen Daar kc 2 C 0 T ], volgt de volgende stelling onmiddellijk uit Lemma 2.2.
Stelling 3.1 Beschouw K als een operator in C 0 T ]. Dan is K een begrensde operator en
(K) = (MB 1 NB ) :
(21)
;
Convergentie op oneindige tijdsintervallen Merk op dat kc 2 L1(0 1) als alle eigenwaarden van MB 1 MA een positief re%eel deel hebben. Toepassing van Lemma 2.3 levert dan de volgende stelling. ;
NEDERLANDSE SAMENVATTING
xix
Stelling 3.2 Beschouw K als een operator in Lp(0 1) met 1 p 1, en veronderstel
dat alle eigenwaarden van MB 1 MA een positief reeel deel hebben. Dan is K een begrensde operator en
(K) = sup (K(z)) = sup (K(i)) : (22) R Re(z) 0 ;
2
Bemerk dat (K) (K(0)) = (MA 1 NA). Hierdoor kunnen golfvormrelaxatiemethoden nooit sneller convergeren dan hun statische equivalenten$ de spectraalradii van de statische en de dynamische iteraties zijn gelijk als het supremum in (22) wordt bereikt in z = i = 0. In het proefschrift worden verder nog enkele speci eke resultaten afgeleid voor de Jacobi- en Gauss-Seidel-golfvormrelaxatievarianten. Ook wordt het geval waarin er convergentie is op het eindige interval en divergentie op het oneindige interval becommentarieerd. ;
Het discrete geval
De golfvormrelaxatieoperator en zijn symbool Iteratie (17) kan worden herschreven als u( ) = K u( 1) + ' . Om de aard van de discrete golfvormrelaxatieoperator K te bepalen, herschrijven we (17) in termen van e( )n] := u( )n] ; un], met un] de exacte oplossing van (13) na discretisatie met de lineaire multistapmethode. Daar we niet itereren op de k startwaarden mogen we zonder verlies aan algemeenheid stellen dat e( )n] = e( 1)n] = 0 n < k. Na combinatie van de eerste N vergelijkingen (de vergelijkingen voor de onbekenden op tijdstappen k : : : N + k ; 1), bekomen we ;
;
E( ) = G E(
1)
;
(23)
met E ( ) = e( )k] e( )k + 1] : : : e( )N + k ; 1] t en G een blok-Toeplitz-onderdriehoeksmatrix. K is bijgevolg een discrete convolutieoperator in de ruimte van rijen van lengte N . De j -de component van de kern k is gelijk aan de (constante) matrix op de j -de benedendiagonaal van G. Verder hebben we de Z -getransformeerde nodig van de kern k . Als e( )(z) = Z (e( )), bekomen we e( ) (z ) = K (z ) e( 1) (z ) met K (z) = (a(z)MB + b(z)MA) 1 (a(z)NB + b(z)NA) het symbool van de operator K . Als we deze formule vergelijken met (19), bekomen we de volgende relatie: 1 a K (z) = K b (z) : (24) ;
;
Convergentie op eindige tijdsintervallen Toepassing van Lemma 2.4 op operator K levert onmiddellijk de volgende stelling.
NEDERLANDSE SAMENVATTING
xx
Stelling 3.3 Beschouw K als een operator in lp(N ) met 1 p 1 en N eindig, en
veronderstel dat aan de discrete oplossingsvoorwaarde (18) is voldaan. Dan is K een begrensde operator en 1 k : (25)
(K ) = K
k
Convergentie op oneindige tijdsintervallen We kunnen bewijzen dat k 2 l1(1) als (;MB 1MA ) int S . Als we Lemma 2.5 dan toepassen op operator K , verkrijgen we dat ;
1 a (z ) :
(K ) = max
( K (z )) = max K z 1 z 1 b We kunnen deze uitdrukking herschrijven door gebruik te maken van de de nitie van het stabilititeitsgebied, en af te leiden dat n o C n int S = ab (z) : jzj 1 : Meer bepaald bekomen we de volgende stelling. j j
j j
Stelling 3.41 Beschouw K als een operator in lp(1) met 1 p 1, en veronderstel dat (;MB MA ) int S . Dan is K een begrensde operator en ;
(K ) = sup f (K(z)) : z 2 C n int S g = sup (K(z)) : z @S 2
(26)
Discrete versus continue resultaten
Zowel voor eindige als oneindige tijdsintervallen is het eenvoudig aan te tonen dat de discrete spectraalradiusformules convergeren naar hun continue equivalenten voor steeds kleiner wordende , i.e., lim (K ) = (K) : 0 !
Verder bewijzen we de volgende stelling voor A()-stabiele lineaire multistapmethoden.
Stelling 3.5 Beschouw K als een operator in lp(1) en K als een operator in Lp(0 1)
met 1 p 1. Veronderstel dat de lineaire multistapmethode A()-stabiel is en dat (;MB 1MA) '. Dan geldt ;
(K ) supc (K(z)) = supc (K(z)) z 2
z @ 2
(27)
met 'c := C n ' = fz : j Arg(z)j ; g.
Gevolg 3.6 Beschouw K als een operator in lp(1) en K als een operator in Lp(0 1) met 1 p 1. Veronderstel dat de lineaire multistapmethode A-stabiel is en dat alle eigenwaarden van MB 1 MA een positief reeel deel hebben. Dan geldt (K ) (K). ;
NEDERLANDSE SAMENVATTING
xxi
3.3 Modelprobleemanalyse
Beschrijving van het modelprobleem In deze samenvatting zullen we de theoretische convergentieresultaten illustreren en veri %eren aan de hand van de eendimensionale warmtevergelijking
@ u ; @ u = 0 x 2 (0 1) t > 0 (28) @t @x2 ruimtelijk gediscretiseerd op een rooster #h = fxi = ih : 0 i 1=hg met behulp van lineaire eindige elementen. Na het in rekening brengen van de beginvoorwaarde u(x 0) = u0(x) en de Dirichlet-randvoorwaarden, levert dit een beginwaardeprobleem (13) met
B = h6 1 4 1 en A = h1 ;1 2 ;1 de stencils van respectievelijk de massa- en de stijfheidsmatrix. In het proefschrift analyseren we ook de twee- en driedimensionale varianten van het bovenvermelde probleem. We bekijken verschillende ruimtelijke discretisaties met zowel eindigedierentie- als eindige-elementenmethoden.
Theoretische resultaten Het continue geval Vooreerst zullen we de spectraalradii van de continue Jacobi- en Gauss-Seidel-golfvormrelaxatiemethoden berekenen voor eindige tijdsintervallen.
Stelling 3.7 Beschouw de eendimensionale warmtevergelijking (28), ruimtelijk gediscretiseerd met behulp van lineaire eindige elementen. Als we KJAC en KGS beschouwen als operatoren in C 0 T ], geldt
(KJAC ) = 21 cos(h) en (KGS ) = 14 cos2(h) :
(29)
Op oneindige tijdsintervallen daarentegen blijken de golfvormrelaxatiemethoden even snel te zijn als hun statische equivalenten.
Stelling 3.8 Beschouw de eendimensionale warmtevergelijking (28), ruimtelijk gediscretiseerd met behulp van lineaire eindige elementen. Als we KJAC en KGS beschouwen als operatoren in Lp(0 1) met 1 p 1, geldt en
(KJAC ) = (KJAC(0)) = cos(h) 1 ; 2h2=2
(30)
(KGS ) = (KGS(0)) = cos2(h) 1 ; 2h2 :
(31)
NEDERLANDSE SAMENVATTING
xxii
Het discrete geval Om het convergentiegedrag te voorspellen van een eigenlijke implementatie, analyseren we de discrete Gauss-Seidel-golfvormrelaxatieoperatoren voor verschillende tijdsdiscretisatieformules met = 1=100. We concentreren ons in het bijzonder op het bovenvermelde modelprobleem met h = 1=16, waarvoor de spectraalradii van de resulterende operatoren op eindige en oneindige intervallen kunnen worden berekend met behulp van formules (25) en (26), zie Tabel 3.3 (p. 42). We kunnen deze resultaten ook gra sch interpreteren door middel van een zogenaamde spectrale guur, zie Figuur 3.5 (p. 42). Hierin worden contourlijnen geplot van de functie
(KGS(z)) in een gebied van het complexe vlak rond de oorsprong. De resultaten op eindige intervallen kunnen dan worden benaderd door de waarde van de vorige functie te bepalen in het punt 1 kk , terwijl de spectraalradii op oneindige intervallen kunnen worden geschat door het maximum te nemen van (KGS(z)) over de rand het gescaleerde stabiliteitsgebied. We observeren convergentie voor de Crank-Nicolson-methode en de lage-orde achterwaartsedierentiemethoden. Ook merken we op dat de spectraalradius grotere waarden aanneemt bij toenemende orde van de achterwaartsedierentieformules. Dit kon worden verwacht van Stelling 3.5 en de kennis dat deze laatste methodes A()stabiel zijn, met een die kleiner wordt als de orde van de methode stijgt.
Numerieke resultaten
In Tabellen 3.4 en 3.6 (p. 44) vindt men de resultaten van enkele experimenten voor ons modelprobleem, waarbij we ons beperken tot het Gauss-Seidel-geval. We merken op dat de geobserveerde convergentiefactoren sterk overeenkomen met de theoretisch afgeleide spectraalradii op oneindige intervallen, ondanks het feit dat het tijdsinterval in de numerieke experimenten vanzelfsprekend eindig is. Dit gedrag is te wijten aan de niet-normaliteit van de golfvormrelaxatieoperatoren op eindige intervallen. Een theoretische verklaring in termen van de pseudospectra van de operatoren vindt men in 60]$ voor een intu%&tieve verklaring van dit fenomeen verwijzen we naar het proefschrift.
4 Golfvormrelaxatiemethoden en successieve overrelaxatie 4.1 Beschrijving van de methode Het continue geval
De meest logische manier om een SOR-golfvormrelaxatiemethode te de ni%eren voor (13) is gebaseerd op de natuurlijke uitbreiding van de statische SOR-procedure voor stelsels algebra%&sche vergelijkingen, die o.m. wordt beschreven in 4, 24, 98, 106]. Eerst wordt een functie u^(i ) berekend met een Gauss-Seidel-golfvormrelaxatieschema i 1 d X X () ( ) ( ) ( ) _ biiu^i + aiiu^i = ; bij u_ j + aij uj ; bij u_ (j ;
j =1
j =i+1
;
1) + a u( 1) ij j ;
+ fi (32)
NEDERLANDSE SAMENVATTING
xxiii
waarbij u^(i )(0) = (u0)i. Daarna wordt de oude benadering u(i 1) ge%updatet door de correctie u^(i ) ; u(i 1) te vermenigvuldigen met een scalaire overrelaxatieparameter !, ;
;
u(i )
= u(i 1) + ! ;
u^(i ) ; u(i 1) ;
:
(33)
De eerste stap van de convolutie-SOR- of CSOR-golfvormrelaxatie methode is gelijk aan (32). In plaats van de bovenvermelde correctie te vermenigvuldigen met een scalar !, convolueren we de correctie met een tijdsafhankelijke functie # 82],
u(i ) = u(i
;
1)
+ # u^(i ) ; u(i
1)
;
:
(34)
We veronderstellen dat deze convolutiekern van de vorm #(t) = ! (t) + !c (t)
(35)
is, met ! een scalaire parameter, (t) de deltafunctie en !c 2 L1. In dit geval kunnen we de hierboven eerst ingevoerde standaard-SOR-golfvormrelaxatie methode behandelen als een speciaal geval van de CSOR-methode door !c(t) 0 in te vullen. We zullen deze methode ook de dubbele-splitsing-SOR- of DSSOR-methode noemen, daar eliminatie van u^(i ) uit (32) en (33) een iteratief schema levert van de vorm (14) met (MB NB ) en (MA NA) de SOR-matrix-splitsingen van B en A. Ook een derde variant, nl. de enkele-splitsing-SORof SSSOR-methode, wordt behandeld in het proefschrift.
Het discrete geval
De eerste stap van het discrete CSOR-golfvormrelaxatiealgoritme bekomen we door (32) te discretiseren in de tijd met een lineaire multistapmethode. Dit geeft k X 1 l=0
i 1X k X 1 ( ) () l bii + l aii u^i n + l] = ; l bij + laij uj n + l] j =1 l=0 db X k k X X 1 ( 1) ; lbij + laij uj n + l] + lfin + l] : j =i+1 l=0 l=0 ;
(36)
;
De tweede stap benadert de convolutie-integraal in (34) door een convolutiesom met een discrete kern # , (u(i )) = (u(i
;
1) )
+ # (^u(i )) ; (u(i
1) )
;
:
(37)
We veronderstellen dat # = ! + (!c ) , met = f1 0 0 : : :g de discrete deltafunctie 75, p. 409] en (!c) een l1-rij, zodat we de discrete standaard-SOR-golfvormmethode bekomen voor !c n] 0. Tenslotte vermelden we dat de discrete oplossingsvoorwaarde (18) in het geval van (36) kan worden geschreven als k 62 (;D 1D ) : (38) B A k ;
NEDERLANDSE SAMENVATTING
xxiv
4.2 Convergentieanalyse Het continue geval
De convolutie-SOR-golfvormrelaxatieoperator en zijn symbool Het CSOR-golfvormrelaxatiealgoritme de nieert impliciet een klassiek iteratieschema u( ) = KCSORu( 1) + 'CSOR. De vorm van de continue convolutie-SOR-golfvormrelaxatieoperator KCSOR wordt beschreven in het volgende lemma. ;
Lemma 4.1 De continue convolutie-SOR-golfvormrelaxatieoperator is van de vorm KCSOR = K CSOR + KcCSOR
(39)
met K CSOR 2 C d d en KcCSOR een lineaire Volterra-convolutieoperator, wiens kern kcCSOR 2 L1(0 1) als alle eigenwaarden van DB1 DA een positief reeel deel hebben en # van de vorm (35) is met !c 2 L1(0 1).
;
Het symbool KCSOR(z) van de continue CSOR-golfvormrelaxatieoperator wordt bekomen via Laplace-transformatie van (32){(34). Dit levert de vergelijking ue( )(z) = KCSOR(z)ue( 1)(z) + 'e(z), met ;
!
!! 1
KCSOR(z) = z e 1 DB ; LB + e 1 DA ; LA #(z) #(z)
! !! e z) e z) 1 ; #( 1 ; #( z e z ) DB + UB + e z ) DA + UA #( #( ;
e z ) = ! + !ec (z ). en #( Convergentie op eindige tijdsintervallen
Stelling 4.2 Beschouw KCSOR als een operator in C 0 T ]. Dan is KCSOR een begrensde operator en
;
;
;
KCSOR = KCSOR(1) = K CSOR : Als # van de vorm (35) is met !c 2 L1 (0 T ), geldt
(40)
1 ! ; CSOR 1 ; !D + U
K = 1 DB ; LB : B B ;
!
!
(41)
Convergentie op oneindige tijdsintervallen
Stelling 4.3 Beschouw KCSOR als een operator in Lp(0 1) met 1 p 1. Veron-
derstel dat alle eigenwaarden van DB1 DA een positief reeel deel hebben, en zij # van de vorm (35) met !c 2 L1 (0 1). Dan is KCSOR een begrensde operator en ;
;
;
(KCSOR) = sup KCSOR(z) = sup KCSOR(i) : Re(z) 0
R
2
(42)
NEDERLANDSE SAMENVATTING
xxv
Merk op dat #e (z) de Laplace-getransformeerde moet zijn van een functie van de vorm e z ) een begrensde (35) met !c 2 L1(0 1). Een voldoende voorwaarde hiervoor is dat #( en analytische functie is in een open gebied dat het gesloten rechtse complexe halfvlak omvat 41, Prop. 2.3]. De Laplace-getransformeerde van de optimale convolutiekern In klassieke SOR-theorie steunt de bepaling van een optimale overrelaxatieparameter op een zogenaamde Young-relatie, i.e., een relatie tussen de eigenwaarden van de Jacobien SOR-iteratiematrices. Als we veronderstellen dat de matrix zB + A blokconsistent geordend 106, p. 445, Def. 3.2] is, bestaat er ook zo'n relatie tussen de eigenwaarden van de Jacobi- en CSOR-symbolen. Meer bepaald bekomen we dat (z), gede nieerd door p
(z) + #e (z) ; 1 = (z)#e (z)(z)
(43)
een eigenwaarde is van KCSOR(z) als (z) 2 (KJAC(z)). e z ) die de spectraalHet volgende lemma bepaalt de optimale (complexe) waarde #( radius van KCSOR(z) minimiseert. Het resultaat volgt onmiddellijk uit complexe SORtheorie, zie b.v. 46, Thm. 4.1] of 70, Eq. (9.19)]. Het werd herontdekt in 82, Thm. 5.2], waar het werd gepresenteerd in een golfvormrelaxatiecontext voor stelsels (13) met B = I .
Lemma 4.4 Veronderstel dat zB + A een blokconsistent geordende ; JAC matrix is met nietsinguliere diagonaalblokken. Veronderstel dat het spectrum K (z ) op een lijnsegment ;1 (z) 1(z)] ligt met 1(z ) 2 C n f(;1 ;1] 1 1)g. De spectraalradius van KCSOR(z) is dan minimaal voor #e opt(z) =
2 1 + 1 ; 21(z)
(44)
p
p met de wortel met positief reeel deel. In het bijzonder geldt ;
KCSORopt(z) = j#e opt(z) ; 1j < 1 :
(45)
In het proefschrift worden vervolgens de spectraalradii bestudeerd van de verschillende (optimale) SOR-golfvormmethoden, toegepast op stelsels (13) met B = I . Ook wordt voor zulke problemen een expliciete formule voor de optimale convolutiekern afgeleid.
Het discrete geval
De convolutie-SOR-golfvormrelaxatieoperator en zijn symbool () = Het discrete CSOR-schema kan worden geschreven in expliciete vorm als u KCSOR u( 1) + ' . De precieze vorm van de discrete convolutie-SOR-golfvormrelaxatieoperator wordt hieronder bepaald. ;
Lemma 4.5 De discrete convolutie-SOR-golfvormrelaxatieoperator KCSOR is een discrete convolutieoperator, wiens kern kCSOR 2 l1(1) als (;DB1 DA ) int S en # 2 l1(1). ;
NEDERLANDSE SAMENVATTING
xxvi
Z -transformatie van het discrete CSOR-golfvormschema levert ue( )(z) = KCSOR (z)ue( 1)(z) + 'e (z) met KCSOR (z) het symbool van KCSOR . Dit symbool is ge geven door ;
KCSOR(z) =
!
!! 1
1 a (z) 1 D ; L + 1 D ; L B B A A b #e (z) #e (z) ! !!
e e 1 a (z) 1 ; # (z) D + U + 1 ; # (z) D + U B B A A b #e (z) #e (z) ;
met #e (z) = ! + (!ec ) (z). Convergentie op eindige tijdsintervallen
Stelling 4.6 Beschouw KCSOR als een operator in lp(N ) met 1 p 1 en N eindig, en veronderstel dat aan de discrete oplossingsvoorwaarde (38) is voldaan. Dan is KCSOR een begrensde operator en
(KCSOR) = (KCSOR (1)) :
(46)
Convergentie op oneindige tijdsintervallen
Stelling 4.7 Beschouw KCSOR als een operator in lp(1) met 1 p 1. Veronderstel dat (;DB1 DA ) int S en # 2 l1(1). Dan is KCSOR een begrensde operator en ;
(KCSOR ) = max
(KCSOR (z)) = max
(KCSOR (z)) : z 1 z =1 j j
j j
(47)
De eis dat #e (z) de Z -getransformeerde is van een l1-kern # volgt onmiddellijk uit de veronderstelling dat #e (z) een begrensde en analytische functie is in een open domein dat het gebied fz 2 C : jzj 1g omvat. Strengere voorwaarden vindt men in 27, p. 71]. De Z -getransformeerde van de optimale convolutiekern Het volgende lemma is het discrete equivalent van Lemma 4.4.
Lemma 4.8 Veronderstel dat
1a b (z )B
+ A een blokconsistent geordende is met ; JAC matrix niet-singuliere diagonaalblokken. Veronderstel dat het spectrum K (z ) op een lijnsegment ;(1) (z) (1) (z)] ligt met (1) (z ) 2 C n f(;1 ;1] 1 1)g. De spectraalradius van KCSOR (z) is dan minimaal voor (#e opt) (z) = p 2 2 (48) 1 + 1 ; 1(z)
p met de wortel met positief reeel deel. In het bijzonder geldt ;
opt(z ) = j(# e opt ) (z ) ; 1j < 1 :
KCSOR
(49)
NEDERLANDSE SAMENVATTING
xxvii
Discrete versus continue resultaten Spectraalradii Als we veronderstellen dat
a 1 e e (z ) # (z) = #
(50) b voldoen de discrete en continue CSOR-symbolen aan een relatie zoals (24), i.e., 1 a CSOR CSOR K (z) = K b (z) : Bijgevolg kunnen we de spectraalradius van de discrete operator uitdrukken in functie van het symbool van de continue operator, zoals in Stellingen 3.3 en 3.4. Stelling 4.9 Beschouw KCSOR als een operator in lp(N ) met 1 p 1 en N eindig. Veronderstel dat aan de discrete oplossingsvoorwaarde (38) is voldaan en dat (50) geldig is voor z = 1. Dan geldt 1 k CSOR CSOR : (51)
(K )= K
k Stelling 4.10 Beschouw KCSOR als een operator in lp(1) met 1 p 1. Veronderstel dat (;DB1 DA ) int S , # 2 l1 (1) en dat (50) geldig is voor jz j 1. Dan geldt
(KCSOR) = supf (KCSOR(z)) : z 2 C n int S g = sup (KCSOR(z)) : (52) ;
z @S 2
Onder de nodige voorwaarden kunnen we vervolgens aantonen dat ; CSOR ; CSOR lim
K =
K 0 en dit zowel voor eindige als oneindige tijdsintervallen. Ook de CSOR-varianten van Stelling 3.5 en Gevolg 3.6 kunnen op een eenvoudige manier worden bewezen. Tenslotte vermelden we dat (50), en dus ook de bovenvermelde resultaten, niet altijd van toepassing zijn. Het is echter eenvoudig aan te tonen dat (50) wel geldig is voor de standaard-SOR-methode met dubbele splitsing en de optimale CSOR-golfvormmethode. !
Optimale convolutiekernen We kunnen ook een verband a eiden tussen de optimale continue en discrete convolutiekernen. Hiertoe herschrijven we (34) en (37) respectievelijk als
u(i )(t) = u(i 1)(t) + ! ;
en
u(i )n] = u(i 1)n] + ! ;
Zt
u^(i )(t) ; u(i 1)(t) ;
+
!c(t ; s)
n X
u^(i )n] ; u(i 1)n] ;
0
+
l=0
u^(i )(s) ; u(i 1)(s) ;
!cn ; l] u^(i )l] ; u(i
;
1)l]
ds :
Het vergelijken van beide uitdrukkingen suggereert dat !c n] een benadering moet zijn van !c(n ) voor kleine . In dit geval benadert de discrete convolutiesom de continue convolutie-integraal als een simpele numerieke integratieregel. In het proefschrift wordt deze intu%&tieve vaststelling bevestigd en in een limietrelatie gegoten voor de optimale kernen voor lineaire stelsels (13) met B = I .
NEDERLANDSE SAMENVATTING
xxviii
Een uitbreiding van de theorie voor meer algemene problemen
Tot nu toe is de toepasbaarheid van Lemma's 4.4 en 4.8 beperkt tot problemen waarvoor de Jacobi-symbolen collineaire spectra hebben. We zullen nu soortgelijke resultaten formuleren voor meer algemene problemen. We beperken ons hierbij tot het discrete geval$ het continue geval kan analoog worden behandeld. Het bewijs van Lemma 4.8 is gebaseerd op klassieke SOR-theorie voor het complexe lineaire stelsel ( 1a(z)=b(z)B + A)u = f . Als we veronderstellen dat de bovenvermelde matrix blokconsistent geordend is, zijn de eigenwaarden (z) van KCSOR (z) verbonden met de eigenwaarden (z) van KJAC ( z ) door een Young-relatie als in (43), p (z) + #e (z) ; 1 = (z)#e (z) (z) : Bijgevolg is de spectraalradius (KCSOR (z)) voor een gegeven #e (z) gelijk aan ;
n
(z)
p
o
max j (z )j : (z ) + #e (z ) ; 1 = (z )#e (z ) (z ) : (KJAC(z))
2
Wanneer de eigenwaarden van KJAC (z ) niet collineair zijn, is het niet eenvoudig om e een optimale # (z) te vinden. Recent werd voor zulke problemen echter een complexe SOR-theorie ontwikkeld door Hu, Jackson en Zhu 32]. Deze auteurs veronderstellen dat de eigenwaarden van KJAC (z ) in een gebied R (z ) = R(p (z ) q (z ) (z )) liggen. Dit gebied is het gesloten inwendige van een ellips met de oorsprong als middelpunt, gegeven door E (z) = E (p (z) q (z) (z)) = : = ei (z) (p (z) cos() + iq (z) sin()) : De halve assen p (z) en q (z) voldoen aan p (z) q (z) 0, de hoek (z) ligt tussen ;=2 en =2 en varieert van 0 tot 2 . De spectraalradius (KCSOR (z)) is dan duidelijk CSOR begrensd door zijn virtuele equivalent, R(K (z)), gede nieerd in termen van #e (z) als n o p max j (z )j : (z ) + #e (z ) ; 1 = (z )#e (z ) (z ) : (z) R (z)
2
Bovendien kan de parameter (#e R) (z) worden bepaald die deze bovengrens minimiseert voor een gegeven ellips 32, Thm. 1]. Lemma 4.11 Veronderstel dat 1a(z)=b(z)B + A een blokconsistent geordende matrix is met niet-singuliere diagonaalblokken. Veronderstel dat het spectrum (KJAC (z )) in het gebied R (z) ligt, dat het punt 1 niet bevat. In termen van 2 (#e R) (z) = p (53) 2 1 + 1 ; (p (z) ; q2(z))ei2 (z) p met de wortel met positief reeel deel, geldt ; R (z ) ;KCSOR (z )
R KCSOR (54) R waarbij de laatste uitdrukking mag worden voorzien van eender welke #e (z ). In het bijzonder, 2 ; CSORR p ( z ) + q ( z ) (z) = j(#e R) (z)j
R K <1: (55) 2 ;
NEDERLANDSE SAMENVATTING
xxix
Het is verder eenvoudig in te zien dat ; opt(z ) ;KCSORR (z ) ;KCSORR (z )
KCSOR (56) R voor eender welk elliptisch gebied dat (KJAC (z )) bevat. Het volgende opmerkelijke resultaat uit 32, x3] toont aan dat er wel degelijk een ellips bestaat waarvoor deze bovengrens wordt bereikt. Lemma 4.12 Er bestaat een optimale ellips (die het spectrum (KJAC (z )) omvat) waare e voor (#R) (z) = (#opt ) (z ), en ; opt (z ) = ;KCSORR (z ) = ;KCSORR (z ) :
KCSOR (57) R Bovendien bevat zo'n optimale ellips steeds een eigenwaarde van KJAC (z ). Als we Lemma 4.11 willen toepassen om de optimale convolutierij (#opt) te berekenen, moeten we de optimale ellips bepalen voor verschillende waarden van z. Wanneer de eigenwaarden van het Jacobi-symbool op een rechte liggen is er een eenvoudige oplossing voor dit probleem, zie Lemma 4.8. Als de eigenwaarden van KJAC (z ) niet collineair zijn, is zelfs het zoeken van een \goede" ellips (die het spectrum van het Jacobi-symbool omvat en waarvoor de grens (56) redelijk scherp is) een zeer moeilijke opgave.
4.3 Modelprobleemanalyse Theoretische resultaten
Het continue geval In de volgende analyse beschouwen we de DSSOR- en CSOR-golfvormrelaxatiemethoden voor de eendimensionale warmtevergelijking, gediscretiseerd met lineaire eindige elementen. Merk op dat de spectra van de resulterende Jacobi-symbolen voor dit probleem collineair zijn. Om voor de hand liggende redenen zullen we ons beperken tot het oneindige tijdsinterval. Stelling 4.13 Beschouw de eendimensionale warmtevergelijking (28), ruimtelijk gediscretiseerd met lineaire eindige elementen. Veronderstel dat 0 < ! < 2. Als we KDSSOR beschouwen als een operator in Lp (0 1) met 1 p 1, geldt 8 q 1 2 > > 1 ; ! + 2 (!1 ) + !1 1 ; ! + 41 (!1)2 > > <
1 + 83 p! > (! ; 1) 1 ; 3 p > > : 8 !
(KDSSOR) = >
p
!1 1+ 18 !2 21 !1 1+ 18 !2 21
;
! !d ! > !d
(58)
;
met 1 = cos(h) en !d = (8 ; 4 4 ; 21)=21 . Bovendien is !opt > !d . Tabel 4.4 (p. 75) toont enkele waarden van !opt$ de corresponderende spectraalradii voldoen blijkbaar aan een relatie van de vorm (KDSSOR!opt ) 1 ; O(h2). De spectraalradius van de CSOR-golfvormoperator met optimale kern wordt berekend in de volgende stelling. In tegenstelling tot de spectraalradius van de optimale DSSORoperator is ze gelijk aan de spectraalradius van de optimale statische SOR-methode voor het lineaire stelsel Au = f met A de gediscretiseerde Laplace-operator.
NEDERLANDSE SAMENVATTING
xxx
Stelling 4.14 Beschouw de eendimensionale warmtevergelijking (28), ruimtelijk gedis-
cretiseerd met lineaire eindige elementen. Als we KCSORopt beschouwen als een operator in Lp (0 1) met 1 p 1, geldt
(KCSORopt) = (KCSORopt(0)) =
cos2(h) 2 1 ; 2h : p 2 1 + 1 ; cos (h)
(59)
Het discrete geval We analyseren het gebruik van verschillende tijdsdiscretisatieformules voor ons modelprobleem met h = 1=16, en dit zowel voor de DSSOR- als CSOR-iteratie. De spectrale guren voor beide methodes vindt men in Figuren 4.8 en 4.9 (p. 77)$ de berekende waarden van de spectraalradii worden gerapporteerd in Tabel 4.5 (p. 78).
Numerieke resultaten Praktische benadering van een geschikte convolutiekern In het proefschrift bediscussi%eren we de praktische bepaling van een geschikte convolutierij (#num ) die kan gebruikt worden in de implementatie van een discreet CSOR-algoritme. We doen dit eerst voor stelsels (13) met B = I . We vergelijken verscheidene manieren om een geschikte convolutierij te bepalen, waaronder methoden gebaseerd op de limietrelatie tussen continue en discrete optimale kern, en een methode die de inverse Z transformatie van (#e opt) (z) berekent met behulp van een snelle Fourier-transformatie. Deze laatste methode geeft de meest betrouwbare resultaten. Ze kan ook worden uitgebreid naar algemenere matrices B en zal gebruikt worden in al onze numerieke experimenten. De methode vraagt echter de berekening/benadering van (#e opt) (z) voor verscheidene punten (op de eenheidscirkel), wat een zeer moeilijk probleem kan zijn. Daarom vermelden we ook een automatische procedure ter bepaling van de rij (#num ) , ontwikkeld door Reichelt 82, x6]. Hierin gebruikt men { ongeacht of de eigenwaarden van KJAC (z ) e collineair zijn of niet { het rechterlid van (48) als benadering van (#opt) (z) voor enkele speci eke waarden van z. De resulterende getallen worden benaderd door een rationale functie van lage-orde veeltermen in z 1, waarvan tenslotte de inverse Z -transformatie wordt berekend. Punt- en lijnrelaxatie In Tabellen 4.10 en 4.11 (p. 82) worden de geobserveerde convergentiefactoren van (puntsgewijze) DSSOR- en CSOR-golfvormrelaxatie gegeven voor ons modelprobleem. De resultaten komen sterk overeen met de theoretisch afgeleide spectraalradii. Tenslotte bekijken we in het proefschrift nog enkele problemen waarvoor de Jacobisymbolen KJAC (z ) geen collineair spectrum hebben. We tonen aan dat de numerieke CSOR-resultaten, bekomen met een lijngeori%enteerde relaxatie en een convolutierij (#num ) die berekend werd via de niet-geldige formule (48), zeer dicht bij de \optimale" convergentieresultaten liggen. We observeren dus een zekere robuustheid van de CSOR-golfvormrelaxatiemethode. ;
NEDERLANDSE SAMENVATTING
xxxi
5 Chebyshev-versnelling van golfvormrelaxatiemethoden 5.1 Veeltermversnelling van golfvormrelaxatie
Veronderstel dat we een rij iteratiefuncties fx(i)g hebben, gegenereerd door een klassieke golfvormrelaxatiemethode van de vorm (14),
MB x_ ( ) + MA x( ) = NB x_ ( 1) + NAx( 1) + f x( )(0) = u0 1 : (60) Een eerste manier om de convergentie van deze rij naar de oplossing van (13) te versnellen is een onmiddellijke uitbreiding van de lineaire versnellingstechniek voor statische iteraties, zie b.v. 24, Chap. 3] of 106, Chap. 11]. Meer bepaald construeren we een nieuwe rij functies fu(i)g door X u( ) = i x(i) 0 (61) ;
;
i=0
te stellen. We kunnen (61) herschrijven als u( ) = q (K) u(0) + 'q P
in termen van q (s) = i=0 isi. Deze veelterm voldoet aan een normalisatievoorwaarde, i.e., q (1) = 1. Het symbool van de iteratieoperator q (K) kan gevonden worden door Laplace-transformatie van (61), ue( )(z) = q (K(z)) ue(0)(z) + 'eq (z) : Deze uitdrukking komt overeen met de veeltermversnelling van de iteratie
xe( )(z) = K(z)xe( 1)(z) + 'e(z) (62) bekomen door Laplace-transformatie van (60), naar de oplossing van (20). We kunnen Lemma 2.3 dan toepassen en bekomen
(q (K)) = sup (q (K(z))) = sup (q (K(z)) : (63) ;
Re(z) 0
z iR 2
De waarden (q (K)) en (q (K(z))) geven convergentiefactoren over iteraties. De overeenkomstige asymptotisch gemiddelde spectraalradii zijn gede nieerd door % (q (K)) := lim ( (q (K)))1= en % (q (K(z))) := lim ( (q (K(z))))1= !1
!1
zie b.v. 106, p. 299]. Gecombineerd met (63) leiden deze de nities onmiddellijk tot
% (q (K)) = sup % (q (K(z))) = sup % (q (K(z))) : Re(z) 0
z iR 2
(64)
De rij veeltermen fq (s) : q (1) = 1g die (64) minimiseert werd bestudeerd in 68]. Er werd aangetoond dat enkel een marginale versnelling mogelijk is$ betere resultaten kunnen worden verwacht met een rij z-afhankelijke veeltermen.
NEDERLANDSE SAMENVATTING
xxxii
5.2 Convolutiegebaseerde veeltermversnelling van golfvormrelaxatie
Chebyshev-versnelling in het frequentiedomein
De convergentie van de Laplace-getransformeerde iteratiefuncties fxe(i)(z)g kan worden verbeterd door lineaire combinaties te nemen van deze functies, i.e.,
ue( )(z) =
X i=0
)e i (z) xe(i)(z) 0 :
(65)
Terwijl de notatie in vergelijking (61) aanduidt dat dezelfde rij veeltermen wordt gebruikt voor alle lineaire stelsels van de vorm (20), suggereert (65) een verschillende keuze van co%eci%enten voor elke speci eke waarde van z. Door herhaaldelijk (62) in te vullen, bekomen we ue( )(z) = Q (z K(z)) ue(0)(z) + 'eQ (z) (66) P e met Q (z s) = i=0 ) i(z)si en Q (z 1) = 1. De spectraalradius van de iteratiematrix in (66) is gegeven door
(Q (z K(z))) = max jQ (z )j : (67) (K(z)) 2
Daar het spectrum van K(z) zelden exact gekend is, gaan we op zoek naar veeltermen fQ (z s)g die klein zijn op een gebied dat (K(z )) bevat. Meer bepaald veronderstellen we dat de eigenwaarden van K(z) in een gesloten gebied R(z) = R(d(z) p(z) q(z) (z)) liggen, waarvan de rand gelijk is aan de ellips E (z) = E (d(z) p(z) q(z) (z)), gecentreerd rond het complexe punt d(z). Deze ellips is gegeven door : = d(z) + ei (z)(p(z) cos() + i q(z) sin()) 0 < 2 met halve assen p(z) en q(z) die voldoen aan p(z) q(z) 0 en ;=2 (z) < =2. De spectraalradius (67) is dan duidelijk begrensd door zijn virtuele equivalent, gede nieerd als
R (Q (z K(z))) := max jQ (z )j : R(z) 2
Om de gemiddelde convergentiefactor per iteratie te bepalen, de ni%eren we ook de virtuele asymptotisch gemiddelde spectraalradius als %R (Q (z K(z))) := lim ( R (Q (z K(z))))1= : !1
Deze radius kan worden geminimiseerd door fQ (z s)g te kiezen als een rij van gescaleerde en getransleerde Chebyshev-veeltermen van de eerste soort (in de variabele s) 61].
Lemma 5.1 Veronderstel dat het spectrum (K(z)) in het gebied R(z) ligt, dat het punt
1 niet bevat en waarvoor p(z) > q(z) > 0. In termen van
P R(z s) =
T T
s d(z) c(z) 1 d(z) c(z) ;
;
(68)
NEDERLANDSE SAMENVATTING
xxxiii
met graad Chebyshev-veelterm van de eerste soort en c(z ) = p 2 T () 2 de i -de ( z ) p (z) ; q (z)e , geldt ;
%R P R(z K(z)) %R (Q (z K(z))) voor alle rijen veeltermen fQ (z s) : Q (z 1) = 1g. In het bijzonder, ; p(z) + q(z) %R P R(z K(z)) = p 2 2 1 ; d(z ) + (1 ; d(z )) ; c (z )
(69) (70)
p p met de tak voor gekozen zodat (1 ; d(z ))2 = 1 ; d(z ).
We kunnen analoge resultaten bewijzen voor de gedegenereerde gevallen van het gebied R(z). Wanneer (K(z)) op een lijnsegment R(d(z) p(z) 0 (z)) = E (d(z) p(z) 0 (z)) ligt, hebben we het volgende lemma.
Lemma 5.2 Als (K(z)) op het lijnsegment R(d(z) p(z) 0 (z)) ligt dat het punt 1 niet bevat, blijven de resultaten van Lemma 5.1 geldig in termen van q (z ) = 0 en c(z ) = p(z)ei (z).
Vervolgens bekijken we het probleem van de bepaling van een optimale ellips, i.e., een ellips waarvoor (70) zo klein mogelijk is.
Lemma 5.3 Elke optimale ellips Eopt(z) bevat een eigenwaarde van K(z). Als (K(z)) collineair is, dan is de optimale ellips Eopt (z ) gelijk aan het lijnsegment tussen de extreme eigenwaarden van K(z).
Gevolg 5.4 Voor elke optimale ellips wordt de virtuele asymptotisch gemiddelde spectraalradius echt bereikt, i.e., er geldt ;
;
% P Ropt (z K(z)) = %Ropt P Ropt (z K(z)) :
(71)
De bovenstaande resultaten geven niet genoeg informatie om deze ellips te bepalen in het geval dat (K(z)) niet collineair is. Zelfs het zoeken naar een \goede" ellips, waarvoor (70) dicht bij zijn minimale waarde ligt, is een zeer moeilijk probleem. Tenslotte vermelden we dat iteratie (65) kan herschreven worden in een vorm die geschikter is voor een praktische implementatie als )e i (z) de co%eci%enten zijn van de veeltermen (68).
Convolutiegebaseerde Chebyshev-versnelling in het tijdsdomein
Door inverse Laplace-transformatie van het z-afhankelijk iteratieschema (65), verkrijgen we X u( ) = ) i x(i) 0 (72) i=0
P
met als normalisatievoorwaarde dat i=0 ) i (t) = (t). Deze methode wordt een convolutiegebaseerde veelterm-golfvormrelaxatie methode genoemd. Als de veeltermen
NEDERLANDSE SAMENVATTING
xxxiv
Qn(z s) van het Chebyshev-type zijn, kunnen we opnieuw een eenvoudiger te implementeren schema a eiden. Onder de juiste voorwaarden kunnen we dan bewijzen dat u( ) = KCH u(0) + 'CH met KCH een operator die bestaat uit een matrixvermenigvuldiging en een convolutie met een L1(0 1)-kern. Lemma 2.3 impliceert dat
(KCH ) = sup (P R (z K(z))) = sup (P R(z K(z))) : z iR
Re(z) 0
2
De corresponderende asymptotisch gemiddelde spectraalradius, die gelijk is aan %(KCH ) = sup %(P R (z K(z))) z iR 2
(73)
kan dan worden begrensd door zijn virtuele equivalent pp (z) + q(z) : (74) sup %R(P R(z K(z))) = sup z iR z iR j1 ; d(z ) + (1 ; d(z ))2 ; c2(z )j De gelijkheid in voorgaande formule volgt uit Lemma 5.1 en de veronderstelling dat de ellipsen R(z) het punt 1 niet bevatten. Bovendien is (73) gelijk aan (74) als deze ellipsen optimaal gekozen zijn. In het vervolg van dit hoofdstuk zullen we vaak de subscript weglaten in de notatie van de operator KCH , die de (convolutiegebaseerde) Chebyshevgolfvormrelaxatieoperator wordt genoemd. 2
2
5.3 Modelprobleemanalyse
In het proefschrift starten we de modelprobleemanalysen met een theoretisch onderzoek naar de Chebyshev-Picard-methode voor de vergelijking (28), gediscretiseerd met eindige dierenties. We bespreken ook het verband tussen de Chebyshev-Picard-methode en de verschoven Picard-iteratie, ingevoerd door Lubich en Skeel 54, 85]. In deze samenvatting beperken we ons tot de convolutie-Chebyshev-versnelling van de Jacobi- en Gauss-Seidel-golfvormrelaxatiemethoden, en dit voor ons modelprobleem.
Chebyshev-Jacobi-golfvormrelaxatie
De volgende stelling toont aan dat de spectraalradius van de Chebyshev-Jacobi-golfvormrelaxatiemethode gelijk is aan de spectraalradius van de corresponderende statische iteratie voor ons modelprobleem. Stelling 5.5 Beschouw de eendimensionale warmtevergelijking (28), ruimtelijk gediscretiseerd met behulp van lineaire eindige elementen. Als we KCH JACopt beschouwen als een operator in Lp (0 1) met 1 p 1, geldt ; ; % KCH JACopt = % P Ropt (0 KJAC(0)) = pcos(h) 2 1 ; h : (75) 1 + 1 ; cos (h) Als we dit resultaat vergelijken met de resultaten uit Stellingen 3.8 en 4.14, merken we op dat de Chebyshev-Jacobi-golfvormmethode veel sneller is dan de niet-versnelde Jacobi-variant, maar slechts half zo snel als de optimale CSOR-methode. ;
;
NEDERLANDSE SAMENVATTING
xxxv
Chebyshev-Gauss-Seidel-golfvormrelaxatie Het Gauss-Seidel-equivalent van Stelling 5.5 vindt men hieronder. Het bewijs ervan steunt o.m. op het feit dat de spectra van de Gauss-Seidel-symbolen KGS(z) collineair zijn. Opnieuw is de convolutiegebaseerde Chebyshev-methode even snel als zijn statische tegenhanger. Bovendien is de Chebyshev-Gauss-Seidel-golfvormrelaxatiemethode even snel als het optimale CSOR-algoritme.
Stelling 5.6 Beschouw de eendimensionale warmtevergelijking (28), ruimtelijk gediscretiseerd met behulp van lineaire eindige elementen. Als we KCH operator in Lp(0 1) met 1 p 1, geldt ;
% KCH
;
GSopt = % ;P Ropt (0 KGS (0)) =
GSopt
;
beschouwen als een
cos2(h) 1 ; 2h : (1 + 1 ; cos2(h))2 p
(76)
Numerieke resultaten In Tabellen 5.2 en 5.3 (p. 107) vinden we de geobserveerde convergentiefactoren van de Chebyshev-Jacobi- en Chebyshev-Gauss-Seidel-golfvormmethoden voor ons modelprobleem. Deze resultaten komen zeer goed overeen met deze uit de continue theorie. De invloed van de tijdsdiscretisatiemethode werd niet besproken. Met de kennis uit de vorige hoofdstukken verwachten we echter dat het convergentiegedrag van de discrete golfvormmethoden kan worden afgeleid van een spectrale guur. We verwijzen hiervoor naar de Figuren 5.6 (p. 102) en 5.9 (p. 105).
6 Multirooster-golfvormrelaxatiemethoden 6.1 Beschrijving van de methode Het continue geval
Multiroostermethoden zijn zeer eci%ente oplossingsmethoden voor elliptische parti%ele dierentiaalvergelijkingen 22, 88, 100]. Het principe van de methoden, gebaseerd op het verwijderen van de sterk oscillerende en zacht verlopende foutencomponenten door respectievelijk smoothing en een ruwroostercorrectie, kan eenvoudig worden uitgebreid naar tijdsafhankelijke problemen door alle operaties op functies te de ni%eren. Hieronder geven we een tweeroostervariant voor (13), bekomen door eindige-elementendiscretisatie van een parabolische vergelijking (5), waarin we golfvormrelaxatie gebruiken als smoother. De methode is gede nieerd op twee geneste roosters #H #h . ( Pre-smoothing. Stel x(0) h = uh
1)
;
, en voer 1 golfvormrelaxatiestappen uit, i.e., los
MBh x_ (h ) + MAh xh( ) = NBh x_ (h op voor = 1 2 : : : 1.
1) + N x( 1) + f Ah h h
;
;
x(h )(0) = u0
(77)
NEDERLANDSE SAMENVATTING
xxxvi Ruwroostercorrectie. Bereken het defect
dh = Bhx_ (h 1) + Ahx(h 1) ; fh = NBh (x_ (h 1 1) ; x_ (h 1)) + NAh (x(h 1 ;
;
1) ; x( 1 ) ) h
:
Los het ruwroosterequivalent op van de defectvergelijking,
BH v_ H + AH vH = rdh vH (0) = 0
(78)
met r : #h ! #H de restrictieoperator, die jnroostergrootheden transfereert naar ruwroostergrootheden. Interpoleer dan de correctie vH naar #h , en verbeter de huidige benadering, xh = x(h 1) ; pvH met p : #H ! #h de prolongatieoperator. Post-smoothing. Voer 2 iteraties uit van de vorm (77), te beginnen met x(0) h = xh , ( ) ( 2) en stel uh = xh . Daar (78) van dezelfde vorm is als (13), kan deze tweeroostervariant recursief worden aangewend om tot een multiroosteralgoritme te komen op meer dan twee geneste roosters.
Het discrete geval
Het discrete tweerooster-golfvormrelaxatiealgoritme wordt bekomen door discretisatie van het continue equivalent (vergelijkingen (77){(78) en de dierentiaalvergelijkingen van de post-smoothing stappen) in de tijd met behulp van een lineaire multistapmethode. De resulterende discrete golfvormrelaxatievariant is goed gede nieerd als en slecht als er voldaan is aan de discrete oplossingsvoorwaarden van het tweeroosteralgoritme, i.e., k 62 (;M 1M ) en k 62 (;B 1A ) : (79) Bh Ah H H k k ;
;
6.2 Convergentieanalyse Het continue geval
De tweerooster-golfvormrelaxatieoperator en zijn symbool De continue tweeroostercyclus kan worden herschreven in expliciete vorm als u( ) = Mu( 1) + '. De continue tweerooster-golfvormrelaxatieoperator M is gegeven door ;
M=K 2CK 1
met K de standaard-golfvormrelaxatieoperator en C de tweerooster-correctieoperator. Door termen te herschikken, kunnen we M herschrijven als M = (MBh1 NBh ) 2 (I ; pBH1 rBh )(MBh1 NBh ) 1 + Mc ;
;
met Mc een lineaire Volterra-convolutieoperator.
;
NEDERLANDSE SAMENVATTING
xxxvii
De fout e( ) voldoet aan e( ) = Me( 1). Laplace-transformatie van deze relatie geeft e( ) (z ) = M(z ) e( 1) (z ) : Door de vergelijkingen van de tweeroostercyclus te transformeren, vinden we volgende uitdrukking voor het symbool M(z) van operator M, M(z) = K 2 (z)(I ; p(zBH + AH ) 1r(zBh + Ah))K 1 (z) K(z) = (zMBh + MAh ) 1(zNBh + NAh ) : ;
;
;
;
Convergentie op eindige tijdsintervallen De volgende stelling is analoog aan Stelling 3.1. Stelling 6.1 Beschouw M als een operator in C 0 T ]. Dan is M een begrensde operator en ;
(M) = (MBh1 NBh ) 2 (I ; pBH1 rBh)(MBh1 NBh ) 1 : (80) Convergentie op oneindige tijdsintervallen Als we veronderstellen dat alle eigenwaarden van BH1AH en MBh1 MAh een positief re%eel deel hebben, (81) geldt dat mc 2 L1(0 1). We bekomen de volgende stelling na toepassing van Lemma 2.3. Stelling 6.2 Beschouw M als een operator in Lp(0 1) met 1 p 1, en veronderstel dat aan de voorwaarden (81) is voldaan. Dan is M een begrensde operator en
(M) = sup (M(z)) = sup (M(i)) : (82) R Re(z) 0 ;
;
;
;
;
2
Het discrete geval
De tweerooster-golfvormrelaxatieoperator en zijn symbool () = De discrete tweeroostercyclus kan in expliciete vorm worden herschreven als u M u( 1) + ' . Om de aard van de discrete tweerooster-golfvormrelaxatieoperator te bepalen, merken we op dat we, net als in (23), kunnen a eiden dat E ( ) = GE ( 1) :
Hier, E ( ) = e( )k] e( )k + 1] : : : e( )N + k ; 1] t en G is een blok-Toeplitz-onderdriehoeksmatrix. Dit impliceert dat M een discrete lineaire convolutieoperator is. De Z -getransformeerde M (z) van de kern van M , ook het symbool van de operator M genoemd, kunnen we bekomen door Z -transformatie van de vergelijkingen van de discrete tweeroostercyclus. Dit geeft a 1 M (z) = M (z) : ;
;
b
Convergentie op eindige tijdsintervallen Toepassing van Lemma 2.4 op operator M levert onmiddellijk de volgende stelling.
NEDERLANDSE SAMENVATTING
xxxviii
Stelling 6.3 Beschouw M als een operator in lp(N ) met 1 p 1 en N eindig, en
veronderstel dat aan de discrete oplossingsvoorwaarden (79) is voldaan. Dan is M een begrensde operator en 1 k
(M ) = M : (83)
k
Convergentie op oneindige tijdsintervallen Als we veronderstellen dat (;MBh1MAh ) (;BH1AH ) int S , dan behoort de kern van de operator M tot l1(1). Met behulp van Lemma 2.5 bekomen we dan de volgende stelling. Stelling 6.4 Beschouw M als een operator in lp(1) met 1 p 1, en veronderstel dat (;MBh1 MAh ) (;BH1AH ) int S . Dan is M een begrensde operator en
(M ) = sup f (M(z)) : z 2 C n int S g = sup (M(z)) : (84) ;
;
;
;
z @S 2
Discrete versus continue resultaten
De relatie tussen de tweeroosteroperatoren M en M is gelijkaardig aan de relatie tussen
K en K. Meer bepaald hebben we dat
lim (M ) = (M) en dit zowel voor eindige als oneindige tijdsintervallen. We vermelden ook de tweeroosterequivalenten van Stelling 3.5 en Gevolg 3.6. Stelling 6.5 Beschouw M als een operator in lp(1) en M als een operator in Lp(0 1) met 1 p 1. Veronderstel dat de lineaire multistapmethode A()-stabiel is en dat (;MBh1MAh ) (;BH1AH ) '. Dan geldt
(M ) supc (M(z)) = supc (M(z)) (85) 0 !
;
;
z 2
z @ 2
met 'c := C n ' = fz : j Arg(z)j ; g. Gevolg 6.6 Beschouw M als een operator in lp(1) en M als een operator in Lp(0 1) met 1 p 1. Veronderstel dat de lineaire multistapmethode A-stabiel is en dat aan voorwaarden (81) is voldaan. Dan geldt (M ) (M).
Een uitbreiding van de tweeroosterresultaten naar het multiroostergeval
Veronderstel dat we meer dan twee roosters #h0 #h1 : : : #hl hebben, samen met een verzameling prolongatieoperatoren phhii+1 : #hi ;! #hi+1 0 i l ; 1, een verzameling restrictieoperatoren rhhii+1 : #hi+1 ;! #hi 0 i l ; 1, en matrices Bhi en Ahi , 0 i l. We kunnen dan analoge resultaten a eiden als hierboven. Deze resultaten worden uitgedrukt in termen van het symbool van de multirooster-golfvormrelaxatieoperator, gegeven door 8 < Kh2 (z ) I ; phhl I ; Mh (z ) Lh 1 (z )rhhl;1 Lhl (z ) Kh1 (z ) l 6= 1 l Mhl (z) = : l ; hl;1 1 h l;1 l;1 l Kh21 (z) I ; ph10 Lh0 (z)rh10 Lh1 (z) Kh11 (z) l=1 waarbij Lhi (z) = zBhi + Ahi en Khi (z) = (zMBhi + MAhi ) 1(zNBhi + NAhi ). ;
;
;
NEDERLANDSE SAMENVATTING
xxxix
6.3 Modelprobleemanalyse Theoretische resultaten
Het continue geval Het volgende resultaat begrenst (M) door een constante, onafhankelijk van de roostergrootte h. We gebruiken standaardver jning (H = 2h) en stuksgewijze lineaire interpolatie, terwijl de restrictieoperator gede nieerd wordt als r := pt 22, p. 66].
Stelling 6.7 Beschouw de eendimensionale warmtevergelijking (28), ruimtelijk gediscretiseerd met behulp van lineaire eindige elementen. Als we M, met red-black Gauss-Seidelsmoothing en interpolatie en restrictie zoals hierboven gedenieerd, beschouwen als een operator in Lp (0 1) met 1 p 1, geldt pp (86)
(M) 3 (2 ; 1) met ( ) = ( +1) +1
voor = 1 + 2 1. Het discrete geval De spectrale guur van de tweerooster-golfvormrelaxatieoperator is gegeven in Figuur 6.3 (p. 121). De berekende waarden van de discrete spectraalradii vindt men in Tabel 6.3 (p. 121).
Numerieke resultaten
In Tabellen 6.7 (p. 124) en 6.13 (p. 126) vinden we de numerieke resultaten van het tweerooster/multirooster-golfvormrelaxatiealgoritme voor ons modelprobleem. Zoals verwacht komen de bekomen resultaten goed overeen met de theoretische resultaten op het oneindige interval. In het proefschrift geven we ook nog tal van numerieke resultaten voor andere modelproblemen om meer inzicht te krijgen in het convergentiegedrag van de multiroostermethoden.
7 Besluiten en suggesties voor verder onderzoek
Besluiten
Een uitbreiding van de lineaire golfvormrelaxatie-convergentietheorie Het eerste doel van deze thesis was na te gaan of de klassieke golfvormrelaxatie-convergentietheorie voor lineaire stelsels dierentiaalvergelijkingen u_ + Au = f 63, 64] kon worden uitgebreid naar meer algemene problemen van de vorm B u_ + Au = f met B niet-singulier. In Hoofdstuk 3 introduceerden we een golfvormrelaxatieschema voor zulke veralgemeende problemen en bestudeerden we het convergentiegedrag, zowel voor de continue als de discrete varianten van de methode. Hiervoor formuleerden we deze algoritmen in
xl
NEDERLANDSE SAMENVATTING
expliciete vorm, en onderzochten we de spectrale eigenschappen van de resulterende iteratieoperatoren. Onze resultaten bleken analoog te zijn als die in het B = I -geval. Voor bepaalde semi-gediscretiseerde warmteproblemen konden we aantonen dat de Jacobi- en Gauss-Seidel-golfvormrelaxatiealgoritmen even snel convergeren als hun statische equivalenten voor Au = f . Versnellingstechnieken Daarna wensten we de convergentie van golfvormrelaxatiemethoden te verbeteren door versnellingstechnieken. De gekende versnellingstechnieken voor stationaire lineaire systemen waren de logische kandidaten om te worden uitgetest op tijdsafhankelijke problemen. In Hoofdstuk 4 bekeken we de versnelling van golfvormrelaxatie met SOR-technieken. De teleurstellende resultaten van de standaard-SOR-golfvormrelaxatiemethode voor het B = I -geval 63, 64] konden eenvoudig worden veralgemeend voor de meer algemene lineaire stelsels dierentiaalvergelijkingen. Meer bepaald resulteerde de SOR-golfvormrelaxatiemethode niet in dezelfde versnelling als zijn statisch equivalent. Op dat moment ontdekten we de resultaten van Mark Reichelt, die enkele veelbelovende numerieke experimenten had uitgevoerd met het zogenoemde convolutie-SOR-golfvormrelaxatiealgoritme 82], waarin elke vermenigvuldiging met een overrelaxatieparameter werd vervangen door een convolutie met een tijdsafhankelijke functie. We onderzochten deze CSOR-methode in detail, en waren o.m. in staat om aan te tonen dat de optimale variant (met de best mogelijke convolutiefunctie) zich in veel gevallen gedraagt als de optimale SOR-methode voor Au = f . In Hoofdstuk 5 pasten we eenzelfde convolutie-idee toe op de Chebyshev-versnelling van golfvormrelaxatiemethoden. We illustreerden eerst dat er geen aanzienlijke versnelling moest worden verwacht door lineaire combinaties van de basisiteratiefuncties te nemen 68]. We probeerden daarom de convergentie te versnellen door een convolutiegebaseerde aanpak. Onze convergentieresultaten werden eerst toegepast op de convolutie-Chebyshevversnelling van de Picard-methode, en vergeleken met deze van de sterk verwante verschoven Picard-iteratie 54, 85]. Ook werd aangetoond dat de convolutie-Chebyshev-Jacobien Gauss-Seidel-golfvormrelaxatiemethoden even snel convergeren als hun statische equivalenten, en dit voor verschillende modelproblemen. Tenslotte werd de performantie van multirooster-golfvormrelaxatiemethoden voor problemen B u_ + Au = f , afgeleid van een lineaire, parabolische parti%ele dierentiaalvergelijking door ruimtelijke discretisatie, onderzocht in Hoofdstuk 6. We konden de eindigedierentieresultaten 55, 95] uitbreiden naar het eindige-elementengeval. Meer bepaald gedroeg de multirooster-golfvormrelaxatiemethode zich als zijn statisch equivalent voor Au = f $ het convergentiegedrag is onafhankelijk van de roosterafstand h van de ruimtelijke discretisatie van de parti%ele dierentiaalvergelijking.
Suggesties voor verder ondezoek
In het proefschrift worden enkele suggesties voor verder onderzoek gegeven. Zo moeten nog verscheidene (nieuwe) versnellingstechnieken (b.v. convolutie-Krylov-golfvormvarianten 59], multigrid met verschillende ruwe roosters, ...) worden onderzocht, alsook de toepasbaarheid van golfvormmethoden op niet-lineaire problemen met variabele coecienten. Ook interessant is een onderzoek naar de moeilijkheden die ontstaan als we
NEDERLANDSE SAMENVATTING
xli
de continue golfvormrelaxatiealgoritmen discretiseren met behulp van een tijdsintegrator met variabele tijdstap, gekozen in functie van het gedrag van de lokale variabelen in de tijd. Tot slot vermelden we de eventuele toepassing van golfvormrelaxatiemethoden op andere problemen, zoals b.v. integro-dierentiaalvergelijkingen en hyperbolische parti%ele dierentiaalvergelijkingen.
xlii
NEDERLANDSE SAMENVATTING
Chapter 1 Introduction We introduce the waveform relaxation (WR) method and illustrate its use for solving initial-value problems (IVPs). The subject of the thesis is set in the WR research domain and we recall the most important convergence results and acceleration techniques. Finally, we present an overview of the dissertation.
1.1 Waveform relaxation methods 1.1.1 Basic ideas
Waveform relaxation is an iterative solution technique for IVPs consisting of ordinary di erential equations (ODEs) or di erential-algebraic equations. The method was introduced by Lelarasmee to overcome the de ciencies of standard time-stepping schemes when applied to IVPs that model the behaviour of very large-scale electrical networks 50]. Standard time-stepping schemes require at each time level the accurate solution of a large system of algebraic equations, obtained by discretising the IVP in time. The determination of the numerical solution, even at one time level, can become very expensive, especially when direct linear solvers are used. In addition, the time increment of timestepping methods is restricted to one which is ne enough to resolve the system's most rapidly changing component. Since large IVPs are often very sti , i.e., their variables may change in time at very dierent rates, the resolution of the slowly changing variables is then unnecessarily accurate and valuable computation time is wasted. Thirdly, timestepping schemes are inherently sequential as they advance the solution time step after time step. Waveform relaxation is an iterative method that tries to get around these shortcomings. The method decomposes the original problem at the dierential equation level. The large initial-value system is then solved iteratively, by solving the smaller subsystems in sequence, in a loop that is repeated until convergence. Values from the previous iteration are used for the variables from other subsystems. Waveform relaxation methods are continuous-time iterative methods, that is, given a vector of functions which approximates the solution, they calculate a new approximation along a whole time interval. They dier from standard iterative techniques in that their iterates consist of waveforms or 1
CHAPTER 1. INTRODUCTION
2
functions in time instead of scalar values. The subsystems that arise in a WR iteration are usually chosen to be much smaller than the original system$ an additional requirement for the method to be useful is that the former systems are easily solvable. An actual computer implementation of WR requires the use of a standard time integration method to solve the decomposed dierential equation subsystems. The resulting discrete-time WR algorithms allow for a selection of the time step of this integration scheme to re ect the behaviour of the local variables only. Hence, they are genuine multirate integration methods. Appropriate WR variants can be implemented on a parallel computer by distributing the subsystems among dierent processors. Every processor is then responsible for computing the update to a subsystem of dierential equations, a task which is often large enough to lead to very satisfactory parallel performance. On the other hand, it is well-known that WR algorithms need a substantial amount of computer memory as they require the storage of at least one complete waveform for every variable.
1.1.2 Some standard waveform methods
The former ideas will be illustrated by means of the general nonlinear rst-order IVP
F (t u u_ ) = 0 u(0) = u0 t > 0 (1.1) where F : R C d C d ! C d $ u0 = (u0)1 (u0)2 : : : (u0)d]t 2 C d is a vector which contains the initial values$ u(t) = u1(t) u2(t) : : : ud(t)]t 2 C d is the solution vector at time t and u_ denotes the derivative of u with respect to t. Componentwise, this system looks like 8 F (t u u : : : ud u_ 1 u_ 2 : : : u_ d) = 0 u1(0) = (u0)1 > > > 1 1 2 < F2(t u1 u2 : : : ud u_ 1 u_ 2 : : : u_ d) = 0 with u2(0) = (u0)2 ... ... > > > : F (t u u : : : u u_ u_ : : : u_ ) = 0 ud(0) = (u0)d : d 1 2 d 1 2 d The WR iteration is started with an initial approximation u(0), a vector of d components, each of which is a function de ned for t > 0. A natural choice is to take these waveforms constant, equal to the values speci ed by the initial condition,
u(0) i (t) = (u0 )i i = 1 2 : : : d t > 0 : We can then de ne a very simple iteration scheme for (1.1) which maps the \old" iterate u( 1) into a \new" iterate u( ) by solving the following equations for i = 1 2 : : : d, ;
Fi t u(1
;
1) : : : u( 1) u( ) u( 1) : : : u( 1) i 1 i i+1 d ( 1) ( 1) ( ) u_ 1 : : : u_ i 1 u_ i u_ (i+1 1) : : : u_ (d 1) = 0 ;
;
;
;
;
;
;
;
;
:
(1.2)
The new waveforms are also required to satisfy the initial condition u(i )(0) = (u0)i. The iterative scheme (1.2) converts the task of solving a dierential equation in d variables into the task of solving a sequence of d dierential equations, each in one variable. Due to its correspondence with the Jacobi relaxation method for linear and nonlinear
1.1. WR METHODS
3
systems of algebraic equations, the resulting method is referred to as the Jacobi waveform relaxation method. Waveform relaxation methods are also called dynamic iteration methods, while the corresponding relaxation methods for systems of algebraic equations are then referred to as static iterations. We can de ne the Gauss{Seidel waveform relaxation method for (1.1) in complete analogy with the Jacobi case. More precisely, its de ning iterative scheme is given by the d dierential(-algebraic) equations
Fi(t u(1 ) : : : u(i ) u(i+1 1) : : : u(d ;
1) u_ ( ) : : : u_ ( ) u_ ( 1) : : : u_ ( 1) ) = 0 1 i i+1 d
;
;
;
:
(1.3)
While the Jacobi WR method (1.2) is fully parallel (its d dierential equations can be solved simultaneously), the Gauss{Seidel scheme is inherently sequential as the equations (1.3) need usually to be solved the one after the other. However, for the problems we shall consider the Gauss{Seidel WR method turns out to be parallelisable by using an adequate ordering of the variables. The Jacobi and Gauss{Seidel WR schemes for (1.1) may require the solution of nonlinear dierential(-algebraic) equations. Yet, these nonlinear equations need not to be solved exactly. A good approximate solution often satis es to ensure convergence of the WR iteration. One de nes the waveform relaxation Newton method as a WR method in which each equation is linearised by a Newton{Raphson procedure before it is solved. As an example, we mention the Jacobi WR Newton method, whose iterative scheme is given by Fi t u(1 1) : : : u(d 1) u_ (1 1) : : : u_ (d 1) i ( 1) : : : u( 1) u_ ( 1) : : : u_ ( 1) u( ) ; u( 1) + @F t u 1 1 i i d d @ui i ( 1) : : : u( 1) u_ ( 1) : : : u_ ( 1) u_ ( ) ; u_ ( 1) = 0 : t u + @F 1 1 i i d d @ u_ ;
;
;
;
;
;
;
;
;
;
;
;
;
;
i
Next, we point out the correspondence between the WR methods and a method, introduced a century ago by Picard and Lindel%of in their existence proof for solutions of ODEs of the form u_ = f (t u) u(0) = u0 t > 0 (1.4) with f : R C d ! C d 52, 74]. The method, de ned in terms of the iteration
u_ (i ) = fi(t u(1
1) : : : u( 1) ) d
;
;
decomposes the original ODE system into trivial subsystems, i.e., it converts the task of solving the d-dimensional problem (1.4) into the task of integrating a known scalar function for every state variable separately. This is a genuine WR scheme, to which we will refer as the waveform Picard method. For this reason, WR methods are also known under the alternative name of Picard{Lindelof iterations. For completeness, we also mention the waveform Newton iteration which is dierent from the WR Newton methods discussed above. The method implements a Newton{
CHAPTER 1. INTRODUCTION
4
Raphson linearisation of the original system, leading to the iteration ; ( 1) ( 1) ; ( ) ; t u u _ u ; u( 1) F t u( 1) u_ ( 1) + @F @u ; ( 1) ( 1) ; ( ) t u u _ u_ ; u_ ( 1) = 0 : + @F @ u_ ;
;
;
;
;
;
;
(1.5)
;
Observe that the latter is usually not considered to be a relaxation method as it does not decouple the initial-value problem (1.1) into easily solvable subsystems. More precisely, each waveform Newton iteration requires the solution of a linear system of dierential (-algebraic) equations (1.5) with the same dimension as the original problem. The latter step could be performed, e.g., by using a WR method, leading to the so-called waveform Newton relaxation method 101].
1.1.3 An example
We shall apply the Jacobi WR scheme to the following ODE system, F1(t u1 u2 u_ 1 u_ 2) = u_ 1 ; u2 = 0 u1(0) = 0 with (1.6) F2(t u1 u2 u_ 1 u_ 2) = u_ 2 + u1 = 0 u2(0) = 1 : The reader will certainly recognise (1.6) as the classical system de ning the sine and cosine functions. The iteration is started by choosing the initial approximations u(0) 1 (t) 0 and (0) ( ) ( ) u2 (t) 1. For 1, the iterates u1 and u2 are calculated by solving (1.2), which for the present problem becomes ( () u_ 1 = u(2 1) u(1 )(0) = 0 with ( ) u_ (2 ) = ;u(1 1) u2 (0) = 1 : It can easily be veri ed that the WR method leads to ;
;
8 2i+1 X1 > (2 1) (2 ) > i t > u ( t ) = u ( t ) = ( ; 1) > 1 < 1 (2i + 1)! ;
1
;
i=0
> > (2 ) (2 +1)(t) = X(;1)i t2i > > : u2 (t) = u2 (2i)!
0:
i=0 These waveforms obviously converge to u(1 )(t) = sin(t) and u(2 )(t) = cos(t) as each pair 1
1
of iterations picks up one additional term of the Taylor expansion of the solution functions. We have plotted successive iterates u(1 )(t) and u(2 )(t) for t 2 0 5] and = 0 1 : : : 10 in Figure 1.1. We observe that the error (the dierence between the computed waveform and the exact solution) is not uniform along the interval of integration. The approximation is good at the beginning of the time interval but deteriorates for increasing values of t. Moreover, the error is not necessarily reduced at every time point after every new iterate. We have that each iteration (in this case, each pair of iterations) lengthens the time interval for which the approximation is satisfactory. This qualitative convergence behaviour is fairly typical, as it is also observed for more complicated problems. It will be shown further that it is in agreement with the general convergence theory.
1.2. A SURVEY OF THE WR LITERATURE 3 2 1 0 ;1 ;2 ;3
.. ...... .. ....... .. ...... ...... ... ..... .. ...... . . . . . . . .. ..... .. ...... ....... .. ...... .. ..... .. ...... .. ..... . . . . . . .... ... ...... ... ...... ... ....... ... ..... .............. .... ...... ............................................................................... .... ........................... . . . . . . . . . . . . . . . . . . . . . . . . . . ....... . ... . .......... ...... ................................................................. .......... .... ........ ....... ..... ....... ....... .... .......... ....... .... ......... ..... ........................................................................................................................................................................................................................................................................................................... .... ........... .... .......... ... ..... ....... .... ... .... ....... .... ... .... ....... ..... ... ............... ................. ..... ... ........ ..... ... ..... .. .... ... .... ... .... .. .... .. ... .. ... .. .... .. ... ... ... .. ... .. ... .. ... ... ... .. ... .. .. .. .. .. .. ... . .
= 1 2
=0
= 9 10
2
3
4
= 4 5 = 10
= 0 1
= 3 4
1
3 2 1 0 = 8 9 ;1 ;2 = 6 7 ;3 4 5
. .. .. .. . .. .. .. . ... .. .. .. .. .. . . . . .. ... .. .. ... .. .. .. .. .. . . . .................... .... ................................................................................................................................................................................................................................................. ....................... ........... . .......... .. ... ......... .. .. ..... ... ... ..... .. ....... ... .. . . . . .. . .. . . . ...... .. .. .. .... ... .................. ... .... .... ....... .... ..... ....... .... .... .... ........ .... ............ .... ..... ....... .... .......... ....... ..... .... ....... ...... . . . . ........ . . . . . . ... ...... . . . . ....... .......................... .... ........ .... ......... ..... ...... .... ........ ..................... .... . ........... . . . . . . . . ...... .... .. ................. ..................... ......... .. .. .. .... .. .... .............................. .......... .... .......... ... ....... ... ....... .... ..... .... ...... .... ..... .... ... .... ... .... ... .... ... ... ... ... .... ... ... ... ... ... .. ... ... ..
= 5 6
= 7 8
0
5
= 2 3
5
0
1
2
3
Figure 1.1: Jacobi WR iterates u(1 )(t) (left) and u(2 )(t) (right) versus t for (1.6).
1.2 A survey of the waveform relaxation literature Due to the large number of publications in this area, we shall not attempt to overview all of the WR literature. Instead, we restrict ourselves to the results for rst-order IVPs, with special emphasis on the convergence and acceleration aspects of the methods. For a more detailed overview, including a WR study for, e.g., second-order and delay dierential equations, we refer to Burrage's book 8, Chap. 7{9] and the references cited therein.
1.2.1 General convergence results
The most general convergence analysis of WR methods for rst-order IVPs is presented in the electrical engineering literature in articles by Lelarasmee, Ruehli and Sangiovanni{ Vincentelli 51], and by White et al. 101, 102]. There, convergence is proven for certain nonlinear rst-order ODE systems which are slightly more general than (1.4) and which are typically encountered in electrical circuit simulation problems (the left-hand side is multiplied by a matrix which depends on the solution u). In particular, the Jacobi and Gauss{Seidel WR methods are shown to be convergent iterations for (1.4) if f (t u) is Lipschitz continuous in u on the interval 0 T ]. This result is based on the fact that the WR iteration operators are contractions on the space of continuous functions on 0 T ], equipped with an exponentially scaled norm. Hence, this does not require the error to decrease in every iteration at every point in time. Yet, one can show by a strict contractivity argument that there always exists a time interval for which the error decreases uniformly, i.e., in maximum norm. These convergence results provide some insight in the qualitative convergence behaviour of the Jacobi and Gauss{Seidel WR methods, as illustrated above in the example of Section 1.1.3. However, they are of little practical use when one is interested in quantitative convergence estimates. To this purpose, Miekkala and Nevanlinna developed a WR convergence theory for linear ODE systems of the form u_ + Au = f u(0) = u0 t > 0 (1.7) with A 2 C d d 63, 64, 66, 67]. For such dierential systems, the WR schemes (1.2) and (1.3) can be rewritten in the form u_ ( ) + MAu( ) = NAu( 1) + f u( )(0) = u0 : (1.8)
;
CHAPTER 1. INTRODUCTION
6
Here, the matrices MA and NA are chosen such that A = MA ; NA$ they correspond respectively to the Jacobi (MA = DA $ NA = LA + UA) and Gauss{Seidel (MA = DA ; LA$ NA = UA ) splittings of A, de ned in terms of A = DA ; LA ; UA with DA a diagonal, LA a lower-triangular and UA an upper-triangular matrix. Miekkala and Nevanlinna proved that, under the right assumptions, the asymptotic convergence rates of these continuoustime Jacobi and Gauss{Seidel WR algorithms equal those of the corresponding static methods for Au = f , which are studied extensively in classical textbooks 4, 24, 98, 106]. They also derived convergence results for discrete-time algorithms and were able to predict and understand the convergence behaviour of actual WR implementations.
1.2.2 Acceleration techniques
Throughout the years WR methods have been applied to a wide variety of rst-order IVPs. The main application area so far has been the simulation of large-scale integrated circuits, see, e.g., 17, 50, 51, 76, 101, 102, 104], but there have also been implementations towards chemical process simulation 83, 86] and electric power systems analysis 12, 13, 14]. Waveform relaxation methods can also be applied to parabolic initial-boundary-value problems, governed by a partial di erential equation (PDE) of the form @ u(x t) + Lu(x t) = f (x t) x 2 # t > 0 (1.9) @t with L an elliptic dierential operator on the spatial domain #. Indeed, semi-discretising the latter problem, that is, discretising its spatial variables only, leads to a system of rstorder ODEs. Partial dierential equations that were solved by WR methods occur, e.g., in the simulation of semi-conductor devices 58, 81]$ we also mention diusion equations 10], the Navier{Stokes equations 71, 72] and the Brusselator problem 95, x4.7.4, x8.5.2]. In this thesis, we restrict ourselves to linear, second-order operators L with time-independent coecients. The use of nite di erences then results in an ODE system of the form (1.7), while a discretisation with nite elements yields B u_ + Au = f u(0) = u0 t > 0 (1.10) with B A 2 C d d and B nonsingular. For such real-life applications, one should not naively implement the Jacobi and Gauss{Seidel WR algorithms presented in the previous section. The inherent parallel and multirate integration properties of the methods should be exploited, as well as other convergence-improving techniques such as windowing 101] and inaccurate iteration 67, 69]. Several acceleration techniques that are well-known for systems of algebraic equations have been extended to the dynamic iteration case too. Some examples for general ODE systems are linear/polynomial acceleration 54, 56, 57, 59, 68, 85], optimal ordering of the equations 42], partitioning and blockwise iteration 40], preconditioning 9] and successive overrelaxation (SOR) 63, 64, 82]. Other acceleration techniques, such as domain decomposition 18], multigrid 55, 90, 95] and Toeplitz relaxation 45, 84] have only been applied to semi-discretised parabolic problems$ they naturally extend related methods for elliptic PDEs.
1.2. A SURVEY OF THE WR LITERATURE
7
Below, we shall brie y discuss some known convergence results for the techniques that will be studied further on in this thesis. These waveform techniques are developed to be applied to real-life applications. Yet, their theoretical convergence results are usually formulated in terms of equations of the form (1.7). These results lead to quantitative convergence estimates for linear constant-coecient model problems, and valuable insights for more involved, nonlinear problems.
Successive Overrelaxation. The SOR waveform relaxation method can be de ned as
the natural extension of the standard SOR procedure for systems of algebraic equations, as will be explained in detail further on. The convergence results for ODE systems (1.7), investigated in 63, 64] by Miekkala and Nevanlinna, are rather disappointing. That is, the SOR WR method only yields a small acceleration when compared to the optimal SOR acceleration for the linear system Au = f . Recently, however, Reichelt et al. introduced a novel method, the convolution SOR (CSOR) waveform relaxation algorithm, in which they replaced the multiplication with an overrelaxation parameter by a convolution with a time-dependent function 82]. Their numerical experiments for semi-conductor device simulation problems showed the convolution-based approach to have much more potential than the standard SOR WR method.
Linear/Polynomial acceleration. In 68], Nevanlinna investigated the possibility of
accelerating the convergence of (1.8) by linear or polynomial acceleration techniques, i.e., by taking linear combinations of the waveform iterates. He concludes that, in contrast to the static iteration case, no essential speed up should be expected by application of such techniques. These observations are investigated in more detail in 56, 57] for several Krylov subspace waveform relaxation methods. Inspired by the CSOR approach, Lumsdaine and Wu de ned the convolution Krylov waveform relaxation methods 59], in which every multiplication of a waveform iterate with a scalar parameter is replaced by a convolution with a time-dependent function. They were able to prove that these WR methods converge as fast as their static counterparts, if convergence is looked at in the correct norm. Concerning the Chebyshev acceleration of waveform methods, we mention the papers by Lubich 54] and Skeel 85]. These authors derived promising results for a so-called shifted waveform Picard method, the shift parameters of which are de ned in terms of the zeros of certain Chebyshev polynomials.
Multigrid. Multigrid is known to be a very fast method for solving elliptic PDEs. Its
combination with WR was rst analysed by Lubich and Ostermann in 55] for linear parabolic PDEs, spatially discretised by using nite dierences. A further examination of the resulting multigrid waveform relaxation method can be found, e.g., in 90, 95]. The method is shown to converge very fast for a wide variety of problems, with a convergence rate which is almost as good as that of multigrid for the corresponding static problem, and, thus, independent of the mesh size of the spatial discretisation.
8
CHAPTER 1. INTRODUCTION
1.3 Thesis overview In this dissertation, we shall study the convergence of WR methods for linear systems of ODEs. In particular, we extend the existing WR convergence theory for ODE systems of type (1.7) to general systems of type (1.10) with nonsingular B . We study several \accelerated" WR variants and will emphasise the CSOR method, the convolution-based Chebyshev method and the multigrid method. The ODE problems considered for the numerical experiments will be derived by nite-dierence or nite-element discretisation of a linear parabolic PDE. This thesis contains the results that have been published in the journal papers 31, 36, 37, 39], the technical report 38], and in two articles in conference proceedings 34, 35]. The text is organised as follows. The convergence of WR methods is governed by the spectral radius and/or norm of the WR iteration operators. In Chapter 2, we discuss the spectral properties of these operators. We will show that the continuous-time operators consist of a matrix multiplication and a linear Volterra convolution part. In the discrete-time case, the WR operators are discrete Volterra convolution operators. We will recall the spectral properties of such operators and present some new results. In the later chapters, the results of Chapter 2 will be used to examine the convergence of the basic waveform relaxation methods and also of the SOR, Chebyshev and multigrid WR accelerations. More precisely, we will derive spectral radii and norm formulae for both the continuous-time and discrete-time operators. We will apply these results to several semi-discretised model diusion problems, and verify them by means of extensive numerical experiments. Chapter 3 contains the convergence study of the basic WR methods for (1.10), i.e., those that can be de ned by splitting the matrices B and A. The resulting theoretical formulae are generalisations of those of Miekkala and Nevanlinna for solving (1.7). Application of the results for semi-discretised model equations shows the convergence rates of the Jacobi and Gauss{Seidel WR methods to be equal to those of the corresponding static iterations for Au = f . Numerical evidence is also supplied. In Chapter 4, we investigate the convergence behaviour of the standard SOR WR method for (1.10). We will show that the acceleration over the Jacobi and Gauss{Seidel methods is small compared to the acceleration obtained with SOR in the static case. Therefore, we concentrate on the CSOR WR method. We extend and complete Reichelt's theoretical convergence study. Among other things, we provide model problem analyses showing that the optimal CSOR WR method (i.e., the one with the best possible convolution kernel) behaves exactly as its optimal static SOR variant. We also comment on both the theoretical and practical determination of the optimal convolution kernel and illustrate a certain robustness of the method. Part of this work has been done in collaboration with Min Hu and Ken Jackson of the Computer Science Department of the University of Toronto, Canada. A similar convolution idea is applied to the Chebyshev acceleration of WR methods in Chapter 5. In this chapter, we rst recall Nevanlinna's result that states that one can not bene t from taking linear combinations of waveform iterates. Instead of combining WR iterates multiplied by a scalar coecient, better convergence results are to be expected by convolving each of the iterates with a time-dependent function before their
1.3. THESIS OVERVIEW
9
sum is computed. The optimal choice of the latter functions is related to certain complex Chebyshev polynomials. An exact relationship will be derived, together with a detailed theoretical analysis of the resulting (convolution-based) Chebyshev WR method. We rst apply these results to the Chebyshev{Picard iteration. We compare our analysis to that of the shifted waveform Picard iteration. The convolution approach does not only allow the analysis of the Chebyshev{Picard method, but can also be used to study, e.g., the Chebyshev{Jacobi and Chebyshev{Gauss{Seidel WR methods. We will show that the convergence behaviour of these methods is similar to that of the corresponding static iterations, at least for certain model problems. Although the actual performance of the above WR methods is illustrated by means of semi-discretised model heat equations, they can also be applied to ODEs that do not stem from a parabolic PDE. This is not the case for the multigrid WR method of Chapter 6, which is only de ned for semi-discretised parabolic PDEs. In this chapter, we generalise the convergence theory of Lubich and Ostermann for the nite-dierence case towards general ODE systems (1.10), obtained by nite-element discretisation of a linear parabolic PDE (1.9). In particular, a model problem analysis allows us to draw the conclusion that the multigrid WR method converges in a mesh-size independent manner. Finally, in Chapter 7 some general conclusions are given and some possible directions for future research are suggested.
10
CHAPTER 1. INTRODUCTION
Chapter 2 Functional Analysis Preliminaries We introduce some notations and recall the necessary background of Banach spaces, in which we shall study the convergence behaviour of WR methods. We survey the most important spectral properties of linear operators and apply these results to the operators that are obtained by rewriting the iterative waveform schemes for linear ODE systems in explicit form.
2.1 Banach spaces The following de nitions can be found in standard functional analysis textbooks, see, e.g., 15, 19, 43, 91, 105].
De nition 2.1.1 Let X be a vector (or linear) space over C . A norm on X is a realvalued function k kX which has the following properties for x y 2 X : i) kxkX 0 , kxkX = 0 implies x = 0 , ii) kxkX = jj kxkX , 2 C , iii) kx + y kX kxkX + ky kX . The space X , equipped with k kX , is called a normed vector space.
De nition 2.1.2 A sequence fxn]gn=0 in a normed vector space X is a Cauchy se1
quence if kxn] ; xm]kX ! 0 as n m ! 1. The space X is called complete if every Cauchy sequence converges to a vector in X .
De nition 2.1.3 A Banach space is a complete normed vector space. Observe that the norm associated with a Banach space does not necessarily stem from an inner product. Some classical examples of Banach spaces are given below.
Example 2.1.1 Let C (0 T ]$ C d ), or C 0 T ] for short, denote the set of continuous C d valued functions de ned on the interval 0 T ]. With the usual de nitions of addition and 11
CHAPTER 2. FUNCTIONAL ANALYSIS PRELIMINARIES
12
scalar multiplication of functions, C 0 T ] is a Banach space with respect to the maximum norm kxkC 0T ] = tmax kx(t)k 0T ] 2
with k k any usual vector norm in C d .
Example 2.1.2 Denote the set of C d -valued Lebesgue-measurable functions which are p-th power integrable by Lp((0 T )$ C d ) or Lp(0 T ) for short. It is a Banach space with the following norm,
kxkLp(0T ) =
sZ T p 0
kx(t)kp dt 1 p < 1 :
(2.1)
Here, k k denotes any usual vector norm in C d and T may be equal to 1. For the sake of completeness, one technicality is to be mentioned. Observe that kxkLp(0T ) = 0 does not necessarily imply x(t) = 0 everywhere, since x(t) may be dierent from 0 in a set of measure zero. In order to make (2.1) satisfy the rst requirement of De nition 2.1.1, it is customary not to distinguish between functions that dier only on such a set. That is, \x = y" means x(t) = y(t) almost everywhere (except in a set of measure zero), a fact which is sometimes accentuated by writing \x(t) = y(t) a.e.". A member of Lp(0 T ) is then de ned as an equivalence class of functions equal a.e. Closely related is the Banach space of essentially bounded functions L (0 T ), whose norm is given by 1
kxkL1 (0T ) = inf fM : kx(t)k M a.e. on (0 T )g :
Example 2.1.3 Consider the space of C d -valued, p-summable sequences x = fxn]gNn=01 of length N . This space, denoted by lp(N $ C d ) or lp(N ) for short, is of Banach type with ;
its norm given by
8q > < p PNn=01 kxn]kp kxklp(N ) = > : sup fkxn]kg ;
0 n
1p<1 p=1:
Again, k k stands for any usual vector norm in C d , while N is possibly equal to 1. (For this reason, we wrote \sup" instead of \max" for p = 1.) The asymptotic convergence behaviour of relaxation methods for linear algebraic systems in nite-dimensional vector spaces is determined by the spectral radius of the iteration operator (matrix). We shall point the reader's attention to a similar property for iterations de ned in in nite-dimensional Banach spaces. First, however, we recall some useful de nitions (in which X denotes an ordinary Banach space over C ).
De nition 2.1.4 A linear operator H (dened everywhere) in X is said to be bounded if there exists a constant M such that kHxkX M kxkX for all x 2 X .
2.1. BANACH SPACES
13
De nition 2.1.5 The norm of a bounded linear operator H in X is dened by kHkX := sup fkHxkX =kxkX g = sup kHxkX : 0=x X 6
x X =1
2
k k
(2.2)
De nition 2.1.6 Let H be a linear operator in X . If is such that the range of I ;H, R(I ; H) := fy 2 X : (I ; H)x = y for some x 2 X g
(2.3)
is dense in X and I ; H has a bounded inverse, then belongs to the resolvent set of H. All scalar values that are not in this resolvent set comprise the spectrum of H, denoted by (H).
De nition 2.1.7 Let H be a bounded linear operator in X whose spectrum (H) is nonempty and bounded. The spectral radius of H is dened by
(H) := sup jj :
( ) 2
H
(2.4)
Particular elements of the spectrum are the eigenvalues of H, i.e., the scalars for which there exists a nontrivial x 2 X such that Hx = x. Yet, in contrast to the situation in nite-dimensional vector spaces, the spectrum of H may also contain other elements, and the spectral radius does not necessarily equal the modulus of the largest eigenvalue. Two alternative characterisations of (H) are given below. The second one is based on the observation that a bounded linear operator H in X is closed, such that R(I ; H) = X if belongs to the resolvent set of H 91, p. 254].
Property 2.1.1 Consider a bounded linear operator H in X . Then,
(H) = nlim
!1
p n
kHn kX :
(2.5)
Property 2.1.2 Consider a bounded linear operator H in X . Then, (H) equals the
smallest number for which jj > implies that I ; H has a bounded inverse.
We now state the main theorem of this introduction on Banach spaces. It deals with the convergence of a successive approximation scheme, and may be found, e.g., in 43, p. 382].
Theorem 2.1.3 Let H be a bounded linear operator in X . Suppose (H) < 1. Then, for all ' 2 X the successive approximations
x( ) = Hx(
;
1) + '
= 1 2 : : :
(2.6)
with arbitrary x(0) 2 X converge to the unique solution of x ; Hx = '. Suppose (H) > 1. Then, (2.6) with x(0) = 0 cannot converge for all ' 2 X .
For the reader's convenience, we also recall the notion of the essential spectrum of a closed operator H 44, Chap. IV, x5.6], which shall be used in the proof of Lemma 2.2.1.
CHAPTER 2. FUNCTIONAL ANALYSIS PRELIMINARIES
14
De nition 2.1.8 Let H be a closed linear operator in X . The essential spectrum of H,
denoted by e (H), consists of those scalars such that either R(I ; H) is not closed or R(I ; H) is closed but dim N (I ; H) = 1 and codim R(I ; H) = 1, with N (I ; H) := fx 2 X : (I ; H)x = 0g
(2.7)
the null space of I ; H.
Property 2.1.4 Let H be a closed linear operator in X . Then, e(H) (H).
2.2 Spectral properties of convolution-like operators In the sequel of this dissertation, the continuous-time and discrete-time WR algorithms for linear ODE systems shall be rewritten as successive approximation schemes of the form (2.6). In this section, we study the spectral properties of the resulting convolution(-like) operators in some speci c Banach spaces.
2.2.1 The continuous-time case The continuous-time WR scheme (1.8) can be rewritten as an explicit relation between successive waveform iterates. This is done by application of the general solution formula for linear ODEs of the form (1.7) 15, p. 119]. This yields
u( ) = Hu(
;
1) + '
(2.8)
where H is a linear Volterra convolution operator with matrix-valued kernel h(t) = e tMA NA, Zt Hx(t) = (h x)(t) := h(t ; s)x(s) ds ;
0
and ' depends on the initial condition and the right-hand side f . The spectral radius of this operator H is studied by Miekkala and Nevanlinna 63], both for the nite time interval (in C 0 T ]) and the in nite time interval (in Lp(0 1)). In this thesis, we shall investigate several continuous-time WR methods for the linear ODE system (1.10) with nonsingular B . Any two successive iterates of such methods will be shown to satisfy an explicit relation of the form (2.8) too. Yet, the resulting operator H will be of a more general type than the former one. That is, we will show that H = H + Hc
(2.9)
where H 2 C d d , and Hc is a linear Volterra convolution operator with kernel hc . Below, we shall investigate the spectral radii of these generalised operators, both on nite and in nite time intervals. Observe that the resulting formulae should incorporate those of 63] as a special case, i.e., by setting H = 0 and hc(t) = e tMA NA .
;
2.2. SPECTRAL PROPERTIES OF CONVOLUTION-LIKE OPERATORS
15
Finite time intervals Lemma 2.2.1 Consider H = H +Hc as an operator in C 0 T ], and suppose hc 2 C 0 T ]. Then, H is a bounded operator and (H) = (H ).
The rst proof given below is based on a stability result from perturbation theory and was suggested by an (anonymous) referee of 37]. For the reader's convenience, we included a second, more elementary proof as well. Proof 1. Bounding Hx gives kHxkC 0T ] kHxkC 0T ] + kHc xkC 0T ] (kH k + T khc kC 0T ])kxkC 0T ]
where kk denotes both the vector norm in C d and the matrix norm induced by it. Hence, H is a bounded operator with kHkC 0T ] kH k + T khc kC 0T ]. Since the linear convolution operator Hc is compact, operator H is a compact perturbation of H . From 44, Chap. IV, Thm. 5.35], it then follows that
e(H ) = e(H) where e(H ) and e(H) denote the essential spectra of the (closed) operators H and H respectively. We show below that equality also holds for the spectra. It is easily seen that the spectrum of the matrix multiplication operator H is equal to the spectrum of the matrix H . For any 2 (H ) both dim N (I ; H ) and codim R(I ; H ) in C 0 T ] are in nite. Hence, the essential spectrum of the matrix multiplication operator H equals the spectrum of H , or e(H ) = (H ) : It follows that e(H) is a nite set, and any point 2 (H) n e(H) must be an isolated eigenvalue of H 44, Chap. IV, Thm. 5.33]. We will show that there are no such points, i.e., (H) = e(H). Suppose we have some x 6= 0, such that Hx = Hx + Hc x = x : Since 62 e(H) = (H ), this can be rewritten as (I ; H ) 1Hc x = x which means that (I ; H ) 1Hc has 1 as an eigenvalue. However, (I ; H ) 1Hc is a linear Volterra convolution operator with continuous kernel, whose spectrum equals the singleton f0g 47, p. 33]. This contradicts our assumption. Hence, (H) = e (H), and thus (H) = (H ), which completes the proof. 2 ;
;
;
Proof 2. The boundedness of operator H is already proven in Proof 1. If H = 0, we refer to the result used at the end of the previous proof, stating that the spectrum of a linear Volterra convolution operator with continuous kernel equals the singleton f0g. Hence, (H) = (H ) = 0.
CHAPTER 2. FUNCTIONAL ANALYSIS PRELIMINARIES
16
Further on, we assume H 6= 0. The n-fold application of H to x then includes 2n terms. Each term consists of a combination of matrix multiplication and Volterra convolution operators applied to x. The norm of a term with n ; i matrix multiplications and i convolutions can be bounded by kH kn i khc kiC 0T ] ;
Z t Z s1 Z s2 0
0
0
:::
Z si;1 0
dsi : : : ds3 ds2 ds1 kxkC0T ]
which, for t 2 0 T ], is smaller than kH kn i khc kiC 0T ] ;
Note that for each i there are
n i
T i kxk i! C0T ] :
terms satisfying the above bound. We get
n ! X n ci n
kHn xkC 0T ] kH k
i
i=0
i! kxkC0T ]
with c = (T khc kC0T ])=kH k. Using Property 2.1.1 of the spectral radius, we obtain v uX n i u n c : n t
(H) kH k nlim i
i!
i=0
!1
To calculate the limit, we observe that 1, Eqs. 22.3.9 and 22.5.16], n i X n c = L (;c) i i! n i=0
where Ln () denotes the n-th Laguerre polynomial. Using Perron's asymptotic formula for Laguerre polynomials 89, Thm. 8.22.3], we obtain r
q 1 n 1 + O(n 1=2 ) : nc 1 = 2 c= 2 1 = 4 1 = 4 2 Ln(;c) = 2 e c n e Taking the limit n ! 1, both factors tend to 1, and, by consequence, (H) kH k. In order to prove that (H) is independent from the choice of the vector norm in C d , we use two dierent vector norms k k and k k , and their associated maximum norms k kC 0T ] and k kC 0T ]. Since all vector norms are equivalent 33, p. 7, Thm. 2], there exist m, M > 0 such that m kxkC0T ] kxkC0T ] M kxkC0T ]. Hence, I ; H has a bounded inverse with regard to the maximum norm k kC0T ] if and only if (I ; H) 1 is bounded with regard to k kC0T ], and the independence follows by de nition of (H). Consequently, (H) kH k for every induced matrix norm, or p n
n
;
;
;
;
p
;
0
0
0
;
0
(H) inf kH k = (H ) fk kg
(2.10)
2.2. SPECTRAL PROPERTIES OF CONVOLUTION-LIKE OPERATORS
17
where the in mum is taken over all matrix norms induced by a vector norm in C d . The equality of (2.10) follows from a well-known characterisation of the spectral radius of a matrix 33, p. 14]. Finally, suppose 6= 0, and x ; Hx = f : Evaluating for t = 0 gives (I ; H )x(0) = f (0) : (2.11) If det(I ; H ) = 0, then (2.11) has either no solutions or an in nite number of solutions. The eigenvalues of H are therefore not regular values of H, and, consequently, (H)
(H ), which completes the proof. 2
In nite time intervals
The proof of the in nite-interval equivalent of Lemma 2.2.1 is based on a theorem by Paley and Wiener, see, e.g., 20, p. 45] or 73, p. 60]. The theorem deals with the solution of a linear Volterra integral equation x + k x = f . Its solution can be expressed in terms of a resolvent function r, which is de ned by the two equations r + k r = r + r k = k. In particular, x = f ; r f . A necessary and sucient condition for the boundedness of the resolvent r, and hence, for the boundedness of the solution x, is given in the theorem. Note that it holds both for scalar and for vector-valued functions. Theorem 2.2.2 (Paley{Wiener) Let k 2 L1(0 1). Then the resolvent r of k satises rR 2 L1(0 1) if and only if det(I + K(z)) 6= 0 for Re(z) 0, where K(z) = L(k(t)) := zt 0 k (t)e dt denotes the Laplace transform of k . Lemma 2.2.3 Consider H = H + Hc as an operator in Lp(0 1) with 1 p 1, and suppose hc 2 L1 (0 1). Then, H is a bounded operator and
(H) = sup (H(z)) (2.12) 1
;
Re(z) 0
= sup (H(i))
R where H(z ) = H + Hc (z), and Hc(z ) denotes the Laplace transform of hc .
2
(2.13)
Proof. The boundedness of H is an immediate consequence of Young's inequality on the convolution operator 78, p. 28]. Property 2.1.2 yields that the spectral radius of H is the smallest value of for which jj > implies that I ; H has a bounded inverse in Lp (0 1). Suppose 6= 0, and
x ; Hx = (I ; H ) x ; hc x = f : (2.14) First, we suppose that is not an eigenvalue of H , i.e., 62 (H ). In that case, (2.14) can be rewritten as x ; (I ; H ) 1 hc x = (I ; H ) 1 f : Applying the Paley{Wiener Theorem, one nds that x is bounded if and only if ; det I ; (I ; H ) 1 Hc(z) 6= 0 for Re(z) 0 ;
;
;
CHAPTER 2. FUNCTIONAL ANALYSIS PRELIMINARIES
18 or, equivalently,
det(I ; (H + Hc(z))) 6= 0 for Re(z) 0 : The set ' of all , with 62 (H ), that lead to an unbounded solution x, is '=
Re(z) 0
(H + Hc(z)) n (H ) :
De ne as sup fjj : 2 'g (note that (H) , with \ " instead of \=" since we did not yet take all possible into account). By the continuity of the eigenvalues of H + Hc (z) as a function of z, it is clear that
= sup (H + Hc(z)) : Re(z) 0
We still need to consider the that are eigenvalues of H . However, because lim (H + Hc(z)) = H
z
!1
(2.15)
these eigenvalues are in magnitude smaller than or equal to . Thus, (H) = , and thereby (2.12) follows. The second equality (2.13) is obtained by application of the maximum principle. 2 In L2(0 1), an analogous result holds for the norm.
Lemma 2.2.4 Consider H = H + Hc as an operator in L2(0 1), and suppose hc 2 L1(0 1). Then,
kHkL2 (0
1
)
sup kH(z)k
(2.16)
= sup kH(i)k
(2.17)
=
Re(z) 0
R
2
where H(z ) = H + Hc (z), Hc (z) denotes the Laplace transform of hc and k k denotes the matrix norm induced by the standard Euclidean vector norm. Proof. This result is a consequence of Parseval's formula, see, e.g., 7, p. 8].
2
Remark 2.2.1 Consider H as an operator in C 0 T ]. From (2.15), we derive
(H) = (H ) = (H(1)) which means that the spectral radius of H on nite time intervals is smaller than the spectral radius of H on in nite time intervals.
2.2. SPECTRAL PROPERTIES OF CONVOLUTION-LIKE OPERATORS
19
2.2.2 The discrete-time case
In 64], Miekkala and Nevanlinna rst considered the discrete-time variants of (1.8), obtained by discretising the latter WR schemes in time using a linear multistep method. They represented the continuous-time waveforms discretely as vectors de ned on N (possibly in nite) successive time levels, i.e., u = fun]gNn=01, and rewrote the resulting WR iterations in explicit form as u( ) = H u( 1) + ' : (2.18) Here, u( )n] is the approximation of u( )(t) at t = n and denotes the chosen time step. The operator H was proven to be a linear discrete convolution operator with matrix-valued kernel h , ;
;
(H x ) n] = (h x )n] :=
n X m=0
hn ; m]xm]
(2.19)
whose spectral radius was investigated both in the spaces lp(N ) ( nite time interval) and lp(1) (in nite time interval). In this thesis, several discrete-time WR methods for the more general system (1.10) will be studied theoretically. We will show that the resulting iterations can also be written in explicit form (2.18), with H an operator of type (2.19). Therefore, we rephrase the spectral properties of such operators, given in 64], in a more general context.
Finite time intervals Lemma 2.2.5 Consider H as an operator in lp(N ) with 1 p 1 and N nite. Then, H is a bounded operator and
(H ) = (h0]) = (H (1))
where H (z ) = Z (h ) := transform of h .
(2.20)
PN 1 n n=0 hn]z denotes the discrete Laplace transform or Z ;
;
Proof. Since H is a linear operator in a nite-dimensional space, boundedness of H follows. The operation H x can be represented in a standard linear algebra notation as a matrix-vector product, 2 6 6 6 6 6 6 6 4
h0] h1] h2]
hN ; 1]
h0] h1] h0]
h2] h1] h0]
32 6 7 6 7 6 7 6 7 6 7 6 7 76 54
x0] x1]
xN ; 1]
3 7 7 7 7 : 7 7 7 5
(2.21)
The spectral radius of operator H equals the spectral radius of the N N lower-triangular block-Toeplitz matrix in (2.21). By consequence, (H ) = (h0]). The second equality follows immediately. 2
20
CHAPTER 2. FUNCTIONAL ANALYSIS PRELIMINARIES
In nite time intervals Lemma 2.2.6 Consider H as an operator in lp(1) with 1 p 1, and suppose h 2 l1(1). Then, H is bounded and
(H ) = max
(H (z)) z 1 = max
(H (z)) z =1
(2.22) (2.23)
j j
where H (z) = Z (h ) := transform of h .
j j
P
n=0 hn]z 1
n
;
denotes the discrete Laplace transform or Z -
The outline of our proof is very similar to the one given in 64, Thm. 3.1]. Yet, here, it is phrased in terms of general convolution operators. A similar line of arguments is implied in the proof of 55, Prop. 9]. The proof is based on the discrete version of the Paley{Wiener Theorem 53]. This theorem states that the solution of a discrete Volterra convolution equation x + k x = f with f 2 lp(1) and k 2 l1(1) is bounded in lp(1) if and only if det(I + K (z)) 6= 0 for jzj 1, with K (z) the Z -transform of k . Proof. The boundedness of H follows from the fact that l1 lp lp . Indeed, applying Young's inequality for discrete convolution products 26, p. 198] yields kH x klp ( ) kh kl1 (
: Property 2.1.2 characterises the spectral radius of H as the smallest value of for which jj > implies that I ; H has a bounded inverse in lp(1). Consider x ; H x = x ; h x = f with f 2 lp(1). Suppose 6= 0, then this can be rewritten as a convolution equation x ; 1 h x = 1 f : By the Paley{Wiener Theorem, it follows that x is bounded if and only if 1 det I ; H (z) 6= 0 for jzj 1 or, equivalently,
(H ) = sup (H (z)) : 1
1
)
kx klp (
)
1
z 1
j j
Note that H (z) is analytic for jzj > 1, including z = 1, and, since h 2 l1(1), it is continuous for jzj 1 . Also, the spectral radius satis es the maximum principle. Hence, we obtain (2.22) and (2.23). 2
Remark 2.2.2 In the case of d = 1, this lemma corresponds to a well-known spectral property of semi-in nite Toeplitz operators 79, Thm. 2.1]. In l2(1), an analogous result holds for the norm.
2.2. SPECTRAL PROPERTIES OF CONVOLUTION-LIKE OPERATORS
21
Lemma 2.2.7 Consider H as an operator in l2(1), and suppose h 2 l1(1). Then, kH kl ( ) = max kH (z )k (2.24) z 1 = max kH (z )k (2.25) z =1 2
1
j j
j j
where H (z ) denotes the discrete Laplace transform or Z -transform of h and kk denotes the matrix norm induced by the standard Euclidean vector norm. Proof. The proof is based on Parseval's relation for vector-valued l2-sequences, kx kl2 ( ) = kxe (z )kH2 1
where xe (z) = Z (x ) denotes the Z -transform of x and k kH2 is the norm in the Hardy{Lebesgue space of square integrable functions analytic outside the unit disk,
Z 2 1=2 1 i 2 kxe (z )kH2 = sup kxe (re )k d : r>1 2 0 The Parseval relation for the scalar case can be found, e.g., in 105, p. 41]. By de nition of operator norm and by Parseval's relation, we have kH x k kH (z )xe (z )k kH kl2 ( ) = sup l2 ( ) = sup H2 : kx kl2 ( ) kxe (z )kH2 1
1
1
) can be seen to be equal to sup z 1 kH (z )k (for the technical details of this last step, we refer to the proof of a very similar theorem 7, Thm. 2.2], which deals with operator norms of Fourier multipliers). Consideration of the analyticity and continuity of H (z) leads to (2.24) and (2.25). 2
kH kl2(
1
j j
Remark 2.2.3 As in the continuous-time case, it follows from (2.20) and (2.22) that the spectral radius of H on nite time intervals is smaller than the spectral radius of H on in nite time intervals.
22
CHAPTER 2. FUNCTIONAL ANALYSIS PRELIMINARIES
Chapter 3 Basic Waveform Relaxation Methods We de ne the basic WR methods for general, linear systems of ODEs and perform a detailed analysis of the methods for systems with constant coecients. The emphasis lies on the Jacobi and Gauss{Seidel waveform variants, the convergence results of which are illustrated and veri ed by means of a semi-discretised model parabolic initial-boundaryvalue problem.
3.1 Description of the method In this section, we shall describe the basic continuous-time WR schemes and give some examples. We also present the resulting discrete-time variants, derived by using linear multistep formulae for time integration.
3.1.1 The continuous-time case
Consider a general, linear initial-value problem B u_ + Au = f u(0) = u0 t > 0 (3.1) where B A 2 C d d and B is assumed to be nonsingular. For such problems, the basic waveform relaxation method can be de ned in terms of the matrix splittings B = MB ; NB and A = MA ; NA where MB is assumed to be invertible. More precisely, these splittings allow us to rewrite the original ODE system in the equivalent form MB u_ + MA u = NB u_ + NAu + f leading in a natural way to the WR iteration scheme MB u_ ( ) + MAu( ) = NB u_ ( 1) + NA u( 1) + f u( )(0) = u0 : (3.2) As noted before, the initial approximation u(0) is usually chosen equal to u0 for t > 0. The convergence behaviour and the computational complexity of the above iteration obviously depend on the nature of the splitting matrices. A natural choice is to split the
;
;
23
CHAPTER 3. BASIC WR METHODS
24 WR variant Richardson Jacobi Gauss{Seidel SOR
MB 1 !I DB DB ; LB 1D ; L B ! B
NB MA 1 1 !I ; B !I LB + UB DA UB DA ; LA 1 !D + U 1D ; L B ! A A ! B ;
NA 1 !I ; A LA + UA UA 1 !D + U A ! A ;
Table 3.1: Several splittings (MB NB ) and (MA NA ) for use with (3.2). matrices B and A in an identical way, and similar to the splittings (MA NA) that are typically used when solving linear systems of the form Au = f by standard relaxation methods MA u( ) = NA u( 1) + f or u( ) = MA 1 NAu( 1) + MA 1 f (3.3) 4, 24, 98, 106]. In Table 3.1 several WR variants for (3.1) are de ned by listing some classical splittings of B and A for use with (3.2). Here, B and A are decomposed as in Section 1.2.1, that is, B = DB ; LB ; UB and A = DA ; LA ; UA, with DB and DA diagonal, LB and LA strictly lower-triangular and UB and UA strictly upper-triangular matrices. It is straightforward to verify that the WR methods with Jacobi and Gauss{ Seidel matrix splittings are equivalent to the more general iteration schemes (1.2) and (1.3), applied to the linear ODE system (3.1). Remark 3.1.1 The WR methods of type (1.8) for ODE systems (3.1) with B = I require only a splitting of the matrix A. Yet, they can also be written and analysed in terms of the general iteration (3.2) by setting MB = I and NB = 0. Remark 3.1.2 If, for ODE systems (3.1) with B = I , we take MB = I , NB = 0, MA = 0 and NA = ;A, then (3.2) corresponds to the Picard iteration. ;
;
;
3.1.2 The discrete-time case
;
We shall concentrate on discrete-time WR schemes for (3.1), obtained by discretising their continuous-time equivalents (3.2) using linear multistep formulae. For the use of Runge{Kutta integration methods in combination with WR, we refer to 8, x8.3] and the references cited therein. We rst recall the de nition of a constant-step-size linear multistep formula for calculating the solution to the ODE (1.4), see, e.g., 48, p. 11], k k X 1X (3.4) l=0 lun + l] = l=0 lf n + l] n 0 : Here, l and l are real constants, denotes the step size, ul] approximates the ODE solution at t = l and f l] = f (l ul]). We shall assume that k starting values u0] u1] : : : uk ; 1] are given. De nition 3.1.1 The characteristic polynomials of the linear multistep method are given by k k X X a(z) := lzl and b(z) := lzl : (3.5) l=0
l=0
3.1. DESCRIPTION OF THE METHOD
25
Throughout this thesis we adhere to some common assumptions. The linear multistep method is irreducible : a(z) and b(z) have no common roots$ the linear multistep method is consistent : a(1) = 0 and a_ (1) = b(1)$ the linear multistep method is zero-stable : all roots of a(z) are inside the closed unit disk and every root with modulus one is simple. When 1 is the only (simple) root of a(z) on the unit circle, the method is called strictly stable. For future reference, we also de ne the stability region of a linear multistep method, as well as the related notion of A()-stability 25, 48]. De nition 3.1.2 The stability region S consists of those 2 C for which the polynomial a(z) ; b(z) (around = 1 : 1 a(z) ; b(z)) satises the root condition: all roots zj satisfy jzj j 1 and those of modulus one are simple. ;
De nition 3.1.3 A multistep method is called
i) A()-stable, 0 < < 2 , if S ' := fz : j Arg(;z )j < z 6= 0g, ii) A-stable if S contains the left-half complex plane fz : Re(z ) < 0g.
Example 3.1.1 We give in the table below the coecients of the backward di erentiation formulae (BDF) of order one to ve 48, p. 242].
order k k 5 4 3 2 1 0 1 1 1 ;1 2 2 1 ; 34 13 3 6 9 3 1 ; 18 ; 112 11 11 11 12 48 36 3 4 1 ; 25 ; 16 25 25 25 25 60 300 300 200 75 12 5 137 1 ; 137 137 ; 137 137 ; 137 Table 3.2: Coecients of the BDF methods of order one to ve. We have plotted the boundaries of their stability regions in Figure 3.1. Another integration method which will be used frequently further on is the trapezoidal rule, also known (mainly in the context of solving PDEs) as the Crank{Nicolson (CN) method. The method, which is de ned by 1 (un + 1] ; un]) = 1 f n + 1] + 1 f n] 2 2 is a second-order method whose stability region equals the left-half complex plane. Application of linear multistep formula (3.4) to the continuous-time iteration scheme (3.2) leads to k k X 1X ( ) () l=0 lMB u n + l] + l=0 lMA u n + l] = k k k X X 1X ( 1) ( 1) l=0 lNB u n + l] + l=0 lNAu n + l] + l=0 lf n + l] : ;
;
(3.6)
CHAPTER 3. BASIC WR METHODS
26 . ..... .. ... .. ...
10 8 BDF(4) 6 BDF(3) 4 BDF(2) 2 BDF(1)
4
;
BDF(5)
...................................................................... ............. .......... ......... ....... ....... ....... ...... ....... ....... ...... . . . . . .. ..... ....... ...... . . . ...... ..... ..... . . . ..... ... . . ..... . . .... .... . . .... .. .... ... . .... .. . .... . . .... .. ... . ... .................................................. ... . . . . . . . . ... . ....... ...... . ... .. . . . . . . . . . . . . ..... ... ...... ..... . . ... .. . . . . . ..... ... .. ..... . . . . . . ... .... . ... .... .. .. . . . .... ... ... .. . . . .... .. .. ... .. ... .......................... . . . . . . . . . . . ...... . ... .. .. ...... . . . . .. ... . .. . . . . . . . ... ..... . ... .. .... .... .. . . . . . . .... .. ... .. .. ..... ... ... . . .... .. .. . . ... ... . .. .... ... ... .................................. .. .... .. ... .... .. .. ..... .. . . . .. .... ...... .. . .. ... .. .. .......... ... . . . . .. .. ... .. ........ ................ . .. .... ......... . . .. .. . . ... .. .. . .. . ....... . .. . ..... . ....... . . . . . . . . ..... . . . . ..... . . . . . ...... . .... . . . . . . . . . . . . . . . . . ........ ................... .. .. . . . . . . . . . . . ; . . . . . . . . ... ...... ....... ...... .. . . . . . . . . . . . . . . ..... .... .... ... ........... .. .. . .......................... ... .. .; .. .. .. .... ... .. ... .. .... .... ... .. ... .. .... ..... .. .. . .. .. ..... ...... . .. . .. . . . . . . . . . ...... .. .. . .. ........... ...... ... .. .. ............................ ... .. . . ... .. ... ; .... .... .... .. ... .... .... .. ..... . . .. .... . . . .. . .. .... ...... .. ..... ... ...... .. ...... ... ...... ...... .. ; ... ...... ........ .. ... ............. ......... ..................................... .. ... . . .. .. ... ... ... ... .... .... ... ... ; .... .... .... .... . . . ..... ... .... ..... .... ..... ..... ..... ..... .... ....... ...... ; ...... ....... . . . . . . ...... ........ ....... .......... ......... ................. .......... ..........................................................
2
2 4 6 8 10
2
4
6
8 10 12 14 16
Figure 3.1: Stability region boundaries of the BDF methods of order one to ve. We do not iterate on the k given starting values, i.e., u( )n] = u( 1)n] = un] for n < k. In the remainder of the text we shall concentrate on the use of implicit methods, i.e., k 6= 0. Equation (3.6) can then be solved uniquely for every n if and only if 1 k MB + k MA is invertible, or, k 62 ;;M 1M : (3.7) B A k Further on we shall refer to this condition as the discrete solvability condition. ;
;
3.2 Convergence analysis We shall study the convergence behaviour of WR methods by using the general framework of Chapter 2. We will generalise Miekkala's and Nevanlinna's convergence results for the B = I -case 63, 64] towards general ODE systems (3.1). More precisely, we rewrite the continuous-time and discrete-time WR iterative schemes (3.2) and (3.6) in explicit form, and analyse the spectral properties of the resulting operators on nite and in nite time intervals. We derive some speci c results for the Jacobi and Gauss{Seidel WR methods, and compare the continuous-time results with their discrete-time counterparts.
3.2.1 The continuous-time case
The waveform relaxation operator and its symbol
After multiplying the left-hand and right-hand side of (3.1) with B 1, we can apply the general solution formula for linear ODE systems of the form (1.7) 15, p. 119]. As such, the solution to (3.1) is formally given by ;
u(t) = e
B ;1 At u + 0
;
Zt 0
eB;1A(s t) B 1f (s)ds : ;
;
(3.8)
3.2. CONVERGENCE ANALYSIS
27
We can then rewrite iteration (3.2) as an explicit successive approximation scheme u( ) = Ku( 1) + ' (3.9) with the continuous-time waveform relaxation operator K and the right-hand side function ' given by K = MB 1 NB + Kc (3.10) ;
;
'(t) = e
MB;1 MA t (I ; M 1 N )u + B B 0
;
;
Zt 0
eMB;1MA(s t)MB 1 f (s)ds : ;
;
Here, Kc is a linear Volterra convolution operator with matrix-valued kernel ;1 kc(t) = e MB MAtMB 1 (NA ; MA MB 1NB ) : (3.11) Let e( ) be the error of the -th WR iterate, i.e., e( ) := u( ) ; u. It satis es e( ) = Ke( 1). That is, it is the solution to the dierential equation MB e_( ) + MAe( ) = NB e_( 1) + NAe( 1) e( )(0) = 0 : (3.12) If e( )(z) = L(e( )(t)) denotes the Laplace transform of e( ), then we get by Laplace transforming (3.12) that e( )(z ) = K(z ) e( 1)(z ) (3.13) with K(z) = (zMB + MA ) 1(zNB + NA ) : (3.14) We shall hereafter refer to K(z) as the continuous-time dynamic iteration matrix or symbol of operator K. This symbol can be rewritten as K(z ) = MB 1NB + Kc(z ), with Kc(z) = (zMB + MA ) 1(NA ; MA MB 1NB ) the Laplace transform of the kernel kc . Remark 3.2.1 For the Jacobi or Gauss{Seidel methods K(z) equals the iteration matrix of the corresponding static relaxation method for the linear system (zB + A) ue(z) = fe(z) + Bu0 (3.15) obtained by Laplace transforming (3.1). Remark 3.2.2 The analogue of the statement of Remark 3.2.1 is not necessarily valid for other WR solvers, an easy counterexample being the Richardson method. Remark 3.2.3 Independent of the choice of the matrix splittings, K(0) = MA 1 NA always equals the iteration matrix of the static iteration (3.3) for Au = f . ;
;
;
;
;
;
;
;
;
;
;
;
Convergence on nite time intervals
The spectral radius of the WR operator on nite time intervals is known to be zero when B is the identity matrix, i.e., the resulting WR method converges in a superlinear way 63, p. 461]. Below, we generalise this result towards ODE systems of type (3.1) with general, nonsingular B . In that case, the asymptotic convergence of the WR method turns out be linear, and solely depending on the splitting (MB NB ). Theorem 3.2.1 Consider K as an operator in C 0 T ]. Then, K is bounded and
(K) = (MB 1 NB ) : (3.16) Proof. Since kc 2 C 0 T ], the theorem follows immediately from Lemma 2.2.1. 2 ;
CHAPTER 3. BASIC WR METHODS
28
Convergence on in nite time intervals The solution to (3.1), given by (3.8), is obviously bounded if and only if all eigenvalues of B 1A have positive real parts. (We assume the right-hand side function f to be in Lp(0 1).) The following lemma deals with the boundedness of the WR operator. ;
Lemma 3.2.2 If all eigenvalues of B 1A have positive real parts, then the following ;
statements are equivalent:
i) K is a bounded operator in Lp (0 1) with 1 p 1, ii) all eigenvalues of MB 1 MA have positive real parts. ;
2
Proof. The lemma is a direct consequence of 62, Thm. 1].
The converse of the lemma is as follows.
Lemma 3.2.3 Consider K as an operator in Lp(0 1) with 1 p 1, and assume all
eigenvalues of MB 1MA have positive real parts. If (K) < 1 then all eigenvalues of B 1A also have positive real parts. ;
;
Proof. By inspection of (3.10) and (3.11) we can conclude that, under the assumption of the lemma, K is bounded in Lp(0 1), and, therefore, (K) < 1. If (K) < 1, then the WR iteration is a convergent successive approximation scheme in Lp(0 1). Its xed point satis es (3.1) and is therefore given by (3.8). Hence, since the xed point is in Lp(0 1), all eigenvalues of B 1A must have positive real parts. 2 ;
Theorem 3.2.4 Consider K as an operator in Lp (0 1) with 1 p 1, and assume 1 all eigenvalues of MB MA have positive real parts. Then, ;
(K) = sup (K(z))
(3.17)
= sup (K(i)) :
(3.18)
Re(z) 0
R
2
This theorem is a special case of 62, Thm. 2]. Here, we prefer to deduce the theorem from Lemma 2.2.3, i.e., we give a proof based on the Paley{Wiener Theorem. Proof. Notice that kc 2 L1(0 1) since all eigenvalues of MB 1 MA have positive real parts. As such we have veri ed the assumptions of the lemma, and the result follows. 2 For a better understanding of the theorem, we recall relation (3.13) between the Laplace transforms of successive errors. Asymptotically, any \frequency" component of the initial error e(0)(i) converges with the corresponding convergence factor (K(i)). According to (3.18), the spectral radius of the WR operator (K) equals the supremum of these factors, taken over all frequencies . Hence, the asymptotic convergence behaviour of operator K is determined by the slowest converging frequency component of the initial error. ;
3.2. CONVERGENCE ANALYSIS
29
Remark 3.2.4 Observe also that (K) (K(0)) = (MA 1 NA). As such, WR methods ;
never converge faster than the corresponding static iteration (3.3)$ the spectral radii of the static and dynamic iterations are equal when the supremum in (3.17){(3.18) is attained at z = i = 0. Setting H = K in Lemma 2.2.4 yields the norm equivalent of Theorem 3.2.4. Theorem 3.2.5 Consider K as an operator in L2(0 1), and assume all eigenvalues of MB 1 MA have positive real parts. Then, kKkL2 (0 ) = sup kK(z )k (3.19) ;
1
Re(z) 0
= sup kK(i)k
(3.20)
R where k k denotes the matrix norm induced by the standard Euclidean vector norm.
2
Remark 3.2.5 If not all eigenvalues of MB 1MA have positive real parts, we can still ;
derive a similar result by switching to an exponentially weighted Lp-space 63, p. 463]. Assuming that for all eigenvalues i of MB 1 MA it holds that Re(i) + > 0, we consider the Banach space Lp (0 1), whose norm is de ned as kxkLp (0 ) := ke t x(t)kLp(0 ) : (3.21) With this change of norm, both Theorems 3.2.4 and 3.2.5 apply with the supremum taken over Re(z) , or after application of the maximum principle, over the line z = + i. Remark 3.2.6 From an analysis in exponentially weighted spaces, one may derive the length of a time window 0 Tr ] wherein convergence is approximately geometric with a given rate of decay r. This is especially interesting when K is unbounded or divergent in Lp(0 1). In particular, Leimkuhler shows that Tr = 1=r , provided that r = inf f : K is bounded in Lp (0 1) and (K) < rg exists, with (K) the spectral radius of K in Lp (0 1) 49, x2.1]. That is, ;
;
1
1
(0) Ke
C 0Tr ]
. e r e(0)C0Tr ]
where the symbol \." means \approximately bounded by". For obvious reasons, the latter author refers to 0 T1] as the window of (stable) convergence.
Jacobi and Gauss{Seidel results
The above framework to analyse general WR methods of the form (3.2) will be used in this section to derive some speci c results for the well-known Jacobi and Gauss{Seidel variants. Let, in the following, KJAC and KGS denote the Jacobi and Gauss{Seidel WR operators. For completeness, we recall their dynamic iteration matrices KJAC(z) = (zI + DA ) 1(LA + UA ) KGS(z) = (zI + (DA ; LA )) 1UA ;
;
CHAPTER 3. BASIC WR METHODS
30
in the B = I -case, and KJAC(z) = (zDB + DA ) 1(z(LB + UB ) + (LA + UA)) (3.22) GS 1 K (z) = (z(DB ; LB ) + (DA ; LA )) (zUB + UA ) for general B . For ODE systems (3.1) with B = I , the Gauss{Seidel WR method is known to be superlinearly convergent on nite time intervals. For general ODE systems, matrix B has to satisfy several assumptions to obtain a convergent dynamic Gauss{Seidel iteration scheme, as shown in the next theorem. Theorem 3.2.6 Consider KGS as an operator in C 0 T ], and assume B is Hermitian with positive diagonal elements. Then the Gauss{Seidel waveform relaxation method converges, i.e., (KGS ) < 1, if and only if B is positive denite. Proof. From Theorem 3.2.1, we derive (KGS ) = ((DB ; LB ) 1UB ). The theorem then follows from 33, p. 71, Cor. 1]. 2 Corollary 3.2.7 For a system of ODEs (3.1), derived from a linear, parabolic PDE (1.9) by nite-element discretisation, the Gauss{Seidel waveform relaxation method is convergent in C 0 T ]. Proof. If (3.1) is derived from a parabolic PDE by nite-element discretisation, matrix B is real positive de nite symmetric, see, e.g., 93, p. 5]. Since the diagonal elements of a positive de nite matrix are positive, we can apply Theorem 3.2.6 to obtain the result. 2 On in nite time intervals, we rst recall a result of Miekkala and Nevanlinna for linear ODE systems with B = I 63, Cor. 4.1]. In particular, the latter authors proved that, under some additional assumptions on the matrix A, the Jacobi and Gauss{Seidel WR methods have the same spectral radii as their static counterparts. Theorem 3.2.8 Consider an ODE system (3.1) with B = I . Assume A has a constant positive diagonal DA = daI (da > 0) and (KJAC(0)) = 1 . Then, if we consider KJAC and KGS as operators in Lp (0 1) with 1 p 1, we have ; ;
KJAC = KJAC(0) = 1 : (3.23) If, in addition, A is consistently ordered, we obtain ; ;
KGS = KGS(0) = 21 : (3.24) We are not able to prove a similar, optimal convergence result for the B 6= I -case. Yet, the spectral radius of the Gauss{Seidel WR method also equals the square of the spectral radius of the Jacobi method for such ODE systems. Lemma 3.2.9 Consider KJAC and K1 GS as operators in Lp(0 1) with 1 p 1. Assume all diagonal elements of DB DA have positive real parts, and let A and B be such that (zB + A) is a consistently ordered matrix for Re(z ) 0. Then,
(KGS ) = (KJAC )2 : (3.25) ;
;
;
;
3.2. CONVERGENCE ANALYSIS
31
Proof. Observe that, for Re(z) 0,
det(I ; KGS(z)) = det (z(DB ; LB ) + (DA ; LA )) 1 det ((z (DB ; LB ) + (DA ; LA )) ; (zUB + UA )) B ) + (DA ; LA )) ; (zUB + UA )) : (3.26) = det ((z(DB ; Ltrace( zDB + DA ) Introducing the shorthands D = zDB + DA , L = zLB + LA and U = zUB + UA , the numerator of (3.26) becomes p p 1 det (D ; L ; U ) = det D ; ( L + p U ) p d p p 1 = ( ) det D ; ( L + p U ) : (3.27) Since zB + A, Re(z) 0, is a consistently ordered matrix, we can use 106, p. 147, Thm. 3.3] to rewrite (3.27) as ;
p
( )d det
p
D ; (L + U ) :
p
Hence, is an eigenvalue of KGS(z) if and only if = 0 or is an eigenvalue of KJAC(z). The latter means that
(KGS(z)) = (KJAC(z))2
(3.28)
2
which implies (3.25) by application of Theorem 3.2.4.
Remark 3.2.7 The rst assumption of Lemma 3.2.9 can be loosened if (3.1) is derived
by spatial nite-element discretisation from a linear, parabolic PDE (1.9). The positive de niteness of B implies that it is sucient to assume that all diagonal elements of A are positive. The latter assumption can be dropped completely if a( ) is H -elliptic, with a( ) the bilinear form corresponding to operator L and H the Sobolev space used in the
nite-element discretisation of the parabolic PDE 11, p. 24].
Finite-interval versus in nite-interval results
Remark 2.2.1, applied to H = K, indicates that the spectral radius of K as an operator on
nite time intervals is smaller than the spectral radius of K as an operator on the in nite time interval. Therefore, it is possible that the WR method is convergent on any nite time interval, but divergent on the in nite time interval. In a situation like that, computations on a suciently long time interval will at rst seem to diverge. Eventually however, the computations must start to converge. This eect is illustrated in the following example.
Example 3.2.1 Consider the linear IVP
1 21 1 1
u_1 + 12 1 u_2 ;1 1
u1 = cos(t) u2 0
(3.29)
CHAPTER 3. BASIC WR METHODS
32
with the initial conditions u1(0) = 1 and u2(0) = 0. The solutions are u1(t) = cos(t) and u2(t) = sin(t). We solve this problem using the Gauss{Seidel WR method. Therefore, we apply the following splittings,
MB = 11 01
1 NB = 00 ;02
1
MA = ;1 01 2
NA = 00 ;01
such that the dynamic iteration matrix is given by
KGS(z) =
z + 21 0 z;1 z+1
1 ;
0 ; 12 z ; 1 0 0
:
We have plotted in Figure 3.2 the spectral radius of the dynamic iteration matrix evaluated along the imaginary axis, i.e., for z = i, as a function of the frequency . On nite time intervals, we can apply the result of Theorem 3.2.1, which yields (3.30)
(KGS ) = (KGS(1)) = 21 : 2.0 1.5 1.0 0.5 0.0
. . . .. . . . . . ... . ... . . . ... . . . . . .. . . . .. . . . . .. . . . .. .. . . . .. .. . . . . . . . . .. .. ... . . ... . . . ............................................................................................................................................................................................................................. ............................................................................................................................................................................................................................
;100 ;75
;50
;25
0
25
50
Figure 3.2: (KGS(i)) versus for (3.29).
2 1 0 ;1 ;2 25.0 12.5 0.0 ;12:5 ;25:0
=0 . . . . .. ... .... ...... ..... ...... ...... ..... ...... .... . .. ... .. ... .. .. .. .. .. ... .. .. .. ... .. .. .. ... . .. .. .. .. .. .. .. .. .. . .. .. .. ... .. .. . .. ... .. .. .. .. .. .. .. . .. .. .. .. .. ... . ... . .... . .... . .... . . .. ... . .. .... . . . . . . . . . . . . . . . .. . ... . ... . . . ... . ... .. . . .. .. . . . . . . . . . . . . . . . . . .. .. .. .. .. . .. . .. . .. .. ... ... .. ... ... .. ... .. .. .. .. . .. . .. .. ... .. .. .. ... .. .. .. ... .. .. .. .. .. ... .. .. .. . .. . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. . . .. .. .. . .. .. .. .. .. . .. . .. . .. .. .. . .. .... .. .. .. .. . . . .. .. . .. .. . . . . . .. . .. . .. .. .. .. .. .. .. .. .. .. .. .. .... .... .... ... ..... ...... ..... ... .. .. .... ... ..
0
10
20
30
40
50 =8
......... .... ..... ..................................................................... ... ...... ....................... .. ...... .................. .. ..... .. ................ ... ...... . . . .. ... ... ... ... ... ... ... .... ... ....... ... ..... . . . .... . .... ...... ..........
0
10
20
30
40
50
4 2 0 ;2 ;4 250 125 0 ;125 ;250
75
100
=4 .... .... ... ...... .. ... ... ... ... .. .. ... ........ .. .. .. .. .. .. .. .. ....... . ... .. .. ... .. ... .. ... .. ... .. ... ........................... .... .. .... .... ..... .. ... ... .... .. ... ... .. . . . .. . ... .. ... .. ... ... .. .. .. .. .. ... ... . ... ... .. ... .. .... ... .. ..... .. ...... ..... . .. .. ..... . . . . . . . . . . . . . . ... .. .. . . .. .. .... ... .. ...... .. ... .. .. . .. .. .. .. ...... .
0
10
20
30
40
50
= 12 ............ ................................................................................................................................ .......... ... ... ... ... ... ... .. ... ..... ... .... ... .... ... ... . . ... . ... ... ... ..... ... .... ..... ... ........................
0
10
20
30
40
50
Figure 3.3: Error e(2 )(t) of the -th Gauss{Seidel WR iterate versus t for (3.29).
3.2. CONVERGENCE ANALYSIS
33
This result ensures linear convergence on any time interval of nite length. The spectral radius on in nite time intervals can be calculated with the aid of (3.18),
(KGS ) = sup (KGS(i)) = 2 :
(3.31)
R
2
Thus, WR is divergent on the interval (0 1). In order to clarify these results, we have plotted in Figure 3.3 the error e(2 ) of the second component iterate after 0, 4, 8 and 12 iterations. Roughly speaking, one observes two subintervals with dierent convergence characteristics, corresponding to the results (3.30) and (3.31) respectively. The nite-length interval with small errors extends as more iterations are applied. Consequently, the region of divergent behaviour recedes backwards after a large number of iterations. Hence, asymptotically, the convergence behaviour will be dictated by the nite-interval analysis.
3.2.2 The discrete-time case
The waveform relaxation operator and its symbol
Iteration (3.6) can be rewritten as u( ) = K u( 1) + ' . Because we do not iterate on the starting values, we use a slightly dierent subscript -notation here than the one in (2.18)$ that is, u = fuk + n]gNn=01 (3.32) with N the (possibly in nite) number of time steps. (Alternatively, we could have used negative indices to denote the time levels associated with the k starting values, as is done in 53, 55]. This, however, would require some shifting in the indices of formulae (3.4) and (3.6).) The precise expression for ' can be calculated following the lines of 67, p. 536{ 537]. It depends on the values of f n], n 0 and on the starting values un], n < k. In order to determine the nature of K , the discrete-time waveform relaxation operator, we rewrite (3.6) in terms of e( )n] := u( )n] ; un]. Here, un] is the exact solution of ODE (3.1) when discretised using the linear multistep method. This gives ;
;
k k X 1X ( ) () l=0 lMB e n + l] + l=0 lMA e n + l] = k k X 1X ( 1) ( l=0 lNB e n + l] + l=0 lNAe ;
1) n + l] :
;
With Cl = 1 lMB + lMA , and Dl = 1 lNB + lNA , this becomes k X l=0
Cle( )n + l] =
k X l=0
Dle(
1) n + l] :
;
(3.33)
As we do not iterate on the starting values, we may assume without loss of generality that e( )n] = e( 1)n] = 0, n < k. When we combine the rst N equations, i.e., the ;
CHAPTER 3. BASIC WR METHODS
34
equations the unknowns on time steps k : :t: N + k ; 1, and after introducing vector ( for ( ) ) E = e k] e( )k + 1] : : : e( )N + k ; 1] , we get
E ( ) = C 1D E ( ;
;
1)
:
(3.34)
Matrices C and D are N N block-lower-triangular matrices with k+1 constant diagonals. The -locks on the j -th diagonal are given respectively by Ck j and Dk j . It follows immediately that matrix C 1D is a N N lower-triangular block-Toeplitz matrix. Hence, K is a discrete linear convolution operator on the lp-space of vectors or sequences of length N . The j -th component of its matrix-valued discrete convolution kernel k equals the (constant) submatrix on the j -th lower block diagonal of C 1D. In the theory we shall need the Z -transform (of) the convolution kernel k , which can ( ) be found by Z -transforming equation (3.33). If e (z) = Z (e ), we obtain ;
;
;
;
e( )(z) = K (z) e( 1)(z) ;
with the discrete-time dynamic iteration matrix or symbol given by
K (z) = (a(z)MB + b(z)MA) 1(a(z)NB + b(z)NA) : ;
(3.35)
By comparison to (3.14) the following relation results,
K (z) = K 1 ab (z) :
(3.36)
Note that (3.36) still holds when ab (z) is set to 1 in the case of b(z) = 0. (In this case a(z) 6= 0, since the characteristic polynomials have no common roots.)
Convergence on nite time intervals Theorem 3.2.10 Consider K as an operator in lp(N ) with 1 p 1 and N nite, and assume the discrete solvability condition (3.7) is satised. Then, K is bounded and
(K ) = K 1 k k
:
(3.37)
Proof. The theorem follows from Lemma 2.2.5 and the observation that
1 a (z) = K 1 k : lim K ( z ) = lim K z z b k !1
!1
2
Convergence on in nite time intervals The following lemma deals with the boundedness of the discrete-time WR operator K . It is proved using a matrix-valued version of Wiener's inversion Theorem, see, e.g., 53, p. 446] or 64, p. 577].
3.2. CONVERGENCE ANALYSIS
35
Theorem 3.2.11 (Wiener's inversion Theorem) Given a matrix-valued sequence x such that x 2 l1(1) and
det
Lemma 3.2.12
1
xi]z
i=0
;
i
!
6= 0
P i i 1 i=0 y i]z = ( i=0 xi]z ) , we have y 2 l1 (1). If (;MB 1MA) int S , then K is bounded in lp(1) with 1 p 1.
for jz j 1. Setting
P
X
1
1
;
;
;
;
Proof. It is sucient to prove that the kernel k of the discrete convolution operator K is an l1-sequence. To this end, consider rst the l1-sequence
k MB + k MA k 1MB + k 1MA : : : 0MB + 0MA 0 0 : : : Its Z -transform equals the matrix function z k (a(z)MB + b(z)MA). By Wiener's Theorem, we have that the inverse, (a(z)MB + b(z)MA) 1 zk , is the transform of another l1-sequence, say r , if det (a(z)MB + b(z)MA) 6= 0 for jzj 1 : (3.38) Next, consider the l1-sequence s = k NB + k NA k 1NB + k 1NA : : : 0NB + 0NA 0 0 : : : the Z -transform of which is given by z k (a(z)NB + b(z)NA). The convolution of r and s is another l1-sequence, which can be seen to be equal to the kernel k . Indeed, the Z -transform of r s is identical to K (z). As a result, it follows that K is bounded if (3.38) is satis ed. Suppose there is a z with jzj 1 such that det (a(z)MB + b(z)MA) = 0 : (3.39) Then necessarily b(z) 6= 0. (If b(z) = 0 then a(z) 6= 0, because a(z) and b(z) have no common roots. Since MB is assumed to be invertible, equality (3.39) can not hold.) We obtain det ab (z)MB + MA = 0 and therefore ab (z) 2 (;MB 1MA ). By de nition of the stability region, we note that na o C n int S = b (z) : jzj 1 : (3.40) Hence, it follows from jzj 1 that ;MB 1MA has an eigenvalue which is not an interior point of S . This contradicts the assumption of the lemma. In other words, we have shown that if K is not bounded, i.e., (3.38) is not satis ed, then (;MB 1MA ) 6 int S . The latter is equivalent to the statement of the lemma. 2 Remark 3.2.8 Condition (;MB 1MA ) int S implies the discrete solvability condition (3.7). Indeed, since kk = ab (1), it follows that kk 62 int S , and, therefore, k 62 (;M 1M ). B A k ;
;
;
;
;
;
;
;
;
;
;
;
CHAPTER 3. BASIC WR METHODS
36
Remark 3.2.9 Condition (;MB 1MA ) int S implies that all poles of K(z) are in ;
the interior of the scaled stability region 1 S .
Theorem1 3.2.13 Consider K as an operator in lp(1) with 1 p 1, and assume (;MB MA) int S . Then, ;
(K ) = sup f (K(z)) : z 2 C n int S g = sup (K(z)) :
(3.41) (3.42)
z @S 2
Proof. As (;MB 1MA) int S , it follows that k 2 l1(1). Lemma 2.2.6 yields ;
1 a (z )
(K ) = max
( K (z )) = max K z 1 z 1 b j j
j j
:
(3.43)
From (3.40), we then derive (3.41), while equality (3.42) is obtained by the maximum principle. Note that we write \sup" instead of \max", since the maximum may be approached at in nity. 2
Remark 3.2.10 Since the linear multistep methods are chosen to be irreducible and
consistent, we have 1 ab (1) = 0. As in the continuous-time case, it then easily follows from (3.43) that (K ) (K(0)). In particular, the spectral radius of the discrete-time WR operator equals that of the corresponding static iteration matrix K(0) = MA 1 NA when the maximum in (3.43) is attained at z = 1. ;
In l2(1), a similar result holds for the norm by application of Lemma 2.2.7.
Theorem 3.2.14 Consider K as an operator in l2(1), and assume (;MB 1MA) ;
int S . Then,
kK kl2 (
)
1
= sup fkK(z)k : z 2 C n int S g = sup kK(z)k z @S 2
(3.44) (3.45)
where k k denotes the matrix norm induced by the standard Euclidean vector norm.
In analogy to the discussion in 67, Thm. 4.2] we can make the following note. It is the discrete-time equivalent of Remark 3.2.5.
Remark 3.2.11 When the assumption in the above theorems is violated, a weaker con-
dition may be satis ed: (;MB 1MA) int S , where S consists of all for which a(e z) ; b(e z) (around = 1 : 1a(e z) ; b(e z)) satis es the root condition. The analysis can then be redone in the Banach space lp (1) with the exponentially scaled norm kx klp ( ) := kfe n xn]gn=0klp ( ) : With this change of norm, the suprema in Theorems 3.2.13 and 3.2.14 have to be taken over all z in C n int S , or, after application of the maximum principle, over @S . ;
;
;
;
;
1
;
;
1
1
3.3. MODEL PROBLEM ANALYSIS
37
3.2.3 Discrete-time versus continuous-time results
The continuous-time results are regained when we let ! 0 in the convergence formulae for operator K . For nite time intervals, we have
lim (K ) = lim0 K 1 k 0 k !
!
= (K(1)) = (K) :
A similar result is found for in nite time intervals. Note that for any consistent linear multistep method, the tangent to @S in the origin of the complex plane is the imaginary axis. As such, the boundary of the scaled stability region @ ( 1 S ) tends to the imaginary axis when ! 0. Consequently, lim (K ) = lim0 sup (K(z)) = sup (K(i)) = (K) :
0 !
!
z @S 2
R
2
Furthermore, for a xed time step , we can prove the following theorem for A()stable linear multistep methods. The theorem is closely related to 55, Prop. 9], where multigrid WR on nite-dierence grids is analysed. We reformulate the proof, using our notations, for completeness.
Theorem 3.2.15 Consider K as an operator in lp(1) and K as an operator in L1p(0 1) with 1 p 1. Assume the linear multistep method is A()-stable and (;MB MA) ' (as dened in Denition 3.1.3). Then, ;
(K ) supc (K(z)) = supc (K(z)) z 2
z @ 2
(3.46)
with 'c := C n ' = fz : j Arg(z)j ; g. Proof. If the multistep method is A()-stable, then ab (z ) 2 'c for jz j 1. Combining the latter with (3.43) yields the inequality of (3.46). The equality is obtained by the maximum principle. 2
Corollary 3.2.16 Consider K as an operator in lp(1) and K as an operator in Lp(0 1)
with 1 p 1. Assume the linear multistep method is A-stable and all eigenvalues of MB 1 MA have positive real parts. Then, (K ) (K). ;
Proof. The corollary is a special case of Theorem 3.2.15 with = =2, combined with Theorem 3.2.4. 2
3.3 Model problem analysis We describe our model parabolic heat ow problem and give an overview of the resulting ODE systems, depending on the dimension of the problem and the choice of the spatial discretisation method. We shall apply the convergence analysis of Section 3.2 to some of these ODE systems, and compare the results with those from our numerical experiments.
CHAPTER 3. BASIC WR METHODS
38
3.3.1 Description of the model problems
In this thesis, our theoretical WR convergence results will be illustrated and veri ed by means of the model heat ow problem @ u ; u = 0 x 2 (0 1)m t > 0 (3.47) m @t P where m denotes the m-dimensional Laplace operator m = mi=1 @x@22i and m 2 f1 2 3g. Equation (3.47) is supplied with an initial condition u(x 0) = u0(x) and Dirichlet boundary conditions, which are, with x = x1, y = x2 and z = x3, chosen such that the analytical solution of the resulting initial-boundary-value problem equals 8 2 > m=1 < u(x t) = sin(x) exp(; t) 2 u(x y t) = 1 + sin(x=2) sin(y=2) exp(; t=2) m=2 > : u(x y z t) = 1 + sin(x=3) sin(y=3) sin(z=3) exp(; 2t=3) m=3: We shall discretise the latter problem (in space) using nite-di erence or nite-element methods on a discrete grid #h with equal mesh size h in all spatial directions, given by 8 > m=1 < fxi = ih : 0 i 1=hg #h = > f(xi = ih yj = jh) : 0 i j 1=hg m=2 : f(x = ih y = jh z = kh) : 0 i j k 1=hg m=3 : i j k We will only recall the results needed for our presentation here, and refer, e.g., to 93, 94] for a more general, in-depth analysis of these semi-discretisation procedures. In the numerical method of lines, all spatial derivatives in (3.47) are replaced by a nite-dierence operator at each grid point of #h . If we use second-order accurate, central dierences, this leads (after elimination of the Dirichlet boundary conditions) to an (1=h ; 1)m -dimensional IVP of the form (3.1). Here, u is a column vector containing the functions which approximate the exact solution at the internal grid points of #h, B = I , and matrix A is given in stencil notation by 1
h2 ;1 2 ;1
2 3 ; 1 12 4 ;1 4 ;1 5 and
h
;1
2 6 6 1 6 6 h2 6 6 4
;1 ;1
. . . . . . . . . . . .
........................................................ . ...... ..... ......
;1
6 . . . . . . . . . . . .
3 7 ;1 7 ;1 777 7 5
.. ...... ...... ...... ..... ........................................................
;1
for m = 1, m = 2 and m = 3, respectively. The latter notation can be regarded as a symbolic representation of (a part of) the discrete grid #h, and it indicates which grid points are involved in the matrix-vector product Au, together with the corresponding coecients. For example, for m = 2 the following ODEs result, u_ ij + h12 (;ui 1j ; uij 1 + 4uij ; uij+1 ; ui+1j ) = 0 1 i j 1=h ; 1 with uij the approximant of the PDE solution at (xi yj ) for t > 0 and uij (0) = u0(xi yj ). ;
;
3.3. MODEL PROBLEM ANALYSIS
39
We also discretised our model problem using a nite-element approach, in which one replaces (3.47) by its weak formulation, given as follows: nd u( t) 2 H such that
@ u v + a(u v) = (f v) for all v 2 H : @t Here, H is usually a Sobolev space on 0 1]m, ( ) denotes the L2-inner product, and a( ) is the bilinear form corresponding to ;m, i.e., 8 < a(u v) = (;mu v) = :
R 1 @u @v
0 @x @x dx
R 1 R 1 @u @v @u @v 0 0 @x @x + @y @y dxdy
m=1 m=2:
(To keep the discussion simple, we omitted the treatment of the boundary conditions, for which we refer to the literature. That is, we assume for simplicity that the Dirichlet boundary conditions are zero.) In a Galerkin-based approach, u( t) is then approximated in a nite-dimensional subspace Hd of H , spanned by a set of linearly Pd independent basis functions f'1 '2 : : : 'dg. The approximate solution u(x t) = i=1 ui(t)'i(x) is found by solving the following set of equations: @u ' + a(u ' ) = (f ' ) j = 1 2 : : : d : j j @t j In terms of the mass matrix B = bij ]dij=1 = ('j 'i)]dij=1 and the sti ness matrix A = aij ]dij=1 = a('j 'i)]dij=1, we may rewrite these equations in our standard form (3.1). The initial condition is obtained by means of interpolation or projection of the given initial condition for the PDE. Usually, the subspace Hd corresponds to a space of piecewise polynomial functions on a set of disjunct polygons or elements, the vertices of which coincide with the grid points of #h . In this case, the basis functions 'i of the latter subspace can be chosen to have compact support, resulting in sparse matrices B and A. For example, we investigated the use of linear nite elements for the one-dimensional variant of (3.47). That is, we approximated its solution in the space of functions that are piecewise linear on the elements xi xi+1], leading to a (1=h ; 1)-dimensional ODE system (3.1) with stencils
(3.48) B = h6 1 4 1 and A = h1 ;1 2 ;1 : We also used quadratic and cubic nite elements to discretise the latter problem in space, but refer to 99, Ex. 5.2.2] and 87, p. 56] for the precise expressions of the matrices B and A in these cases. Finally, we recall the results of some speci c nite-element discretisations of our two-dimensional model problem (3.47). The relevant stencils of the resulting (1=h ; 1)2 -dimensional system (3.1) are given by
h2
B = 12
2 3 2 3 1 1 ;1 4 1 6 1 5 and A = 4 ;1 4 ;1 5
1 1
;1
CHAPTER 3. BASIC WR METHODS
40
in the case of linear nite elements (linear basis functions$ triangular elements), and by
h2
B = 36
2 3 2 3 1 4 1 ; 1 ; 1 ; 1 4 4 16 4 5 and A = 1 4 ;1 8 ;1 5
1 4 1
3 ;1 ;1 ;1
when bilinear nite elements (bilinear basis functions$ rectangular elements) are used.
3.3.2 Theoretical results The continuous-time case
First, we consider our model problem (3.47), discretised using nite dierences. We recall once more that for this problem the Jacobi and Gauss{Seidel WR methods converge in a superlinear way on nite time intervals, see e.g 63, p. 461] or Theorem 3.2.1 with MB = I and NB = 0.
Theorem 3.3.1 Consider the heat equation (3.47), discretised in space using central nite di erences. Then, if we consider KJAC and KGS in C 0 T ], we have
(KJAC ) = (KGS ) = 0 :
(3.49)
In order to determine the spectral radius of KJAC and KGS on in nite time intervals, the spectral radii of KJAC(z) and KGS(z) are to be calculated for every value of z along the imaginary axis. This is generally a very dicult task. However, for the current problem, we have that 1 = (KJAC(0)) = cos(h) (3.50) independent of the spatial dimension m. As such, the following result is an immediate application of Theorem 3.2.8. It can be found, e.g., in 63, p. 473].
Theorem 3.3.2 Consider the heat equation (3.47), discretised in space using central nite di erences. Then, if we consider KJAC and KGS as operators in Lp(0 1) with 1 p 1, we have for small h that
(KJAC ) 1 ; 2h2=2 and (KGS ) 1 ; 2h2 :
(3.51)
Next, we consider the nite-element discretisation of (3.47). We restrict ourselves to the one-dimensional case, discretised using linear basis functions, and calculate both the nite-interval and in nite-interval spectral radii of the Jacobi and Gauss{Seidel WR operators.
Theorem 3.3.3 Consider the one-dimensional heat equation (3.47), discretised in space
using linear nite elements. Then, if we consider KJAC and KGS as operators in C 0 T ], we have
(KJAC ) = 21 cos(h) and (KGS ) = 14 cos2(h) : (3.52)
3.3. MODEL PROBLEM ANALYSIS
41
Proof. The proof follows immediately from the stencil (3.48) of B and Theorem 3.2.1. In particular, we have in the Jacobi case that MB 1NB = DB1 (LB + UB ) is a tridiagonal Toeplitz matrix with zero diagonal and constant entries 1=4 on the subdiagonal and superdiagonal. The expression of the spectral radius of the Gauss{Seidel WR operator then follows immediately from (3.28), applied for z = 1. 2 ;
;
Theorem 3.3.4 Consider the one-dimensional heat equation (3.47), discretised in space using linear nite elements. Then, if we consider KJAC and KGS as operators in Lp(0 1) with 1 p 1, we have (for small h) that and
(KJAC ) = (KJAC(0)) = cos(h) 1 ; 2h2=2
(3.53)
(KGS ) = (KGS(0)) = cos2(h) 1 ; 2h2 :
(3.54)
Proof. Discretising (3.47) for m = 1 with linear nite elements yields 2 + 12 ; 2 zh JAC
(K (z)) = 4zh2 + 12 cos(h) :
As a consequence of Theorem 3.2.4, equation (3.18), we nd that 2 + 12 ; 2 ih
(KJAC ) = sup 4ih2 + 12 cos(h) = cos(h) R 1 ; 2h2 =2 : 2
Since the assumptions of Lemma 3.2.9 are satis ed, formula (3.54) follows immediately from (3.53) by application of (3.25). 2 Equation (3.54) is illustrated for h = 1=16 in Figure 3.4, where we plotted values of (KGS(z)) over the imaginary axis, i.e., for z = i. Since the supremum in (3.18) is attained at = 0, the dynamic Gauss{Seidel iteration turns out to be as fast as its static counterpart for the current model problem.
The discrete-time case
In order to predict the convergence behaviour of an actual implementation of the WR methods, we analyse the discrete-time WR operators for the dierent time-discretisation formulae introduced in Example 3.1.1 with a constant time step = 1=100. In particular, we concentrate on the Gauss{Seidel WR algorithm for the one-dimensional variant of (3.47), discretised in space using linear nite elements with h = 1=16. The spectral radii of the nite-interval and in nite-interval Gauss{Seidel WR operators are reported in Table 3.3. The results were computed by direct numerical evaluation of formulae (3.37) and (3.42). These results can be understood by looking at the spectral picture 95, p. 107], which enables a graphical inspection of convergence. In the spectral picture a set of contour lines of the function (KGS(z)) is plotted for z in a region of the complex plane close to
CHAPTER 3. BASIC WR METHODS
42
1.0 0.9 0.8 0.7 0.6 0.5 0.4 ;1200 ;800 ;400 0 400 800 1200 Figure 3.4: (KGS(i)) versus for (3.47) (m = 1, linear nite elements, h = 1=16). ................................... ....... ...... ....... ...... ...... ..... ...... ..... .... .... . . . .... .... . . ..... . . . ..... ..... . .... . . .... .... . . .... . ... .... . . .... . .... .... . . . .... .... .... . . . . .... .... . .... . . . ..... ..... ..... . . .... .... . . ..... . .... ..... . . . . ..... . .... .... . . . . ..... ...... ...... . . . ...... ...... . . ...... . . . .. ...... ...... . ...... . . ...... ..... . . . . ....... . . ...... ...... . . . . ...... . ...... ...... . . . . . . ....... ... .... .......
multistep method CN BDF(1) BDF(2) BDF(3) BDF(4) BDF(5)
nite interval 0.458 0.658 0.548 0.486 0.445 0.414 in nite interval 0.962 0.962 0.962 0.976 1.149 1.865 Table 3.3: Spectral radii of discrete-time Gauss{Seidel WR for (3.47) (m = 1, linear nite elements, h=1/16, =1/100).
1200 800
BDF(4)
600
BDF(3) BDF(2) BDF(1)
400 200 0
=2:0
200
;
400
;
600
;
800
;
1000
=0:8
;
1200
BDF(5)
CN
1000
;
.. .. ... .. ............................................................................ .. ............... .......... ... ......... ........... .. ....... ... ....... .. ...... ....... . . . . . . . . .. ...... . ..... ...... .. .. ... .......... ..... ...... .. .. .. ......... . ..... .. ..... .. . ..... . . . . .. ..... .. .... ... . . ..... . . . .... .. .. .. ...... . ... ... . . . . .. ... .... . .. .. . .... . . . . . . .... .. .. .. .. .... .... . . . . . .. ... . .. . .... . . . . .... . . . . .. .. .. .. .... .. .. .. . . ..... .... . . ........... .. .. .... ......................... ............................... . .... .. .. . ... .. . . . . . . . . . ...... .... .. .. .. ... . .. .. ...... . . . . . . . . ... . . . . . . . .. .. . . .. .. . ....... .. .. ... ...... ...... . . .. . .. ..... .. . . . . ... ...... .. . . .. .. .. .. .. ....... .. . . . . . . . ..... ... .... .. .... .. .. .. ...... . . . .. . . . . . . .... .. . ... . . . ... .. ... .. . . ..... .. .. .. ........ .... .. . . . . . ... .. ... . . . . . . . ... .. .. ... . . .. ... .... ............................................ . . . .. . . . . . . . . . . . . . . . . ....... .... . . .. .... ... ... .. ...... . . . . . . . .. . . . . . . . . . . . . . . . . . .... . . . . . . .. .... ..... ... .. .... . .. ....... .. .. ..... . ... ...... . .... .. . .. .... ......... .. .. ...... .. ........ . .. .. ... . .. .. ... . ........... .. .... .. .... .. ...................................... .. .. .. . .... . ... ..... ...... ..... .. .. . . . . . . . . . . . .. .... .. .. .. ......... .... ..... ...... .. . . . . . . . . . . . .. . . . ... .. ... . .. .. .. ..................... .................. .. . . .. . . . .. ...... ... . . . . . . .... . .. . .. . . . ... . .. .. .. .. .. . ............ .. . . . .. . .. . . . .. . . . .. . .. .. .. .. ...... .. . . . . . . . . . .... . . . . . . . . . . . . . . . . . . . . . . .. .. . . .......... . . . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. ... .. ... . . .. .. . . .. .. ................................ ................ . . . . . . . . .. .... . . . .. .. .............. ..... ......... ........ . . . . . . . . . .. .. ...... . ...... . ... .... . ..... .. .. . ....... .. ...... .. ...... .. ................................... ... .. .. ... ....... .. .. ...... .. ........... .. .. ... .. .... ... ...... .. .. .... .. ... ......... .. ..... .. . . . . . . . . . . . . . . . . . . . . . . . . . ... . . . .. .. ...... . . .. ....... ...... ... ..... .. .. .. ..... ..... ... ........... ....... .. ... .................................. .. ..... . .. .... ... .. .... .. ...... .. .. .. ........ .. .... .. ..... .... ..... .. .. .. ........ . . . . . . . . ....... . . .... .. . .. .... ... ..... .. . . .... .. . ... .. ........ ...... ... .. . .. ... .. .. .... .. ......... ... .. ... ...... ..... .. .. .. ........ .. ... ...... ...... . . .. .. ......... ....... ... . .. ... .. .... .. . . . . . . . . . . . . ............ .. .. . .. . .. . . ... ................................................... .. .. . . ... .... .. .. .. ....... .. ... ... . .... .. .... .. . .... .. .. ... ... ....... . .... . . . . . . . . . . . .. .. .. .. ... .... .. ....... .. ..... .... ... .. .... ... ..... ..... . ..... .. ...... . ..... ...... .. .. ...... . . ..... . . . . . . . . . ... . ... ...... .. .. ... ......... ...... ..... .. ... ....... ...... ....... .. ....... ... ......... ......... .. ............. . . . . . . . . . . . . . . . . . . . . . . . ................. . .... . ................................................................... .. ... .. ..
400 200 0
;
;
200 400 600 800 1000 1200 1400 1600 1800
Figure 3.5: Spectral picture of Gauss{Seidel WR for (3.47) (m = 1, linear nite elements, h = 1=16, = 1=100).
3.3. MODEL PROBLEM ANALYSIS
43
the complex origin. On top of this picture, the scaled stability boundaries of the linear multistep methods can be plotted. Figure 3.5 display contour lines of (KGS(z)) (for values 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0) for the model problem, together with the scaled stability region boundaries of the CN and BDF methods. The values of the nite-interval spectral radii can be estimated by checking the values of the functions at the points on the real axis given by 1 kk (which are not shown in the picture). With increasing order of the BDF methods, these points move to the right. Indeed, kk equals 1 (BDF(1)), 3/2 (BDF(2)), 11/6 (BDF(3)), 25/12 (BDF(4)), and 137/60 (BDF(5)). A value of 2 is found for the CN method. The values of the in nite-interval spectral radii can be estimated by taking the maximum of (KGS(z)) over the plotted scaled stability region boundaries. If CN time discretisation is used, this means taking the largest value of (KGS(z)) over the imaginary axis. From Figure 3.5 (and also Figure 3.4) it then follows that the in nite-length discretetime waveform method is convergent in this case. The discrete-time waveform variants are convergent for the low-order BDF methods, while divergence is observed for some high-order methods. In particular, the spectral radius seems to increase with increasing order of the BDF method. This was to be expected from Theorem 3.2.15 and the knowledge that the BDF methods are A()-stable with = 90 (BDF(1), BDF(2)), = 88 (BDF(3)), = 73 (BDF(4)) and = 51 (BDF(5)). Note also that the maximum of
(KGS(z)) over 1 @S is found at the origin for CN, BDF(1) and BDF(2). Hence, the equality of the corresponding values in Table 3.3.
3.3.3 Numerical results
In this section, we shall derive some experimental results for our model heat equation (3.47). In correspondence with Section 3.2.2, the solution of the latter problem shall be approximated for N equidistant points t = l in the interval of interest 0 T ], with = k+NT 1 and k l k + N ; 1. We restrict ourselves to the Gauss{Seidel WR method, although similar results can be obtained for the Jacobi variant. () In each waveform iteration we calculated the l 2 -norm of the discrete error e := u( ) ; u , where u is the solution of (3.1) after time discretisation, or, in practice, the best approximation to it. The -th iteration convergence factor ( ) is determined by dividing this result for successive iterates, i.e., ;
( ) := ke( )kl2(N )=ke(
1) k
;
l2 (N )
:
(3.55)
After a suciently large number of iterations, this factor takes a more or less constant value. The averaged convergence factor is then de ned as the geometric average of these iteration convergence factors over the region of nearly constant behaviour. (As customary in the numerical community, we could also calculate the convergence factors from (3.55) with e( ) being replaced by the defect or residual of the -th discrete approximation. The obtained results and insights are very similar.) In Table 3.4, we report observed averaged convergence factors for the one-dimensional variant of (3.47) with t 2 0 1]. We used the CN method for time discretisation and take a small time step ( = 1=1000) in order to approximate the continuous-time convergence
CHAPTER 3. BASIC WR METHODS
44
h 1=8 1=16 1=32 1=64
nite dierences 0.841 0.959 0.990 0.997 linear nite elements 0.841 0.959 0.990 0.997 quadratic nite elements 0.967 0.992 0.998 0.999 cubic nite elements 0.862 0.965 0.991 0.998 Table 3.4: Averaged convergence factors of Gauss{Seidel WR for (3.47) (m = 1, T = 1, CN, = 1=1000). h 1=4 1=8 1=16 1=32
nite dierences 0.478 0.846 0.960 0.990 linear nite elements 0.478 0.845 0.960 0.990 bilinear nite elements 0.356 0.780 0.941 0.985 Table 3.5: Averaged convergence factors of Gauss{Seidel WR for (3.47) (m = 2, T = 1, CN, = 1=1000).
1.0 0.8 0.6 0.4 0.2 0.0
=0
................................................................................................................................................................................................................................. .. .. .. .. .. .. .. .. .. .... .... ..... ..... .. ................................................................................................................................................................................................................... .. .. .. ... .. ... ... ..... ...... ... ....................................................................................................................................................................................................... .. ... .. .. ..... ............... ... .................................................................................................................................................................................... ... .... ............. .... ............................................................................................................................................................................... ...... .................................................................................................................................................................................. ...... ........................................................................................................................................................................ ................................................................................................................................................................... .............................................................................................................................................................................................................................................................
=3
=6
= 30 0.0 0.2 0.4 0.6 0.8 1.0 h = 1=8
1.0 0.8 0.6 0.4 0.2 0.0
=0
. ................................................................................................................................................................................................................................. .. .. .. .. .. .. ... .. .. .. .. .. ... .... ....................................................................................................................................................................................................................... .. .. .. .. .. .. .. ... .. ............................................................................................................................................................................................................. .. .. ... ... .... .............................................................................................................................................................................................. .... ... ... .... ...... .................................................................................................................................................................................. ..... ...... .... ................................................................................................................................................................... .. .. .................................................................................................................................................................... ............................................................................................................................................................. ................................ .. .... ........................................................................................................................................ .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .. .. .. .. .. .. .. .. .. .. .. .. .. .
= 50
= 100
= 500 0.0 0.2 0.4 0.6 0.8 1.0 h = 1=32
Figure 3.6: Successive Gauss{Seidel WR iterates u( )(1=2 t) versus t for (3.47) (m = 1, linear nite elements). CN BDF(1) BDF(2) BDF(3) BDF(4) BDF(5) 0.960 0.961 0.961 0.974 1.147 1.858 Table 3.6: Averaged convergence factors of Gauss{Seidel WR for (3.47) (m = 1, linear
nite elements, h = 1=16, T = 10, = 1=100). Compare with Table 3.3.
3.3. MODEL PROBLEM ANALYSIS
45
multistep method CN BDF(1) BDF(2) BDF(3) BDF(4) BDF(5) linear nite elements 0.990 0.990 0.990 0.997 bilinear nite elements 0.985 0.985 0.985 0.996 Table 3.7: Averaged convergence factors of Gauss{Seidel WR for (3.47) (m = 2, h = 1=32, T = 1, = 1=200). results. Similar results are given for our two-dimensional model heat ow problem in Table 3.5. The measured convergence factors (for the nite-dierence and the one-dimensional linear nite-element case) closely match the ones that can be obtained by evaluation of the in nite-interval Gauss{Seidel spectral radii of Theorems 3.3.2 and 3.3.4. (For both discretisations, the latter are given by cos2(h), yielding 0:854, 0:962, 0:990 and 0:998 for h = 1=8, h = 1=16, h = 1=32 and h = 1=64, respectively.) For the other spatial discretisations of (3.47), the WR convergence factors seem to satisfy a relation of the form
(KGS ) 1 ; O(h2) too, although no explicit theoretical formulae were found. In Figure 3.6, we illustrate this mesh-size dependent convergence behaviour by means of the one-dimensional heat equation, discretised using linear nite elements. In particular, we have pictured successive iterates u( ) of the former experiments, evaluated in the middle of the discrete grid #h . Next, we investigate the in uence of the chosen time-discretisation method upon WR convergence. In Table 3.6, we report observed convergence factors of the Gauss{Seidel WR method for the latter equation, discretised with mesh size h = 1=16. We solved the problem for t 2 0 10] with time step = 1=100, and chose an oscillatory initial approximation of the solution in order to excite all possible error frequencies. We notice that the measured values correspond well to the theoretical in nite-interval spectral radii of Table 3.3. The dependence of the convergence of WR on the nature of the timediscretisation method is further illustrated in Table 3.7, where we reported averaged convergence factors for the two-dimensional variant of (3.47) with T = 1 and = 1=200. The dashes (\-") in the table indicate that the WR method showed divergence over a large number of iterations. The former results clearly indicate that even though the time interval in the experiments is nite, the observed averaged convergence factors (for long enough time windows) closely match the theoretical in nite-interval spectral radii. This fact, which is well-known in the WR literature, is due to the nonnormality of the WR operators on nite intervals. Lumsdaine and Wu explained this theoretically in terms of the so-called pseudospectra of the WR operators 60]. For completeness, we shall clarify this relation between the
nite-interval and in nite-interval spectral radii in another way. To this end, we solve the one-dimensional variant of (3.47), discretised using linear
nite elements with h = 1=16, for t 2 0 1]. (Note that a similar analysis could be done for larger time windows and/or for other model problems. It would lead to similar conclusions and insights.) We use BDF(2) and BDF(5) time discretisation with constant time step = 1=100. In Figure 3.7 successive convergence factors are plotted for the rst 400 waveform Gauss{Seidel iterations, when BDF(2) discretisation is used. These factors
CHAPTER 3. BASIC WR METHODS
46 1.00 0.75 0.50 0.25 0.00
1.00 0.75 0.50 0.25 0.00 400
qqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq q qqqqqq qqqqqq q qqqqqqqqq qqqqqqqq q qqqqqqqqqqq qqqqqqqqqqqqq qqqqqqqq
0
50
100
150
200
250
300
350
Figure 3.7: Convergence factors ( ) versus for (3.47) (m = 1, linear nite elements, h = 1=16, T = 1, BDF(2), = 1=100). 1.00 0.75 0.50 0.25 0.00 1.00 0.75 0.50 0.25 0.00
qqq q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qq
q q q q q
1.00 0.75 0.50 0.25 = 100 0.00 0 10 20 30 40 50 60 70 80 90100 1.00 0.75 0.50 0.25 = 300 0.00 0 10 20 30 40 50 60 70 80 90100 qqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qq qq qq qq qqq q q qqq qqq qqq qqqq qqqq qqqqq
q
= 10 0 10 20 30 40 50 60 70 80 90100 qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqq qqqq qqqq qqqq qqqq qq qq qqqq qqqqqq qqqqqq qqqqqqq qqq q
= 200 0 10 20 30 40 50 60 70 80 90100
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqqq qqqqqqqqq qqqqqqqqq qqqq q
Figure 3.8: Time-level convergence factors ( ) k] versus k for (3.47) (m = 1, linear nite elements, h = 1=16, BDF(2), = 1=100). appear to remain more or less constant for a large number of iterations. The height of the plateau matches the value obtained in Table 3.3 for in nite time intervals, i.e., 0.962. Eventually, the plateau in Figure 3.7 is left, and the factors start to decrease. Ultimately, they start to rise again and reach the value 1. This is for purely technical reasons, because at that time the solution has converged within the nite-precision arithmetic of the implementation. A similar plot is given in Figure 3.9 for the BDF(5) discretisation. Here, the evolution is much more erratic. The results clearly indicate divergence for a large number of iterations. After a sucient number of iterations, the convergence factors decrease below 1, and the iteration starts to converge. This behaviour can be explained by examining the time-level convergence factors. These factors are similar to the standard convergence factors (3.55), but are evaluated for each time level separately,
( )k] := ke( )k]k=ke( 1)k]k with k k the standard Euclidean vector norm. In Figure 3.8, we plotted such time-level convergence factors for the BDF(2) method (for = 10, = 100, = 200 and = 300). The factor measured at the rst time level ;
3.4. SOME CONCLUDING REMARKS 2.0 1.5 1.0 0.5 0.0
47 2.0 1.5 1.0 0.5 0.0 400
qqqq q qq q q q qq q qq qq qq q qq qqq qqqq qqqqqqq qq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq q qqqqqq q q q q qqq q q q qqqq qq q qq q q qq qq qq qq q q q qq qq qq qq qq q qq qqqq q q q q qq qqqqq q q q q q qq qq qqq qq qq qq q qqq qqq qqqqqqq q qqqqqqqqqqqqq q qq q
0
50
100
150
200
250
300
350
Figure 3.9: Convergence factors ( ) versus for (3.47) (m = 1, linear nite elements, h = 1=16, T = 1, BDF(5), = 1=100). 3.0 2.5 2.0 1.5 1.0 0.5 0.0 3.0 2.5 2.0 1.5 1.0 0.5 0.0
3.0 2.5 2.0 1.5 1.0 = 5 0.5 0.0 0 10 20 30 40 50 60 70 80 90100 3.0 2.5 2.0 1.5 1.0 0.5 = 100 0.0 0 10 20 30 40 50 60 70 80 90100 q
q
q
q
q
q
q q
qqqqq
q
q q q q q q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq q
=1 0 10 20 30 40 50 60 70 80 90100 q
q
qq
q
q
q q q q q qq qq q q q q q q qqqqqq q q q q q q q qq q q q qqqqqqqqqq q q q q qq qq qqqqq q qqqqqq q q q qq q q qq q q q q q q q qq qqqqq qqqqqqqq
qqqqqqq
= 50 0 10 20 30 40 50 60 70 80 90100
q
q
q q q qq q q q q q q q q q q q q q q qq qq qqq q qq q q q q q qqqqq q q q q q qq qq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq q q q q q
qq q q q q q q q q qq q qq q q q q qq q q q q qq qq q q q q q q qq q q qq qqqq q qq qq qq q q q q q q q q q q qqqqqq qqq qqq qqqqq
qqqqqqqqqqqqqqqqqqqqqq
Figure 3.10: Time-level convergence factors ( )k] versus k for (3.47) (m = 1, linear nite elements, h = 1=16, BDF(5), = 1=100). is close to 0.548, the value predicted by the nite time interval analysis in Table 3.3. The convergence factors at the next time levels increase, and eventually become constant. The height of the plateau matches the spectral radius value for in nite time intervals. As more iterations are applied, the plateau is forced out of the time window and the corresponding convergence factors decrease. In Figure 3.10, we have plotted time-level convergence factors for the BDF(5) method (for = 1, = 5, = 50 and = 100). Again, we observe that the factor at the rst time level corresponds to the value predicted by the nite time interval analysis (0.414). The pictures illustrate the onset of oscillations which rapidly explode. As more iterations are applied, the region of divergent behaviour moves to the right, and is forced out of the time window. From then on, the iteration converges rapidly.
3.4 Some concluding remarks In this chapter, we have generalised the basic WR convergence theory, originally developed for (3.1) with B = I 63, 64], to linear ODE systems with general B . The theory has
48
CHAPTER 3. BASIC WR METHODS
been applied to a model parabolic problem, discretised in space using nite dierences and several nite-element variants, and checked against results obtained by numerical experiments. We observed the convergence behaviour of the Jacobi and Gauss{Seidel dynamic iterations to be similar to that of their static counterparts for the corresponding time-independent elliptic problem. Because of their poor convergence rates, the static Jacobi and Gauss{Seidel relaxation methods are in practice always used in combination with some acceleration technique. In the following chapters, we shall try to extend and adapt these acceleration techniques to the WR case.
Chapter 4 Waveform Relaxation Methods and Successive Overrelaxation We discuss the use of SOR techniques in combination with WR and analyse the convergence properties of the resulting methods. These methods include both the splittingbased SOR waveform variant, de ned (but not analysed) in Chapter 3, and the recent convolution-based SOR waveform iteration. In particular, a model problem analysis will show the latter method to be far superior over the former one, a fact that will be veri ed by means of extensive numerical experiments.
4.1 Description of the method In this section, we classify the dierent existing SOR WR algorithms. We consider both the continuous-time and discrete-time case, as well as the pointwise and blockwise relaxation variants.
4.1.1 The continuous-time case
The most obvious way to de ne an SOR waveform method for systems of dierential equations (3.1) is based on the natural extension of the static SOR procedure for algebraic equations, which can be found, e.g., in 4, 24, 98, 106]. First, a function u^(i ) is computed with a Gauss{Seidel WR scheme i 1 d X X () ( ) ( ) ( ) _ biiu^i + aiiu^i = ; bij u_ j + aij uj ; bij u_ (j ;
j =1
1) + a
( 1) ij uj
;
j =i+1
;
+ fi (4.1)
with u^(i )(0) = (u0)i. Next, the old approximation u(i 1) is updated by multiplying the correction u^(i ) ; u(i 1) with a scalar overrelaxation parameter !, ;
;
u(i ) = u(i
1) + !
;
u^(i ) ; u(i
;
1)
:
(4.2)
Elimination of the intermediate approximation u^(i ) from (4.1) and (4.2) leads to an iterative scheme of the form (3.2) with (MB NB ) and (MA NA ) the SOR matrix splittings 49
CHAPTER 4. WR METHODS AND SOR
50 of Table 3.1, i.e., and
MB = !1 DB ; LB
NB = 1 ;! ! DB + UB
(4.3)
MA = !1 DA ; LA
NA = 1 ;! ! DA + UA :
(4.4)
This method was brie y considered for ODE systems (3.1) with B = I in 82]. Its nonlinear variant was studied, e.g., in 2, 3]. The rst step of the convolution SOR (CSOR) waveform relaxation method is similar to the rst step of the previous scheme, and consists of the computation of a Gauss{Seidel iterate u^(i ) using (4.1). Instead of multiplying the resulting correction by a scalar !, as in (4.2), the correction is convolved with a time-dependent function # 82],
u(i ) = u(i
;
1) + #
u^(i ) ; u(i
1)
;
:
(4.5)
In this thesis, we will allow for fairly general convolution kernels of the form #(t) = ! (t) + !c (t)
(4.6)
with ! a scalar parameter, (t) the delta function, and !c 2 L1. In that case, (4.5) can be rewritten as
u(i ) = u(i
1) + !
;
u^(i ) ; u(i
;
1)
+ !c u^(i ) ; u(i
;
1)
:
(4.7)
The latter equation corresponds to (4.2) with an additional correction based on a Volterra convolution. As such, the former standard SOR waveform relaxation method can be treated as a special case of the CSOR method by setting !c (t) 0.
Remark 4.1.1 The original SOR WR method, developed and analysed by Miekkala and Nevanlinna for systems of ODEs (3.1) with B = I 63, 64] diers from the standard SOR waveform method described above in that only the coecient matrix A is split in an SOR manner. To distinguish between both SOR methods, we will further refer to the Miekkala{Nevanlinna method as the single-splitting SOR (SSSOR) waveform relaxation method, whereas the standard SOR method will be referred to as the double-splitting SOR (DSSOR) method. The SSSOR method, which can be cast into the framework of (3.2) by setting MB = I , NB = 0 and (4.4), does not have any use for general ODE systems (3.1) with B 6= I . Hence, in that case we will distinguish only between the CSOR method and its variant for !c(t) 0, i.e., the standard SOR or DSSOR method. Remark 4.1.2 The pointwise SOR WR methods described above can be adapted easily to the blockwise relaxation case. Matrices B and A are then partitioned into similar systems of db db rectangular blocks bij and aij . Correspondingly, DB , LB and UB (and DA , LA and UA ) are then block-diagonal, block-lower-triangular and block-upper-triangular matrices.
4.2. CONVERGENCE ANALYSIS
51
4.1.2 The discrete-time case
The rst step of the discrete-time CSOR WR algorithm is obtained by discretising (4.1) in time using a linear multistep formula, yielding
i 1X k X 1 ( ) () lbii + laii u^i n + l] = ; j=1 l=0 lbij + laij uj n + l] db X k k X X 1 ( 1) ; lbij + laij uj n + l] + lfin + l] : j =i+1 l=0 l=0
k X 1 l=0
;
(4.8)
;
The second step approximates the convolution integral in (4.5) by a convolution sum with a discrete kernel # , (u(i )) = (u(i
;
1) )
u(i )) ; (u(i + # (^
1) )
;
:
(4.9)
We obtain from this a discrete-time analogue of (4.7) if we set # = ! + (!c) , with = f1 0 0 : : :g the discrete delta function 75, p. 409]. Equation (4.9) then becomes (u(i ))
= (u(i 1)) ;
+!
(^u(i ))
; (u(i 1) ) ;
+ (!c )
(^u(i ))
; (u(i 1)) ;
: (4.10)
Note that the discrete convolution operator is bounded if # (or (!c ) ) is an l1-sequence. Clearly, the discrete-time analogue to the standard SOR WR method is obtained by setting !c n] 0. The latter can also be derived by discretising (3.2) in time and using the splittings (4.3) and (4.4). As before, we assume that we do not iterate on k given starting values, i.e., u^(i )n] = u(i )n] = u(i 1)n] = uin], n < k, while the discrete solvability condition (3.7) simpli es to k 62 (;D 1D ) (4.11) B A k in the case of (4.8). ;
;
4.2 Convergence analysis We will analyse the convergence properties of the SOR WR methods outlined in the previous section. We will consider the general case of blockwise relaxation, which includes pointwise relaxation as a limiting case. This analysis will follow the framework of Chapter 2 and extends and completes the results of 82] (for ODEs (3.1) with B = I ) to systems with general, nonsingular B . We will concentrate on deriving the nature and the properties of the iteration operator of the CSOR WR method, as well as on the determination of the optimal convolution kernels. Most results for the standard (double-splitting) SOR waveform method follow immediately by setting !c (t) 0 or !c n] 0. For the SOR waveform method with single splitting (which nds use only in the B = I -case) we refer to 63, 64].
CHAPTER 4. WR METHODS AND SOR
52
4.2.1 The continuous-time case
The convolution SOR waveform relaxation operator and its symbol In order to identify the nature of the CSOR waveform iteration operator we need the following elementary result, which we formulate as a lemma.
Lemma 4.2.1 The solution to the ODE bu_ + au = qv_ + pv + w, t > 0, with constants
b a q p 2 C m
m
and b nonsingular, is given by u = Kv + ', where
Kx = b 1 q x + kc x with kc (t) = e ;
'(t) = e
;
;
b;1 at ; u(0) ; b 1 q v (0) ;
Zt
+
b;1 atb 1 (p ; ab 1q )
0
;
e
b;1 a(t
;
s)b 1 w(s) ds : ;
;
;
The function kc 2 L1(0 1) if Re() > 0 for all 2 (b 1 a). If, in addition, w 2 Lp(0 1), then ' 2 Lp(0 1). ;
Proof. The lemma is an immediate consequence of the well-known solution formula for the ODE (1.7), see, e.g., 15, p. 119]. 2
The CSOR WR algorithm implicitly de nes a classical successive iteration scheme of the form u( ) = KCSORu( 1) + 'CSOR, where 'CSOR depends on the ODE's right-hand side f and the initial condition, and where KCSOR denotes a linear operator, which we name the continuous-time convolution SOR waveform relaxation operator. The nature of this operator is identi ed in the next lemma. ;
Lemma 4.2.2 The continuous-time convolution SOR waveform relaxation operator is of
the form
KCSOR = K CSOR + KcCSOR (4.12) with K CSOR 2 C d d and KcCSOR a linear Volterra convolution operator, whose matrixvalued kernel kcCSOR 2 L1 (0 1) if all eigenvalues of DB1 DA have positive real parts and if # is of the form (4.6) with !c 2 L1(0 1).
;
Proof. Introductory Lemma 4.2.1 can be applied to equation (4.1) to give i 1 X ( ) u^i = (Hij j =1 ;
db X ( ) + (hc)ij )uj + (Hij j =i+1
+ (hc)ij )u(j
1) +
;
i
:
(4.13)
The (matrix) constants Hij and (matrix) functions (hc)ij can be derived from the result of Lemma 4.2.1 with b = bii, a = aii, q = ;bij and p = ;aij . Note that (hc)ij 2 L1(0 1) if Re() is positive for all 2 (bii 1aii), while i 2 Lp(0 1) if, in addition, fi 2 Lp(0 1). We shall prove the existence of constants KijCSOR and L1-functions (kcCSOR )ij , i j = 1 2 : : : db, such that ;
u(i ) =
db db X X KijCSOR u(j 1) + (kcCSOR )ij u(j 1) + 'i : ;
j =1
;
j =1
(4.14)
4.2. CONVERGENCE ANALYSIS
53
The case i = 1 follows immediately from the combination of (4.13) with (4.7). More precisely, K11CSOR = (1 ; !) I , K1CSOR = ! H1j , (kcCSOR )11 = ;!c I , (kcCSOR)1j = ! (hc)1j + j !c (H1j + (hc)1j ), 2 j db , and '1 = (# 1) I with I the identity matrix whose dimension equals the dimension of b11 and a11. The general case follows by induction on i. This step involves computing u(i ) from (4.13) and (4.7) and using (4.14) to substitute u(j ), j < i. The result then follows from the knowledge that linear combinations and convolutions of L1-functions are in L1(0 1).
2
The symbol KCSOR(z) of the continuous-time CSOR WR operator is obtained after Laplace transforming the iterative scheme (4.1){(4.5). This gives the equation ue( )(z) = KCSOR(z)ue( 1)(z) + 'e(z) in Laplace-transform space, with ;
!
!! 1
KCSOR(z) = z e 1 DB ; LB + e 1 DA ; LA #(z) #(z)
! !! 1 ; #e (z ) 1 ; #e (z ) z DB + UB + e DA + UA #e (z) #(z) ;
(4.15)
e z ) = ! + !ec (z ). With reference to (4.12), we can also write (4.15) as KCSOR (z ) = and #( K CSOR + KCSOR (z), with KCSOR (z) = L(kcCSOR (t)). c c
Convergence on nite time intervals The following theorem follows immediately from Lemma 2.2.1, applied to KCSOR, taking into account that !c is an L1-function, and, consequently, limz !ec(z) = 0. !1
Theorem 4.2.3 Consider KCSOR as an operator in C 0 T ]. Then, KCSOR is bounded and
;
;
;
KCSOR = KCSOR(1) = K CSOR : If # is of the form (4.6) with !c 2 L1(0 T ), we have
1 ! ; CSOR 1 ; !D + U
K = 1 DB ; LB : B B ;
!
!
(4.16) (4.17)
As a result, we have shown that the asymptotic convergence behaviour of the iteration on a nite time window is independent of the L1-function !c .
Convergence on in nite time intervals If all eigenvalues of DB1 DA have positive real parts, Lemma 4.2.2 implies that the convolution kernel of KCSOR belongs to L1(0 1). Consequently, we can apply Lemmas 2.2.3 and 2.2.4 in order to obtain the following convergence theorem. ;
CHAPTER 4. WR METHODS AND SOR
54
Theorem 4.2.4 Consider KCSOR as an operator in Lp(0 1) with 1 p 1. Assume
all eigenvalues of DB1 DA have positive real parts, and let # be of the form (4.6) with !c 2 L1(0 1). Then, KCSOR is bounded and ;
;
;
(KCSOR) = sup KCSOR(z) = sup KCSOR(i) :
Re(z) 0
R
2
(4.18)
Furthermore, we have kKCSOR kL2 (0 ) = sup kKCSOR (z )k = sup kKCSOR (i )k R Re(z) 0 1
2
(4.19)
where k k denotes the matrix norm induced by the standard Euclidean vector norm.
Remark 4.2.1 In Theorem 4.2.4, we require #e (z) to be the Laplace transform of a
function of the form (4.6) with !c 2 L1(0 1). For this, a sucient (but not necessary) e z ) is a bounded and analytic function in an open domain containing condition is that #( the closed right half of the complex plane 41, Prop. 2.3].
The Laplace transform of the optimal convolution kernel At the heart of classical SOR theories for determining the optimal overrelaxation parameter lies usually the existence of a Young relation, i.e., a relation between the eigenvalues of the Jacobi and the SOR iteration matrices. The following lemma reveals the existence of such a relation between the eigenvalues of the CSOR symbol KCSOR(z) and the eigenvalues of the Jacobi symbol (3.22). It uses the notion of a block-consistent ordering, for which we refer to 106, p. 445, Def. 3.2].
Lemma 4.2.5 Assume the matrices B and A are such that zB + A is a block-consistently
e z ) 6= 0. If (z ) is an eigenvalue ordered matrix with nonsingular diagonal blocks, and #( JAC of K (z ) and (z) satises
((z) + #e (z) ; 1)2 = (z)(#e (z)(z))2
(4.20)
then (z ) is an eigenvalue of KCSOR (z ). Conversely, if (z ) 6= 0 is an eigenvalue of KCSOR(z) which satises (4.20), then (z) is an eigenvalue of KJAC(z). Proof. Using the shorthands D? = zDB + DA , L? = zLB + LA and U ? = zUB + UA , we can write (3.22) and (4.15) as
KJAC(z) = (D?) 1(L? + U ?) KCSOR(z) = (D? ; #e (z)L?) 1((1 ; #e (z))D? + #e (z)U ?) : ;
;
e z ) is viewed as a complex overrelaxation parameter, then KJAC (z ) and Thus, if #( KCSOR(z) are the standard Jacobi and SOR iteration matrices for the matrix zB + A = D? ; L? ; U ?. The result of the lemma then follows immediately from classical SOR theory, see, e.g., 98, Thm. 4.3] or 106, p. 451, Thm. 3.4]. 2
4.2. CONVERGENCE ANALYSIS
55
Under the assumption of a block-consistent ordering of zB + A, we have that if (z) is an eigenvalue of KJAC(z) then also ;(z) is an eigenvalue of KJAC(z) 106, p. 451, Thm. 3.4]. Therefore, if (z) is an eigenvalue of KJAC(z) and (z) satis es (4.20), we can choose p (4.21) (z) + #e (z) ; 1 = (z)#e (z)(z) : e z ) which minimises the The next lemma determines the optimal (complex) value #( spectral radius of KCSOR(z) for a given value of z. The result follows immediately from complex SOR theory, see, e.g., 46, Thm. 4.1] or 70, Eq. (9.19)]. It was rediscovered in 82, Thm. 5.2], and presented there in a WR context for the B = I -case.
Lemma 4.2.6 Assume the matrices B and A are such that zB + A is a block-consistently ; ordered matrix with nonsingular diagonal blocks. Assume the spectrum KJAC(z ) lies on a line segment ;1(z) 1 (z)] with 1 (z ) 2 C n f(;1 ;1] 1 1)g. The spectral radius of KCSOR(z) is then minimised for a given value of z by the unique optimum #e opt(z), given by
2 1 + 1 ; 21(z) p where denotes the root with the positive real part. In particular, #e opt(z) =
;
(4.22)
p
KCSORopt(z) = j#e opt(z) ; 1j < 1 ;
(4.23)
where the superscript opt is used to indicate that KCSORopt(z ) equals the spectral e z) = # e opt (z ). radius of (4.15) with #(
Remark 4.2.2 Following Kredell in 46, Lemma 4.1], the condition on the collinearity of
the eigenvalues of the Jacobi symbol can be weakened. For (4.22) to hold, it is sucient that there exists a critical eigenvalue pair 1(z) of KJAC(z), i.e., for all values of the overrelaxation parameter the pair 1(z) corresponds to the dominant eigenvalue of KCSOR(z). The existence of such a pair may be dicult to verify in practice, though.
Remark 4.2.3 The results of the above lemmas rely on the assumption of a block-
consistent ordering of the matrix zB + A. In the case of semi-discrete parabolic PDEs in two dimensions, this assumption is, for example, satis ed if B = I , A corresponds to a ve-point nite-dierence stencil and the relaxation is pointwise with lexicographic, diagonal or red-black ordering of the grid points. It is also satis ed if B and A correspond to a nine-point stencil { a discretisation with linear or bilinear nite elements on a regular triangular or rectangular mesh, for example { and the relaxation is linewise.
Remark 4.2.4 The assumption that the eigenvalues of the Jacobi symbol are on a line
is rather severe. Yet, it is satis ed for certain important classes of problems. For example, this is so when B = I and A is a real symmetric positive de nite consistently ordered matrix with constant positive diagonal DA = daI (da > 0). In that case the spectrum of KJAC(z) equals da=(z +da)(KJAC(0)). It is the spectrum of the standard Jacobi iteration matrix, which is real and has maximum smaller than one 106, p. 147, Thm. 3.5], scaled and rotated around the origin. If the assumptions of the theorem are violated, a more
CHAPTER 4. WR METHODS AND SOR
56
general complex SOR theory, allowing the eigenvalues of the Jacobi symbol to be in a certain ellipse, should be used, see, e.g., 32] or Section 4.2.4. Alternatively, one could decide to continue to use (4.22). Although one gives up optimality then, this practice may lead to a good enough convergence, as is illustrated in Section 4.3.2. Remark 4.2.5 If 1(z) 2 C n f(;1 ;1] 1 1)g and 1(z) is analytic for Re(z) 0, then #e opt(z) is bounded and analytic for Re(z) 0. According to Remark 4.2.1, Theorem 4.2.4 may then be applied to calculate the spectral radius of the CSOR operator with optimal kernel (KCSORopt). In general however, 1(z) is only known to be piecewise analytic, with possible discontinuities or lack of smoothness where the maximum switches from one eigenvalue branch to another. In that case, one might opt for using (4.22) with an analytic function 1(z) that approximates the \correct" one at certain values of z (e.g., near the values that correspond to the slowest converging error components).
Optimal pointwise overrelaxation in the B = I-case
In this section, we will study the optimal variants of the three dierent pointwise SOR methods for ODE systems of the form (3.1) with B = I . The resulting formulae will be applied to a model problem in Section 4.3. We rst recall the analytical expression of the spectral radius of the single-splitting SOR WR operator KSSSOR , as presented by Miekkala and Nevanlinna in 63, Thm. 4.2]. Theorem 4.2.7 Consider an ODE system (3.1) with B = I . Assume A is a consistently ordered matrix with constant positive diagonal DA = da I (da > 0), the eigenvalues of KJAC(0) are real with 1 = (KJAC(0)) < 1 and 0 < ! < 2. Then, if we consider KSSSOR as an operator in Lp (0 1) with 1 p 1, we have
(KSSSOR ) =
8 q 1 (! )2 + (! ) 1 ; ! + 1 (! )2 > > 1 ; ! + 1 1 1 < 2 4
! !d
(4.24) 8(! ; 1)2 ! > ! d 8(! ; 1) ; (!1)2 p where !d = (4=3)(2 ; 4 ; 321 )=21. Furthermore, we have !opt = 4=(4 ; 21) > !d . Based on Lemma 4.2.5 we can derive an analogous expression for the spectral radius of the double-splitting SOR WR operator KDSSOR. Theorem 4.2.8 Consider an ODE system (3.1) with B = I . Assume A is a consistently ordered matrix with constant positive diagonal DA = da I (da > 0), the eigenvalues of KJAC(0) are real with 1 = (KJAC(0)) < 1 and 0 < ! < 2. Then, if we consider KDSSOR as an operator in Lp (0 1) with 1 p 1, we have > > :
8 q > 1 (! )2 + (! ) 1 ; ! + 1 (! )2 > 1 ; ! + 1 1 1 > 2 4 < DSSOR
(K )=> 1 + 41 !! 11 > > : (! ; 1) 1 ; 1 !1 4 ! 1 p 2 2 p
;
p
;
where !d = (4 ; 2 4 ; 21)=1 . Furthermore, we have that !opt > !d .
! !d ! > !d
(4.25)
4.2. CONVERGENCE ANALYSIS
57
Proof. Since B = I and DA = da I , we have that
(KJAC(z)) = z +dad (KJAC(0)) : a
(4.26)
In order to apply Lemma 4.2.5 to the double-splitting case, we note that here #e (z) = ! and that (z) is an eigenvalue of the DSSOR symbol KDSSOR(z), which is given by 1 1 1 1 ;! 1 ;! DSSOR K (z) = z ! I + ! DA ; LA z ! I + ! DA + UA : With (0) denoting an arbitrary eigenvalue of KJAC(0), we can rewrite (4.21) as p z = ;1 + (z)!(0) : da (z) + ! ; 1 We get equilibrium lines for j(z)j, p j(z )j!(0) z(t) = ;1 + (4.27) da j(z )jei 2t + (! ; 1)e i 2t by setting (z) = j(z)jeit with t varying from 0 to 4. The supremum of j(z)j along the imaginary axis is attained at a point where such an equilibrium line osculates the imaginary axis, i.e., when Re(z(t)) = 0 and Re(z_(t)) = 0. In addition, we note that the eigenvalues of KJAC(z) are collinear, which implies that KJAC(z) has a critical eigenvalue pair 46, Lemma 4.1]. According to Remark 4.2.2, the dominant eigenvalue of KDSSOR(z) is then obtained by replacing (0) by 1 in (4.27). This yields for Re(z(t)) = 0 the following condition p t 2 4j(z)j(! ; 1) cos 2 ; j(z)j!1(j(z)j + ! ; 1) cos 2t +(j(z)j; ! +1)2 = 0 (4.28) while Re(z_ (t)) = 0 gives p 1 t 4j(z)j(! ; 1) cos 2 ; 2 j(z)j!1(j(z)j + ! ; 1) sin 2t = 0 : (4.29) If sin(t=2) = 0, the osculation with the imaginary axis occurs at the origin and the corresponding largest value of j(0)j equals the spectral radius of the static SOR p method stat 98, 106], which, for ! not larger than the optimal parameter !opt = 2=(1 + 1 ; 21) of the latter method, is given by r 1 1 j(0)j = 1 ; ! + (!1 )2 + (!1 ) 1 ; ! + (!1 )2 : (4.30) 2 4 If sin(t=2) 6= 0, the osculation is at a complex point z = i, 6= 0 (and by symmetry at z = ;i). The corresponding value of j(z)j is obtained by eliminating cos(t=2) from (4.28) and (4.29). This gives the equation ;32(! ; 1)2 ; 2! 2 (! ; 1)21 2 j(z )j + (! ; 1)2 = 0 j(z )j + 2 2 16(! ; 1) ; ! 1 ;
;
CHAPTER 4. WR METHODS AND SOR
58 whose largest solution for ! > 1 equals
1 + 41 !! 1 1 (4.31) j(z )j = (! ; 1) 1 !1 : 1; 4 ! 1 In order to determine the range of validity of this result, we need in (4.29) to specify the condition that ;1 < cos(t=2) < 1. This is a condition on ! which, when combined with (4.31), leads to ! > !d with !d as given in the formulation of the theorem. It turns out stat and that (4.31) is larger than (4.30) for !d < ! ! stat. Hence, the that 1 < !d < !opt opt proof is completed by combining the latter two expressions. 2 Finally, we investigate the spectral radius of KCSORopt, the CSOR waveform operator with optimal kernel. Theorem 4.2.9 Consider an ODE system (3.1) with B = I . Assume A is a consistently ordered matrix with constant positive diagonal DA = daI (da > 0) and the eigenvalues of KJAC(0) are real with 1 = (KJAC (0)) < 1. Then, if we consider KCSORopt as an operator in Lp (0 1) with 1 p 1, we have 2 p1 2 2 :
(KCSORopt) = (KCSORopt(0)) = (4.32) (1 + 1 ; 1) Proof. Under the assumptions of the theorem we have (4.26) and 1(z) = z +dad 1 (4.33) a which implies that the eigenvalues of KJAC(z) lie on a line segment ;1(z) 1(z)] with 1(z) 2 C nf(;1 ;1] 1 1)g. Hence, we may apply Lemma 4.2.6 in order to derive the optimum complex overrelaxation parameter #e opt(z). Since the resulting function, which equals (4.34) #e opt(z) = r 2 2 d 1 + 1 ; z+ada 1 is bounded and analytic in the complex right-half plane, including the imaginary axis, we know from Remark 4.2.1 that it is the Laplace transform of a function of the form (4.6), with !c 2 L1(0 1). Thus, we may apply Theorem 4.2.4, which, when combined with (4.23), yields j (i )j2 p1
(KCSORopt) = sup (KCSORopt(i)) = sup j#e opt(i) ; 1j = sup : R R R j1 + 1 ; 21 (i )j2 Since the numerator is maximal for = 0, and the denominator is minimal for = 0, the latter supremum is obtained for = 0. This completes the proof. 2 Using formulae (4.24), (4.25) and (4.32), we can compute the spectral radii of the optimal single-splitting , double-splitting and convolution SOR WR methods as a function of 1. These values are presented in Table 4.1, together with the values of the optimal parameter !opt for the SSSOR and DSSOR method. Observe, in addition, that the convergence behaviour of the optimal pointwise CSOR WR method is similar to that of its static counterpart, as shown in the following corollary. p
;
p
2
2
;
2
4.2. CONVERGENCE ANALYSIS
59
1
0.9 0.95 0.975 0.9875
(KSSSOR!opt ) 0.681 (1.2539) 0.822 (1.2913) 0.906 (1.3117) 0.952 (1.3224)
(KDSSOR!opt ) 0.750 (1.1374) 0.867 (1.1539) 0.932 (1.1626) 0.965 (1.1671)
(KCSORopt) 0.393 0.524 0.636 0.728 Table 4.1: Spectral radii of optimal single-splitting (SSSOR), double-splitting (DSSOR) and convolution (CSOR) SOR WR for problems (3.1) with B = I and DA = daI that satisfy the assumptions of Theorems 4.2.7{4.2.9. The value of the optimal parameter !opt is given in parenthesis.
Corollary 4.2.10 Under the assumptions of Theorem 4.2.9, optimal CSOR WR for the
ODE system u_ + Au = f attains the same asymptotic convergence rate as the optimal static SOR method for the linear system Au = f . Proof. As the maximum of (KCSORopt(z )) is foundpat the complex origin, the statement stat = # e opt(0) = 2=(1 + 1 ; 21 ). follows from the fact that !opt 2
Moreover, we have the following explicit expression for the optimal convolution kernel.
Theorem 4.2.11 Under the assumptions of Theorem 4.2.9, we have that #opt(t) = (t)+ (!c )opt(t) with (!c )opt 2 L1(0 1). In particular, (!c )opt(t) = 2e
da t I
;
2(1 da t)=t
(4.35)
with I2() the second-order modied Bessel function of the rst kind. Proof. From the proof of Theorem 4.2.9, we have that #opt is of the form (4.6) with !c 2 L1(0 1). In addition, it follows from (4.34) that ! = limz #e opt(z) = 1. The correctness of the analytic expression (4.35) can be checked as an elementary exercise by using the Laplace-transform pairs L ( (t)) = 1 and !1
b)2 a ; b t)=t = p (a ; p L 2e ( 2 2 ( z + a + z + b)4 the latter of which can be found in 1, Eq. (29.3.53)]. (a+b)t=2I
;
2
Remark 4.2.6 Observe that since ! = 1, the matrix multiplication part of KCSORopt equals
K CSORopt = zlim KCSORopt(z) = 0 : !1
Hence, in this case, Lemma 4.2.2 implies that KCSORopt is just a convolution operator with an L1-kernel. The shape of the function (!c)opt is characterised by the properties given in the following corollary.
CHAPTER 4. WR METHODS AND SOR
60 4
10
2
10
0
10
−2
10
−4
10
−6
10
−8
10
−5
−4
10
−3
10
−2
10
10
−1
10
0
10
Figure 4.1: (!c )opt(t) versus t for (3.47) (m = 1, nite dierences). The dierent curves correspond to mesh size h = 1=8 (solid), h = 1=16 (dotted), h = 1=32 (dashed) and h = 1=64 (dash-dotted) respectively.
Corollary 4.2.12 Under the assumptions of Theorem 4.2.9, (!c)opt, given by (4.35), satises the following properties: i) (!c )opt (0) = 0 ii) 0 (!c )opt(t) < 1 dae (1 1)dat ;
iii) iv)
21 4e da
R
1
0
;
< maxt 0(!c )opt(t) < 1da
(!c )opt(t) dt = (1+p11
:
2 ;
21 )2
Proof. A series expression for the modi ed Bessel function I2() can be found in 1, Eq. (9.6.10)]. It reads 2 X ( t42 )k t I2(t) = 2 k=0 k !(k + 2)! from which we derive X 1 da )2k+1 2k+1 (!c )opt(t) = 1 da e dat 22k(+1 t : (4.36) k !( k + 2)! k=0 1
1
;
Property i and the positivity of (!c)opt result. Formula (4.36) can be written as (!c )opt(t) = 1da e
;
da t
X
(2k + 1)! (1dat)2k+1 : 2k+1 k !(k + 2)! (2k + 1)! k=0 2 1
4.2. CONVERGENCE ANALYSIS
61
Since the coecients (2k + 1)!=(22k+1 k!(k + 2)!) are (strictly) smaller than 1 for k 0, we have (!c )opt(t) < 1da e da t e1dat = 1da e (1 1)dat (4.37) which proves the upper bound in Property ii. We now truncate series (4.36) after the
rst term to get 1da e dat 14da t (!c )opt(t) (4.38) ;
;
;
;
with equality only for t = 0. Calculation of the maxima of the upper and lower bounds (4.37) and (4.38) over t 0 leadsR to Property iii. Finally, by de nition of the Laplace transform, we have #e opt(0) = 1 + 0 (!c )opt(t)dt, leading immediately to Property iv. 2 1
When the system of ODEs is derived by spatial discretisation of a parabolic PDE, 1 is often close to one. The characteristics of the optimal kernel are then largely determined by the parameter da , whose value is often rapidly increasing with decreasing mesh spacing. In that case, (!c)opt is a positive function, which starts from 0 at t = 0 and has an area that is bounded by one. Its maximum is proportional to da, hence it is large for small h, while the function decreases exponentially for suciently large t. As an example, we will illustrate these implications of Corollary 4.2.12 for the one-dimensional heat equation (3.47), discretised using nite dierences. The resulting ODE system (3.1), with B = I , da = 2=h2 and 1 = cos(h), satis es the conditions of Theorem 4.2.11. Figure 4.1 shows a logarithmic plot of (!c )opt(t) for t 2 10 5 1] and for several values of the mesh size h. Note that its maximum increases and is attained at a smaller t-value for decreasing h, while, for suciently large t, the value of the optimal kernel rapidly approaches 0. Consequently, we may expect the use of a truncated kernel #(t), de ned by #opt(t) for t T and by 0 for t > T for some large enough T , to lead to nearly optimal convergence results. ;
4.2.2 The discrete-time case
The convolution SOR waveform relaxation operator and its symbol Since we do not iterate on the k starting values, we use the shifted subscript -notation of (3.32) for sequences u of which the initial values un], n < k are known. The discrete-time version of Lemma 4.2.1 reads as follows.
Lemma 4.2.13 The solution to the di erence equation k X 1 l=0
lb + la un + l] =
k X 1 l=0
lq + lp vn + l] +
k X l=0
lwn + l] n 0
with b a q p 2 C m m and b nonsingular is given by u = K v + ' , with K a discrete convolution operator with kernel k and ' depending on wl], l 0 and the initial values ul] vl], l = 0 1 : : : k ; 1. The sequence k 2 l1(1) if (;b 1a) int S . If, in addition, w 2 lp(1), then ' 2 lp(1).
;
CHAPTER 4. WR METHODS AND SOR
62
Proof. The proof of the lemma is based on a Z -transform argument and the use of Wiener's inversion theorem for discrete l1-sequences. It is analogous to the proof of Lemma 3.2.12. 2
The discrete-time CSOR scheme can be written as a classical successive approxi( ) ( 1) CSOR mation method, u = K u + ' . Here, ' is a sequence which depends on the discrete right-hand side f and the initial conditions, while the nature of the discrete-time convolution SOR waveform relaxation operator KCSOR is identi ed below. ;
Lemma 4.2.14 The discrete-time convolution SOR waveform relaxation operator KCSOR is a discrete convolution operator, whose matrix-valued kernel kCSOR 2 l1(1) if (;DB1DA ) int S and # 2 l1(1). ;
Proof. Discretisation of (4.1) and (4.5) leads to the discrete-time CSOR scheme, given by (4.8) and (4.9). Application of Lemma 4.2.13 to (4.8) gives
(^u(i ))
=
i 1 X ;
j =1
(hij ) (u(j ))
+
db X j =i+1
(hij ) (u(j
;
1) )
+ ( i)
(4.39)
with (hij ) 2 l1(1) if (; bii 1aii) int S and ( i) an lp-sequence. It is now easy to prove that there exist l1-sequences (kijCSOR) such that ;
(u(i ))
db X = (kijCSOR) (u(j 1)) + ('i) :
(4.40)
;
j =1
Indeed, the combination of (4.39) and (4.9) for i = 1 gives (k11CSOR) = ( ; # ) I , = # (h1j ) , 2 j db , and ('1) = (# ( 1) ) I with I the identity matrix of the appropriate dimension in the case of block relaxation. The general case involves the computation of (u(i )) from (4.39) and (4.9). It follows by induction on i and is based on the elimination of the sequences (u(j )) , j < i, from (4.39) using (4.40). Consequently, the resulting (kijCSOR) , which consist of linear combinations and convolutions of l1-sequences, belong to l1(1). 2
(k1CSOR j )
Z -transformation of the the discrete-time CSOR waveform iterative scheme yields ( 1) CSOR = K (z)ue (z) + 'e (z) with KCSOR (z) the discrete-time CSOR symbol. This symbol is given by
ue( )(z)
;
!
!! 1
KCSOR (z) = 1 ab (z) e 1 DB ; LB + e 1 DA ; LA # (z ) # (z)
! !!
1 a 1 ; #e (z ) 1 ; #e (z ) b (z) #e (z) DB + UB + #e (z) DA + UA with #e (z) = ! + (!ec) (z).
;
4.2. CONVERGENCE ANALYSIS
63
Convergence on nite time intervals
For the iteration on nite intervals we have the following result from Lemma 2.2.5. Theorem 4.2.15 Consider KCSOR as an operator in lp(N ) with 1 p 1 and N nite, and assume that the discrete solvability condition (4.11) is satised. Then, KCSOR is bounded and
(KCSOR) = (KCSOR (1)) : (4.41)
Convergence on in nite time intervals
If (;DB1DA ) int S and # 2 l1(1), Lemma 4.2.14 implies that the kernel of the CSOR operator belongs to l1(1). Hence, we may apply Lemmas 2.2.6 and 2.2.7 in order to derive the following in nite-interval result. Theorem 4.2.16 Consider KCSOR as an operator in lp(1) with 1 p 1. Assume (;DB1DA ) int S and # 2 l1(1). Then, KCSOR is bounded and CSOR (z )) = max (KCSOR (z )) :
(KCSOR ) = max
( K (4.42) z 1 z =1 ;
;
j j
j j
Furthermore, we have kKCSOR kl2 ( ) = max kKCSOR (z)k = max kKCSOR (z)k z 1 z =1 1
j j
j j
(4.43)
where k k denotes the matrix norm induced by the standard Euclidean vector norm. Remark 4.2.7 In Theorem 4.2.16, we require #e (z) to be the Z -transform of an l1kernel # . For this, a sucient (but not necessary) condition is that #e (z) is a bounded and analytic function in an open domain containing fz 2 C : jzj 1g. A tighter set of conditions can be found in 27, p. 71].
The Z-transform of the optimal convolution sequence
The following lemma is the discrete-time equivalent of Lemma 4.2.6. It involves the eigenvalue distribution of the discrete-time Jacobi symbol, which is related to its continuoustime equivalent by (3.36). Lemma 4.2.17 Assume the matrices B and A are such that 1 ab (z)B + A is a blockconsistently ; JAC ordered matrix with nonsingular diagonal blocks. Assume the spectrum K (z) lies on a line segment ;(1) (z) (1) (z)] with (1) (z) 2 C nf(;1 ;1] 1 1)g. The spectral radius of KCSOR (z) is then minimised for a given value of z by the e unique optimum (#opt) (z), given by (4.44) (#e opt) (z) = p 2 2 1 + 1 ; (1) (z) p where denotes the root with the positive real part. In particular, opt (z )) = j(# e opt ) (z ) ; 1j < 1 :
(KCSOR (4.45) Proof. In analogy with Lemma 4.2.6, the result follows from standard complex SOR theory applied to the complex matrix 1 ab (z)B + A. 2
CHAPTER 4. WR METHODS AND SOR
64
4.2.3 Discrete-time versus continuous-time results
Spectral radii
Under the assumption
a 1 e e (z ) # (z) = #
(4.46) b the discrete-time and continuous-time CSOR symbols are related by a formula similar to (3.36), i.e., 1 a CSOR CSOR K (z) = K b (z) : As a result, we have the following two theorems which provide the spectral radius/norm of the discrete-time operator in terms of the symbol of the continuous-time operator. They correspond to Theorems 3.2.10 and 3.2.13{3.2.14, so we can omit their proofs. Theorem 4.2.18 Consider KCSOR as an operator in lp(N ) with 1 p 1 and N
nite. Assume the discrete solvability condition (4.11) is satised and (4.46) holds for z = 1. Then, 1 k CSOR CSOR
(K ) = K : (4.47)
k Theorem 4.2.19 Consider KCSOR as an operator in lp(1) with 1 p 1. Assume (;DB1DA ) int S , # 2 l1(1) and (4.46) holds for jzj 1. Then,
(KCSOR) = supf (KCSOR(z)) : z 2 C n int S g = sup (KCSOR(z)) : (4.48) ;
z @S 2
Furthermore, we have kKCSOR kl2 ( ) = supfkKCSOR (z )k : z 2 C n int S g = sup kKCSOR (z )k (4.49) 1
z @S 2
where k k denotes the matrix norm induced by the standard Euclidean vector norm. Under the right assumptions (among others (4.46) for the necessary values of z), we can also extend the validity of the results of Section 3.2.3 to the CSOR case. In particular, we have ; CSOR ; CSOR lim
K =
K 0 both for the nite-interval and in nite-interval computation, while the CSOR versions of Theorem 3.2.15 and Corollary 3.2.16 follow in a straightforward way. Observe that equality (4.46), and, hence, the results mentioned above, are not necessarily satis ed. They do hold, however, in the two important cases explained below. The rst case concerns the standard SOR waveform ; 1 amethod with double splitting. e e Indeed, if !c(t) 0 and !c n] 0, we have # (z) = # b (z) = !. Equality (4.46) is also satis ed for the optimal CSOR method. As the discrete-time and continuous-time Jacobi symbols are related by (3.36), a;similar relation holds for their respective eigenvalues with largest modulus: (1) (z) = 1 1 ab (z) . Hence, by comparing the formulae for the optimal convolution kernels in Lemmas 4.2.6 and 4.2.17, we nd a 1 e e (#opt) (z) = #opt (z) : !
b
4.2. CONVERGENCE ANALYSIS
65
Optimal convolution kernels In this section we will relate the optimal continuous-time and discrete-time convolution kernels. Therefore, we rewrite (4.7) and (4.10) as
u(i )(t) = u(i 1)(t) + !
u^(i )(t) ; u(i 1)(t)
;
;
Zt
+
0
!c(t ; s) u^(i )(s) ; u(i
;
1)(s)
ds
and
u(i )n] = u(i 1)n] + ! ;
n X
u^(i )n] ; u(i 1)n] ;
+
l=0
!cn ; l]
u^(i )l] ; u(i 1)l] ;
respectively. Comparing the latter equations already suggests that !cn] should be such that it approximates !c (n ) for small . In that case the discrete convolution sum approximates the continuous convolution integral as a simple numerical integration rule. This intuition is con rmed and cast into a more precise mathematical form in the following theorem.
Theorem 4.2.20 Consider an ODE system (3.1) with B = I . Assume A is a consis-
tently ordered matrix with constant positive diagonal DA = da I (da > 0), the eigenvalues of KJAC(0) are real with 1 = (KJAC(0)) < 1, and the linear multistep method is strictly stable. Then, the continuous-time optimal kernel #opt (t) = (t) + (!c )opt(t) and its discrete-time equivalent (#opt) = + ((!c )opt) are related by
) n] = (! ) (t) t 0 : lim ((!c)opt c opt
0 (t=n ) !
(4.50)
Note that we have used a subscript in the notation of the optimal discrete kernel to emphasise that the function depends on the value of the time increment. An equivalent but somewhat less intuitive form of (4.50) is obtained by replacing by t=n: lim n
!1
((!c)opt) nt n] = (!c)opt(t) t > 0 : t=n
(4.51)
Proof. Under the assumptions of the theorem, Lemma 4.2.6 holds for Re(z ) 0 with 1(z) given in (4.33). The function #e opt(z), given by (4.34), is bounded and analytic for Re(z) 0. Consequently, by the inverse Laplace-transform formula, we have Z 1 (!c )opt(t) = 2i
i
1
i
ezt #e opt(z) ; 1 dz :
(4.52)
; 1
A similar expression will be derived for the discrete-time kernel by using the inverse Z -transform formula. To apply Lemma 4.2.17, we have to ensure that (1) (z) 2 C n f(;1 ;1] 1 1)g jzj 1
(4.53)
CHAPTER 4. WR METHODS AND SOR
66
with (1) (z) calculated from (3.36) and (4.33), i.e., (4.54) (1) (z) = 1 a(zd) a 1 : + d a b(z) Because of the strict stability of the multistep method at least a small disk of the form f : j +dj dg with d > 0 is contained in the stability region S 25, p. 259]. Consequently, we have for small enough that f : j + daj dag 1S . Since by de nition of stability region (3.40) holds, we immediately obtain j 1a(z)=b(z) + daj da for jzj 1. For these values of z, (4.54) yields j(1) (z)j < 1, and, hence, (4.53). From this we may conclude that for small enough , the conditions of Lemma 4.2.17 are satis ed for all z on or outside the unit disk. Therefore, for any such z the optimal (#e opt) (z) is given by the combination of (4.44) and (4.54). This function is bounded and analytic for jzj 1$ by using the inverse Z -transform formula 5, p. 262], we arrive at the expression I 1 n 1 e ((!c )opt) n] = 2i z (#opt) (z) ; 1 dz : (4.55) z =1 As we have derived the conditions for existence of the optimal kernels, we can now prove the correctness of (4.50). We start by considering the case of t = 0. In that case we can use Property i of Lemma 4.2.12. Hence, we need to show that ((!c)opt) 0] = (! ) (0) = 0 : lim (4.56) c opt 0 By the initial-value theorem for the Z -transform 75, Eq. (7.35)], we nd 2 e ((! ) ) 0] = lim (# ) (z) ; 1 = r ;1 : ;
;
;
j j
!
c opt
z
!1
opt
2
1 + 1 ; zlim (1) (z) !1
The limit in this expression can be calculated from (4.54), da = da1 : lim ( ) ( z ) = lim 1 1 1 k 1 a(z) z z k + da b(z) + da Equality (4.56) follows by a straightforward limit calculation. Next, we will prove (4.50) for t > 0. By a change of variables z = i in (4.52) and and z = ei in (4.55), we obtain respectively Z 1 it e (!c)opt(t) = 2 e #opt(i) ; 1 d (4.57) and Z 1 ((!c )opt) n] = 2 ein (#e opt) (ei) ; 1 d : (4.58) !1
!1
1
;1
;
Consider now a xed t > 0 with t = n . Switching to the notation of (4.51), expression (4.58) can be transformed into n ((!c )opt) nt n] 1 Z it e t i n = 2 e (#opt) nt (e ) ; 1 nt nt ]() d (4.59) t 1
;
;1
4.2. CONVERGENCE ANALYSIS
67
where the characteristic function nt nt ]() equals 1 for 2 ; nt nt ] and 0 elsewhere. As before, (4.59) holds only if is small enough. For a xed t this is equivalent to requiring n to be large enough, say n N . The limit relation (4.51) follows immediately from (4.57) and (4.59) by the dominated convergence theorem 77, Thm. I.16], if we can prove the pointwise convergence ;
it (# e opt ) t (ei nt ) ; 1 lim e n n
n n ; ] ( ) t t
!1
= eit #e opt(i) ; 1
(4.60)
and the uniform, n-independent bound
t eit (# i e n n n t opt ) n (e ) ; 1 t t ] () g () n N ;
(4.61)
with g 2 L1(;1 1). The equality in (4.60) follows from the consistency of the linear multistep method. Indeed, from a(1) = 0 and a_ (1) = b(1), we derive i nt ) i nt ) n a ( e a ( e lim t i t = nlim t i t = i n b(e n ) n b(e n ) !1
and thus,
!1
!
i nt ) n a ( e = nlim #e opt t i t = #e opt(i) : b(e n ) In order to prove condition (4.61) we will construct a function g explicitly. Because of the strict stability requirement, 1 is the only root of a(z) on the unit circle. Since it is also the only root of the rational function a(z)=b(z) on the unit circle and since this root is simple, there exists a nite positive constant M such that
e opt ) t (ei nt ) lim ( # n n!1
!1
i n a(ei nt ) jj a(e ) jj n n b(ei ) M 2 ; ] or t b(ei nt ) M 2 ; t t ] :
(4.62)
To bound the left-hand side of (4.61) we note that j(1 ) (z )j2 e opt ) (z ) ; 1 = p (# : 1 + 1 ; ( )2 (z )2 1
p
Since denotes the root with the positive real part, (4.54) yields 2 (1da)2 (#e ) t (ei nt ) ; 1 ( )2 (ei nt ) = (1da) opt
n
1
t n
2 : t i n a ( e ) nt i t ; da b(e n )
2 t i n a ( e ) nt i t + da b(e n )
We can now use (4.62) to construct the following bound, valid for jj > Mda, 2 e opt) t (ei nt ) ; 1 n n ] () (1 da ) 2 : eit (# t t n ; d a M ;
j j
CHAPTER 4. WR METHODS AND SOR
68
This bound holds even if 62 ; nt nt ] because of the presence of the -function. Note
nally from (4.45) that the left-hand side of (4.61) is always bounded by 1. The proof is then completed by setting g to be the L1(;1 1)-function 8 < g() = :
1
(1 da )
2
( jMj da)2 ;
2 ;L L] 62 ;L L]
2
with L > Mda.
Remark 4.2.8 The strict stability condition is a very natural condition. In 25, p. 272]
we nd that it is satis ed by any multistep method of practical interest with nonempty int S . However, methods that do not satisfy the condition on the stability region do exist { Milne{Simpson methods, for example 25, p. 262]. For such methods (#e opt) (z) from (4.44) is not analytic for jzj 1, and the inverse Z -transform calculation is not feasible.
Remark 4.2.9 For strictly stable multistep methods the optimal discrete kernel was only
proved to exist for small enough . This condition on was required in the proof of (4.53), i.e., to guarantee the analyticity of (4.44). For A()-stable methods, however, condition (4.53) is satis ed irrespective of the size of the time increment. This can be explained by noting that in this case (;1 0) int S , which implies that 1a(z)=b(z) + da with jz j 1 is either complex or real with absolute value larger than or equal to da . Hence, for these methods the optimal kernel exists for any if the other assumptions of the theorem are satis ed. ;
2
2
10
10
0
10
0
10
−2
10 −2
10
−4
10 −4
10
−6
10
−8
10
−6
10
−10
10 −8
10
−12
10 −10
10
−14
10 CN
BDF(3)
−12
10
−16
−4
10
−2
10
0
10
10
−4
10
−2
10
0
10
Figure 4.2: Absolute value of (4.63) versus t = n for (3.47) (m = 1, nite dierences, h = 1=16). The dierent curves correspond to (from top to bottom) = 1=10, = 1=100, = 1=1000 and = 1=10000 respectively.
4.2. CONVERGENCE ANALYSIS
69
10 1 10 2 10 3 10 4 10 5 10 6 CN 0.702 1.231 1.008 0.176 -0.805 -1.803 BDF(3) 0.710 1.261 1.069 0.249 -0.729 -1.727 ;
Table 4.2: log
((!c )opt) 0]
;
;
;
;
;
for (3.47) (m = 1, nite dierences, h = 1=16).
We will illustrate Theorem 4.2.20 for the one-dimensional model heat problem (3.47), discretised using nite dierences with h = 1=16. To show the convergence of the discretetime kernel to the continuous-time one with decreasing time increment, we have plotted the absolute value of the dierence ((!c )opt) n] ; (! ) (n ) (4.63) c opt in Figure 4.2 for several values of and t = n 2 10 4 1]. We used the CN method and the BDF(3) formula for time discretisation and approximated the discrete kernel from (4.44) by an inverse Z -transform algorithm based on the use of Fast Fourier Transforms (FFTs), as will be explained in Section 4.3.2. The downward peaks are due to the zero crossing of (4.63). To illustrate the convergence at t = 0, we report values of log (((!c )opt) 0]= ) in Table 4.2. Note that the convergence to the limiting value ;1 is very slow. ;
4.2.4 An extension of the theory for more general problems
So far, the applicability of Lemmas 4.2.6 and 4.2.17 is restricted to problems whose Jacobi symbols have collinear spectra. In this section we formulate analogous results for more general problems. We will limit the discussion to the discrete-time case. The continuoustime case can be treated similarly. The proof of Lemma 4.2.17 applies a classical SOR result for complex matrices to the linear system ( 1a(z)=b(z)B + A)u = f . It was noted there that the CSOR symbol KCSOR (z) represents the SOR iteration matrix for the latter system, with #e (z) acting as the complex overrelaxation parameter. Since the coecient matrix of the linear system is assumed to be block-consistently ordered, the eigenvalues (z) of the SOR iteration matrix are related to the eigenvalues (z) of KJAC (z ) by the Young relation 106, p. 451, Thm. 3.4], p (z) + #e (z) ; 1 = (z)#e (z) (z) : (4.64) This implies that the spectral radius (KCSOR (z)) for a given #e (z) equals ;
n
(z )
p
o
max j (z )j : (z ) + #e (z ) ; 1 = (z )#e (z ) (z ) : (KJAC(z))
2
(4.65)
When the eigenvalues of KJAC (z ) are on a line segment in the complex plane, classical SOR theory provides a simple expression for #e (z) minimising (4.65). This optimal value is denoted by (#e opt) (z) and given by (4.44). If the collinearity assumption is not satis ed, however, one cannot nd an optimal #e (z) easily, and a more complex SOR theory may
CHAPTER 4. WR METHODS AND SOR
70
............................. ....... ....... ....... ... ...... ...... ... .. ...... .. .. ...... . . . . . ... . .. .... ... . . .. . . . .... .. .. . . . . . . ... ... . . . . . . . . . . .. .... . . . . . .. ... ... . . . . . . . .. .. ... . . . . . . .. ... .. . ... . . . . ... ... .. ... ... .. ... ... .. .. .. ... .. ..... .. ... ... ........... . . . . . . ...... .. .. ..... ... ... ........ ....... .. .. ... ..... ...... ... .. ...... ..... .. .. .. ...... .. .. .. .. .. . . . . .. . . ... .. .. . .. . .. .. ... . .. ... . . . .... .. . .... .. . . . . . .. . ... . .. .. . .... .. . .. .. . ... .. . .... . ... . . . .. . ... . ... ... . .... . .... ... . ... . . .... .. . . . . . .. . ... .. . ..... .. . .... ... . .... . ...... .. . ...... ... ...... .... . . . . . . ...... . ....... ....... ........................
q (z )
p (z )
(z )
1
Figure 4.3: An ellipse E (p (z) q (z) (z)). have to be used. Such a theory was recently developed by Hu, Jackson and Zhu 32]. They assume the eigenvalues of KJAC (z ) to lie in a region R (z ) = R(p (z ) q (z ) (z )), the closed interior of an ellipse centred around the origin. This ellipse is given by
E (z) = E (p (z) q (z) (z)) = : = ei (z) (p (z) cos() + iq (z) sin()) with semi-axes p (z) and q (z) that satisfy p (z) q (z) 0, angle (z) with ;=2
(z) < =2, and varying between 0 and 2. This is illustrated graphically in Figure 4.3. Obviously, the spectral radius (KCSOR (z)), given by (4.65), is bounded from above CSOR by its virtual counterpart R(K (z)) which is de ned in terms of the current #e (z) as n o p e e max j ( z ) j : ( z ) + # ( z ) ; 1 = ( z ) # ( z ) ( z ) : (4.66) (z) R (z)
2
In 32], Hu, Jackson and Zhu determine a value (#e R) (z) which minimises this upper bound for a given ellipse. Based on their result 32, Thm. 1], we can immediately formulate the following lemma.
Lemma 4.2.21 Assume the matrices B and A are such that 1a(z)=b(z)B + A is a ;
block-consistently ordered matrix with nonsingular diagonal blocks. Assume the spectrum (KJAC (z )) lies in the region R (z ), which does not contain the point 1. In terms of 2 (#e R) (z) = p (4.67) 2 1 + 1 ; (p (z) ; q2(z))ei2 (z) p where denotes the root with the positive real part, we have ;
;
R (z ) KCSOR (z )
R KCSOR R
(4.68)
the latter which may be supplied with any possible #e (z ). In particular,
2 ; CSORR e p (z ) + q (z )
R K <1: (z) = j(#R) (z)j
2
(4.69)
4.3. MODEL PROBLEM ANALYSIS
71
In analogy with the former;use of the superscript opt, we added the superscript R to CSOR the virtual spectral radius R K (z) to indicate that expression (4.66) has to be evaluated by using (#e R) (z). If we use a similar notation for the spectral radius (4.65), evaluated with (#e R) (z), it is straightforward to see that ; opt(z ) ;KCSORR (z ) ;KCSORR (z )
KCSOR (4.70) R for any elliptic region R (z) containing (KJAC (z )). The following remarkable result from 32, x3] shows that there actually exists an ellipse for which the latter bound is attained. Uniqueness, however, of this optimal ellipse is not guaranteed by the theory in the above reference. Lemma 4.2.22 There exists an optimal ellipse surrounding the spectrum (KJAC (z )) e e for which (#R ) (z) = (#opt) (z), and ; opt (z ) = ;KCSORR (z ) = ;KCSORR (z ) :
KCSOR (4.71) R In addition, such an optimal ellipse contains an eigenvalue of KJAC (z ). In order to use Lemma 4.2.21 to compute the optimal convolution sequence (#opt) by inverse Z -transform techniques, one would have to determine the optimal ellipse containing (KJAC (z )) for several values of z . A solution to the problem of nding this ellipse does exist when the eigenvalues of the Jacobi symbol KJAC (z ) lie on a line segment ;(1) (z) (1) (z)]. In that case Lemma 4.2.17 shows that the optimal ellipse is degenerated and corresponds to the line segment linking the extremal eigenvalues. In particular, the parameters de ning this ellipse are found by setting (1) (z) = p (z)ei (z) and q (z) = 0. We do not know how to nd such an optimal ellipse when the eigenvalues of KJAC (z ) are not collinear. Although an example has been given in 32, x4], even the problem of nding a \good" ellipse (which surrounds the spectrum of the Jacobi symbol and for which the associated bound (4.70) is relatively sharp) may prove to be a formidable task. In practice, one therefore often tries to determine a suitable convolution sequence without calculating these (nearly) optimal ellipses for all needed values of z, as will be illustrated in Section 4.3.2.
4.3 Model problem analysis In this section, we shall apply the theoretical convergence results of Section 4.2 to several semi-discretised variants of our model problem (3.47). For obvious reasons (the nonnormality of the SOR WR operators), we restrict ourselves to the in nite-interval case and compare these results with those from some numerical experiments. In addition, we comment on how to derive a suitable convolution kernel for the CSOR WR method in practice.
4.3.1 Theoretical results The continuous-time case
If we discretise (3.47) using nite dierences, matrix A is consistently ordered for an iteration with pointwise relaxation in a lexicographic, red-black or diagonal ordering. The
CHAPTER 4. WR METHODS AND SOR
72
eigenvalues of the Jacobi iteration matrix corresponding to matrix A are well-known. They are real and, independent of the spatial dimension, we have (3.50). Hence, the assumptions of Theorems 4.2.7, 4.2.8 and 4.2.9 are satis ed. In Figure 4.4, we illustrate formulae (4.24) and (4.25) by depicting the spectral radii of the single-splitting and double-splitting SOR), the spectral radius of the operators (KSSSOR ) and (KDSSOR), together with (Kstat static SOR iteration matrix MA 1NA with MA and NA as in (4.4), as a function of !. ;
3.0 2.5 2.0 1.5 1.0 0.5 0.0
h = 1=8
. ... ..
.. ...
...
... ...
... ...
... .... .... .... .... .. . . . . . .. .. .... .... ... .... ... .... .... . .... . . . . . . . ............................................. ... .. ................................... .. ...... ....................... .. ...... .................... .. .......... .......... ...... .. .. .. . .. . . .. . . ... ...
+
+ + + + + +
0.0
0.5
1.0
1.5
2.0
3.0 2.5 2.0 1.5 1.0 0.5 0.0
3.0 2.5 2.0 1.5 1.0 0.5 0.0
h = 1=32
.. ..
.. ...
.. ...
... ..
.. ...
.. .... .... ... .... .... . . .... .... .... ... ..... ... .... .... . . . . . .. .... ..... .... . ..... .......................................................................................................................... .. ......................... . . . . . . . .. . .. ... ... ..
+ ... ...
+ + + + + +
0.0
0.5
1.0
1.5
2.0
3.0 2.5 2.0 1.5 1.0 0.5 0.0
SOR) (dots) versus ! for Figure 4.4: (KSSSOR ) (solid), (KDSSOR) (dashed) and (Kstat (3.47), discretised using nite dierences. The \+"- and \"-symbols indicate measured values from numerical experiments (Section 4.3.2).
In 63, p. 473], Miekkala and Nevanlinna derived the following result for the optimal single-splitting SOR waveform method from Theorem 4.2.7. Theorem 4.3.1 Consider the heat equation (3.47), discretised in space using central nite di erences. Then, if we consider KSSSOR!opt as an operator in Lp (0 1) with 1 p 1, we have for small h that
(KSSSOR!opt ) 1 ; 22 h2 !opt 34 ; 49 2h2 : (4.72) A similar result can be proven for the optimal DSSOR WR method. The calculation is standard, but rather lengthy and tedious. It was performed by using the formula manipulator Mathematica 103]. The computation is based on rst dierentiating (4.25) with respect to ! and then nding the zeros of the resulting expression. The formula for !opt is then substituted back into (4.25). Finally, entering (3.50) and calculating a series expression for small h leads to the desired result. Theorem 4.3.2 Consider the heat equation (3.47), discretised in space using central nite di erences. Then, if we consider KDSSOR!opt as an operator in Lp (0 1) with 1 p 1, we have for small h that p p p 2 2 9 DSSOR! opt
(K ) 1 ; 2 h !opt (4 ; 2 2) + (3 ; 4 2 )2 h2 : (4.73) For the CSOR WR method with optimal overrelaxation kernel we may invoke Theorem 4.2.9. The operator's spectral radius equals that of the optimal static SOR iteration matrix for the discrete Laplace operator.
4.3. MODEL PROBLEM ANALYSIS
73
h
1/8 1/16 1/32 1/64
(KSSSOR!opt ) 0.745 (1.2713) 0.927 (1.3166) 0.981 (1.3291) 0.995 (1.3323)
(KDSSOR!opt ) 0.804 (1.1452) 0.947 (1.1647) 0.986 (1.1698) 0.997 (1.1711)
(KCSORopt) 0.446 0.674 0.821 0.906 Table 4.3: Spectral radii of optimal single-splitting (SSSOR), double-splitting (DSSOR) and convolution (CSOR) SOR WR for (3.47), discretised using nite dierences. The value of the optimal parameter !opt is given in parenthesis.
Theorem 4.3.3 Consider the heat equation (3.47), discretised in space using central nite di erences. Then, if we consider KCSORopt as an operator in Lp (0 1) with 1 p 1, we have for small h that
(KCSORopt) 1 ; 2 h :
(4.74)
Numerical values of these spectral radii, together with the corresponding !opt are presented in Table 4.3 as a function of the mesh size h. They are computed from (4.24), (4.25) and (4.32). Next, we investigate the performance of SOR WR methods for (3.47), discretised in space using nite elements. In general, such discretisation does not lead to a matrix zB + A that is consistently ordered for point relaxation. This precludes the use of Lemma 4.2.5. An exception is the one-dimensional model problem (3.47) discretised with linear
nite elements. For this problem, we derive the spectral radius of the double-splitting SOR WR operator, based on the DSSOR versions of Theorem 4.2.4 and Lemma 4.2.5. In particular, we set #e (z) = ! so that the right-hand side of (4.15) becomes KDSSOR(z).
Theorem 4.3.4 Consider the one-dimensional heat equation (3.47), discretised in space using linear nite elements. Assume 0 < ! < 2. Then, if we consider KDSSOR as an operator in Lp (0 1) with 1 p 1, we have that 8 q 1 > 2 > 1 ; ! + 2 (!1 ) + !1 1 ; ! + 41 (!1)2 > > <
1 + 38 p! > (! ; 1) 1 ; 3 p > > : 8 !
(KDSSOR) = >
p
!1 1+ 18 !2 21 !1 1+ 18 !2 21
;
! !d ! > !d
(4.75)
;
with 1 = cos(h) and !d = (8 ; 4 4 ; 21 )=21 . Furthermore, we have !opt > !d . Proof. The spectrum of the Jacobi symbol is given by 1 ;2zh2 + 12 JAC (K (z)) = 4zh2 + 12 j 1 j h ; 1 with j = cos(jh) : (4.76) Since the conditions of Lemma 4.2.5 are satis ed, (4.21) can be written as 2 p (z) + ! ; 1 = (z)! ;42zhzh2 ++1212 j
CHAPTER 4. WR METHODS AND SOR
74 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0
h = 1=8
+
+ + + +
......... . ......... . ......... . .........
0.0
. .......... . .........
0.5
+ ... ...
... ... . ......... . ... .
1.0
.. ....
..
... ...
.. ....
.. ...
+
... ...
.. .. .. .. . .
1.5
... ...
..
.
.. ...
.. ...
. ..
..
2.0
4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0
4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0
h = 1=32
. .. ..
. .. ..
.
+
+
. .. ...
... ..
... ...
.. ...
.. ...
.. ...
+
... ... . .... . . . ......... . ......... . ......... . ......... . ......... . ......... . .......... . . . . . . . . . . . . . . . .. . . . . ..
+ + + +
0.0
0.5
1.0
1.5
2.0
4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0
SOR ) (dots) versus ! for (3.47) (m = 1, linear Figure 4.5: (KDSSOR) (dashed) and (Kstat
nite elements). The \+"-symbols indicate measured values from numerical experiments (Section 4.3.2).
or, after setting (z) = j(z)jeit, p ;3 j(z )jei 2t + (! ; 1)e i 2t ; j(z )j!j p z(t) = h2 : (4.77) j(z )jei 2t + (! ; 1)e i 2t + 12 j(z )j!j In complete analogy with the proof of Theorem 4.2.8, we replace j by 1 and impose the conditions Re(z(t)) = 0 and Re(z_(t)) = 0 on the equilibrium curve (4.77) to determine the supremum of j(z)j along the imaginary axis. This gives p 1 t 2 4j(z)j(! ; 1) cos 2 ; 2 j(z)j!1 (j(z)j + ! ; 1) cos 2t +(j(z)j + ! ; 1)2 ; 12 j(z)j!221 = 0 ;
;
and
p 4j(z)j(! ; 1) cos 2t ; 14 j(z)j!1 (j(z)j + ! ; 1) sin 2t = 0 : We deduce that either the supremum is attained at the origin giving (4.30), or the supremum is found at a certain point z = i ( 6= 0) giving 1 1 + 83 p! ! 1+ 18 !2 21 j(z )j = (! ; 1) 3 p !1 : (4.78) 1 ; 8 ! 1+ 1 !2 2 ;
;
8
1
The proof is completed by combining (4.30) and (4.78). The value of !d is derived by determining the least value of ! for which the supremum is not attained at the origin. It stat . turns out that 1 < !d < !opt 2 Equation (4.75) is illustrated in Figure 4.5, where the theoretical values of the spectral radius are plotted against !. The spectral radius of the CSOR WR operator with optimal kernel is calculated in the following theorem. As in the nite-dierence case, it equals the spectral radius of the optimal static SOR method for the system Au = f where A is the discrete Laplacian.
4.3. MODEL PROBLEM ANALYSIS h !d !opt
1/8 1.0599 1.0625 DSSOR! opt ) 0.834
(K
(KCSORopt) 0.446
75 1/16 1.0687 1.0694 0.956 0.674
1/32 1.0710 1.0712 0.989 0.821
1/64 1.0716 1.0716 0.997 0.906
Table 4.4: Parameters !d and !opt, together with spectral radii of optimal double-splitting (DSSOR) and convolution (CSOR) SOR WR for (3.47) (m = 1, linear nite elements). 0.975 0.950 0.925 0.900 0.875 0.850
........................................................................................................................... ................. ............ ........... ....... ...... ....... ....... ...... ....... ...... . . . . ...... ...... . ...... . . . ..... ...... . .... . . . ..... ..... . . .... ..... .... . . . .... .... .... . . . ..... .... . . .... . .... .... . . . .... . .... .... . . .... ... . .... . . .... .... . . .... .... ... . . .... ... . .... . . ... .... . . .... ... .... . . .... ... .. ...
;600
0:9563 0:9562
;400
;200
0
200
400
600
............................... .................................................. .............. ............ ............ ..... ....... ......... ........ ....... ......... ...... .......... .... .......... ..... .......... .... ............ ..... ............ .... ................. . . . .... . . . . . . . . . . . . . . ... ............................................. .... .... . . ... .. ... ... . .... .. . . ... .. ... .. . .. .. ... . ... ... ... . .. .. . . .. ...
0:9561
;100 ;75 ;50 ;25
0
25
50
75
100
Figure 4.6: (KDSSOR!opt (i)) versus for (3.47) (m = 1, linear nite elements, h = 1=16). The bottom picture is an enlargement of the upper one for ;100 100. 0.7 0.6 0.5 0.4 0.3 0.2 0.1
...... .. .. .. .. . .. . .. .... . ... .. .. .. .. ... .. ... .. . .. .. ... . .. ... . ... ... ... . ... ... . .... ... . .... . .... ... . . . .... . .... .... . . ..... .... . . ..... . ..... .... . . . ..... . ...... ...... . . . . ...... . ...... ...... . . . . . ....... ....... ....... . . . . . ....... ....... . . ......... . . . . . . .......... ......... . . . . . ........... . . . . . . .............. ............ . . . . . . .................. . . . . . . . . ..................... .................. . . . . . . . . . . . . .......... . . . . . . . . . ...
;600
;400
;200
0
200
400
600
Figure 4.7: (KCSORopt(i)) versus for (3.47) (m = 1, linear nite elements, h = 1=16).
CHAPTER 4. WR METHODS AND SOR
76
Theorem 4.3.5 Consider the one-dimensional heat equation (3.47), discretised in space
using linear nite elements. Then, if we consider KCSORopt as an operator in Lp(0 1) with 1 p 1, we have (for small h) that cos2(h) (4.79)
(KCSORopt) = (KCSORopt(0)) = 2 1 ; 2h : p 2 1 + 1 ; cos (h)
Proof. Because of (4.76), we have that the eigenvalues of KJAC (z ) lie on the line segment ;1(z) 1(z)] with 1(z) = (;2zh2 +12) cos(h)=(4zh2 + 12). Therefore, the conditions of Lemma 4.2.6 are satis ed. Since 1(z) is a bounded and analytic function for Re(z) 0, so is #e opt(z). Hence, we may apply Theorem 4.2.4 and Lemma 4.2.6 to nd
(KCSORopt) = sup j#e opt(i) ; 1j = j#e opt(0) ; 1j
from which the result follows.
2
R
2
Table 4.4 shows some values of !d, !opt and (KDSSOR!opt ) calculated by means of (4.75). The supremum of (KDSSOR!opt (z)) over the imaginary axis is not attained at the origin, as illustrated in Figure 4.6 for h = 1=16. Moreover, the resulting spectral radii (KDSSOR!opt ) obviously satisfy a relation of the form 1 ; O(h2). For comparison purposes we added in Table 4.4 the spectral radii of optimal convolution SOR WR, which are identical to the ones in Table 4.3. In this case, (4.79) implies that the supremum of
(KCSORopt(i)) is attained at = 0. An illustration of this observation can be found in Figure 4.7.
The discrete-time case
In this section, we analyse the use of the CN method and the BDF formulae of order one up to ve, for the one-dimensional model problem (3.47) with linear nite-element discretisation on a mesh with mesh-size h = 1=16. The results for nite dierences or more-dimensional problems are qualitatively similar. We computed (KDSSOR!? ) by direct numerical evaluation of (4.42), with = 1=100 and with # = !? , where !? equals the optimal ! for the continuous-time iteration. Since (1) (z) 2 C n f(;1 ;1] 1 1)g and (1) (z) is analytic for jzj 1, we have that (#e opt) (z), given by (4.44), is also analytic. Hence, we can also compute (KCSORopt) by evaluation of (4.42). The results for the DSSOR and CSOR iteration are reported in Table 4.5. They can be derived from a so-called spectral picture, see Section 3.3.2, in which the scaled stability region boundaries of the linear multistep methods are plotted on top of the contour lines of the spectral radius of the continuous-time WR symbol. Two such pictures are given in Figures 4.8 and 4.9, with contour lines drawn for the values 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0 and 2.2 in the DSSOR case and 0.6, 0.7, 0.8 and 0.9 in the CSOR case. According to Theorem 4.2.19, the values of Table 4.5 can be veri ed visually by looking for the supremum of the symbol's spectral radius along the plotted scaled stability region boundaries. When CN time discretisation is used, the latter supremum has to be taken over the imaginary axis$ we can also obtain the corresponding spectral radii from Figures 4.6 and 4.7 in this case.
4.3. MODEL PROBLEM ANALYSIS 600 400 200
77
.. .. .. . . . . . . .. ....... .. . . .. .. . .. .. . . . .. .... . ...... .. . . . ..... .. .. .. . .. . .. ........ . .. .. .. . .. ...... . . . ....... .. . . .. .. .. .. . . . . . .. . . . . . .. .. . . ...... .. .. ..... .. .. . . . . ........ .. . .. .. . .. .. .... . ....... .. .. .. . .. ... .. .. . ... .. .. . .. . . .. .... . . . . . . . . . .. ... .. . .. .... . . .. .. . . . . . . .. . .. . . . . . ..... . . . . . .. .... . .. .. . .. .. . . . . . . .................................................. .. ...... .. .. . . . . . . . . . ............. .......... .. .. . . . . ..... .. .. .. ....... . ......... .... . . . . . . ...... .. . ...... . . . . . . . ....... . . ...... .. ... .. . .. . .. . . ...... ...... . . . . . . . . . . . . . . . . . ..... .. ... . . .. . .. . . ...... ... ..... ... . . . . . . . . . .... .. .. ....... ....... .. .. .. .. .... . . . . . ... . . . . . .... ... . . ...... . . . .... . . . . . . . . . . .. ... . . . . .. ... . ... . . . . . . . . .... . . . . . . .. . . . .. . . .. . ... .. .. ... . .. . . . . . . ... . . . ... ..... .. . ....... .. . .. ..... ................................... . . .. . . . . . . . .. . ..... . . .. . ... ....... .. .. . ...... .. . .. . . . . .. . . . . . . . . . . . . . . . . . .... . . .. .... ... .... . . ....... .. . ..... .. ...... . . . . .. . . . . . . . . .... . ...... .. . .. .. ....... .. .... . .... ... . . . .. . . . ...... . . .. . . . .... .. . ...... .. .. . . . ..... . ... . . . . . . . . . . . .. . .. .. .. . ... .. .. .. . ............. .. . . . . . . . . . . . . . . . . .. . ............... .. ... .... . . ........................... . . . .. ....... . . .. .. .. .. .. . . . . ...... .. ... ... ...... .... ... .. ... . . . . . .. .. .. .. . . . . . . .... ... .. . .. .. .. .... . ... . . . . .. .. .. .. .. . . .. . . . .. .. ... . . .. . . . .. . . .. .. .. .. . .. . . .. . .. . ... . . . . . . . . .. .. .. .. .. . . .. . . . . . . . . . . . . . .. .. .. .. . . . . ... . . .. . . .. .. . . .. .. .. .. .. . .. . . . ..... . . . . .. .. . . . . . . .. .. .. .. .. . . . . . ..... .... . . . . . ... . .. . . . . . . . . . .. .. . .. . . . . . . . ..... ...... . . . . . . . .. ..... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .............. .. ... . ...................... . . . .. ... .. .... . . .. .. .. .. ........ .. . .. .. . ..... . .. . ... . .. . . . ..... . .. . .. .. ... . .. .. ............. .. ... . .. . ........ . .. . .... .. . .. ....... .. .. .. ......... .. . ... ..... . ..... . . . . . . . . . . . . . . . . . . .... . ... ... . ...... .. . . . ..... .. . .... . . .. ..... .. ... . ....... ...... . .. ...... .. .. ...... ... . .. ............... .......................... ... .. .. ....... .. .. .. ..... .... . ....... .. . . . . . .... .. .. .. . . . .. . .. . .... . ... . . . . .... .. .. .. . . . . . . . . .. . . . ... . . . . ... ... .. .... .. ........ .. .. .. . .... .. . ....... . . . ..... .. ....... .. .. .. .... ....... . ... . . . ...... .. . ...... .. .. .. .. . ......... ... . ..... . . . . . . . . . . . . . . . . . . . . .. . . . . ....... .. .. . . . ...... ...... . . ... .. . .... . . . . ..... ..... . . . .... .. .. .. ........ . . ....... .. ... . . . . ............. ........ . .. .... .. .. .. . ........................................................ .. . . . . . ... .. . .. ...... .. .. .. . . . . . . . . . . . . . .... . . . .. .... . .. .. . . . . .. . ....... . .. ..... .. .. . . .. .. . .. .. ... .. . .. ..... . .. .. .. ...... .. .. .. .... ... .. .. . . . . . . . . . . ........ .. . . .. .. .... .. .. .. .... . ... .. .. .. .. .. .. . .. .. . . . .. ......... .. .. .. .. .. .. .. . .. . . .... . . .. .. .. .. .... . . .... .. . . . . . . . . . . . . . ..... . .. . . . . ...... .. .. .... .. . . .. .. . ... . . .
.... ...... ...... ...... .......
=0:8
=2:2
0
;200 ;400
;600 ;400
;200
0
200
400
600
..... .... ..... ...... .......
800
Figure 4.8: Spectral picture of optimal DSSOR WR for (3.47) (m = 1, linear nite elements, h = 1=16, = 1=100). 600 400 200 0
;200 ;400
... . . . ...... .. . . .... . ... ..... . . .. .... . .. . ..... . .... .. . . . . . . . ..... . . . . .. . ...... .. . . . ... .... . ... .... . . .... ... . . .. . .. ... . . .. . ... . . ... ... . .................................................................. . .. . ... .......... ....... . . . .. . ...... ....... . . .. ....... ...... .. . . . ...... ..... ... . ...... .. . ....... . . .. . . . ..... . .... . .. . .... .. . . . . . .... .. . .. .... . .... . . . .. . . .... . ..... .. . . . .... .. . . . . . .... ... ... . . . .... . . ... . . ... ... . . ... . . . ... .. .... ..... ........................................ . . ... . . . . . . . . . . . ...... .... ... . .... ..... ...... .. . .... .. . . . ... ..... . . . ...... .... ... . ... ..... .. . . .... . . . . . . . . . . .... .... .. .. . ... . ..... .. .... . .. . . .... . . . . . . ... ..... .. ... . .. . .... ...... .... . . . . .. . . . . . . . ... ...... .. .. . ...... ... .. ...... ..... .. ... . ...... . .. . . . ............. .. ... ...... . .. .. ... ....................................... . .. .. . . . ... ...... .. .. . .. ...... .... .. .... .. ... ..... ... . . .. . . . . . . . . . . . . ... . . .. .. ... ... . .... . .. .. . . . . . . . . . . . . . . . . . ........ . .. ... .. .. ................ . ... .. .. ... .. ..... . . . . . .... . .. . ... ...... ... ... . . .. . . . . . . . . . . . . ..... ....... . . . . . . . . . . . . . . . ..... . .. .. . . . . . . .......... .. .. . . . . . . . . . .. . . . . . . . . . . . . . . .. . . ...... .. .. . . . . . ........ . . . . . . . . . . . . . . . . . ...... .. .. .. ... ... ..... . . . ... . . . . . . . . . . . . . . . . . . . . . . . . .... ... ... ... .......... .. .... . .. .. .......... ........................... .. ..... ... .. ..... ........... . .... . ... ... ....... .. .. . .. ..... ... . ..... ... ..... .... . . .. .... .. ..... . . .... .... .. .. . ...... ... .. . ...... ... . . . . . . . . . . . . . . . . . . . . . . ..... . . . .. .... . .... .. ..... .... .. ... ...... . .... ..... ... .. .. . . ...... .. ..... .... ... .. . ......... ....... ... ... .... . ... ......................................... ... ... . . .... .. . . .. ... . . . . . . . . . .... . ... . ... . . ... . ..... .. . . . .... . ... . .. . . .... .... . . .. . . .... .... . .. . . .... ..... . . . .. . . . . . . . .. ..... .. ..... . ....... .. . .. ....... . ..... .. . ...... .. ...... . . .. ...... ...... .. . ....... ........ . .. .. . ........ ................ . .. ................. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . .. . ... ... . . ... . ... . .... ... .... .. .. . . ....... .. . . . . ....... . .. . . . ......... .. . . .... .. . ..... . . .. ... . ..... . ... . ...... . . .... ..... . .
..... ..... ...... ...... ...... .
=0:6
=0:9
;600 ;400
;200
0
200
400
600
.... ..... .... ...... ........
800
Figure 4.9: Spectral picture of optimal CSOR WR for (3.47) (m = 1, linear nite elements, h = 1=16, = 1=100).
CHAPTER 4. WR METHODS AND SOR
78
multistep method CN BDF(1) BDF(2) BDF(3) BDF(4) BDF(5)
(KDSSOR!? ) 0.956 0.956 0.956 0.991 1.236 2.113
(KCSORopt) 0.674 0.674 0.674 0.674 0.674 0.674 Table 4.5: Spectral radii of discrete-time optimal double-splitting (DSSOR) and convolution (CSOR) SOR WR for (3.47) (m = 1, linear nite elements, h = 1=16, = 1=100).
4.3.2 Numerical results
Practical determination of a suitable convolution kernel
First, we comment on the practical determination of a suitable convolution sequence (#num ) that can be used in a practical implementation of the discrete-time CSOR WR algorithm. We will start with ODE systems of the form u_ + Au = f for which the assumptions of Theorem 4.2.11 are satis ed. For such problems, we have an explicit expression for the optimal continuous-time kernel #opt, which is completely determined by the scalar 1 and the diagonal value da. Unfortunately, a similar expression does not seem to exist for the optimal discrete-time sequence (#opt) , as the inverse Z -transform of (4.44) appears to be too complex to be performed analytically. One might, at rst, try to employ the continuous-time kernel in the discrete-time computations. This idea is inspired by the existence of the limit relation (4.50). More precisely, one could set (#num ) = + ((!c )num ) , and select (!c)num n] := (!c)opt(n ) n = 0 1 : : : N ; 1 : (4.80) Experimental convergence factors for the nite-dierence discretisation of (3.47), obtained with this discrete kernel and the CN time-discretisation method for t 2 0 1], are given in Table 4.6. They are unsatisfactory, except when very small time steps are used. Another attempt at using the continuous-time kernel could be based on the observation that one is not really interested in determining the value of the kernel but in computing an integral. In particular, the discrete convolution sum in (4.10) can be regarded as a numerical approximation by quadrature of the convolution integral in (4.7). Hence, instead of using the rst-order quadrature rule that one gets when one uses (4.80), one could try to compute that integral more accurately by using an integration rule of higher order. In a second experiment, we used the composite midpoint integration rule, n 1 X ;
l=0
(!c )num n ; l]
() u^i l] + u^(i )l + 1]
2
( u ; i
;
1) l] + u( 1) l + 1] ! i ;
2
(4.81)
where the fractions denote linearly interpolated approximations of u^(i )((l + 1=2) ) and u(i 1)((l + 1=2) ) respectively, and (!c)num n] := (!c)opt((n ; 1=2) ) n = 1 2 : : : N ; 1 : (4.82) The corresponding convergence factors given in Table 4.6 in parentheses are somewhat better than the ones obtained by using (4.80), but overall they do not convince. Other ;
4.3. MODEL PROBLEM ANALYSIS
79
h 1/8 1/16 1/32 1/64 1/100 0.543 (0.448) 0.885 (0.690) 0.982 (0.961) 0.996 (0.995) 1/500 0.461 (0.448) 0.745 (0.671) 0.952 (0.851) 0.994 (0.984) 1/1000 0.455 (0.452) 0.701 (0.667) 0.913 (0.837) 0.991 (0.948) Table 4.6: Averaged convergence factors for (3.47) (m = 1, nite dierences, T = 1, CN). We used (4.80) and (in parentheses) (4.81){(4.82) to approximate the optimal kernel or convolution integral. numerical integration rules lead to similar conclusions. This follows from the discussion in Section 4.2.3 and from the observation that the optimal discrete kernel for a particular problem and time-discretisation method can be very dierent from the optimal continuous kernel (multiplied by ) unless is very small. Hence, we will now consider methods that derive the optimal discrete kernel directly. They are based on the expression of its Z -transform, which is given by the combination of (4.44) and (4.54), 2 (#e ) (z) = s (4.83) opt
1+ 1;
da 1 1 a(z ) b(z) +da
2
and which is analytic for jzj 1. The inverse Z -transform can be computed symbolically by a series expansion of (4.83) in terms of powers of z 1, according to the Z -transform de nition ;
(#e opt) (z) =
X 1
n=0
#optn]z n :
(4.84)
;
Although the elements of the sequence (#opt) can be derived easily from the latter expression, a more practical procedure is to use a numerical inverse Z -transform technique. The method we used is based on a Fourier-transform method, and is justi ed by the following observation. Setting #(t) = (#e opt) (e it), (4.84) becomes ;
#(t) =
X 1
n=0
#optn]eint :
Thus, #optn] is the n-th Fourier coecient of the 2-periodic function #. More precisely,
Z 2 1 int #optn] = 2 0 #(t)e dt which can be approximated numerically by the nite sum ;
M X1 1 # k 2 e #num n] := M M k=0 ;
ink 2M
;
:
CHAPTER 4. WR METHODS AND SOR
80
h 1/8 1/16 1/32 1/64 1/100 0.441 0.676 0.820 0.907 1/500 0.441 0.676 0.816 0.902 1/1000 0.441 0.676 0.816 0.902 Table 4.7: Averaged convergence factors for (3.47) (m = 1, nite dierences, T = 1, CN). We used an inverse Z -transform technique to approximate the optimal kernel. Consequently, numerical approximations to the M valuesnf#optn]gMn=01 can be found by oM 1 2 computing the discrete Fourier transform of the sequence (#~ opt) (e ik M ) k=0 . This can be performed very eciently by using the FFT algorithm. In the numerical experiments the number of time steps N is always nite. Hence, we only use the numerical approximations of f#optn]gNn=01. To compute the discrete kernel, we took M N and large enough to anticipate possible aliasing eects. The approach is illustrated for the latter model problem in Table 4.7. We observe that the experimental convergence factors are independent of the time increment . More precisely, they are almost identical to the optimal continuous-time spectral radii (KCSORopt), reported in Table 4.3. Next, we consider ODE systems (3.1) for which the analytical expression of (1) (z) cannot be computed explicitly (at the moment, we still suppose the Jacobi symbols KJAC (z ) to have collinear spectra). For such problems, the numerical inverse Z -transform method is still applicable in theory. Yet, it will be very time consuming, since it requires the computation of (#e opt) (z), and thus of (1) (z), for M equidistant points on the unit circle. Similar remarks can be made for general ODEs with noncollinear Jacobi spectra. Indeed, as noted in Section 4.2.4, it is then already a formidable task to determine a (nearly) optimal ellipse, and hence the resulting formula (4.67), for one speci c value of z. In practice, one could therefore use one of the following two strategies to obtain a suitable convolution sequence in these cases. A rst attempt, for ODE systems with B = I and DA = daI , could start from the knowledge of a (nearly) optimal ellipse/line segment surrounding (KJAC (0)). From JAC formulae (3.36) and (4.33), it is clear that the spectrum of K (z) is obtained by rotating and scaling the spectrum of KJAC (0), corresponding to the multiplication with 1 da=( a(z)=b(z) + da). A similar operation applied to the ellipse surrounding the latter spectrum is then expected to lead to a good ellipse for the current value of z. This approach is however not applicable when B 6= I or DA 6= daI . Therefore, an automatic procedure was developed by Reichelt et al. for computing an analytic approximation to (#e opt) (z), see 80, x5.6] and 82, x6]. The largest-magnitude eigenvalue (1) (z) is estimated for some speci c values of z (e.g., z = 1 ;1 1 : : :) by subspace iteration or the implicitly restarted Arnoldi method, and inserted in the righthand side of (4.44). The resulting values are tted by a ratio of low-order polynomials in z 1: ;
;
;
;
;
PK L bz j X (#e opt) (z) PLj=0 j j = 1 ; crj z 1 j aj z ;
j =0
;
j =1
;
;
4.3. MODEL PROBLEM ANALYSIS
81
yielding by inverse Z -transformation that #optn] #num n] :=
L X j =1
cj rjn :
It is as yet unclear how to select the speci c values of z and what other conditions are to be imposed on the rational approximation (e.g., as to the degree of numerator and denominator, the number of poles, and the pole placement). The procedure is illustrated for certain nonlinear semi-conductor device problems in 82], and is shown to lead to very satisfactory results, even for systems of ODEs that do not satisfy the assumptions of Lemma 4.2.17 (i.e., systems with noncollinear Jacobi spectra). We will comment on this further on in this section, where we shall illustrate the robustness of formulae (4.22) and (4.44).
Pointwise relaxation
Now that we know how to obtain a suitable approximation of the optimal convolution sequence (#opt) , we can compare the behaviour of standard and convolution-based SOR WR methods by means of numerical experiments for our model heat problem (3.47) with t 2 0 1]. For (3.47), discretised using nite dierences, the correctness of formulae (4.72) and (4.73) is illustrated in Figure 4.4 by the \+"- and \"-symbols. They correspond to the measured convergence factors of respectively the single-splitting and double-splitting SOR WR method, applied to the one-dimensional variant of the model problem with CN time discretisation and = 1=1000. Averaged convergence factors as a function of h are given in Tables 4.8 and 4.9 for the one-dimensional and two-dimensional nite-dierence discretisation of (3.47). They agree very well with the theoretical values given in Table 4.3, and they illustrate the correctness of formulae (4.72), (4.73) and (4.74). For SSSOR and DSSOR WR, we took the overrelaxation parameters ! in the numerical experiments equal to the optimal parameters !opt of the corresponding continuous-time iterations as in Section 4.3, while the numerical inverse Z -transform method is used to calculate a suitable convolution sequence for CSOR WR. In order to illustrate the dramatic improvement of the latter SOR method over the other SOR WR methods we included Figure 4.10. There, we depict the evolution of the l2-norm of the error as a function of the iteration index. The results for standard Gauss{Seidel WR are also given. Observe that qualitatively similar convergence plots are obtained for certain nonlinear semi-conductor device problems in 82, x7.2].
h 1/8 1/16 1/32 1/64 SSSOR 0.713 0.919 0.979 0.995 DSSOR 0.783 0.942 0.985 0.996 CSOR 0.441 0.676 0.820 0.907 Table 4.8: Averaged convergence factors of optimal single-splitting (SSSOR), doublesplitting (DSSOR) and convolution (CSOR) SOR WR for (3.47) (m = 1, nite dierences, T = 1, CN, = 1=100). Compare with Table 4.3.
CHAPTER 4. WR METHODS AND SOR
82
h 1/8 1/16 1/32 1/64 SSSOR 0.718 0.921 0.980 0.995 DSSOR 0.788 0.944 0.986 0.996 CSOR 0.442 0.670 0.822 0.909 Table 4.9: Averaged convergence factors of optimal single-splitting (SSSOR), doublesplitting (DSSOR) and convolution (CSOR) SOR WR for (3.47) (m = 2, nite dierences, T = 1, CN, = 1=100). Compare with Table 4.3. 3 2 1 0 ;1 ;2 ;3 ;4 ;5 ;6 ;7 ;8 ;9
...... ........ . . .................. . ......... . .... ....... .... .... . . ..... ... ...... . ...... .. .. . . ...... ...... ....... . ...... . ..... ....... . ........ . ...... . .... . ... . ...... ....... . ..... . ........ ...... ..... . . .... ...... .. .... . . ..... ....... . ........ ..... . . ...... . ..... ...... . ...... .. . . . .... ....... ...... ....... . .... . . . ...... ...... . ..... ....... . ..... . ..... ... . . . . . . ........ . . . . . ...... ..... . .... . .... ...... ... . ..... . . ...... ....... . ..... ...... . .... . . . . .... . ..... . ....... ..... . ........ .... . . . .... .... . . . .... ...... .... . . .... . . .... . ....... .. . .... . . . ......... ..... . ..... ..... . .. . . .. ........ .... . .... . . . ...... ..... . ...... .. .... . . .. ... ..... . ...... . . ... . ... ... . ... .... .... . .... ... ... . ... .... . ... .... ... .. . ... .. .. . . .. .. ..
0
200 400 600 800 1000 1200 1400 1600 1800 2000
Figure 4.10: log ke( )kl2(100) versus iteration index for (3.47) (m = 2, nite dierences, h = 1=32, T = 1, CN, = 1=100) for Gauss{Seidel (dash-dotted), single-splitting SOR (solid), double-splitting SOR (dashed) and convolution SOR (dotted) WR.
h 1/8 1/16 1/32 1/64 DSSOR 0.817 0.952 0.988 0.997 CSOR 0.441 0.676 0.819 0.908 Table 4.10: Averaged convergence factors of optimal double-splitting (DSSOR) and convolution (CSOR) SOR WR for (3.47) (m = 1, linear nite elements, T = 1, CN, = 1=100). Compare with Table 4.4. multistep method CN BDF(1) BDF(2) BDF(3) BDF(4) BDF(5) DSSOR 0.952 0.952 0.952 0.953 1.219 2.124 CSOR 0.676 0.676 0.676 0.676 0.676 0.676 Table 4.11: Averaged convergence factors of optimal double-splitting (DSSOR) and convolution (CSOR) SOR WR for (3.47) (m = 1, linear nite elements, h = 1=16, T = 1, = 1=100). Compare with Table 4.5.
4.3. MODEL PROBLEM ANALYSIS
83
In Figure 4.5, the correctness of (4.75) is illustrated by \+"-symbols, which correspond to observed averaged convergence factors of the double-splitting SOR WR method for (3.47) with m = 1, linear nite elements, CN time discretisation and = 1=1000. In Table 4.10 we present numerical results as a function of h for optimal double-splitting and CSOR WR. These values should be compared to the ones given in Table 4.4. Moreover, the CSOR results illustrate the correctness of (4.79). Finally, averaged convergence rates obtained with dierent time-discretisation formulae are given in Table 4.11. They match the theoretical values of Table 4.5 very well.
Linewise relaxation A nite-element discretisation of (3.47) does not in general lead to a matrix zB + A that is consistently ordered for point relaxation. These matrices may, however, be consistently ordered for blockwise or linewise relaxation. As an illustration, we investigate the performance of linewise CSOR WR for the two-dimensional heat equation, discretised using linear or bilinear nite elements. For completeness, we also studied the method for the
nite-dierence discretisation of (3.47) with m = 2. The resulting matrices zB + A are block-consistently ordered, but unfortunately, the eigenvalues of KJAC(z) are in general not collinear (except for z = 0 and z = 1). Yet, a suitable convolution sequence (#num ) is calculated by means of the numerical inverse Z -transform method in combination with formula (4.44). This technique only requires the computation of a single eigenvalue (1) (z) (which is the largest one in magnitude) for M values of z located equidistantly along the unit circle. As the spectra (KJAC (z )) are not collinear, the resulting kernel is not guaranteed to be the optimal one, or even a good one. Nevertheless, numerical evidence shows that this procedure yields excellent convergence rates for the problems considered. This robustness of the CSOR WR method is illustrated in Table 4.12, where we reported convergence results for CN time discretisation with time step = 1=100. We also included the theoretical spectral radii of the optimal static, linewise SOR method for the corresponding p linear systems Au = f . Observe that the latter, which can be approximated by 1 ; 2 2h for small mesh size h 4, p. 152], agree very well with the averaged convergence factors of the linewise CSOR WR methods. With the generalised CSOR theory, this robustness can now be explained in an intuitive manner as follows. When the eigenvalues of the Jacobi symbol are not too far from being on a line, any reasonable ellipse { most probably also the optimal one { will be very elongated with p (z)ei (z) (1) (z) and q (z) small. The right-hand side of (4.44),
h 1/8 1/16 1/32 1/64
nite dierences 0.318 (0.322) 0.568 (0.572) 0.756 (0.757) 0.871 (0.870) linear nite elements 0.320 (0.322) 0.569 (0.572) 0.757 (0.757) 0.870 (0.870) bilinear nite elements 0.312 (0.317) 0.567 (0.571) 0.760 (0.757) 0.870 (0.870) Table 4.12: Averaged convergence factors of linewise CSOR WR for (3.47) (m = 2, T = 1, CN, = 1=100). The theoretical spectral radii of the corresponding static linewise SOR method are given in parenthesis.
CHAPTER 4. WR METHODS AND SOR
84
0.6
0.6
0.4
0.4
0.2
0.2
0
0
−0.2
−0.2
−0.4
−0.4
−0.6
−0.6
−1
−0.5
0
0.5
1
−1
0.6
0.6
0.4
0.4
0.2
0.2
0
0
−0.2
−0.2
−0.4
−0.4
−0.6
−0.6
−1
−0.5
0
0.5
1
−1
−0.5
0
0.5
1
−0.5
0
0.5
1
Figure 4.11: Eigenvalues (\+") of KJAC (z ) and the optimal ellipses for several values i of z = e for (3.47) (m = 2, linear nite elements, h = 1=8, linewise relaxation). The respective pictures for = 0 3=12 6=12 and 9=12 are ordered from left to right, top to bottom.
0.03 0.02 0.01 0.00 ; ;=2 0 =2 0.4 0.3 0.2 0.1 0.0
;
;=2
0 =2 opt (z )) (lower picture, \") and Figure 4.12: j"(z)j (upper picture), (KCSOR appr (z )) (lower picture, \") for several values of z = ei for (3.47) (m = 2,
(KCSOR
linear nite elements, h = 1=8, linewise relaxation).
4.4. SOME CONCLUDING REMARKS
85
which we denote further on by (#e appr ) (z), is then a good approximation to the optimal (#e opt) (z), corresponding to a formula of the form (4.67). If we set (#e appr ) (z) = (#e opt) (z) + "(z) then it is easy to derive from (4.64), e.g., by doing a series expansion with Mathematica, that ; opt(z ) ; ;KCSORappr (z ) = O("(z )) "(z ) ! 0 :
KCSOR The overall spectral radius of the CSOR iteration is found by a maximisation procedure over the unit circle, see Theorem 4.2.16 and in particular formula (4.42). Hence it is especially important for (#e appr ) (z) to be close to (#e opt) (z) near the values of z for opt(z )) is large. In our experiments, this always appeared to be near the which (KCSOR value z = 1. Fortunately, this is exactly where the eigenvalues of the Jacobi symbol are collinear or nearly collinear. The above discussion will be illustrated by means of the above model problem, discretised using linear nite elements for h = 1=8. We computed ellipses surrounding i (KJAC (z )) for several values of z = e on the unit circle, see Figure 4.11. These ellipses were obtained by choosing p (z) = j(1) (z)j
(z) = Arg((1) (z)) and by determining q (z) as the smallest value for which all eigenvalues of KJAC (z ) lie in the closed interior of the resulting ellipse. There is no rm guarantee that these ellipses are truly optimal. Yet, numerical experiments evaluating formula (4.65) with overrelaxation parameter from (4.67) for various neighbouring ellipses did never lead to a smaller value of the spectral radius. Hence, it seems reasonable to assume we have found an (at least locally) nearly optimal #e (z) . In the upper picture of Figure 4.12, we plotted j"(z)j, the modulus of the dierence between the approximating (#e appr) (z) and the one we assumed to be the optimal one for several values of z = ei on the unit circle. The dierence between the corresponding opt (z )) and (KCSORappr (z )) is depicted in the lower picture of spectral radii (KCSOR Figure 4.12. The collinearity of the spectrum of the Jacobi symbol implies that "(ei) = 0, opt(ei )) = (KCSORappr (ei )) for = 0 and = . By noting that and hence, (KCSOR the maximum of the latter spectral radius over the unit circle is attained for in the neighbourhood of 0 (or z = ei close to 1), we then derive that ; ; ; opt(ei0 ) :
KCSORappr KCSORopt KCSOR As the rightmost spectral radius of this expression equals the spectral radius of the optimal linewise SOR method for the linear system Au = f , this observation explains the robustness of the CSOR WR method, illustrated in Table 4.12.
4.4 Some concluding remarks In this chapter, we investigated the performance of several SOR WR methods for general, linear systems of ODEs. For the variants that are based on a multiplication with an
86
CHAPTER 4. WR METHODS AND SOR
overrelaxation parameter, we were able to extend and con rm the results of 63, 64] to systems of the form (3.1) with nonsingular B . In particular, we showed the resulting SOR WR methods to converge only slightly faster than the Gauss{Seidel WR method. We also investigated the CSOR WR method, in which the multiplication with an overrelaxation parameter is replaced by a convolution with a time-dependent function. Model problem analyses showed the CSOR method to converge as fast as the optimal SOR method for the corresponding static problems if the optimal convolution kernel (de ned in terms of its Laplace-transform expression) is used. In addition, we extended the original class of problems for which the Laplace transform of the optimal kernel is given, and commented on how to derive the optimal kernel in practice. In the following chapter, we shall examine the applicability of a similar convolution idea towards the acceleration of WR by Chebyshev techniques.
Chapter 5 Chebyshev Acceleration of Waveform Relaxation Methods In this chapter, we investigate the possibility of accelerating WR convergence by Chebyshev techniques. In analogy with the SOR case, we demonstrate that a convolution-based approach yields much better results than a straightforward application of the static Chebyshev acceleration technique. In particular, we will show that application of the convolution idea results in WR methods that converge as fast as their static counterparts for a variety of problems.
5.1 Polynomial acceleration of waveform relaxation Suppose we have a sequence of iterates fx(i)g, generated by a classical WR method of the form (3.2),
MB x_ ( ) + MA x( ) = NB x_ (
;
1) + N x( 1) + f A ;
x( )(0) = u0 1
(5.1)
with x(0)(t) u0. As a rst way to expedite the convergence of such a sequence towards the solution of (3.1), we consider a straightforward extension of the linear acceleration techniques for static iterations, see, e.g., 24, Chap. 3] or 106, Chap. 11]. That is, we construct a new waveform sequence fu(i)g by setting
u( ) =
X i=0
i x(i) 0 :
(5.2)
P
The numbers i are chosen such that i=0 i = 1, which ensures that u( ) = u for all 0 whenever the initial waveform x(0) equals the exact solution u. By repeatedly inserting (3.9) (with u( ) replaced by x( )), we can rewrite (5.2) as P
P
u( ) = q (K) u(0) + 'q P
with 'q = i=1 i( ij=01 Kj ') and q (s) = i=0 isi. In terms of this polynomial, the normalisation condition on the i-values becomes q (1) = 1. ;
87
88
CHAPTER 5. CHEBYSHEV ACCELERATION OF WR METHODS
The symbol of the iteration operator q (K) can be identi ed by Laplace transforming (5.2),
ue( )(z) =
X i=0
i xe(i)(z)
(5.3)
= q (K(z)) ue(0)(z) + 'eq (z) : This expression corresponds to the polynomial acceleration of the iteration
xe( )(z) = K(z)xe(
1)(z ) + ' e(z )
;
(5.4)
obtained by Laplace transforming (5.1), towards the solution of (3.15). (In this chapter, we assume the systems (3.15) to be solvable for Re(z) 0. This assumption, which is satis ed if all eigenvalues of B 1A have positive real parts, turns out to be a very natural one as it implies the boundedness of the solution u to (3.1) if f 2 Lp, see p. 28.) If all eigenvalues of MB 1 MA have positive real parts, the operator q (K), a linear combination of powers of K, consists of a matrix multiplication and a convolution with an L1-kernel. Hence, we have from Lemma 2.2.3 that ;
;
(q (K)) = sup (q (K(z))) = sup (q (K(z)) : Re(z) 0
z iR 2
(5.5)
The values (q (K)) and (q (K(z))) yield convergence factors over iterations. The corresponding asymptotic averaged spectral radii are de ned by
% (q (K)) := lim ( (q (K)))1= and % (q (K(z))) := lim ( (q (K(z))))1= !1
!1
see, e.g., 106, p. 299]. We maintain the notation of this latter reference, i.e., the left-hand sides contain a subscript to emphasise the use of a sequence of polynomials fq (s)g. The % -value depends on this sequence of polynomials, and not on a particular polynomial q (s), of course. Combined with (5.5) these de nitions immediately lead to
% (q (K)) = sup % (q (K(z))) = sup % (q (K(z))) : Re(z) 0
z iR 2
(5.6)
The sequence of polynomials fq (s) : q (1) = 1g that minimise (5.6) was studied in 68]. It was shown that only a marginal acceleration of the basic WR iteration is possible. Intuitively, this can be expected by the following argument. By interpreting K(z) as an iteration matrix for the static linear system (3.15), we expect (from our knowledge of the static iteration theory) that %(q (K(z))) will usually be minimised by a sequence of Chebyshev polynomials, chosen in terms of the eigenvalue distribution of K(z ). As these eigenvalues depend on z, one should not hope that a single polynomial sequence will do well for all values of z in the closed right-half complex plane. Better convergence results may be expected with a sequence of frequency-dependent polynomials, an idea which is addressed in the next section.
5.2. CONVOLUTION-BASED POLYNOMIAL ACCELERATION OF WR
89
5.2 Convolution-based polynomial acceleration of waveform relaxation We rst study the Chebyshev acceleration of (5.4) towards the solution of (3.15), and derive the necessary theoretical properties of the resulting methods. Next, we use these results to analyse the convergence behaviour of the convolution Chebyshev WR methods for (3.1), obtained by inverse Laplace transforming the latter iterative schemes.
5.2.1 Chebyshev acceleration in the frequency domain
The convergence of the Laplace-transformed waveform iterates fxe(i)(z)g towards the solution of (3.15) can be improved by taking linear combinations of the latter, that is, by setting X ue( )(z) = )e i (z) xe(i)(z) 0 : (5.7) i=0
Whereas the notation in equation (5.3) indicates that the same polynomial sequence fq (s)g is used for all linear systems of the form (3.15), the notation in (5.7) suggests a dierent choice of coecients for every speci c value of z. By repeated insertion of (5.4), equation (5.7) can be rewritten as ue( )(z) = Q (z K(z)) ue(0)(z) + 'eQ (z) (5.8) with
i 1 ! X X '~Q (z) = )~ i(z) K(z)j '~(z) ;
i=1
Q (z s) =
X i=0
j =0 )e i (z)si
and Q (z 1) = 1, which ensures that ue( )(z) = ue(z) for all 0 if xe(0)(z) = ue(z). The spectral radius of the iteration matrix in (5.8) is given by
(Q (z K(z))) = max jQ (z )j : (5.9) (K(z)) 2
Since the spectrum of K(z) is seldom known exactly, one will try to nd polynomials fQ (z s)g that are small on a region containing (K(z )). In particular, we assume the eigenvalues of K(z) to lie in a closed region R(d(z) p(z) q(z) (z)), whose boundary equals the ellipse E (d(z) p(z) q(z) (z)) centred around the complex point d(z). This ellipse, illustrated in Figure 5.1, is given by : = d(z) + ei (z)(p(z) cos() + i q(z) sin()) 0 < 2 with semi-axes p(z) and q(z) that satisfy p(z) q(z) 0 and ;=2 (z) < =2. When there is no confusion possible, we will denote this ellipse as E (z)$ R(d(z) p(z) q(z) (z)) will be abbreviated as R(z). The spectral radius (5.9) is then obviously bounded by its virtual counterpart, de ned as
R (Q (z K(z))) := max jQ (z )j : R(z) 2
90
CHAPTER 5. CHEBYSHEV ACCELERATION OF WR METHODS
......................... ........ .... ....... ... .... ...... ... ...... ...... .. .. ...... . . . ... . .. ... ... . .. . . . . . .. . ... ... ..... .. ... .... .. . . . . . . .. . ... . .... . . . . . .. .. .... . . . . . . .. .. ... . . . . . . . . ... ... .. . . . . . . .. ... .. ... ... .. ... ... ... ... .. ... ... .. ... .............. .. . . . . . ...... . .. .. ...... .. ...... .. ...... . ... ..... ..... ... .. ....... ..... ... .. .. ........ .. . . . . . . . . . . ..... .. .. . . . . . .. . . . .. . . . .. ... ... . .. .. .. . .. .. . ... .. ... .. .. . . ..... .. . . . . ... ... . ... ... . ... . ... ... . .. ... . . .. . .... . . . . . .. .... . .. ... . . . .... .. . .... . . . .... .. . ... . . .. . . .. . .. ..... . ... ... . ... .... . ...... ... . ...... .... .. ...... . . . . .... . . ...... ...... ...............................
q (z )
d(z )
p(z )
(z )
1
Figure 5.1: An ellipse E (d(z) p(z) q(z) (z)). That is, for any Q (z s) we have that
(Q (z K(z))) R (Q (z K(z))) : In order to determine the averaged convergence rate per iteration, we de ne the virtual asymptotic averaged spectral radius as %R (Q (z K(z))) := lim ( R (Q (z K(z))))1= : !1
This number can be minimised by choosing fQ (z s)g as a sequence of scaled and translated Chebyshev polynomials of the rst kind (in the variable s). The following lemma follows in a straightforward way from the paper by Manteuel on the Chebyshev iteration for nonsymmetric linear systems 61]. Lemma 5.2.1 Assume the spectrum (K(z)) lies in the region R(z), which does not contain the point 1 and for which p(z ) > q (z ) > 0. In terms of
T R P (z s) = T
s d(z) c(z) 1 d(z) c(z) ;
;
(5.10)
with Tp() the -th degree Chebyshev polynomial of the rst kind and c(z) = p2(z) ; q2(z)ei (z), we have ; %R P R(z K(z)) %R (Q (z K(z))) (5.11) for all polynomial sequences fQ (z s) : Q (z 1) = 1g. In particular, ; p(z) + q(z) (5.12) %R P R(z K(z)) = p 2 2 1 ; d(z ) + (1 ; d(z )) ; c (z )
5.2. CONVOLUTION-BASED POLYNOMIAL ACCELERATION OF WR
91
p p where the branch for is chosen such that (1 ; d(z))2 = 1 ; d(z ).
Proof. In 61], Manteuel discusses the convergence of the Chebyshev-accelerated Richardson iteration for the nonsymmetric linear system Ax = b. His results apply immediately to our case with the provision that A and s in his paper are replaced by I ; K(z) and 1 ; s. More precisely, equation (5.11) follows immediately from 61, Thms. 2.4 and 2.8]. Since P R(z ) is an analytic function of , its maximum modulus in region R(z) is attained on the boundary. Consequently, %R(P R(z K(z))) is given by
lim
!1
R 1= max P (z ) :
E (z) 2
For arbitrary 62 d(z) ; c(z) d(z)+ c(z)], that is, when is not on the interval connecting the focal points of the ellipse, one has p R 1= ; d(z ) + ( ; d(z ))2 ; c2 (z ) p lim P (z ) = 1 ; d(z) + (1 ; d(z))2 ; c2(z) !1
(5.13)
which is shown to be constant for all 2 E (z) 61, Eqs. (3.1) and (2.13)]. The proof is then completed by inserting = d(z) + p(z)ei (z) (the point on E (z) which corresponds to = 0) in the right-hand side of (5.13). 2 Similar results can be proven for degenerate cases of the region R(z). When this region is chosen to be a disc with midpoint d(z) and radius p(z)(= q(z)), the conclusions of Lemma 5.2.1 remain valid in terms of the polynomials s ; d(z ) R (5.14) P (z s) = 1 ; d(z) 61, Thm. 2.5]. Moreover, the resulting iteration is equivalent to the extrapolated method based on (5.4), such that only moderate convergence acceleration can be achieved in this case 24, Chap. 12, p. 335]. For the other degenerate case, which occurs when (K(z)) lies on a closed line segment R(d(z) p(z) 0 (z)) = E (d(z) p(z) 0 (z)), we have the following lemma. Lemma 5.2.2 If (K(z)) lies on the line segment R(d(z) p(z) 0 (z)), which does not contain the point 1, then the results of Lemma 5.2.1 remain valid with q(z ) = 0 and c(z) = p(z)ei (z) . Proof. The optimality of the Chebyshev polynomials in the sense of (5.11) for a line segment follows from the equality of the polynomials (5.10) with q(z) = 0 and c(z) = p(z)ei (z) and the normalised Faber polynomials for this line segment, see, e.g., 16, Ex. 1]. Moreover, PnR(z ) attains its maximum modulus on R(d(z) p(z) 0 (z)) in the endpoints = d(z) p(z)ei (z). For these and the speci ed q(z) and c(z), the numerator of (5.10) evaluates to T (1). Hence, ; 1 %R P R(z K(z)) = 1= : 1 ; d ( z ) lim T p(z)ei (z) !1
CHAPTER 5. CHEBYSHEV ACCELERATION OF WR METHODS
92
Elaborating the latter equation (by using the techniques of 61]) leads immediately to (5.12) with q(z) = 0. 2 The problem of determining an optimal ellipse, that is, an ellipse surrounding (K(z)) for which (5.12) is as small as possible, is addressed next.
Lemma 5.2.3 Any optimal ellipse Eopt(z) contains an eigenvalue of K(z). If (K(z)) is collinear, the optimal ellipse Eopt (z ) equals the line segment linking the extremal eigenvalues of K(z).
Proof. Suppose (K(z)) is not collinear and let E1 (z ) be a surrounding ellipse that does not contain an eigenvalue of K(z). Then, there exists an elliptic region R2(z) R1(z) (with the same midpoint and inclination angle as the rst one) containing (K(z)). More precisely, we have ;
;
;
%R2 P R2 (z K(z)) %R2 P R1 (z K(z)) < %R1 P R1 (z K(z)) :
(5.15)
The rst equality follows from (5.11) with P R1 (z ) the polynomial (5.10) corresponding to the region R1(z), while the second (strict) inequality is an immediate consequence of the maximum principle. As a result, E1(z) cannot be optimal. In the collinear case, we have, by a convexity argument, that any elliptic region R(z) enclosing (K(z)) must necessarily contain the line segment Ropt(z). Hence, we have as in (5.15) that ;
;
;
%Ropt P Ropt (z K(z)) %Ropt P R(z K(z)) < %R P R(z K(z))
2
and the proof is completed.
Corollary 5.2.4 For an optimal ellipse, the virtual asymptotic averaged spectral radius is actually attained, i.e., we have ;
;
% P Ropt (z K(z)) = %Ropt P Ropt (z K(z)) :
(5.16)
Proof. If (K(z)) is not collinear, equation (5.16) is proved by noting that the optimal variant of (5.13) is constant for all 2 Eopt(z), at least one point of which belongs to (K(z)) by Lemma 5.2.3. In the collinear case, (5.16) follows from the fact that jP Ropt (z )j attains its maximum on Ropt(z ) in the endpoints of this interval, which are eigenvalues of K(z). 2
In general, Lemma 5.2.3 does not provide enough information to de ne the optimal ellipse in the case where the eigenvalues of K(z) are not collinear. Guided by one's intuition that such an ellipse should be as small as possible, and the knowledge that it should pass through at least one eigenvalue, one can then try to nd a \good" surrounding ellipse, for which (5.12) is close to its minimal value. This task turns out to be rather dicult as well. Finally, iteration (5.7), where )e i (z) are the coecients of the polynomials (5.10), can be transformed into a form that is more convenient for computation. Assume q(z) < p(z),
5.2. CONVOLUTION-BASED POLYNOMIAL ACCELERATION OF WR
93
so that c(z) 6= 0. The transformation can be done by using the recurrence relation of the Chebyshev polynomials. It is easy to derive that
P0 (z s) = 1 P1(z s) = ;e(z)s ; ;e(z) + 1 e e e P (z s) = , (z) ;(z)s + 1 ; ;(z) P 1 (z s) + (1 ; ,e (z))P 2 (z s) 2 ;
;
e z ) = 1=(1 ; d(z )) and where ;(
T 1 1 d(z) c(z) 2: ,e (z) = 2 1 ; d(z) 1 d(z)
;
;
c(z)
T
;
c(z)
By applying the Chebyshev recurrence relation once more, one derives 8; < 1 ; 21 2(z ) 1 1 ,e (z) = : 1 2 e 1 ; 4 (z), 1 (z)
=2 3
;
;
;
with 2(z) = c2(z)=(1;d(z))2 , see also 24, Chap. 12]. It then follows from 24, Thm. 3-2.1] that (5.7) can be rewritten as
e z ) bue(1)(z ) ; ue(0)(z ) ue(1)(z) = ue(0)(z) + ;(
ue( )(z) = ue( 2)(z) + ,e ;
() ( 1) b e (z) ;(z) ue (z) ; ue (z) ; ( 1) ( 2) + ue (z) ; ue (z) 2 ;
;
(5.17)
;
with
bue( ) (z ) = K(z )ue( 1) (z ) + 'e(z ) 1 : (5.18) e z ) = 1=(1 ; d(z )) and , e (z ) = 1 for 2. If c(z) = 0, it follows from (5.14) that ;( ;
As a result, (5.17) then becomes
() ue( )(z) = ue( 1)(z) + ;e(z) bue (z) ; ue( 1)(z) ;
;
1 :
5.2.2 Convolution-based Chebyshev acceleration in the time domain By inverse Laplace transforming the z-dependent iteration scheme (5.7), we get
u( ) =
X i=0
) i x(i) 0 :
P
(5.19)
The normalisation condition becomes i=0 ) i(t) = (t), with (t) the delta function. The functions ) i are the inverse Laplace transforms of the z-dependent coecients
CHAPTER 5. CHEBYSHEV ACCELERATION OF WR METHODS
94
of the polynomial Qn(z s). Method (5.19) is a convolution-based polynomial waveform relaxation method. If the polynomials Qn (z s) are of Chebyshev-type (5.10), inverse Laplace transforming (5.17){(5.18) yields the mathematically equivalent iterative scheme ; u(1) = u(0) + ; ub(1) ; u(0); ;
(5.20) u( ) = u( 2)(t) + , ; ub( ) ; u( 1) + u( 1) ; u( 2) 2 : Here, ub( ) is obtained by application of the unaccelerated WR method to u( 1), i.e., ub( ) = Ku( 1) + ' 1 (5.21) and the convolution kernels ; and , are the inverse Laplace transforms of the functions e z ) and ,e n (z ). The iteration operator mapping u(0) into u( ) is identi ed in the next ;( lemma. Lemma 5.2.5 Assume all eigenvalues of MB 1MA have positive real parts, ;(t) = (t)+ c (t) and ,i(t) = i (t) + (i)c (t) with c and (i )c (2 i ) in L1(0 1). Then, iteration (5.20){(5.21) can be rewritten as u( ) = KCH u(0) + 'CH (5.22) where K CH consists of a matrixPmultiplicationPand a linear Volterra convolution with an L1(0 1)-kernel, and 'CH = i=1 () i ( ji=01 Kj ')) with ) i the inverse Laplace transforms of the coecients of (5.10). Proof. The condition on the eigenvalues of MB 1 MA ensures that the operator K of the unaccelerated method is of the form K + Kc , where K is a matrix and Kc is a Volterra convolution operator whose kernel belongs to L1(0 1), see Section 3.2.1. The nature of KCH follows as the space of such operators is closed under addition and convolution. 2 Remark 5.2.1 The assumptions on ; and ,i are satis ed when their Laplace transforms e z ) and , e i (z ) are bounded and analytic in an open domain containing the closed right ;( half of the complex plane, see Remark 4.2.1. Under the assumptions of Lemma 5.2.5, Lemma 2.2.3 implies that the spectral radius of the operator KCH is given by
(KCH ) = sup (P R (z K(z))) = sup (P R(z K(z))) : ;
;
;
;
;
;
;
;
;
z iR
Re(z) 0
2
The corresponding asymptotic averaged spectral radius, which equals %(KCH ) = sup %(P R (z K(z))) z iR 2
can be bounded by its virtual counterpart
(5.23)
pp (z) + q(z) (5.24) z iR z iR j1 ; d(z ) + (1 ; d(z ))2 ; c2(z )j the equality of which follows from Lemma 5.2.1 and the assumption that the ellipses R(z) do not contain the point 1. Also, Corollary 5.2.4 implies that (5.23) equals (5.24) if the latter ellipses are optimal. In the remainder of this chapter, we shall often omit the subscript in the notation of the operator KCH , which shall be referred to as the (convolution-based) Chebyshev waveform relaxation operator. sup %R(P R(z K(z))) = sup 2
2
5.3. MODEL PROBLEM ANALYSIS
95
5.3 Model problem analysis In this section, we shall discuss some speci c results for the Chebyshev acceleration of several waveform variants and apply them to the semi-discretised heat equation (3.47). We will show that the convergence behaviour of the convolution Chebyshev WR methods is similar to those of the corresponding static Chebyshev iterations, and illustrate these theoretical results by means of numerical experiments.
5.3.1 The Chebyshev{Picard method Discussion of the Picard method
The Picard method, which is only de ned for ODE systems (3.1) with B = I , can be written as an iteration of the form (5.1) with MB = I , NB = MA = 0 and NA = ;A. Consequently, MB 1MA equals the zero matrix and the corresponding operator KPIC can only be investigated in the weighted spaces Lp (0 1). For the nite-dierence discretisation of equation (3.47), it is well-known that the eigenvalues of A are given by ;
8 2 > < 2(1 + cos(ih))=h i = 1 : : : 1=h ; 1 2(2 + cos(ih) + cos(jh))=h2 i j = 1 : : : 1=h ; 1 > : 2
2(3 + cos(ih) + cos(jh) + cos(kh))=h i j k = 1 : : : 1=h ; 1
m=1 m=2 m=3:
Hence, we derive
; PIC ; A ; 2 m ; 2 m K (z) = 2 (1 ; cos(h)) 2 (1 + cos(h))
z
zh
zh
(5.25)
and
; ; 2m
KPIC = sup KPIC(z) = h 2 (1 + cos(h)) z +iR i.e., the spectral radius of KPIC equals that of KPIC ( ). The latter corresponds to the iteration matrix of the Richardson iteration u( ) = ;Au( 1) + f for the linear system (I + A)u = f . The Picard method appears to be divergent in Lp (0 1) with < 2m(1+ cos(h))=h2. The window of stable convergence can be approximated by 0 h2=2m(1 + cos(h))]. Since this window is very small, the Picard iteration does not have any practical use for the model heat problem. These results are illustrated for (3.47) with m = 1 and h = 1=16 in Figure 5.2, where we plotted contour lines of (KPIC(z)) in the complex plane. For this problem, the window of convergence equals 0 0:000986]. 2
;
Chebyshev acceleration of the Picard method
The Picard method can be accelerated quite well by the convolution-based Chebyshev approach, which is perhaps somewhat surprising. One of the virtues of the Picard method is that for its acceleration, we only need a good/optimal ellipse surrounding (A)$ the multiplication of this ellipse by ;1=z is expected to lead to a good/optimal ellipse for the symbol KPIC(z) = ;A=z at other values of z 6= 0 too. In particular, Lemma 5.2.3 implies that the line segment given in (5.25) is the optimal ellipse surrounding (KPIC(z)) with
96
CHAPTER 5. CHEBYSHEV ACCELERATION OF WR METHODS 1500 0.6
0.7 1000
0.9 0.8 1
500
2.5 50
0
5
−500
−1000
−1500 −1500
−1000
−500
0
500
1000
1500
Figure 5.2: Spectral picture of the Picard iteration for (3.47) (m = 1, nite dierences, h = 1=16).
500 400
0.3
0.4 300 0.6
200 100 0
0.5
0.7
0.8 0.9 0.95
−100 −200 −300 −400 −500 −500
−400
−300
−200
−100
0
100
200
300
400
500
Figure 5.3: Spectral picture of the Chebyshev{Picard iteration for (3.47) (m = 1, nite dierences, h = 1=16).
5.3. MODEL PROBLEM ANALYSIS
97
z 6= 0 for the nite-dierence discretisation of (3.47). We prove the following statement for the asymptotic averaged spectral radius of the resulting optimal Chebyshev{Picard operator KCH PICopt in Lp(0 1). ;
Theorem 5.3.1 Consider the heat equation (3.47), discretised in space using central nite di erences. Then, if we consider KCH p 1, we have (for small h) that ;
% KCH
PICopt =
;
;
PICopt
as an operator in Lp(0 1) with 1
cos(h) 1 ; h : 1 + 1 ; cos2(h)
(5.26)
p
Proof. First note that A = KPIC (;1). By using (5.25), the parameters of the optimal line segment Eopt(z) surrounding (KPIC(z)) equal dopt(z) = ;dopt(;1)=z = ;2m=zh2 and copt(z) = copt(;1)=z = 2m cos(h)=zh2. Hence, Lemma 5.2.1 implies for z 6= 0 KPIC (z) dopt (z) T T copt (z) R PIC opt P (z K (z)) = = dopt (z) T 1 copt T (z) ;
;
dopt ( 1) A copt ( 1) z+dopt ( 1) copt ( 1) ;
;
;
;
:
(5.27)
;
Since the (zero) eigenvalues of MB 1 MA do not have positive real parts, we cannot apply Lemma 5.2.5 in order to prove that the optimal Chebyshev{Picard operator KCH PICopt consists of a matrix multiplication and an L1-convolution part. Yet, the conditions of Lemma 5.2.5 are not necessary to this end, and we shall adopt a dierent approach here. That is, we observe that z = 0 is a removable singularity of P Ropt (z KPIC(z)), or, more precisely, that the rightmost expression of (5.27) is bounded and analytic for Re(z) 0. As in Remark 5.2.1, its inverse Laplace transform KCH PICopt then consists of a matrix multiplication and an L1-convolution part. Using Corollary 5.2.4, (5.23){(5.24) becomes ;
;
;
%(KCH
;
PICopt ) = sup %(P Ropt (z KPIC (z )) = sup % (P Ropt (z KPIC (z )) Ropt z iR z iR 2
2
:
(5.28)
Since the optimal line segment does not contain the point 1 for z 6= 0, Lemma 5.2.1 implies that ; ; 2m cos(h) %Ropt P Ropt z KPIC(z) = (5.29) p 2 2 2 2 zh + 2m + (zh + 2m) ; (2m cos(h)) for z = i, 6= 0. For z = 0, the polynomial (5.27) can be rewritten as the optimal Chebyshev polynomial for accelerating the convergence of the static Richardson iteration u( ) = (I ; A)u( 1) + f , which equals ;
T
(I A) (1 dopt ( 1)) copt ( 1) 1 (1 dopt ( 1) copt ( 1) ;
;
;
;
;
T
;
;
;
:
(5.30)
;
Application of Lemma 5.2.1 to the latter polynomial leads exactly to the right-hand side of (5.29), evaluated at z = 0.
98
CHAPTER 5. CHEBYSHEV ACCELERATION OF WR METHODS
Finally, we remark that the supremum in (5.28) for the function in (5.29) is attained at the origin. Hence, the proof is completed by inserting z = 0 in (5.29) and calculating a series expression for small h. 2 In Figure 5.3, we plotted contour lines of %(P Ropt (z KPIC(z))) (for the one-dimensional heat equation with h = 1=16) in the complex plane.
Relation to previously published work by Lubich and Skeel
In 54, 85], Lubich and Skeel already discussed a related Chebyshev{Picard method. They de ne a shifted Picard iteration in terms of u_ ( ) + u( ) = f ; Au( 1) + u( 1) u( )(0) = u0 (5.31) with = d ; c and = 1 2 : : : N . Here, = cos((2 ; 1)=2N ) denote the zeros of TN (), while d and c are the usual parameters of an ellipse that was chosen to enclose the spectrum of A. By Laplace transformation of (5.31) we can check that the resulting waveform u(N )(t) corresponds to the N -th waveform of our scheme (5.20){(5.21), with d(z) = ;d=z, c(z) = c=z and (z) = Arg(c(z)) for z 6= 0. Hence, for this particular selection of the ellipses E (z), both methods give similar results (after N steps). Assume that the centre of the ellipse E surrounding (A) lies on the real axis, while its major semi-axis is parallel to one of the co-ordinate axes. For these choices of this ellipse, Lubich proved that in Lp(0 T ), the Chebyshev{Picard method for u_ + Au = f gives at least the same error reduction as the Chebyshev acceleration of the Richardson method T1 u( ) = ;Au( 1) + f for the static linear system 1I +A u=f (5.32) T 54, p. 537]. For the reader's convenience, we rewrite the result (for the case that the major axis of E lies on the real line) in our notations. The other case is completely similar. Lemma 5.3.2 Consider an ODE system (3.1) with B = I . Assume the major axis of the ellipse chosen to enclose (A) lies on the real axis of the complex plane. Then, the resulting Chebyshev{Picard iteration satises ( ) 1 1 e(0) R PIC e Lp(0T ) e Pn T K (5.33) Lp (0T ) T for 1 p 1 and 0 < T 1, with k k the matrix norm induced by the chosen vector norm in C d . Observe that for T = 1, PnR(0 KPIC (0)) has to be interpreted as in (5.30) as the scaled and translated Chebyshev polynomial corresponding to the matrix I ; A. As such, the latter lemma, applied to the nite-dierence discretisation of (3.47) with T = 1 and with the ellipse E surrounding (A) equal to the optimal line segment 2m(1 ; cos(h))=h2 2m(1 + cos(h))=h2], can be regarded as the norm equivalent of Theorem 5.3.1. ;
;
;
5.3. MODEL PROBLEM ANALYSIS
99
5.3.2 Chebyshev{Jacobi waveform relaxation
The B = I-case
According to Chapter 3, the Jacobi WR method for u_ + Au = f is de ned as iteration (5.1) with MB = I , NB = 0, MA = DA and NA = LA + UA. If we suppose in addition that DA has a constant positive diagonal DA = daI (da > 0), we have that equation (4.26) is satis ed. As such, we only need a good/optimal ellipse surrounding the static Jacobi iteration matrix KJAC(0) to de ne the resulting Chebyshev{Jacobi WR method for this problem. In terms of the asymptotic averaged spectral radius, we can prove the following relation between the optimal Chebyshev{Jacobi WR method and the corresponding static iteration for Au = f . Theorem 5.3.3 Consider an ODE system (3.1) with B = I . Assume A is consistently ordered with constant positive diagonal DA = da I (da > 0) and the eigenvalues of KJAC (0) are real with 1 = (KJAC (0)) < 1. Then, if we consider KCH JACopt as an operator in Lp(0 1) with 1 p 1, we have ; ; ; (5.34) % KCH JACopt = % P Ropt 0 KJAC(0) = p1 2 : 1 + 1 ; 1 Proof. Since A is consistently ordered, the eigenvalues of KJAC(0) occur in opposite pairs and Lemma 5.2.3 implies that Eopt(z) = z+dada Eopt(0) = z+dada ;1 1]. Since this line segment does not contain the point 1 for Re(z) 0, Lemma 5.2.1 and (5.16) imply that for these z ; ; j1 (z )j (5.35) % P Ropt z KJAC(z) = p 1 + 1 ; 21 (z ) with 1(z) = z+dada 1. In addition, ;
;
T ;eopt(z) = 1 and (,e opt) (z) = 2(z) 1 T
1 11(z) 1 1 (z)
;
are bounded and analytic functions for Re(z) 0, so that we may apply Lemma 5.2.5 and (5.23){(5.24). The proof is completed by noting that the supremum of (5.35) taken over z 2 iR is attained at z = 0. 2 The matrices A, obtained by a nite-dierence discretisation of (3.47), satisfy the conditions of the latter lemma with 1 = cos(h), see equation (3.50). As such, we immediately obtain the following theorem. Theorem 5.3.4 Consider the heat equation (3.47), discretised in space using central nite di erences. Then, if we consider KCH JACopt as an operator in Lp(0 1) with 1 p 1, we have (for small h) that ; % KCH JACopt = pcos(h) 2 1 ; h : (5.36) 1 + 1 ; cos (h) ;
;
100
CHAPTER 5. CHEBYSHEV ACCELERATION OF WR METHODS 1000 800
0.5 0.6
600 400
0.8
200
0.9
0.7 0.95
0 −200 −400 −600 −800 −1000 −1000 −800
−600
−400
−200
0
200
400
600
800
1000
Figure 5.4: Spectral picture of Chebyshev{Jacobi WR for (3.47) (m = 2, nite dierences, h = 1=32). An illustration of this result may be found in Figure 5.4, where we plotted contour Ropt lines of %(P (z KJAC(z))) for the two-dimensional heat equation with nite dierences and h = 1=32. Finally, it may be veri ed by elementary Laplace-transform techniques that the N th WR iterate from the general Chebyshev{Jacobi variant of (5.20){(5.21) can also be obtained from a shifted Jacobi iteration, i.e., 1 u_ ( ) + (1 ; )u( ) = ;KJAC(0) ; u( 1) + 1 f u( )(0) = 0 da da with = d(0) + c(0) and = 1 2 : : : N . As before, denote the zeros of TN (), while d(0) and c(0) are the parameters of the ellipse E (0) that was chosen to enclose the spectrum of KJAC(0). As in the Picard case, we can relate the error reduction of the Chebyshev{Jacobi WR method for u_ + Au = f in Lp(0 T ) to that of the Chebyshev{ Jacobi method for (5.32) for a particular choice of the ellipse E (0). ;
Lemma 5.3.5 Consider an ODE system (3.1) with B = I . Assume A has a constant
positive diagonal DA = daI (da > 0) and the major axis of the ellipse E (0) chosen to enclose (KJAC(0)) lies on the real axis of the complex plane. Then, the resulting Chebyshev{Jacobi WR iteration satises ( ) 1 1 e(0) R JAC (5.37) e Lp(0T ) e P K Lp (0T )
T
T
for 1 p 1 and 0 < T 1, with k k the matrix norm induced by the chosen vector norm in C d .
5.3. MODEL PROBLEM ANALYSIS
101
Proof. The parameters of the ellipse E (z ) surrounding (KJAC (z )) can be written in terms of those of the ellipse E (0), that is, d(z) = z+dada d(0) and c(z) = z+dada c(0). Hence,
P R (z KJAC(z)) =
T
KJAC(z) d(z) c(z) = 1 d(z) ;
T
T
;
c(z)
T
KJAC(0) d(0) c(0) z + da ;
d(0) c(0)
da
;
while the polynomial of the corresponding static Chebyshev{Jacobi method for (5.32) is given by P R(1=T KJAC(1=T )). The rest of the proof is an immediate extension of the proof of Lemma 5.3.2, given in 54, p. 537]. 2
The B 6= I-case
The Jacobi WR method for general ODE systems (3.1) is given by iteration (5.1) with MB = DB MA = DA NB = LB + UB and NA = LA + UA : The symbol KJAC(z) equals (zDB + DA ) 1 (z(LB + UB ) + (LA + UA )), and in general we cannot explicitly relate its eigenvalue distribution to that of KJAC(0) as in equation (4.26). For the one-dimensional heat equation (3.47), discretised using linear nite elements, we can however prove the following theorem. Theorem 5.3.6 Consider the one-dimensional heat equation (3.47), discretised in space using linear nite elements. Then, if we consider KCH JACopt as an operator in Lp(0 1) with 1 p 1, we have (for small h) that ; ; % KCH JACopt = % P Ropt (0 KJAC(0)) = pcos(h) 2 1 ; h : (5.38) 1 + 1 ; cos (h) Proof. From (4.76) and Lemma 5.2.3, we derive that Eopt(z ) = ;1(z ) 1 (z )] with 2 12 1(z) = 24zh zh2 +12 cos(h). The rest of the proof is similar to the proof of Theorem 5.3.4, and a series expression is calculated for small values of h. 2 In Figure 5.5, we plotted the value of %(P Ropt (z KJAC(z)) versus z = i for h = 1=16, while contour lines of the latter quantity are given in Figure 5.6. (The former picture is included mainly for completeness and for comparison with Figures 3.4, 4.6 and 4.7.) ;
;
;
;
0.9 0.8 0.7 0.6 0.5 0.4
...... .. .. .. .... .. .. .. .. ... .. ... ... . ... ... ... ... ... . . . ... ... . .... . .... ... . . . .... .... .... . . . .... .... .... . . . .... .... . . ..... . . ...... ..... . . . . ...... . ..... ..... . . . . ..... ..... ...... . . . . ....... ...... . . ...... . . . ....... ....... . . ...... . . . ....... ...... . . . . . ....... . ........ ...... . . . . . . . . ......... . ........... ......... . . . . . . . . .......... ........... ............ . . . . . . . . . . . ............. ........ ....... ..............
;500 ;400 ;300 ;200 ;100 0 100 200 300 400 500
Figure 5.5: %(P Ropt (i KJAC(i)) versus for (3.47) (m = 1, linear nite elements, h = 1=16).
102
CHAPTER 5. CHEBYSHEV ACCELERATION OF WR METHODS 500 400
0.4 0.3
300
0.6 0.5
200 100 0
0.7
0.8 0.9 0.95
−100 −200 −300 −400 −500 −500
−400
−300
−200
−100
0
100
200
300
400
500
Figure 5.6: Spectral picture of Chebyshev{Jacobi WR for (3.47) (m = 1, linear nite elements, h = 1=16).
Comparison to the convolution SOR waveform relaxation method
We can now compare the convergence behaviour of the optimal Chebyshev{Jacobi WR method to that of other waveform variants for ODE systems that satisfy the conditions of Theorem 5.3.3. More precisely, we have ; ; 2 p1 2 2
KJAC = 1 and KCSORopt = (1 + 1 ; 1) see Theorems 3.2.8 and 4.2.9. These results, which may be applied, e.g., to the nitedierence discretisation of the heat equation (3.47) by setting 1 = cos(h), imply that the Chebyshev{Jacobi WR method is substantially faster than the unaccelerated Jacobi variant, but only half as fast as the optimal CSOR WR method. Similar results may be obtained for the one-dimensional heat equation discretised using linear nite elements, treated in Theorem 5.3.6. Again, the optimal Chebyshev{ Jacobi WR is much faster than its unaccelerated Jacobi variant (with spectral radius cos(h), see Theorem 3.3.4) but only p half as2 fast 2as the optimal CSOR WR method 2 (with spectral radius cos (h)=(1 + 1 ; cos (h)) , see Theorem 4.3.5). This observation for model problems with collinear spectra of their Jacobi symbols also holds in a more general way. We recall from Chapter 3 that the CSOR WR method for (3.1) is a straightforward extension of the static SOR iteration, except that the multiplication with an overrelaxation parameter ! is replaced by a convolution with a timedependent overrelaxation function #. The symbol of the method is given by (4.15), i.e., it e z ). equals the SOR iteration matrix for (3.15) with complex overrelaxation parameter #( If the matrices B and A are such that zB + A is consistently ordered, the continuous-
5.3. MODEL PROBLEM ANALYSIS
103
time equivalents of Lemmas 4.2.21 and 4.2.22 imply that there exists an optimal ellipse E (0 p(z) q(z) (z)) (centred around the origin, i.e., with d(z) = 0) surrounding (KJAC(z))) such that #e opt(z), the value of the parameter minimising the p overrelaxation spectral radius of (4.15), is given by 2=(1 + 1 ; c2(z)), while ;
KCSORopt(z) =
p(p z) + q(z) j1 + 1 ; c2 (z )j
!2
:
(5.39)
In addition, if the spectrum of the latter matrix is not collinear, this ellipse contains an eigenvalue of KJAC(z), while in the collinear case, this ellipse equals the line segment between the extremal eigenvalues of KJAC(z). Hence, if the ellipse of the CSOR method is used also for the Chebyshev acceleration of the Jacobi relaxation method for (3.15), Lemma 5.2.1 and Corollary 5.2.4 yield
z) + q(z) : % P R(z KJAC(z)) = %R P R(z KJAC(z)) = p(p 1 + 1 ; c2 (z ) ;
;
(5.40)
As such, the optimal SOR method for (3.15) turns out to be twice as fast as the resulting Chebyshev{Jacobi iteration for this problem. Finally, we remark that under the right assumptions, the above conclusions remain valid after taking the suprema of (5.39) and (5.40) over the right-half complex plane/imaginary axis.
5.3.3 Chebyshev{Gauss{Seidel waveform relaxation The B = I-case
The Gauss{Seidel WR method for ODE systems u_ + Au = f is given by (5.1) with MB = I , NB = 0, MA = DA ; LA and NA = UA . The following theorem is the Gauss{ Seidel equivalent of Theorem 5.3.3.
Theorem 5.3.7 Consider an ODE system (3.1) with B = I . Assume A is consistently ordered with constant positive diagonal DA = da I (da > 0) and the eigenvalues of KJAC (0) are real with 1 = (KJAC(0)) < 1. Then, if we consider KCH GSopt as an operator in Lp(0 1) with 1 p 1, we have ;
;
% KCH
;
GSopt = % ;P Ropt ;0 KGS (0) =
2 : (1 + 1 ; 21)2 p1
(5.41)
Proof. Since zI + A is consistently orderedpfor Re(z ) 0, it follows from Lemma 3.2.9 that 2 (KGS(z)) if and only if = 0 or 2 (KJAC(z)). Hence, the optimal ellipse Eopt(z) surrounding (KGS(z)) is given by 0 21(z)] with 1(z) = z+dada 1 as in the proof 2 of Theorem 5.3.3. From this follows that dopt(z) = copt(z) = 12(z) and popt(z) = j21(z)j=2. As this line segment does not contain the point 1 and the resulting ;eopt(z) and (,e opt) (z) are bounded and analytic for Re(z) 0, we may apply Lemma 5.2.5 and (5.23){(5.24),
CHAPTER 5. CHEBYSHEV ACCELERATION OF WR METHODS
104 i.e.,
;
% KGS
;
CHopt
;
= sup % P Ropt (z KGS(z)) z iR j2 (z )j=2 = sup q 1 z iR 2 2 2 2 2 1 ; 1 (z )=2 + (1 ; 1 (z )=2) ; (1 (z )=2) 2
2
where the latter equality follows from (5.16) and Lemma 5.2.1. It can be checked that this function attains its supremum for z = 0, from which (5.41) follows. 2 If we apply this theorem to the heat equation, discretised using nite dierences, we immediately see that the resulting WR method is as fast as the optimal CSOR variant.
Theorem 5.3.8 Consider the heat equation (3.47), discretised in space using nite differences. Then, if we consider KCH we have (for small h) that ;
% KCH
;
GSopt =
;
GSopt
as an operator in Lp (0 1) with 1 p 1,
cos2(h) 1 ; 2h : (1 + 1 ; cos2(h))2 p
(5.42)
An illustration can be found in Figure 5.7, where we plotted contour lines of %(P Ropt (z KGS(z))) for the two-dimensional heat equation, discretised using nite dierences and h = 1=32. 1000 800 600 0.4 400
0.3
0.6 0.5
0.8
200
0.7
0.9
0.95 0 −200 −400 −600 −800 −1000 −1000 −800
−600
−400
−200
0
200
400
600
800
1000
Figure 5.7: Spectral picture of Chebyshev{Gauss{Seidel WR for (3.47) (m = 2, nite dierences, h = 1=32).
5.3. MODEL PROBLEM ANALYSIS
105
The B 6= I-case We can prove a similar result as in Theorem 5.3.8 for the one-dimensional heat equation, discretised using linear nite elements.
Theorem 5.3.9 Consider the one-dimensional heat equation (3.47), discretised in space using linear nite elements. Then, if we consider KCH with 1 p 1, we have (for small h) that ;
% KCH
GSopt = % ;P Ropt (0 KGS (0)) =
;
0.7 0.6 0.5 0.4 0.3 0.2
GSopt
;
as an operator in Lp(0 1)
cos2(h) 1 ; 2h : (5.43) (1 + 1 ; cos2(h))2 p
... .... .... ... ..... ... ... .. ... ... .. ... ... . . ... ... ... . . ... ... . .... . . .... ... . . .... ... . .... . . .... .... . . .... . ... ..... . . . . . ..... ..... ..... . . . ..... .... . . ...... . . ...... ..... . . . . . ...... . ...... ...... . . . . . ...... . ....... ...... . . . . ....... ..... . . . . ...... . . . ...... ...... . . . ....... . . . ........ ...... . . . . . . . ......... ........ ......... . . . . . . . . . .......... .......... . ........... . . . . . . . . . ........... .......... ......... ............
;250 ;200 ;150 ;100 ;50 0
50 100 150 200 250
Figure 5.8: %(P Ropt (i KGS(i))) versus for (3.47) (m = 1, linear nite elements, h = 1=16). 250 200 150
0.5
100 50 0
0.2
0.3
0.7
0.4
0.6
0.8 0.9 0.95
−50 −100 −150 −200 −250 −250
−200
−150
−100
−50
0
50
100
150
200
250
Figure 5.9: Spectral picture of Chebyshev{Gauss{Seidel WR for (3.47) (m = 1, linear
nite elements, h = 1=16).
106
CHAPTER 5. CHEBYSHEV ACCELERATION OF WR METHODS
Proof. The proof is completely similar to the proof of Theorem 5.3.7 with KGS(z) = 2 12 2 zh 1 2 (z(DB ; LB ) + (DA ; LA )) (zUB + UA ) and 1(z) = 4zh2 +12 cos(h). An illustration of this lemma for h = 1=16 can be found both in Figures 5.8 (%(P Ropt (i KGS(i))) versus ) and 5.9 (a complete spectral picture). Note that the optimal Chebyshev{Gauss{Seidel WR method is as fast as its static counterpart and the optimal CSOR WR method. Finally, we notice that the results of Theorems 5.3.7, 5.3.8 and 5.3.9 remain valid if we use a red-black ordering of the unknowns, as the spectra of the resulting Gauss{ Seidel symbols equal those of the symbols KGS(z), obtained with lexicographic ordering. Moreover, in analogy with the static iteration case, see, e.g., 92] or 106, p. 374 and p. 386], we recommend the use of such a coloured ordering in order to obtain the best convergence behaviour in practice. ;
;
5.3.4 Numerical results
In this section we will present the results of some numerical experiments, and we will show that the observed convergence behaviour agrees very well with the continuous-time theory for small enough time steps. We do not discuss the in uence of the time-discretisation method upon convergence, but refer to Sections 3.2.2 and 4.2.2 for a discrete-time analysis of WR methods. In Table 5.1 we report the observed averaged spectral radii of the Chebyshev{Picard method for the one-dimensional heat equation, discretised in space using nite dierences and with t 2 0 1]. We used the CN method with time step = 1=100 for time discretisation. The discrete variants of the kernels ;opt(t) and (,opt) (t) are calculated (by the Fourier-transform method of Section 4.3.2) from their Z -transformed expressions (;eopt) (z) and ((,e opt) ) (z), which can be easily expressed as ; (;eopt ) (z) = ;eopt 1 ab (z) ; ((,e opt) ) (z) = (,e opt) 1 ab (z) : (In the Picard case, the latter expressions have a pole at z = 1. Yet, this singularity is removable and does not cause any problems.) Observe that the experimental results correspond very well with the theoretical spectral radii derived in Section 5.3.1. Similar results can be found for the convolution-based Chebyshev acceleration of Jacobi and red-black Gauss{Seidel WR. In particular, we report the observed averaged convergence factors of the latter methods for the two-dimensional heat equation with
nite dierences and the one-dimensional heat equation with linear nite elements in Tables 5.2 and 5.3, respectively. Again, the observed convergence factors are very close to their predicted theoretical values.
5.4 Some concluding remarks We studied the possibility of accelerating WR convergence by linear/polynomial acceleration techniques. We rst con rmed the negative conclusions of Nevanlinna 68] concerning the applicability of linear acceleration techniques towards WR methods, and
5.4. SOME CONCLUDING REMARKS h m = 1, nite dierences ; % KCH PICopt 1 ; h ;
107 1/8 0.648 0.668 0.607
1/16 0.811 0.821 0.804
1/32 0.900 0.906 0.902
1/64 0.949 0.952 0.951
Table 5.1: Averaged convergence factors of the Chebyshev{Picard method for (3.47) (T = 1, CN, = 1=100).
h m = 2, nite dierences m = 1, linear nite elements ; % KCH JACopt 1 ; h ;
1/8 0.656 0.648 0.668 0.607
1/16 0.814 0.811 0.821 0.804
1/32 0.903 0.901 0.906 0.902
1/64 0.951 0.951 0.952 0.951
Table 5.2: Averaged convergence factors of the Chebyshev{Jacobi WR method for (3.47) (T = 1, CN, = 1=100).
h 1/8 1/16 1/32 1/64 m = 2, nite dierences 0.430 0.661 0.816 0.904 m = 1, linear nite elements 0.410 0.650 0.807 0.899 ; % KCH GSopt 0.447 0.674 0.822 0.907 1 ; 2h 0.215 0.607 0.804 0.902 Table 5.3: Averaged convergence factors of the red-black Chebyshev{Gauss{Seidel WR method for (3.47) (T = 1, CN, = 1=100). ;
showed that much better results may be expected from a convolution-based approach. The convergence analysis of the (convolution) Chebyshev WR methods is applied to several problems, among which the semi-discrete heat equation (3.47), and veri ed by means of numerical experiments. We also compared our results with the ones obtained by Lubich 54] and Skeel 85] for the strongly related shifted Picard iteration.
108
CHAPTER 5. CHEBYSHEV ACCELERATION OF WR METHODS
Chapter 6 Multigrid Waveform Relaxation Methods It is well-known that for parabolic partial dierential equations of the form (1.9), discretised in space using nite dierences, WR methods can be successfully combined with multigrid techniques. In this chapter the analysis of the resulting multigrid WR method for ODE systems (3.1) with B = I is extended towards systems with general, nonsingular B , derived by nite-element discretisation of a linear parabolic PDE. Among other things, we derive some theoretical convergence results for our model heat problem, which are con rmed by means of extensive numerical experiments.
6.1 Description of the method In this section, we outline the continuous-time and discrete-time two-grid WR algorithms for semi-discretised PDEs of the form (3.1), as well as their recursively de ned multigrid variants.
6.1.1 The continuous-time case
Multigrid is known to be a very ecient solver for elliptic partial dierential equations 22, 88, 100]. Its principle, which is based on removing the oscillatory error components by smoothing and the smooth error components by a coarse-grid correction, can be easily extended to time-dependent problems by choosing all operations in the resulting multigrid cycle as operations on functions 55]. In particular, we state below a two-grid cycle for the initial value problem (3.1), derived by nite-element discretisation from a parabolic PDE (1.9), in which we use WR as a smoother. It is de ned on two nested grids #H and #h , with #H #h. It determines a new ( ne-grid) iterate u(h ) from the previous waveform u(h 1). In the following, the subscripts h and H are used to denote ne-grid and coarse-grid quantities respectively. ( 1) Pre-smoothing. Set x(0) h = uh , and perform 1 WR steps: for = 1 2 : : : 1, solve MBh x_ (h ) + MAh x(h ) = NBh x_ (h 1) + NAh x(h 1) + fh x(h )(0) = u0 : (6.1) ;
;
;
;
109
CHAPTER 6. MULTIGRID WR METHODS
110
Coarse-grid correction. Compute the defect
dh = Bhx_ (h 1) + Ahx(h 1) ; fh = NBh (x_ (h 1 1) ; x_ (h 1)) + NAh (x(h 1 ;
;
1)
; x(h 1 ) ) :
Solve the coarse-grid equivalent of the defect equation,
BH v_ H + AH vH = rdh vH (0) = 0
(6.2)
with r : #h ! #H the restriction operator transferring ne-grid quantities to coarse-grid ones. Then, interpolate the correction vH to #h , and correct the current approximation, xh = x(h 1) ; pvH with p : #H ! #h the prolongation operator. The coarse-grid matrices BH and AH may be obtained by discretising the parabolic PDE on #H . An attractive alternative, in which we choose BH = rBh p and AH = rAhp, is called Galerkin approximation 100, p. 9]. Post-smoothing. Perform 2 iterations of type (6.1), starting with x(0) h = xh , and
set u(h ) = x(h 2 ). Since (6.2) is formally equal to (3.1), this two-grid cycle can be applied in a recursive way to obtain a multigrid cycle on more than two nested grids.
6.1.2 The discrete-time case
The discrete-time two-grid WR cycle can be obtained by discretising the equations of its continuous-time counterpart (equations (6.1){(6.2) and the ODEs of the post-smoothing steps) in time using a linear multistep method with xed time step . As before, we do not iterate on the k given starting values and assume the multistep method to be implicit, that is, k 6= 0. Hence, it follows immediately that the resulting discrete-time two-grid cycle is well-de ned if and only if k 62 (;M 1M ) and k 62 (;B 1A ) : (6.3) Bh Ah H H k k Further on, we shall refer to (6.3) as the discrete solvability conditions for the two-grid algorithm. ;
;
6.2 Convergence analysis We will analyse the convergence properties of the two-grid WR method following the framework of Chapter 2, and relate the continuous-time results to the discrete-time ones. This analysis extends the results of 55, 95] to systems (3.1) with general, nonsingular B , obtained by a nite-element discretisation of a parabolic PDE. We also comment brie y on the generalisation of our two-grid analysis to the multigrid case.
6.2. CONVERGENCE ANALYSIS
111
6.2.1 The continuous-time case
The two-grid waveform relaxation operator and its symbol The two-grid cycle of Section 6.1.1 can be written as an explicit successive approximation scheme u( ) = Mu( 1) + ', where we omitted the subscript h in order not to overload the notation. The continuous-time two-grid waveform relaxation operator M is given by ;
M=K 2CK 1
(6.4)
with K the standard WR operator (3.10), and C the two-grid correction operator C = (I ; pBH1 rBh ) + Cc : ;
The operator;1 Cc is of linear Volterra convolution type. Its matrix-valued kernel equals cc(t) = pe BH AH tBH1(AH BH1rBh ; rAh). By rearranging terms in (6.4), we may rewrite M in the general form (2.9), ;
;
;
M = (MBh1 NBh ) 2 (I ; pBH1 rBh )(MBh1 NBh ) 1 + Mc : ;
;
;
Operator Mc is a linear combination of products of linear Volterra convolution operators Kc and Cc . Therefore, it is itself of linear Volterra convolution type. We shall denote its kernel by mc, and the Laplace transform of mc by Mc(z). The precise expressions for mc and Mc(z) are rather complicated, and, since they are not required further on, omitted. The error e( ) of the -th two-grid waveform iterate satis es e( ) = Me( 1). Laplace transforming this relation yields ;
; e( )(z) = (MBh1 NBh ) 2 (I ; pBH1rBh )(MBh1NBh ) 1 + Mc(z) e( 1)(z) = M(z) e( 1)(z) : ;
;
;
;
;
By Laplace transforming the equations of the two-grid cycle, we nd the following equivalent expression for the two-grid dynamic iteration matrix or symbol M(z),
M(z) = K (z)(I ; p(zBH + AH ) 1r(zBh + Ah))K (z) K(z) = (zMBh + MAh ) 1(zNBh + NAh ) : 2
;
1
;
Remark 6.2.1 In the case of a Jacobi or Gauss{Seidel splitting of Ah and Bh , K(z) and M(z) are respectively the Jacobi or Gauss{Seidel and the two-grid iteration matrix
for the linear system constructed by nite-element discretisation of the elliptic problem zu + Lu = f on #h , see also Remark 3.2.1.
Convergence on nite time intervals The spectral radius of the two-grid WR operator in the case of a nite-dierence discretisation is known to be zero on nite time intervals, 95, Thm. 3.4.1]. In the following theorem, which is the multigrid waveform analogue of Theorem 3.2.1, we will state the equivalent formula for general nonsingular B .
CHAPTER 6. MULTIGRID WR METHODS
112
Theorem 6.2.1 Consider M as an operator in C 0 T ]. Then, M is bounded and ;
(M) = (MBh1 NBh ) 2 (I ; pBH1 rBh)(MBh1 NBh ) 1 : ;
;
;
(6.5)
Proof. Since both kc and cc are continuous on 0 T ], we have mc 2 C 0 T ]. Consequently, the theorem follows from Lemma 2.2.1. 2
Remark 6.2.2 The latter spectral radius of M equals the spectral radius of the standard two-grid operator for the trivial elliptic problem u = f , discretised on a nite-element mesh.
Convergence on in nite time intervals
In 55, p. 219{220], Lubich and Ostermann examined the multigrid WR method for the
nite-dierence case. We shall extend their results to more general initial-value problems, derived by nite-element discretisation. We assume that all eigenvalues of BH1AH and MBh1 MAh have positive real parts. ;
;
(6.6)
Remark 6.2.3 Conditions (6.6) are satis ed if we assume the boundedness of the analytical solution of (3.1) on #H and the boundedness of the standard WR operator K. Theorem 6.2.2 Consider M as an operator in Lp(0 1) with 1 p 1, and assume conditions (6.6) are satised. Then, M is bounded and
(M) = sup (M(z)) = sup (M(i)) :
(6.7)
kMkL2 (0 ) = sup kM(z )k = sup kM(i )k R Re(z) 0
(6.8)
Re(z) 0
2
R
Furthermore, we have 1
2
where k k denotes the matrix norm induced by the standard Euclidean vector norm. Proof. It is easily veri ed that
Mc (z) = M(z) ; zlim M(z) : !1
Because of (6.6), the entries of Mc (z) are rational functions of z vanishing at in nity, all of whose poles have negative real part. This implies that mc 2 L1(0 1) by an inverse Laplace-transform argument. The theorem then follows from Lemmas 2.2.3 and 2.2.4. 2
Remark 6.2.4 Suppose that (6.6) is not satis ed, but Re(i)+ > 0 and Re(j )+ > 0
for all eigenvalues i and j of respectively MBh1 MAh and BH1AH . Using the exponentially scaled norm (3.21), Theorem 6.2.2 holds when the suprema are taken over Re(z) or over the line z = + i. ;
;
6.2. CONVERGENCE ANALYSIS
113
6.2.2 The discrete-time case
The two-grid waveform relaxation operator and its symbol
The discrete-time two-grid cycle can be written in explicit form as u( ) = M u( 1) + ' or e( ) = M e( 1) (6.9) where e( ) is the error of the -th iterate. The notation is similar to (3.32). The linear operator M is called the discrete-time two-grid waveform relaxation operator. The second equation of (6.9) can be reformulated in a similar way as (3.34), E ( ) = (Ch 1Dh ) 2 (I ; PFH 1RFh)(Ch 1Dh ) 1 E ( 1) : (6.10)
t Here, E ( ) = e( )k] e( )k + 1] : : : e( )N + k ; 1] . Matrices Ch, Dh , FH and Fh are N N block-lower-triangular matrices with k + 1 constant diagonals. The blocks of the j -th diagonal equal respectively (Ch)k j , (Dh )k j , (FH )k j and (Fh)k j , with (Ch )j = 1 j MBh + j MAh (Dh )j = 1 j NBh + j NAh and (FH )j = 1 j BH + j AH (Fh)j = 1 j Bh + j Ah : Matrices P and R are block-diagonal with constant diagonal blocks respectively equal to matrices p and r. I is the identity matrix of dimension d N . It can be seen that the matrix pre-multiplying E ( 1) in (6.10) is a lower-triangular block-Toeplitz matrix. This implies that M is a discrete linear convolution operator. The Z -transform of its matrix-valued kernel can be found by transforming the equations of the discrete-time two-grid cycle. It is denoted by M (z), the discrete-time two-grid dynamic iteration matrix or symbol, and equals M (z) = K2 (z) C (z) K1 (z) with K (z) given by (3.35) and C (z) = I ; p (a(z)BH + b(z)AH ) 1 r (a(z)Bh + b(z)Ah) : Matrix M (z) satis es as similar relation as K (z) does in (3.36), a 1 (6.11) M (z) = M b (z) : ;
;
;
;
;
;
;
;
;
;
;
;
Convergence on nite time intervals Theorem 6.2.3 Consider M as an operator in lp(N ) with 1 p 1 and N nite,
and assume the discrete solvability conditions (6.3) are satised. Then, M is bounded and 1 k : (6.12)
(M ) = M
k Proof. The theorem follows from Lemma 2.2.5 and (6.11), 1 a 1 k
(M ) = (M (1)) = M b (1) = M : k
2
CHAPTER 6. MULTIGRID WR METHODS
114
Convergence on in nite time intervals We rst prove the boundedness of M , i.e, the two-grid equivalent of Lemma 3.2.12.
Lemma 6.2.4 If (;MBh1MAh ) (;BH1AH ) int S , then M is bounded in lp(1) ;
;
with 1 p 1.
Proof. It is sucient to prove that the kernel of M belongs to l1(1). We shall analyse each of the factors in the formula for M (z) separately. We have, from the proof of Lemma 3.2.12, that K (z) is the Z -transform of an l1sequence, say q , if
det (a(z)MBh + b(z)MAh ) 6= 0 jzj 1 :
(6.13)
Consider the l1-sequence
k BH + kAH k 1 BH + k 1AH : : : 0BH + 0AH 0 0 : : : ;
;
Its transform is given by z k (a(z)BH + b(z)AH ). By Wiener's inversion Theorem, (a(z)BH + b(z)AH ) 1zk is the transform of an l1-sequence, say w , if ;
;
det(a(z)BH + b(z)AH ) 6= 0 jzj 1 :
(6.14)
Next, consider the l1-sequences
i = I 0 : : : 0 0 0 : : : and
v = k Bh + kAh k 1 Bh + k 1Ah : : : 0Bh + 0Ah 0 0 : : : ;
;
with I the d d identity matrix. Their transforms are given respectively by
I and z k (a(z)Bh + b(z)Ah) : ;
Now, consider the sequence
q| q {z : : : q} ( i ; p w r v ) q| q {z : : : q} : (6.15) 2 times 1 times If conditions (6.13) and (6.14) are satis ed, it follows that this sequence is in l1(1). (l1 is closed under convolution and addition. The multiplication of an l1-sequence by a matrix is an l1-sequence.) The Z -transform of sequence (6.15) equals M (z), hence the sequence equals the kernel of M . To conclude, M is bounded under conditions (6.13) and (6.14). Suppose one of these conditions is violated. That is to say, there is a z with jzj 1 such that det(a(z)MBh + b(z)NBh ) = 0 or det(a(z)BH + b(z)AH ) = 0. That would mean that ab (z) 2 (;MBh1MAh ) (;BH1AH ). Since jzj 1 this violates the assumption of the lemma. 2 ;
;
6.2. CONVERGENCE ANALYSIS
115
Remark 6.2.5 The assumption of Lemma 6.2.4 implies the two-grid discrete solvability conditions (6.3).
Remark 6.2.6 The assumption of Lemma 6.2.4 implies that all poles of M(z) are inside the scaled stability region 1 S . Theorem 6.2.5 Consider M as an operator in lp(1) with 1 p 1, and assume (;MBh1MAh ) (;BH1AH ) int S . Then, ;
;
(M ) = supf (M(z)) : z 2 C n int S g = sup (M(z)) : z @S 2
(6.16)
Furthermore, we have kM kl2 ( ) = supfkM(z )k : z 2 C n int S g = sup kM(z )k 1
z @S 2
(6.17)
where k k denotes the matrix norm induced by the standard Euclidean vector norm. Proof. The proof is a direct consequence of Lemmas 2.2.6 and 2.2.7, and is similar to the proof of Theorems 3.2.13 and 3.2.14. 2
Remark 6.2.7 If the assumption of the former theorem is violated, but the weaker
condition (;MBh1MAh ) (;BH1AH ) int S holds, then we can formulate a remark analogous to Remark 3.2.11. ;
;
6.2.3 Discrete-time versus continuous-time results
The relation between the two-grid operators M and M is similar to the relation between K and K, outlined in Section 3.2.3. More precisely, we have, both for nite and in nite intervals, that lim (M ) = (M) : 0 We also state the two-grid equivalents of Theorem 3.2.15 and Corollary 3.2.16, without proof. !
Theorem 6.2.6 Consider M as an operator in lp(1) and M as an operator in
Lp(0 1) with 1 p 1. Assume the linear multistep method is A()-stable and (;MBh1MAh ) (;BH1AH ) '. Then, ;
;
(M ) supc (M(z)) = supc (M(z)) z 2
z @ 2
(6.18)
with 'c = C n ' = fz : j Arg(z)j ; g.
Corollary 6.2.7 Consider M as an operator in lp(1) and M as an operator in Lp(0 1) with 1 p 1. Assume the linear multistep method is A-stable and conditions (6.6) are satised. Then, (M ) (M).
CHAPTER 6. MULTIGRID WR METHODS
116
6.2.4 An extension of the two-grid results to the multigrid case
We consider the case where we have a hierarchy of grids, #h0 #h1 : : : #hl a set of prolongation operators phhii+1 : #hi ;! #hi+1 0 i l ; 1, a set of restriction operators, rhhii+1 : #hi+1 ;! #hi 0 i l ; 1, and discretisation matrices Bhi and Ahi , 0 i l. The multigrid algorithm diers from the two-grid cycle in that the coarse-grid defect equation (on #hl;1 ) is approximately solved by an application of two-grid cycles (on #hl;1 and the coarser grid #hl;2 ), an idea that is further extended recursively. The classical V -cycles and W -cycles that will be used later on correspond to = 1 and = 2, respectively. Both cycles are illustrated in Figure 6.1, where we show the order in which the grids are visited. In the continuous-time case this leads to an iteration of the form #h3 .. .. ... .. .. .. .. .. ... .. ......... ..... .
#h2
. .... . ........ . .. ... . .. .. .. . . .. .. ... ..
.. ... .. .. ... ... .. ... ... . .......... .... ..
#h1
.. .. ... .. ... .. .. ... .. .. .. . . ... ... . .. . ..
... ..... ........ . ... ... . .. .. ... ... .. ..
. .... . ........ . .. ... . .. .. .. . .. .. .. .. . .
.. ... .. .. ... .. .. .. . .. .. . . ..... .. .... .. ..
.. . .. ..... ... ......... ... ... .. .. ... . .. . ... .. .. ... ... .. .......... .... ... . ..
.... .. .... . ... ...... .. . .. ... ... ... ..... . ... .. . . ... ... . .... ... ... .. ... .. .. .. .. .. . ... . .. .. . .. . .. .. .. ........ ... ........ .. . .. ... .... ... .. . ... .. ... .. . . . .. . . .. . . . . . . . .. .. .. .. .. . .. . .. ... . .. ... ... .. ........ .... . ... .. ... . .. .. . .. . ..
.. ..... ........ . ... .. . . .. .. .. . .. . .. ..
.. ... . .. . .. . .... .... ... .. ... . ....... ....... ... .. .. .. . .. .. .... .. .. .. .. . . . .. .. .. .. ... ... . . .. ... .. ... ... . ... . .. .. ......... ......... .. . ... ... .... .. .. ... ..
#h0 Figure 6.1: Classical multigrid cycles for l = 3: V (left) and W (right).
u( ) = Mhl u( 1) + '. In the discrete-time case we end up with an iteration operator which we denote by (Mhl ) . Both iterative schemes can be analysed in exactly the same way as the two-grid cycles have been analysed. A Laplace-transform argument is used in the continuous-time case, whereas the discrete-time case is treated by using a Z -transform method. Proceeding as before, we can derive the symbol of the continuous-time multigrid WR method, Mhl (z). The latter takes a particularly simple form under the natural assumption that the semidiscretised PDE operators on #hi (1 i l) are invertible. In that case we can apply the following lemma. ;
Lemma 6.2.8 Let an ODE system of the form (3.1) have a unique solution u, and
let it be solved approximately by steps of a consistent waveform method of the form u(k) = Mu(k 1) + ' with u(0)(t) 0. Then, the -th iterate can be represented as u() = (I ; M )u. ;
Under the above assumption, the multigrid symbol becomes:
8 < Mhl (z) = :
Khl (z) I ; phhll; I ; Mhl;1 (z) Lhl1;1 (z)rhhll; Lhl (z) Khl (z) l 6= 1 ; Kh1 (z) I ; phh Lh01(z)rhh Lh1 (z) Kh1 (z) l=1 where Lhi (z) = zBhi + Ahi and Khi (z) = (zMBhi + MAhi ) 1(zNBhi + NAhi ). Note that Mhl (z) is technically more complicated when the assumption is violated. In that case, it does not involve the factor Lhl1;1 (z). 2 2
;
1
1
1 0
;
0 1
1
;
;
1
6.2. CONVERGENCE ANALYSIS
117
Remark 6.2.8 Let Khi (z) correspond to a Jacobi or Gauss{Seidel splitting. Then, Mhl (z) is the multigrid iteration operator for the elliptic problem (zBhl + Ahl )uhl = fhl $
compare 22, p. 161] and 88, p. 46]. As before, the continuous-time convergence theorems can be formulated in terms of this symbol. The ideas behind the proofs are identical to the ones behind the corresponding proofs in Section 6.2.1, and therefore omitted. Theorem 6.2.9 Consider Mhl as an operator in C 0 T ]. Then, Mhl is bounded and
(Mhl ) = (Mhl (1)) : (6.19)
Theorem 6.2.10 Consider Mhl as an operator in Lp(0 1) with 1 p 1, and
assume all eigenvalues of MBh1i MAhi (1 i l) and Bh01Ah0 have positive real parts. Then, Mhl is bounded and
(Mhl ) = sup (Mhl (z)) = sup (Mhl (i)) : (6.20) R Re(z) 0 Furthermore, we have kMhl kL2 (0 ) = sup kMhl (z )k = sup kMhl (i )k (6.21) R Re(z) 0 where k k denotes the matrix norm induced by the standard Euclidean vector norm. Following the line of arguments of Section 6.2.2, we can derive the discrete-time symbol: 1 a (Mhl ) (z) = Mhl (z) : ;
;
2
1
2
b The discrete-time convergence theorems are immediate extensions of Theorems 6.2.3 and 6.2.5. The proofs are very similar. Theorem 6.2.11 ConsiderS (Mhl ) as an operator in l;p(N ) with 1 p 1 and N l nite, and assume kk 62 i=1 ;MBh1i MAhi ;Bh01Ah0 . Then, (Mhl ) is ;
bounded and
;
((Mhl ) ) = Mhl 1 k k
:
(6.22)
Theorem 6.2.12 Consider (Mhl ) as an operator in lp(1) with 1 p 1, and assume S l (;M 1 M ) Bhi Ahi i=1 ;
(;Bh01Ah0 ) int S . Then, (Mhl ) is bounded and ;
((Mhl ) ) = supf (Mhl (z)) : z 2 C n int S g = sup (Mhl (z)) : z @S 2
Furthermore, we have kMhl ) kl2 ( ) = supfkMhl (z )k : z 2 C n int S g = sup kMhl (z )k 1
z @S 2
(6.23) (6.24)
where k k denotes the matrix norm induced by the standard Euclidean vector norm.
118
CHAPTER 6. MULTIGRID WR METHODS
6.3 Model problem analysis Next, the convergence theorems of the previous section are applied to the one-dimensional variant of (3.47). In particular, we recall the results of 55, 95] for the nite-dierence discretisation of the latter problem, and derive similar conclusions for the linear niteelement case. These theoretical results will be compared with those from some numerical experiments, which are performed not only for the one-dimensional heat equation but also for its two-dimensional analogue.
6.3.1 Theoretical results The continuous-time case
Before we investigate the performance of the multigrid WR method for parabolic PDEs that are discretised using nite elements, we recall an important result for the nitedierence discretisation of the one-dimensional heat equation (3.47) 55, Prop. 5]. In particular, Lubich and Ostermann calculated the spectral radius of a speci c two-grid operator M for this problem by maximising the spectral radius of the corresponding symbol M(z) over the imaginary axis. They proved that the supremum in (6.7) is attained in a point z = i with 1=h2 . Although this implies the two-grid method to be slower than its static counterpart, the latter authors were able to bound (M) by a constant, independent of the mesh size h. In addition, they conjectured that this bound also holds for the two-dimensional variant of this problem. They used standard coarsening (H = 2h), red-black Gauss{Seidel smoothing, piecewise linear interpolation for prolongation and full-weighting restriction. For the precise formulae of the latter operators, we refer, e.g., to 22, x3.4{3.5] or 100, Chap. 5]. Theorem 6.3.1 Consider the one-dimensional heat equation (3.47), discretised in space using central nite di erences. Then, if we consider M, with red-black Gauss{Seidel smoothing and interpolation and restriction as dened above, as an operator in Lp(0 1) with 1 p 1, we have p (6.25)
(M) 21 (2 ; 1) with ( ) = ( +1) +1 for = 1 + 2 1. This is the best possible bound independent of h. Next, we tried to derive a similar result for the one-dimensional variant of (3.47), discretised using linear nite elements. We used standard coarsening, a coarse grid problem derived by discretisation and piecewise linear interpolation. The restriction operator is de ned in the standard way for nite-element multigrid methods, i.e., r := pt 22, p. 66]. Theorem 6.3.2 Consider the one-dimensional heat equation (3.47), discretised in space using linear nite elements. Then, if we consider M, with red-black Gauss{Seidel smoothing and interpolation and restriction as dened above, as an operator in Lp (0 1) with 1 p 1, we have pp
(M) 3 (2 ; 1) with ( ) = ( +1) +1 (6.26) for = 1 + 2 1.
6.3. MODEL PROBLEM ANALYSIS
119
Proof. The proof is a generalisation of the model problem analysis in 22, p. 25] and the proof of 55, Prop. 5]. Writing
M(z) = K (z)C(z)K (z) with C(z) = I ; p(zBH + AH ) 1r(zBh + Ah) 2
1
;
we have to study
(M(z)) = (C(z)K (z)) = 1 + 2 : Let em (m = 1 2 : : : N ; 1), with N = 1=h, denote the eigenvectors of Bh and Ah, p
em(x) = 2h sin(mx) x = kh 1 k N ; 1 : Both C(z) and K (z) leave the subspace spanned by (em eN m) invariant. Their restrictions with respect to this basis have the matrix representations ;
Cm ( z ) =
2 2 sm cm
s2m c2m
3zh2 c2m s2m(1 ; 2c2m ) ;c4m(1 ; 2s2m ) + zh2(1 + 2(c2 ; 4 2 2 2 2 2 2 2 2 m sm ) ) + 12cm sm ;sm (1 ; 2cm ) cm sm (1 ; 2sm ) 2 =6 2 1 ; zh 2 2 2 1 Km(z) = (cm ; sm) 1+ zh2=3 2 2 =6 1 2 3 zh 1 c c m m ;s2 ;s2 + 2(1 ; zh2=6) ;1 ;1
;
m
m
with cm = cos(mh=2), sm = sin(mh=2), and m = 1 2 : : : N=2 ; 1. (We may omit the degenerate case m = N=2.) One then has for 1 and Z = zh2,
Mm(Z ) = Cm(Z )Km(Z ) =
(1 ; Z=6)2 1 Z 2 2 1 1 2 2 2 1 (cm ; sm) 1 (1 + Z=3)2 4 (sm ; cm) 1 2 2 2 2 cm cm m) : + Z (1 + 2(3Zc2(1;;s24c)m2)s+ 2 2 12cm sm ;s2m ;s2m m m ;
;
The spectral radius of Mm(Z ) equals
2 (1 ; Z=6)2 1 Z 3 Z (1 ; 4c2m s2m ) 2 2
(Mm(Z )) = (cm ; sm) (1 + Z=3)2 4 2 ; Z (1 + 2(c2 ; s2 )2) + 12c2 s2 : m m m m ;
Thus,
(M) =
=
sup (M(Z )) = sup max
(Mm(Z )) m
Re(Z ) 0
Re(Z ) 0 2 1 (6 ; i ) i 3 sup 22 (3 + i)2 R
(6.27)
;
pp 2
3 0(2 ; 1)
p p
where the latter supremum is attained for = 3 2= 3 ; 2.
2
CHAPTER 6. MULTIGRID WR METHODS
120 0.4 0.3 0.2 0.1
...................................... ................................................ ............ ...... ...... .......... ........... ..... .......... ....... ........ ..... ......... ...... ........ ..... .......... .... .......... . . . . . . . ......... . . . . . .... .......... ........ ... . . . . . . . . . . . . . ............ . . . . ... ........... . ... ... . . ... . ... ... . ... .. . . ... . .. ... . . ... .. .. .. ... .. .. .. .. ... ... ... ... ... . . ... ... .. ... .... ... .. .. .. .... .
0.0 ;1200 ;800
;400
0
400
800
1200
Figure 6.2: (M(i)) versus for (3.47) (m = 1, linear nite elements, h = 1=16). The fact that the supremum of (M(z)) over the imaginary axis is not attained at the origin is visualised in Figure 6.2 for h = 1=16. (Observe that (M(0)) = 0 : the corresponding static two-grid solver is exact for the one-dimensional Poisson equation, discretised using linear nite elements.) Yet, the theorem states that the spectral radius of the two-grid operator M can be bounded by a constant, independent of h. Some values of the bound (6.26) are given in Table 6.1. Since this bound is not optimal, we also computed the spectral radius of the two-grid operator by numerical evaluation of (6.27) for = 2 and for several values of h. These results are reported in Table 6.2.
1 2 3 4 3 (2 ; 1) 0.866 0.563 0.448 0.384
pp
Table 6.1: Values of the upper bound (6.26) versus for (3.47) (m = 1, linear nite elements).
h
1/8 1/16 1/32 1/64
(M) 0.217 0.263 0.276 0.280 Table 6.2: Numerical values of (M) for (3.47) (m = 1, linear nite elements, = 2).
The discrete-time case
The in uence of the time-discretisation method on the convergence of the two-grid WR iteration is investigated by means of the one-dimensional model problem (3.47), discretised using linear nite elements. The multigrid parameters are as above (standard coarsening, red-black Gauss{Seidel smoothing, linear interpolation, r = pt). We perform one pre-smoothing and one post-smoothing step and consider the CN method and the BDF formulae of order one up to ve for time integration. The spectral radii of the nite-interval and in nite-interval operators for the twolevel algorithm are reported in Table 6.3. The results were computed by direct numerical evaluation of formulae (6.12) and (6.16). In addition, it follows from the latter formulae that these spectral radii can be estimated from the spectral picture of Figure 6.3, where we plotted contour lines of (M(z)) (for values 0.1, 0.3, 0.5, 0.7, 0.9, 1.1), together with
6.3. MODEL PROBLEM ANALYSIS
121
multistep method CN BDF(1) BDF(2) BDF(3) BDF(4)
nite interval 0.050 0.050 0.052 0.051 0.049 in nite interval 0.264 0.069 0.106 0.170 0.343 Table 6.3: Spectral radii of discrete-time two-grid WR for (3.47) (m = elements, h=1/16, =1/100). 1200 800
BDF(3) BDF(2) BDF(1)
400 200 200
=1:1
;
400
;
600
;
800
;
=0:1
1000
;
1200
;
BDF(5)
BDF(4)
600
0
.. . . . .. . ... .. .............................................. . .. . ......................... ............ .. .. . ..................... .......... . .. ....... ......... .. ... ....... .. .. ....... . . . . ....... . . . . ... . ..... .. .. ...... .. .. . . . ...... . . . . . .. .. ...... . .. ........... . . ..... . . . . . ..... ........ . . . . . ..... . .. . .. . .. .... .. ..... . . . .. . . . . .... . . ..... . ...... ... . . .. . . . . . .... . .. . . .. ...... .. .. .... . . . .. .. . . . . ..... .. .... . . ..... .. . .. . . . . . .... . . ... .. .. .. ....... .... . . .. .... . .. .. ... .. . . .. . ... . . . .... ......................................................................... . .. ... .. .. ..... .. .. ........ . .. ... ....... . . . . . .. . . ... .. .. .... .. . . . . ...... . .. ... . . ...... . ... ... . . .. .... .. .. . . . . ... . . . ..... .. ... ....... . ...... ... . ..... . .. ...... .... . . ... ... .. ..... .. . . . . . . . .. . . . ..... .. . . .. .. ......... .. .. . . . . . . .. . . . .... . ... ... . .. . ..... .. .. .. . . . . ... ... .............................................. ... .... .. .. .. ...... ... . . .. . . . . ....... ... ... ... .. .. . ..... ... ........ . . . . . . . . . . . . . . . . . . . ..... .. .. ... .. . . ... .. ...... . . .. . . . . . . . . . . . . . .... .... .. .. . .... . .. .... ... ...... . . . .. . . . .. .... .... . .. .. .. .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . ............. .... ..... . . . . . . ...... ... .. .... .. ..... . . .. . ........... .... ....... .......... .. . .. .... .. . .. . .. .. ........ .............. .... . . .. .. .. . ... . .. . . ................... ................. . . .. . . . . ... . . . . . . . ..... .. .. .. . ........... ..... .. . .. ... . . . . . . .. . .. .. . . . ....... . .. .. .... . . . . . . . . . . . . . . . . . ... .. . . . . . .. . ....... .. . ... ... ..... .. . . . . . . . . . . . . ... . .. ........... . . . . . . . . . . . ..... .. .. . . . . . .. . . .. . . ......................... ....................... .. .. . . . . . . . .. .. ......... ... ........ . .. .. .... . . . . . ... ........ . .... ..... ....... . . . . . . . ..... .. . . . . ... ........ .. .. .... ........... . . . . . . . .. . . ...... . ... .... .......................... ... .. .. .. .... .... . .. .... .. . .. .... ... . . ... .. ... .. .. ...... . . ... .. ....... .. ..... ... .. ...... .. . . . ... . ......... .. ...... ... .. .. ... .. .. ..... ... ....... . . . . . . . . . . . . .......... .... .. ... . .. . ...... ... ... .. ..................................... .. ... ... . . . .. .. .. .. .... ... . .. .. ... .. .... ... . . . ............ .. ..... ... .... .. .. .. . .... ..... .. . . . . . . . . . . . . . . . . .. . . . .. . ...... .. .. ..... ... ... ........ ... ...... . ..... . .. .. .... .. . ....... .. ...... .. .. ... ........ ....... .. .. ..... .. .. .......... ... ........ . ... ................ ... .. ... ..... .. ........... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ........ . .. . . .. ... .. . . . ... . .. . .. .. .. ....... ... . . .... .... .... .. .. . . . .... . .. .. .. ... ....... .. .. .... .. .... . ... .... . .. .. . . . . . .... . . .. .. .. ..... .. . .......... ... .... .. .. .. . ...... .. .......... .. ...... .. . . ........... ...... . . .. .. .... ...... . . . . . . . . . . . . ..... . .. . . .. . ..... . . ....... ... ... ...... . . .. ........ .. ....... ... ......... .......... .. .. ............ ... ............. . ...................... .. .. ................... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . ..
CN
1000
BDF(5) 0.047 1.184 1, linear nite
400 200 0
;
;
200 400 600 800 1000 1200 1400 1600 1800
Figure 6.3: Spectral picture of two-grid WR for (3.47) (m = 1, linear nite elements, h=1/16, =1/100). the scaled stability region boundaries. In particular, the in nite-interval spectral radii increase with increasing order k of the BDF method, as could be expected from Theorem 6.2.6, Corollary 6.2.7 and the knowledge that decreases if k gets larger. For completeness, we shall give some similar results for the two-dimensional equation (3.47), discretised in space using linear or bilinear nite elements. The two-level WR method that will be used is characterised by one four-colour Gauss{Seidel pre-smoothing step, a similar post-smoothing step, standard coarsening and (bi)linear interpolation. The restriction is again de ned by r = pt , which leads to a seven-point formula in the linear
nite-element case, and a nine-point formula in the bilinear case. It is no longer practical to use a direct numerical evaluation of (M(z)) to study the convergence characteristics of the two-grid WR method for these problems. Instead, we can resort to Remark 6.2.1, which relates (M(z)) to the analysis of a standard twogrid method for a simple elliptic problem. The latter can be analysed eciently using a classical Fourier mode analysis as introduced by Brandt in 6]. Fourier analysis shows that, under certain conditions, matrix M(z) is spectrally equivalent to a block-diagonal matrix whose diagonal blocks are matrices of size at most four by four. The general form of these four-by-four matrices can be derived by studying the action of the dierent
CHAPTER 6. MULTIGRID WR METHODS
122 4
4
x 10
0.2
0.3
3 0.4
2
1 1 0
0.5
−1
0.4 0.1
−2 0.3 −3
0.2
−4 −4
−3
−2
−1
0
1
2
3
4 4
x 10
Figure 6.4: Spectral picture of two-grid WR for (3.47) (m = 2, linear elements, h=1/32). 4
4
x 10
0.4
2
0.3
0.5
3
0.6 0.9
1 0.8 0
0.7
1
0.1
−1 0.2
0.6 −2
−3
−4 −4
0.5 0.3
0.4
−3
−2
−1
0
1
2
3
4 4
x 10
Figure 6.5: Spectral picture of two-grid WR for (3.47) (m = 2, bilinear elements, h=1/32).
6.3. MODEL PROBLEM ANALYSIS
123
multistep method CN BDF(1) BDF(2) BDF(3) BDF(4) BDF(5)
nite interval 0.102 0.120 0.110 0.104 0.100 0.097 in nite interval 0.374 0.148 0.148 0.150 0.170 0.233 Table 6.4: Spectral radii of discrete-time two-grid WR for (3.47) (m = 2, linear elements, h = 1=32, = 1=100). multistep method CN BDF(1) BDF(2) BDF(3) BDF(4) BDF(5)
nite interval 0.038 0.042 0.040 0.038 0.037 0.036 in nite interval 0.356 0.052 0.058 0.068 0.087 0.132 Table 6.5: Spectral radii of discrete-time two-grid WR for (3.47) (m = 2, bilinear elements, h = 1=32, = 1=100). multigrid operators on certain sets of four related exponential or sinusoidal Fourier modes. The spectral properties of M(z) are then calculated easily. We refer to the above reference, and to 88] and 100] for an in-depth discussion of the classical Fourier mode analysis. Here we have closely followed the guidelines laid out in 100, Chap. 7]. Figure 6.4 shows the spectral picture for linear nite elements with h=1/32. In the computation we used exponential Fourier modes. They lead to an exact value of the spectral radius in the case of periodic boundary conditions. A slight modi cation to the standard exponential mode analysis was applied to cater for the Dirichlet boundary conditions, a modi cation described in 100, p. 111]. Figure 6.5 shows a similar picture for bilinear elements on a grid with h=1/32. The nature of the stencil in the bilinear element case is such that a sinusoidal Fourier mode analysis is possible 88, x7.1]. The sinusoidal mode analysis leads automatically to the correct value of the spectral radius in the case of a problem with Dirichlet boundary conditions. The exact values of the two-grid spectral radii of the ( nite-length and in nite-length) waveform operators are presented in Tables 6.4 and 6.5.
6.3.2 Numerical results
The theoretical results of the above section will be veri ed next. Therefore, we solve the one-dimensional variant of (3.47) using multigrid WR. In the latter method we applied standard V -cycles and W -cycles, i.e., we approximated the solution of the coarse-grid defect equation (6.2) by respectively one and two two-grid cycles (on the current coarse grid and an even coarser one). The mesh sizes are related in the standard manner (hi = 2hi+1 ), and the former idea is recursively applied until we arrive at a mesh with size h0 = 1=2. The interpolation and restriction operators are chosen as in Section 6.3.1, that is, we use linear interpolation and full-weighting restriction ( nite dierences) or r = pt ( nite elements). We apply one pre-smoothing and one post-smoothing step of coloured Gauss{Seidel WR type (in the nite-dierence, linear and cubic nite-element case, it suces to use red-black relaxation in order to decouple the unknowns in equally large portions, while a quadratic nite-element discretisation needs three colours to do so), and we use the CN time-discretisation method with a very small step size ( = 1=1000) to
CHAPTER 6. MULTIGRID WR METHODS
124
approximate the continuous-time results for t 2 0 1]. The resulting observed averaged convergence factors are reported in Tables 6.6{6.9, respectively for the nite-dierence, the linear, quadratic and cubic nite-element case.
h 1/8 1/16 1/32 1/64 V -cycle 0.064 0.069 0.070 0.071 W -cycle 0.069 0.069 0.059 0.052 Table 6.6: Averaged convergence factors of multigrid WR for (3.47) (m = 1, nite dierences, T = 1, CN, = 1=1000). h 1/8 1/16 1/32 1/64 V -cycle 0.229 0.300 0.326 0.331 W -cycle 0.210 0.254 0.265 0.267 Table 6.7: Averaged convergence factors of multigrid WR for (3.47) (m = 1, linear nite elements, T = 1, CN, = 1=1000). 1.0 0.8 0.6 0.4 0.2 0.0
=0
................................................................................................................................................................................................................................. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ... .. .. .. .. .. .. .. ... ... ... ... .... .... .... ..... ...... ........ ............... ............................................................ ...........................................................................................................
=1 0.0 0.2 0.4 0.6 0.8 1.0 h = 1=8
1.0 0.8 0.6 0.4 0.2 0.0
=0
. ................................................................................................................................................................................................................................. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ... .. ... .. .. .. .. ... ... ... ... .... ..... ..... ..... ....... ......... ..................... .........................................................................................................................................................
=1 0.0 0.2 0.4 0.6 0.8 1.0 h = 1=32
Figure 6.6: Successive multigrid WR iterates u( )(1=2 t) for (3.47) (m = 1, linear nite elements). Further iterates can no longer be distinguished graphically from the = 1iterate. The multigrid convergence factors of Tables 6.6 and 6.7 are obviously bounded above by an h-independent constant less than one, similar to the in nite-interval two-grid formulae (6.25) and (6.26). (In particular, the linear nite-element results match the theoretical spectral radii given in Table 6.2 quite well.) This mesh-size independence is illustrated in Figure 6.6, where we plotted successive iterates u( ), evaluated at x = 1=2$ compare Figure 3.6. Note that one iteration of the multigrid WR method suces to get an approximation that can no longer be distinguished graphically from subsequent iterates. Similar, h-independent convergence results are obtained for the quadratic and cubic nite-element case. Two-dimensional results are given in Tables 6.10{6.12. In these experiments, we used coloured Gauss{Seidel smoothing (red-black for the nite-dierence case$ four colours
6.3. MODEL PROBLEM ANALYSIS
125
h 1/8 1/16 1/32 1/64 V -cycle 0.202 0.285 0.316 0.325 W -cycle 0.201 0.276 0.301 0.309 Table 6.8: Averaged convergence factors of multigrid WR for (3.47) (m = 1, quadratic
nite elements, T = 1, CN, = 1=1000). h 1/8 1/16 1/32 1/64 V -cycle 0.191 0.234 0.236 0.237 W -cycle 0.184 0.218 0.213 0.210 Table 6.9: Averaged convergence factors of multigrid WR for (3.47) (m = 1, cubic nite elements, T = 1, CN, = 1=1000). when nite elements are used), linear/seven-point interpolation ( nite dierences and linear nite elements) or bilinear/nine-point interpolation (bilinear nite elements), and full-weighting restriction ( nite dierences) or r = pt ( nite elements). The other multigrid parameters are chosen as before. Next, we discuss the in uence of time discretisation upon convergence. To this end, we
rst report averaged convergence factors of the two-grid WR method for (3.47) in Tables 6.13 and 6.14. We used 1000 time steps of size = 1=100, de ned the prolongation and restriction operators as above, and chose an oscillatory initial approximation in order to excite all possible error frequencies. The measured values correspond very well to the
h 1/4 1/8 1/16 1/32 V -cycle 0.059 0.094 0.099 0.097 W -cycle 0.059 0.075 0.085 0.080 Table 6.10: Averaged convergence factors of multigrid WR for (3.47) (m = 2, nite dierences, T = 1, CN, = 1=1000). h 1/4 1/8 1/16 1/32 V -cycle 0.135 0.335 0.437 0.470 W -cycle 0.135 0.304 0.357 0.371 Table 6.11: Averaged convergence factors of multigrid WR for (3.47) (m = 2, linear nite elements, T = 1, CN, = 1=1000). h 1/4 1/8 1/16 1/32 V -cycle 0.137 0.299 0.353 0.365 W -cycle 0.137 0.294 0.344 0.355 Table 6.12: Averaged convergence factors of multigrid WR for (3.47) (m = 2, bilinear
nite elements, T = 1, CN, = 1=1000).
126
CHAPTER 6. MULTIGRID WR METHODS
CN BDF(1) BDF(2) BDF(3) BDF(4) BDF(5) 0.255 0.064 0.099 0.161 0.335 1.166 Table 6.13: Averaged convergence factors of two-grid WR for (3.47) (m = 1, linear nite elements, h = 1=16, T = 10, = 1=100). multistep method CN BDF(1) BDF(2) BDF(3) BDF(4) BDF(5) linear nite elements 0.329 0.135 0.137 0.138 0.150 0.198 bilinear nite elements 0.313 0.039 0.044 0.049 0.067 0.118 Table 6.14: Averaged convergence factors of two-grid WR for (3.47) (m = 2, h = 1=32, T = 10, = 1=100). theoretical, in nite-interval spectral radii determined in Tables 6.3{6.5. In Tables 6.15 and 6.16, we show similar results for the true multigrid methods (i.e., with more than two levels), applied to the two-dimensional equation (3.47) with T = 1 and = 1=200. The dashes (\-") indicate divergence over a substantial number of iterations$ we also observe a dependence of the convergence on the nature of the time-discretisation method in this case. Finally, we report averaged convergence factors obtained with W -cycles for dierent values of the mesh-size parameters, and for dierent discretisation schemes in Tables 6.17{6.22. We observe a dependence of the actual convergence factors on h and . For the CN and BDF(2) methods, these factors appear to be bounded by a constant, smaller than one, independent of the mesh size. For a constant value of h, we expect the convergence factors to converge to the continuous-time results when decreases, see Section 6.2.3. This behaviour is recognised clearly for the CN method, in Tables 6.17 and 6.20. Due to the shape of the stability regions of the BDF(2) and BDF(4) methods, it takes a much smaller value of before the discrete-time convergence factors tend to the continuous-time ones, see Tables 6.18, 6.19, 6.21 and 6.22. For a constant value of , we observe an initial increase of the convergence factor when h decreases. For suciently small h the convergence factor starts to decrease multistep method CN BDF(1) BDF(2) BDF(3) BDF(4) BDF(5) V -cycle 0.438 0.177 0.177 0.275 0.844 W -cycle 0.307 0.124 0.124 0.124 0.381 Table 6.15: Averaged convergence factors of multigrid WR for (3.47) (m = 2, linear nite elements, h = 1=32, T = 1, = 1=200). multistep method CN BDF(1) BDF(2) BDF(3) BDF(4) BDF(5) V -cycle 0.446 0.046 0.132 0.332 W -cycle 0.295 0.041 0.041 0.052 0.538 Table 6.16: Averaged convergence factors of multigrid WR for (3.47) (m = 2, bilinear
nite elements, h = 1=32, T = 1, = 1=200).
6.3. MODEL PROBLEM ANALYSIS
127
h 0.04 0.02 0.01 0.005 0.0025 0.001 1/4 0.103 0.135 0.134 0.135 0.134 0.135 1/8 0.126 0.256 0.305 0.304 0.304 0.304 1/16 0.117 0.135 0.282 0.359 0.358 0.357 1/32 0.123 0.125 0.140 0.307 0.372 0.371 Table 6.17: Averaged convergence factors of multigrid WR (W -cycle) for (3.47) (m = 2, linear nite elements, T = 1, CN). h 0.04 0.02 0.01 0.005 0.0025 0.001 1/4 0.051 0.070 0.111 0.128 0.133 0.134 1/8 0.086 0.086 0.108 0.194 0.247 0.291 1/16 0.118 0.118 0.118 0.118 0.124 0.266 1/32 0.124 0.124 0.124 0.124 0.124 0.125 Table 6.18: Averaged convergence factors of multigrid WR (W -cycle) for (3.47) (m = 2, linear nite elements, T = 1, BDF(2)). h 0.04 0.02 0.01 0.005 0.0025 0.001 1/4 0.173 0.320 0.290 0.171 0.135 0.134 1/8 0.141 0.324 0.525 0.766 0.666 0.311 1/16 0.121 0.154 0.358 0.653 0.807 0.948 1/32 0.124 0.124 0.124 0.381 0.726 1.091 Table 6.19: Averaged convergence factors of multigrid WR (W -cycle) for (3.47) (m = 2, linear nite elements, T = 1, BDF(4)). h 0.04 0.02 0.01 0.005 0.0025 0.001 1/4 0.102 0.133 0.136 0.137 0.137 0.137 1/8 0.150 0.231 0.285 0.293 0.294 0.294 1/16 0.080 0.179 0.268 0.330 0.343 0.344 1/32 0.042 0.086 0.184 0.295 0.343 0.355 Table 6.20: Averaged convergence factors of multigrid WR (W -cycle) for (3.47) (m = 2, bilinear nite elements, T = 1, CN). h 0.04 0.02 0.01 0.005 0.0025 0.001 1/4 0.049 0.072 0.085 0.088 0.106 0.125 1/8 0.063 0.088 0.130 0.178 0.224 0.241 1/16 0.045 0.046 0.047 0.104 0.167 0.246 1/32 0.041 0.041 0.041 0.041 0.042 0.132 Table 6.21: Averaged convergence factors of multigrid WR (W -cycle) for (3.47) (m = 2, bilinear nite elements, T = 1, BDF(2)).
128
CHAPTER 6. MULTIGRID WR METHODS
h 0.04 0.02 0.01 0.005 0.0025 0.001 1/4 0.124 0.217 0.161 0.148 0.139 0.135 1/8 0.158 0.319 0.600 0.661 0.405 0.324 1/16 0.069 0.147 0.377 0.735 0.892 0.770 1/32 0.042 0.048 0.112 0.538 0.646 0.937 Table 6.22: Averaged convergence factors of multigrid WR (W -cycle) for (3.47) (m = 2, bilinear nite elements, T = 1, BDF(4)). again. This behaviour is similar to what is observed when the multigrid WR method is used to solve the ODEs obtained by spatial nite-dierence discretisation of a parabolic problem. We refer to 95, x3.5] for an intuitive explanation, and to 96] for a discussion based on an exponential Fourier mode analysis.
6.4 Some concluding remarks In this chapter, we extended the convergence theory for multigrid WR methods for ODE systems (3.1) with B = I to general ODEs with nonsingular B . Application of this theory to several model problems shows that its convergence behaviour is similar to that of its static counterpart. More precisely, the convergence rate of the multigrid WR method turns out to be mesh-size independent, a fact which is con rmed by extensive numerical experiments. For completeness, we should mention that the latter method belongs to the class of parabolic multigrid methods. These are multigrid methods for time-dependent problems designed to operate on grids extending in space and time. Other examples of such methods are the time-parallel multigrid method 21, 28] and the space-time multigrid method 29]. These methods are highly ecient on parallel computers, possibly outperforming parallel implementations of standard time-stepping methods by orders of magnitude 30, 97]. Their convergence characteristics as iterative solvers are similar to the convergence characteristics of multigrid methods for stationary problems, although different parabolic multigrid methods may have very dierent robustness characteristics. The waveform method, in particular, was shown to be very robust across a wide range of time-discretisation schemes. We refer to 29, 96] for a further discussion.
Chapter 7 Concluding Remarks and Suggestions for Future Research This chapter summarises the main results of the dissertation. We also suggest some topics for future work.
7.1 Concluding remarks An extension of the linear waveform relaxation convergence theory When we started the research for this thesis, our rst aim was to investigate whether the classical WR convergence theory for linear systems of ODEs u_ + Au = f 63, 64] could be extended to more general systems of the form B u_ + Au = f with B nonsingular. In Chapter 3, we proposed a WR scheme for such generalised problems and investigated its convergence behaviour, both for the continuous-time and discrete-time variants of the method. To this end, we formulated the latter algorithms as explicit successive approximations and identi ed the resulting iteration operators as consisting of a matrix multiplication and a linear convolution part. The spectral properties of such operators were studied in Chapter 2. Our results turned out to be qualitatively analogous to those for the B = I -case, although the precise expressions are substantially dierent. We showed that the asymptotic convergence behaviour of a standard WR is governed by that of the associated static relaxation method for the Laplace-transformed system (zB + A)ue(z) = fe(z) + Bu0 with z a complex value. The asymptotic convergence factor is then found by a maximisation process over a region in the complex plane. We were able to prove that for some semi-discretised heat ow problems, the Jacobi and Gauss{ Seidel WR algorithms converge as fast as their static counterparts for Au = f . Numerical evidence was provided to support the theory.
Acceleration techniques Next, we were determined to improve the WR convergence by acceleration techniques. Well-known acceleration techniques for stationary linear systems were the obvious candidates to be tried on time-dependent problems. 129
130
CHAPTER 7. CONCLUDING REMARKS AND RESEARCH SUGGESTIONS
In Chapter 4, we considered the acceleration of WR by SOR techniques. For the standard SOR WR method, de ned as a straightforward extension of static SOR, the pessimistic results for the B = I -case 63, 64] could be extended easily towards the more general linear ODE systems. In particular, the SOR WR method did not yield the same acceleration as its static counterpart. At that time, we got to know the results of Mark Reichelt, who did very promising numerical experiments with a so-called convolution SOR (CSOR) WR algorithm 82]. The idea was to replace every multiplication with a scalar overrelaxation parameter by a convolution with a time-dependent function. For certain problems Reichelt also derived the Laplace-transformed expression of the optimal convolution kernel by applying classical complex SOR theory. We investigated this CSOR WR method in great detail for general linear ODE systems and were able to prove that in many cases, its optimal variant (with the best possible convolution kernel/function) behaves exactly as the optimal static SOR method for Au = f . We extended the validity of the Laplace-transformed expression of the optimal convolution kernel towards more general problems, and we provided several results on how to determine the optimal convolution kernel in practice. We also showed the method to be quite robust with respect to the calculation of this kernel. In Chapter 5, we applied a similar convolution idea to the Chebyshev acceleration of WR methods. We rst illustrated that no substantial acceleration should be expected by taking linear combinations of the basic WR iterates, as was shown already in 68]. Instead, we tried to accelerate the WR convergence by a convolution-based polynomial approach. Our convergence results were rst applied to the convolution Chebyshev acceleration of the Picard method, and compared with those of the strongly related shifted Picard scheme which was investigated in 54, 85] and which was, as indicated in the introductory chapter, one of the starting points for the ideas developed in this chapter. In addition, the convolution Chebyshev Jacobi and Gauss{Seidel WR methods are shown to converge as fast as their static counterparts, and this for several model problems. Finally, the performance of multigrid WR methods for ODE systems of the form B u_ + Au = f , derived from a linear, parabolic PDE by spatial discretisation, is investigated in Chapter 6. We were able to extend the nite-dierence results 55, 95] towards the
nite-element case. That is to say, the multigrid WR method behaves almost as its static counterpart for Au = f $ the convergence behaviour is independent of the mesh size h used for the spatial discretisation of the PDE.
7.2 Suggestions for future research New acceleration techniques From the theory developed in this dissertation, we may conclude that \convolution is the way to go" in WR methods. That is, whenever an acceleration scheme requires a waveform iterate to be multiplied by a scalar parameter, we are probably better o with replacing this multiplication by a convolution with a time-dependent kernel. This idea can be applied to Krylov subspace WR methods. As mentioned in the introductory chapter, the straightforward extension of Krylov techniques to the waveform case did not result in an essential speed up 56, 57, 68]. A convolution-based approach
7.2. SUGGESTIONS FOR FUTURE RESEARCH
131
has been shown recently to be very promising by Lumsdaine and Wu 59]. In particular, the convergence behaviour of the resulting convolution Krylov (conjugate gradients, GMRES) WR methods seems to equal that of its static counterparts. When combined with a suitable preconditioner this might make a very ecient method. Yet, the current implementations of Krylov WR algorithms suer from an (until now) inevitable deconvolution which is known to be a numerically unstable process. We think it will be interesting to conduct a further study of these convolution Krylov methods. One could try to develop techniques that circumvent the deconvolution problem, and also analyse the WR extension of dierent Krylov variants (e.g., QMR and BiCGSTAB) that have not yet been considered in 59]. Several other static iterative methods remain to be extended to the WR case. For example, we wonder whether the alternating directions implicit (ADI) method for elliptic PDEs 106, Chap. 17] can be reformulated in a WR context (for semi-discretised parabolic PDEs). For this, and also for other methods, we are especially interested in a comparison of the performance of the WR method with that of the corresponding static methods. We also plan to study generalisations of the standard multigrid WR method in which multiple coarse grids are used, such as, e.g., frequency-decomposition multigrid 23] and multigrid with multiple semi-coarsening 65]. Such methods are known to be very robust for elliptic problems. In the WR context, they might make very ecient parabolic PDE solvers for problems with variable coecients.
Variable-coecient/nonlinear applications The theoretical analysis of the WR methods is usually performed in terms of linear equations with constant coecients. Such an approach allows for a comparison of the theoretical convergence results with those obtained from numerical experiments. A frequently used model problem to this end (also in this dissertation) is the heat equation, spatially discretised using nite dierences or nite elements on regular grids. Yet, one might wish to apply these methods to more general, nonlinear equations on more general, irregular meshes. A few examples are the initial-value problems simulating very large electrical circuits and the parabolic problems modelling the behaviour of semiconductor devices. Other possible PDEs are, e.g., equations of convection-diusion type and the Navier-Stokes and Maxwell equations. For such more general, nonlinear problems we expect problems in determining the optimal kernels in the convolution-based algorithms, as the analytical formulas for these kernels are not valid anymore, or not exactly computable. In addition, one should study which adjustments should be made to the multigrid WR methods in order to make them applicable to dierential equations on irregular grids. Also, it will be interesting to compare the eciency of WR methods, both in CPU and memory usage, with that of standard solution techniques for such nontrivial applications.
Exploitation of multirate behaviour Real-life applications often give rise to large systems of dierential equations whose components change at very dierent rates. They are the ideal candidates to investigate the practical applicability of the multirate integration property of the WR methods. Besides
132
CHAPTER 7. CONCLUDING REMARKS AND RESEARCH SUGGESTIONS
the obvious technical problems (which data structures should be used to represent the discrete waveforms and which manipulations should be de ned on such data structures) that arise in an actual implementation of this idea, we wonder whether the multirate behaviour can be truly exploited for the convolution-based WR methods.
Other problems
The WR method is best known in the context of rst-order ordinary dierential equations. Yet, it would be interesting to investigate the WR approach for other problems too. Some eorts to this end have already been done 8, x7.12], but a lot of questions remain unanswered. For example, we wonder whether WR methods can be successfully applied to integro-dierential equations and to which extent the techniques developed in this thesis are applicable for hyperbolic PDE problems.
Bibliography 1] M. Abramowitz and I. A. Stegun. Handbook of Mathematical Functions: with Formulas, Graphs, and Mathematical Tables. Dover Publications, New York, 1970. 2] A. Bellen, Z. Jackiewicz, and M. Zennaro. Contractivity of waveform relaxation Runge{Kutta iterations and related limit methods for dissipative systems in the maximum norm. SIAM J. Numer. Anal., 31(2):499{523, April 1994. 3] A. Bellen and M. Zennaro. The use of Runge{Kutta formulae in waveform relaxation methods. Appl. Numer. Math., 11(1{3):95{114, 1993. 4] G. Birkho and R. E. Lynch. Numerical Solution of Elliptic Systems, volume 6 of SIAM Studies in Applied Mathematics. SIAM, Philadelphia, 1984. 5] R. N. Bracewell. The Fourier Transform and its Applications. McGraw{Hill Kogakusha, Ltd., Tokyo, 2nd edition, 1978. 6] A. Brandt. Multi-level adaptive solutions to boundary-value problems. Math. Comput., 31(138):333{390, 1977. 7] P. Brenner, V. Thomee, and L. B. Wahlbin. Besov Spaces and Applications to Di erence Methods for Initial Value Problems, volume 434 of Lecture Notes in Mathematics. Springer{Verlag, Berlin, 1975. 8] K. Burrage. Parallel and Sequential Methods for Ordinary Di erential Equations. Oxford University Press, Oxford, 1995. 9] K. Burrage, Z. Jackiewicz, S. P. Nrsett, and R. A. Renaut. Preconditioning waveform relaxation iterations for dierential systems. BIT, 36(1):54{76, 1996. 10] S. R. Choudhury. Waveform relaxation techniques for linear and nonlinear diusion equations. J. Comput. Appl. Math., 42(2):253{267, 1992. 11] P. G. Ciarlet. Basic error estimates for elliptic problems. In P. G. Ciarlet and J. L. Lions, editors, Finite Element Methods, volume 2 of Handbook of Numerical Analysis, pages 18{351. North-Holland, Amsterdam, 1991. 12] M. L. Crow. Waveform Relaxation Methods for the Simulation of Systems of Di erential-Algebraic Equations with Applications to Electric Power Systems. Ph.D. thesis, Department of Electrical and Computer Engineering, University of Illinois at Urbana{Champaign, U.S.A., December 1989. 133
134
BIBLIOGRAPHY
13] M. L. Crow and D. C. Ilic. The parallel implementation of the waveform relaxation method for transient stability simulations. IEEE Trans. Power Systems, 5(3):922{ 932, August 1990. 14] M. L. Crow and D. C. Ilic. The waveform relaxation method for systems of dierential-algebraic equations. Math. and Comp. Modelling, 19(12):67{84, 1994. 15] R. F. Curtain and A. J. Pritchard. Functional Analysis in Modern Applied Mathematics. Academic Press, London, 1977. 16] M. Eiermann. On semiiterative methods generated by Faber polynomials. Numer. Math., 56(2{3):139{156, 1989. 17] W. Fang, M. E. Morari, and D. Smart. Robust VLSI circuit simulation techniques based on overlapping waveform relaxation. IEEE Trans. CAD Integrated Circuits and Systems, 14(4):510{518, April 1995. 18] M. J. Gander. Analysis of Parallel Algorithms for Time-Dependent Partial Di erential Equations. Ph.D. thesis, Department of Scienti c Computing and Computational Mathematics, Stanford University, California, U.S.A., August 1997. 19] I. Gohberg and S. Goldberg. Basic Operator Theory. Birkh%auser, Boston, 1981. 20] G. Gripenberg, S. O. Londen, and O. Staans. Volterra Integral and Functional Equations, volume 34 of Encyclopedia Of Mathematics And Its Applications. Cambridge University Press, Cambridge, 1990. 21] W. Hackbusch. Parabolic multi-grid methods. In R. Glowinski and J.-L. Lions, editors, Computing Methods in Applied Sciences and Engineering VI, pages 189{ 197. North-Holland, Amsterdam, 1984. 22] W. Hackbusch. Multi-Grid Methods and Applications, volume 4 of Springer Series in Computational Mathematics. Springer{Verlag, Berlin, 1985. 23] W. Hackbusch. The frequency-decomposition multigrid method, part I: Application to anisotropic equations. Numer. Math., 56(2{3):229{245, 1989. 24] L. A. Hageman and D. M. Young. Applied Iterative Methods. Academic Press, New York, 1981. 25] E. Hairer and G. Wanner. Solving Ordinary Di erential Equations II, volume 14 of Springer Series in Computational Mathematics. Springer{Verlag, Berlin, 1991. 26] G. H. Hardy, J. E. Littlewood, and G. Polya. Inequalities. Cambridge University Press, Cambridge, 2nd edition, 1978. 27] K. Homan. Banach Spaces of Analytic Functions. Prentice{Hall, Englewood Clis, N.J., 1962. 28] G. Horton. The time-parallel multigrid method. Comm. in Appl. Numer. Methods, 8(9):585{595, 1992.
BIBLIOGRAPHY
135
29] G. Horton and S. Vandewalle. A space-time multigrid method for parabolic partial dierential equations. SIAM J. Sci. Comput., 16(4):848{864, July 1995. 30] G. Horton, S. Vandewalle, and P. Worley. An algorithm with polylog parallel complexity for solving parabolic partial dierential equations. SIAM J. Sci. Comput., 16(3):531{541, May 1995. 31] M. Hu, K. Jackson, J. Janssen, and S. Vandewalle. Remarks on the optimal convolution kernel for CSOR waveform relaxation. Adv. Comput. Math., 7(1{2):135{156, 1997. 32] M. Hu, K. Jackson, and B. Zhu. Complex optimal SOR parameters and convergence regions. Working Notes, Department of Computer Science, University of Toronto, Canada, 1995. 33] E. Isaacson and H. B. Keller. Analysis of Numerical Methods. John Wiley and Sons, New York, 1966. 34] J. Janssen and S. Vandewalle. Multigrid waveform relaxation on spatial niteelement meshes. In P. W. Hemker and P. Wesseling, editors, Contributions to Multigrid: a Selection of Contributions to the Fourth European Multigrid Conference, held in Amsterdam, July 6{9, 1993, volume 103 of CWI Tracts, pages 75{86. Stichting Mathematisch Centrum, Amsterdam, 1994. 35] J. Janssen and S. Vandewalle. Convolution SOR waveform relaxation on spatial
nite-element meshes. In G. Alefeld, O. Mahrenholtz, and R. Mennicken, editors, ICIAM/GAMM 95: Numerical Analysis, Scientic Computing, Computer Science, Hamburg, July 3{7, 1995, volume 76, supplement 1 of Zeitschrift fur Angewandte Mathematik und Mechanik, pages 19{22. Akademie Verlag, Berlin, 1996. 36] J. Janssen and S. Vandewalle. Multigrid waveform relaxation on spatial niteelement meshes: the discrete-time case. SIAM J. Sci. Comput., 17(1):133{155, January 1996. 37] J. Janssen and S. Vandewalle. Multigrid waveform relaxation on spatial niteelement meshes: the continuous-time case. SIAM J. Numer. Anal., 33(2):456{474, April 1996. 38] J. Janssen and S. Vandewalle. Convolution-based Chebyshev acceleration of waveform relaxation. Technical Report TW 256, Department of Computer Science, Katholieke Universiteit Leuven, Belgium, April 1997. 39] J. Janssen and S. Vandewalle. On SOR waveform relaxation methods. SIAM J. Numer. Anal., 34(6):2456{2481, December 1997. 40] R. Jetlsch and B. Pohl. Waveform relaxation with overlapping splittings. SIAM J. Sci. Comput., 16(1):40{49, January 1995.
136
BIBLIOGRAPHY
41] G. S. Jordan, O. J. Staans, and R. L. Wheeler. Local analyticity in weighted L1spaces and applications to stability problems for Volterra equations. Trans. Am. Math. Soc., 274(2):749{782, December 1982. 42] F. Juang. Waveform Methods for Ordinary Di erential Equations. Ph.D. thesis, Department of Computer Science, University of Illinois at Urbana{Champaign, U.S.A., January 1990. 43] L. Kantorovich and G. Akilov. Functional Analysis. Pergamon Press, Oxford, 2nd edition, 1982. 44] T. Kato. Perturbation Theory for Linear Operators. Springer{Verlag, Berlin, 2nd edition, 1984. 45] S. Keras. Numerical Methods for Parabolic Partial Di erential Equations. Ph.D. thesis, Department of Applied Mathematics and Theoretical Physics, University of Cambridge, England, September 1996. 46] B. Kredell. On complex successive overrelaxation. BIT, 2:143{152, 1962. 47] R. Kress. Linear Integral Equations, volume 82 of Applied Mathematical Sciences. Springer{Verlag, New York, 1989. 48] J. D. Lambert. Computational Methods in Ordinary Di erential Equations. John Wiley and Sons, Chichester, 1973. 49] B. Leimkuhler. Estimating waveform relaxation convergence. SIAM J. Sci. Comput., 14(4):872{889, July 1993. 50] E. Lelarasmee. The Waveform Relaxation Method for the Time-Domain Analysis of Large-Scale Nonlinear Systems. Ph.D. thesis, Department of Electrical Engineering and Computer Science, University of California, Berkeley, U.S.A., 1982. 51] E. Lelarasmee, A. Ruehli, and A. Sangiovanni-Vincentelli. The waveform relaxation method for time-domain analysis of large-scale integrated circuits. IEEE Trans. CAD Integrated Circuits and Systems, 1(3):131{145, July 1982. 52] E. Lindel%of. Sur l'application des methodes d'approximations successives -a l'etude des integrales reelles des equations dierentielles ordinaires. J. de Math. Pures et Appl., 4e Serie(10):117{128, 1894. 53] C. Lubich. On the stability of linear multistep methods for Volterra convolution equations. IMA J. Numer. Anal., 3(4):439{465, 1983. 54] C. Lubich. Chebyshev acceleration of Picard{Lindel%of iteration. BIT, 32(3):535{ 538, 1992. 55] C. Lubich and A. Ostermann. Multi-grid dynamic iteration for parabolic equations. BIT, 27(3):216{234, 1987.
BIBLIOGRAPHY
137
56] W.-S. Luk. Krylov's Methods in Function Space for Waveform Relaxation. Ph.D. thesis, Division of Computer Science and Engineering, Chinese University of Hong Kong, June 1996. 57] A. Lumsdaine. Theoretical and Practical Aspects of Parallel Numerical Algorithms for Initial Values Problems, with Applications. Ph.D. thesis, Deptartment of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, U.S.A., January 1992. 58] A. Lumsdaine, M. W. Reichelt, J. M. Squyres, and J. K. White. Accelerated waveform methods for parallel transient simulation of semiconductor devices. IEEE Trans. CAD Integrated Circuits and Systems, 15(7):716{726, July 1996. 59] A. Lumsdaine and D. Wu. Krylov subspace acceleration of waveform relaxation. Technical Report CSE-TR-96-33, Department of Computer Science and Engineering, University of Notre Dame, Indiana, U.S.A., November 1996. 60] A. Lumsdaine and D. Wu. Spectra and pseudospectra of waveform relaxation operators. SIAM J. Sci. Comput., 18(1):286{304, January 1997. 61] T. A. Manteuel. The Tchebychev iteration for nonsymmetric linear systems. Numer. Math., 28(3):307{327, 1977. 62] U. Miekkala. Dynamic iteration methods applied to linear DAE systems. J. Comput. Appl. Math., 25(2):133{151, 1989. 63] U. Miekkala and O. Nevanlinna. Convergence of dynamic iteration methods for initial value problems. SIAM J. Sci. Statist. Comput., 8(4):459{482, July 1987. 64] U. Miekkala and O. Nevanlinna. Sets of convergence and stability regions. BIT, 27(4):554{584, 1987. 65] W. A. Mulder. A new multigrid approach to convection problems. J. Comput. Phys., 83:303{323, 1989. 66] O. Nevanlinna. Remarks on Picard{Lindel%of iteration, part I. BIT, 29(2):328{346, 1989. 67] O. Nevanlinna. Remarks on Picard{Lindel%of iteration, part II. BIT, 29(3):535{562, 1989. 68] O. Nevanlinna. Linear acceleration of Picard{Lindel%of iteration. Numer. Math., 57(2):147{156, 1990. 69] O. Nevanlinna. Power bounded prolongations and Picard{Lindel%of iteration. Numer. Math., 58(5):479{501, 1991. 70] W. Niethammer and R. Varga. The analysis of k-step iterative methods for linear systems from summability theory. Numer. Math., 41(2):177{206, 1983.
138
BIBLIOGRAPHY
71] C. Oosterlee. Robust Multigrid Methods for the Steady and Unsteady Incompressible Navier{Stokes Equations in General Coordinates. Ph.D. thesis, Department of Technical Mathematics and Informatics, Technische Universiteit Delft, The Netherlands, September 1993. 72] C. Oosterlee and P. Wesseling. A multigrid waveform relaxation method for timedependent incompressible Navier{Stokes equations. In P. W. Hemker and P. Wesseling, editors, Contributions to Multigrid: a Selection of Contributions to the Fourth European Multigrid Conference, held in Amsterdam, July 6{9, 1993, volume 103 of CWI Tracts, pages 169{180. Stichting Mathematisch Centrum, Amsterdam, 1994. 73] R. E. A. C. Paley and N. Wiener. Fourier Transforms in the Complex Domain. American Mathematical Society, Providence, R.I., 1934. 74] E. Picard. Sur l'application des methodes d'approximations successives -a l'etude de certaines equations dierentielles ordinaires. J. de Math. Pures et Appl., 4e Serie(9):217{271, 1893. 75] A. D. Poularakis and S. Seely. Elements of Signals and Systems. PWS-Kent Series in Electrical Engineering. PWS-Kent Publishing Company, Boston, 1988. 76] S. Raman, L. Patnaik, and R. Mall. Parallel implementation of circuit simulation. Int. J. High Speed Comput., 2(4):351{373, 1990. 77] S. Reed and B. Simon. Functional Analysis, volume 1 of Methods of Modern Mathematical Physics. Academic Press, New York, 1972. 78] S. Reed and B. Simon. Fourier Analysis, Self-Adjointness, volume 2 of Methods of Modern Mathematical Physics. Academic Press, New York, 1975. 79] L. Reichel and L. N. Trefethen. Eigenvalues and pseudo-eigenvalues of Toeplitz matrices. Linear Algebra Appl., 162{164:153{185, 1992. 80] M. W. Reichelt. Accelerated Waveform Relaxation Techniques for the Parallel Transient Simulation of Semiconductor Devices. Ph.D. thesis, Deptartment of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, U.S.A., June 1993. 81] M. W. Reichelt, J. K. White, and J. Allen. Waveform relaxation for transient simulation of two-dimensional MOS devices. In ICCAD-89: IEEE International Conference on Computer-Aided Design, Santa Clara, California, November 5{9, 1989, pages 412{415. IEEE Computer Society Press, Washington D.C., 1989. 82] M. W. Reichelt, J. K. White, and J. Allen. Optimal convolution SOR acceleration of waveform relaxation with application to parallel simulation of semiconductor devices. SIAM J. Sci. Comput., 16(5):1137{1158, September 1995. 83] A. Secchi, M. Morari, and E. Biscaja. The waveform relaxation method in the concurrent dynamic process simulation. Comp. and Chemical Engineering, 17(7):683{ 704, 1993.
BIBLIOGRAPHY
139
84] J. Simoens. Het Gebruik van Snelle Directe Methoden als Preconditioner voor Golfvormrelaxatie. M.Sc. thesis, Department of Computer Science, Katholieke Universiteit Leuven, Belgium, June 1997. (in Dutch). 85] R. Skeel. Waveform iteration and the shifted Picard splitting. SIAM J. Sci. Statist. Comput., 10(4):756{776, July 1989. 86] A. Skjellum. Concurrent Dynamic Simulation: Multicomputers Algorithms Research Applied to Ordinary Di erential-Algebraic Process Systems in Chemical Engineering. Ph.D. thesis, California Institute of Technology, Pasadena, U.S.A., May 1990. 87] G. Strang and G. J. Fix. An Analysis of the Finite Element Method. Prentice{Hall, Englewood Clis, N.J., 1973. 88] K. St%uben and U. Trottenberg. Multigrid methods: fundamental algorithms, model problem analysis and applications. In U. Trottenberg and W. Hackbusch, editors, Multigrid Methods, volume 960 of Lecture Notes in Mathematics, pages 1{176. Springer{Verlag, Berlin, 1982. 89] G. Szeg%o. Orthogonal Polynomials, volume 23 of American Mathematical Society Colloquium Publications. American Mathematical Society, Providence, R.I., 3rd edition, 1967. 90] S. Ta'asan and H. Zhang. On the multigrid waveform relaxation method. SIAM J. Sci. Comput., 16(5):1092{1104, September 1995. 91] A. E. Taylor. Introduction to Functional Analysis. John Wiley and Sons, New York, 1967. 92] G. J. Tee. Eigenvectors of the successive overrelaxation process, and its combination with Chebyshev semi-iteration. Comp. J., 6:250{263, 1963. 93] V. Thomee. Galerkin Finite Element Methods for Parabolic Problems, volume 1054 of Lecture Notes in Mathematics. Springer{Verlag, Berlin, 1984. 94] V. Thomee. Finite dierence methods for linear parabolic equations. In P. G. Ciarlet and J. L. Lions, editors, Finite di erence methods. Solution of equations in Rn, volume 1 of Handbook of Numerical Analysis, pages 5{196. North-Holland, Amsterdam, 1990. 95] S. Vandewalle. Parallel Multigrid Waveform Relaxation for Parabolic Problems. B.G. Teubner, Stuttgart, 1993. 96] S. Vandewalle and G. Horton. Fourier mode analysis of the multigrid waveform relaxation and time-parallel multigrid methods. Computing, 54(4):317{330, 1995. 97] S. Vandewalle and E. Van de Velde. Space-time concurrent multigrid waveform relaxation. Annals of Numer. Math., 1(1{4):347{363, 1994. 98] R. S. Varga. Matrix Iterative Analysis. Prentice{Hall, Englewood Clis, N.J., 1965.
140
BIBLIOGRAPHY
99] R. Wait and A. R. Mitchell. Finite Element Analysis and Applications. John Wiley and Sons, Chichester, 1985. 100] P. Wesseling. An Introduction to Multigrid Methods. John Wiley and Sons, Chichester, 1992. 101] J. White and A. Sangiovanni-Vincentelli. Relaxation Techniques for the Simulation of VLSI Circuits. Kluwer Academic Publishers, Boston, 1987. 102] J. White, A. Sangiovanni-Vincentelli, F. Odeh, and A. Ruehli. Waveform relaxation: theory and practice. Trans. Soc. for Comp. Simulation, 2(1):95{133, 1985. 103] S. Wolfram. Mathematica: a System for Doing Mathematics by Computer. Addison{ Wesley, New York, 1988. 104] E. Xia and R. Saleh. Parallel waveform-Newton algorithms for circuit simulation. IEEE Trans. CAD Integrated Circuits and Systems, 11(4):432{442, 1992. 105] K. Yosida. Functional Analysis. Springer{Verlag, New York, 1980. 106] D. M. Young. Iterative Solution of Large Linear Systems. Academic Press, New York, 1971.