Kun Il Park
Fundamentals of
Probability and
Stochastic Processes
with Applications to
Communications
Fundamentals of Probability and Stochastic
Processes with Applications to Communications
Kun Il Park
Fundamentals of Probability
and Stochastic Processes
with Applications to
Communications
Kun Il Park
Holmdel, New Jersey
USA
ISBN 978-3-319-68074-3 ISBN 978-3-319-68075-0 (eBook)
https://doi.org/10.1007/978-3-319-68075-0
Library of Congress Control Number: 2017953254
© Springer International Publishing AG 2018
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or
dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt
from the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained
herein or for any errors or omissions that may have been made. The publisher remains neutral with
regard to jurisdictional claims in published maps and institutional affiliations.
Printed on acid-free paper
This Springer imprint is published by Springer Nature
The registered company is Springer International Publishing AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
To
Sylvie
Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Basic Mathematical Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1 Complex Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1.2 Complex Variable Operations . . . . . . . . . . . . . . . . . . . . . 4
2.1.3 Associative, Commutative, and Distributive
Laws of Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.4 Complex Conjugate . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Matrix Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.2 Matrix Transposition . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.3 Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2.4 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2.5 Matrix Inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.2.6 Matrix Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2.7 Linear Combination . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.2.8 Nonnegative Definite Matrix . . . . . . . . . . . . . . . . . . . . . . 30
2.2.9 Complex Conjugate of a Matrix . . . . . . . . . . . . . . . . . . . 32
2.2.10 Matrix Identities for the Estimation Theory . . . . . . . . . . . 32
2.3 Set Theory Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.3.1 Definition of Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.3.2 Subset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.3.3 Set Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.3.4 Set Equality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.3.5 Cartesian Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3 Probability Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.1 Random Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.1.1 Space Ω . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.1.2 Event . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
vii
viii Contents
3.1.3 Combined Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.1.4 Probabilities and Statistics . . . . . . . . . . . . . . . . . . . . . . . 53
3.2 Axiomatic Formulation of Probability Theory . . . . . . . . . . . . . . . . 53
3.3 Conditional Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.3.1 Definition of the Conditional Probability . . . . . . . . . . . . . 62
3.3.2 Total Probability Theorem . . . . . . . . . . . . . . . . . . . . . . . 63
3.3.3 Bayes’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.3.4 Independence of Events . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.4 Cartesian Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4 Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.1 Definition of a Random Variable . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.2 Random Variables Treated Singly . . . . . . . . . . . . . . . . . . . . . . . . 76
4.2.1 Cumulative Distribution Function . . . . . . . . . . . . . . . . . . 76
4.2.2 The Probability Density Function (pdf) . . . . . . . . . . . . . . 84
4.3 Random Variables Treated Jointly . . . . . . . . . . . . . . . . . . . . . . . . 88
4.3.1 The Joint CDF of Two Random Variables . . . . . . . . . . . . 88
4.3.2 Joint pdf of X and Y . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.4 Conditional Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.4.1 Independence of Two Random Variables . . . . . . . . . . . . . 100
4.5 Functions of RVs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.5.1 CDFs of W and Z . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.5.2 pdfs of W and Z . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.5.3 The Joint CDF of W and Z . . . . . . . . . . . . . . . . . . . . . . . 104
5 Characterization of Random Variables . . . . . . . . . . . . . . . . . . . . . . . 109
5.1 Expected Value or Mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.2 Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5.3 Covariance and Correlation Coefficient of Two
Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
5.4 Example Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
5.4.1 Uniform Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
5.4.2 Binomial Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.4.3 Exponential Distribution . . . . . . . . . . . . . . . . . . . . . . . . . 132
6 Stochastic Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
6.1 Definition of Stochastic Process . . . . . . . . . . . . . . . . . . . . . . . . . . 135
6.2 Statistical Characterization of a Stochastic Process . . . . . . . . . . . . 138
6.2.1 First-Order Distributions . . . . . . . . . . . . . . . . . . . . . . . . . 139
6.2.2 Second-Order Distributions . . . . . . . . . . . . . . . . . . . . . . . 140
6.3 Vector RVs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
6.3.1 Definition of Vector RVs . . . . . . . . . . . . . . . . . . . . . . . . 142
6.3.2 Multivariate Distributions . . . . . . . . . . . . . . . . . . . . . . . . 146
6.3.3 Complete Statistical Characterization . . . . . . . . . . . . . . . 147
6.4 Characteristic Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
6.4.1 Characteristic Function of a Scalar RV . . . . . . . . . . . . . . 148
6.4.2 Characteristic Function of a Vector RV . . . . . . . . . . . . . . 150
6.4.3 Independent Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
Contents ix
6.5 Stationarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
6.5.1 nth-Order Stationarity . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
6.5.2 Strict Sense Stationarity . . . . . . . . . . . . . . . . . . . . . . . . . 155
6.5.3 First-Order Stationarity . . . . . . . . . . . . . . . . . . . . . . . . . . 155
6.5.4 Second-Order Stationarity . . . . . . . . . . . . . . . . . . . . . . . . 156
6.5.5 Wide Sense Stationarity (WSS) . . . . . . . . . . . . . . . . . . . . 158
6.5.6 (n þ m)th-Order Joint Stationarity . . . . . . . . . . . . . . . . . . 159
6.5.7 Joint Second-Order Stationarity . . . . . . . . . . . . . . . . . . . . 160
6.5.8 Jointly WSS Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
6.6 Ergodicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
6.7 Parameters of a Stochastic Process . . . . . . . . . . . . . . . . . . . . . . . . 163
6.7.1 Mean and Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
6.7.2 Autocorrelation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
6.7.3 Autocovariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
6.7.4 Cross-correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
6.7.5 Cross-covariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
6.8 Properties of the Autocorrelation of a WSS Process . . . . . . . . . . . 173
6.9 Parameter Vectors and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 175
6.9.1 Mean Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
6.9.2 Autocovariance Matrices . . . . . . . . . . . . . . . . . . . . . . . . 177
6.9.3 Cross-covariance Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 180
6.9.4 Covariance Matrix of a Concatenated
Vector RV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
6.9.5 Linear Combination . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
7 Gaussian Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
7.1 Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
7.2 Single Gaussian RV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
7.3 Two Jointly Gaussian RVs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
7.4 Vector Gaussian RV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
7.5 Characteristic Function of a Gaussian RV . . . . . . . . . . . . . . . . . . . 203
7.5.1 Characteristic Function of a Scalar
Gaussian RV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
7.5.2 Characteristic Function of a Gaussian
Vector RV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
7.6 Gaussian Stochastic Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
8 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
8.1 Analysis of Communications System . . . . . . . . . . . . . . . . . . . . . . 213
8.1.1 Linear Time-Invariant (LTI) System . . . . . . . . . . . . . . . . 213
8.1.2 Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
8.1.3 Input-Output Relationship . . . . . . . . . . . . . . . . . . . . . . . . 218
8.1.4 White Noise Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
8.1.5 Properties of Gaussian RVs and Gaussian
Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
8.1.6 Input-Output Relations of a Stochastic Process . . . . . . . . . 225
x Contents
8.2 Estimation Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
8.2.1 Estimation Problem Statement . . . . . . . . . . . . . . . . . . . . 229
8.2.2 Linear Minimum Mean Square Error
(MMSE) Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
8.3 Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
8.3.1 Kalman Filter: Scalar Case . . . . . . . . . . . . . . . . . . . . . . . 236
8.3.2 Kalman Filter: Vector Case . . . . . . . . . . . . . . . . . . . . . . . 238
8.4 Queuing Theory Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
8.4.1 Queueing Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
8.4.2 General Description of Queueing . . . . . . . . . . . . . . . . . . 251
8.4.3 Point Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
8.4.4 Statistical Characterization of the Point Process
by the Counting Process . . . . . . . . . . . . . . . . . . . . . . . . . 253
8.4.5 Poisson Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
8.4.6 Poisson Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
8.4.7 Key Parameters of a Queueing System . . . . . . . . . . . . . . 259
8.4.8 Little’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
8.4.9 M/M/1 Queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
About the Author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
Chapter 1
Introduction
Through many years of teaching probability and stochastic processes, the author
has observed that students find these subjects difficult because of the difficulty
associated with three particular areas of mathematics—complex variables, linear
algebra, and set theory, which are used extensively as analyses tools in these
subjects. This book devotes a full chapter to explain the essential elements of
these three areas of mathematics.
Probabilities and stochastic processes are essential mathematical theories
applied in communications systems analyses. In communications systems analyses,
electrical signals are often represented by complex random variables (RVs) and
complex stochastic processes. If the RVs and stochastic processes are complex
rather than real, the complexity of their analyses multiplies greatly. Analyses based
on real RVs and real stochastic processes are not adequate to deal with systems
involving complex signals. This book provides comprehensive treatment of com-
plex RVs and complex stochastic processes including extensive analysis and
derivation of the parameters such as the mean, variance, autocorrelation,
autocovariance, cross-correlation and cross-covariance of complex RVs, and com-
plex stochastic processes as well as the stationarity of complex stochastic processes.
This book draws examples of applications of probability and stochastic pro-
cesses from various areas of communications such as the modeling of the additive
white Gaussian noise (AWGN) communications channel, the estimation theory
including the Kalman filtering, and the queueing theory. The Gaussian RV and
the Gaussian stochastic process are essential as applied to communications chan-
nels, e.g., AWGN model. This book devotes a full chapter to the Gaussian distri-
bution and its properties and presents detailed and complete derivation of the
characteristic function of the vector Gaussian RV.
Derivation of the general form of the Kalman filter involves extensive matrix
and vector operations. Using the basic linear algebra reviewed in the prerequisite
chapter, this book derives and proves all the matrix identities required in the
derivation of the Kalman filter.
© Springer International Publishing AG 2018 1
K.I. Park, Fundamentals of Probability and Stochastic Processes with Applications
to Communications, https://doi.org/10.1007/978-3-319-68075-0_1
2 1 Introduction
Some of the key topics and concepts discussed in this book include probability
axioms, Bernoulli trials; conditional probability; total probability; Bayes’ theorem;
independence of events; combined experiments; Cartesian product; Cumulative
Distribution Function (CDF); probability density function (pdf); mean, variance,
and autocorrelation function; stationarity; ergodicity; Gaussian process; estimation
theory; minimum mean square error (MMSE) estimation; Kalman filtering;
counting process; point process; Poisson process; queueing theory; and Little’s law.
The remainder of this book is organized as follows. Chapter 2 deals with the
prerequisite mathematical concepts of complex variables, matrix and vector oper-
ations, and set theory, and Chap. 3 deals with probability theory with a focus on the
axiomatic approach to probability formulation. Chapters 4 and 5 deal with appli-
cations of RVs, definition of RVs, the CDFs, the pdfs, and other general topics on
RVs. Chapter 5 deals with the parameters of RVs such as the mean, variance, and
covariance of both real and complex RVs; Chap. 6, stochastic processes, both real
and complex with a significant portion of this chapter devoted to the stationarity of
a stochastic process, both real and complex; Chap. 7, the Gaussian distribution; and,
finally, Chap. 8, examples of application of RVs and stochastic processes drawn
from the area of communications such as the AWGN channel modeling, the
estimation theory, the queueing theory, and properties of the Gaussian distribution.
Included at the end of the book are a bibliography, an index of the terms used in this
book, and a brief write-up about the author.
Chapter 2
Basic Mathematical Preliminaries
In this chapter, we review essential prerequisite concepts of complex variables,
linear algebra, and set theory required in this book. A reader familiar with these
subjects may skip this chapter without losing the logical flow of the material treated
in this book.
Complex variable operations are used in analyzing complex random variables
and complex stochastic processes. Basic understanding of linear algebra including
vector and matrix operations such as matrix multiplication, matrix inversion, and
matrix diagonalization is needed for understanding vector random variables, mul-
tivariate distributions, and estimation theory. Finally, fundamental concepts of the
set theory are needed for the discussion and formulation of probability and random
variables.
2.1 Complex Variables
This section defines a complex number and complex conjugate and the four basic
algebraic operations performed on complex numbers—addition, subtraction, mul-
tiplication, and division. This section also discusses the associative, the commuta-
tive, and the distributive properties of the complex variable operations.
2.1.1 Definitions
A complex number x is defined by the following expression:
x ¼ xr þ jxi
© Springer International Publishing AG 2018 3
K.I. Park, Fundamentals of Probability and Stochastic Processes with Applications
to Communications, https://doi.org/10.1007/978-3-319-68075-0_2
4 2 Basic Mathematical Preliminaries
where xr is called the real component and xi is called either the imaginary compo-
nent or the coefficient of the imaginary part. In this book, xi is called the imaginary
component and j the imaginary unit. The imaginary unit is also denoted by i. In this
book, j is used. The imaginary unit and the imaginary component put together, i.e.,
jxi, is called the imaginary part.
The imaginary unit j denotes
pffiffiffiffiffiffi
j ¼ À1
Therefore, the square of j is given by the following:
j2 ¼ À1
2.1.2 Complex Variable Operations
The four basic operations of complex variables—addition, subtraction, multiplica-
tion, and division—are defined as follows.
Addition
Consider two complex numbers x1 and x2 with the real and imaginary components
denoted by the following equations, respectively:
x1 ¼ x1r þ jx1i
x2 ¼ x2r þ jx2i
The addition operation performed on the above two complex numbers produces
a new complex number, which is denoted by w with the real and imaginary
components wr and jwi as follows:
x1 þ x2 ¼ w ¼ wr þ jwi
The addition operation is defined in such a way that the real component of w is
equal to the sum of the real components of x1and x2 and the imaginary component
of w, the sum of the imaginary components of x1and x2 as follows:
wr ¼ x1r þ x2r
wi ¼ x1i þ x2i
Therefore, we have the following addition rule of complex numbers:
x1 þ x2 ¼ x1r þ jx1i þ x2r þ jx2i ¼ ðx1r þ x2r Þ þ jðx1i þ x2i Þ ð2:1Þ
2.1 Complex Variables 5
Subtraction
In algebraic operations, a number being subtracted is called the “subtrahend” and the
number it is subtracted from, the “minuend,” and the result of subtraction, the
“difference.” As in real algebra, a subtraction operation is defined as the inverse
operation of the addition operation. Let the minuend and the subtrahend be x1 and x2,
respectively, and the difference be denoted by w as follows:
w ¼ x1 À x2
The subtraction operation is defined such that the sum of the difference w and the
subtrahend x2 obtained by using the addition operation already defined produce the
minuend x1 as follows:
x2 þ w ¼ x1
By an addition operation, the left-hand side of the above equation becomes
x2 þ w ¼ x2r þ jx2i þ wr þ jwi ¼ ðx2r þ wrÞ þ jðx2i þ wiÞ
which should be equal to x1 as follows:
ðx2r þ wrÞ þ jðx2i þ wiÞ ¼ x1r þ jx1i
From the above equation, we derive the following conditions that the real and
imaginary components of the difference wr and wi must satisfy:
x2r þ wr ¼ x1r
x2i þ wi ¼ x1i
From the above equations, we obtain the following equations:
wr ¼ x1r À x2r
wi ¼ x1i À x2i
Therefore, a subtraction operation yields the following expression:
x1 À x2 ¼ ðx1r þ jx1i Þ À ðx2r þ jx2i Þ ¼ ðx1r À x2r Þ þ jðx1i À x2i Þ ð2:2Þ
Multiplication
As in real algebra, a multiplication operation is performed as follows:
x1x2 ¼ ðx1r þ jx1i Þðx2r þ jx2i Þ
¼ x1r x2r þ jx1r x2i þ jx1i x2r þ j2x1i x2i
Noting that j2 ¼ À1, we rewrite the above equation as follows:
6 2 Basic Mathematical Preliminaries
x1x2 ¼ ðx1r x2r À x1i x2i Þ þ jðx1r x2i þ x1i x2r Þ ð2:3Þ
Division
As with a real variable, a division operation is defined as the inverse operation of the
multiplication operation. Therefore, the quotient of dividing a numerator by a
denominator must be such that, if the quotient is multiplied by the denominator,
the numerator be recovered.
w ¼ x1
x2
or
wx2 ¼ x1
Suppose now that a complex number x1 ¼ x1r þ jx1i is divided by a complex
number x2 ¼ x2r þ jx2i and the quotient is denoted by w ¼ wr + jwi as follows:
w ¼ wr þ jwi ¼ x1 ¼ x1r þ jx1i ð2:4Þ
x2 x2r þ jx2i
A division operation must produce the quotient w ¼ wr + jwi such that
w multiplied by the denominator x2 produces the numerator x1:
wx2 ¼ x1 ¼ x1r þ jx1i ð2:5Þ
By the multiplication rule given by (2.3), we obtain the following expression for
the left-hand side of the above equation:
ÀÁ
wx2 ¼ ðwr þ jwiÞ x2r þ jx2i
¼ ðwrx2r À wix2i Þ þ jðwrx2i þ x2r wiÞ
Equating the right-hand side of the above equation with the right-hand side of
(2.5), we obtain the following equation:
ðwrx2r À wix2i Þ þ jðwrx2i þ wix2r Þ ¼ x1r þ jx1i
Equating the real and imaginary components of both sides of the above equation,
respectively, we obtain the following equations:
wrx2r À wix2i ¼ x1r
wrx2i þ wix2r ¼ x1i
Solving the above two simultaneous equations with respect to the real and
imaginary components of the quotient to be defined, we obtain the following
equations:
2.1 Complex Variables 7
wr ¼ x1r x2r þ x1i x2i ð2:6aÞ
x22r þ x22i ð2:6bÞ
wi ¼ x2r x1i À x1r x2i
x22r þ x22i
so that the quotient w becomes the following:
w ¼ x1 ¼ x1r x2r þ x1i x2i þ j x2r x1i À x1r x2i ð2:7Þ
x2 x22r þ x22i x22r þ x22i
To perform a division of x1 by x2, we apply the same process used in real algebra
and confirm if the result agrees with the above definition. First, multiply the
numerator and the denominator by the complex conjugate of the denominator as
follows:
x1 ¼ x1r þ jx1i ¼ ðx1r þ jx1i Þðx2r À jx2i Þ
x2 x2r þ jx2i ðx2r þ jx2i Þðx2r À jx2i Þ
By the multiplication rule, we obtain the following expression for the denomi-
nator of the right-hand side of the above equation, where the imaginary unit j has
been eliminated:
ðx2r þ jx2i Þðx2r À jx2i Þ ¼ x22r þ x22i
By the multiplication rule, we obtain the following expression for the numerator:
ðx1r þ jx1i Þðx2r À jx2i Þ ¼ ðx1r x2r þ x1i x2i Þ þ jðx1i x2r À x2i x1r Þ
and, thus, obtain the following expression for the division:
w ¼ wr þ jwi
¼ ðx1r x2r þ x1i x2i Þ þ jðx1i x2r À x2i x1r Þ
x22r þ x22i
ð2:8Þ
¼ x1r x2r þ x1i x2i þ j x1i x2r À x2i x1r
x22r þ x22i x22r þ x22i
By comparing (2.8) with (2.7), we see that the result of a normal algebraic
division operation agrees with the definition of a division operation for the complex
number given by (2.7).
8 2 Basic Mathematical Preliminaries
2.1.3 Associative, Commutative, and Distributive Laws
of Algebra
The addition, subtraction, multiplication, and division operations of complex num-
bers defined above follow the associative, commutative, and distributive laws of
algebra as shown below.
Associative
x1 þ ðx2 þ x3Þ ¼ ðx1 þ x2Þ þ x3
x1ðx2x3Þ ¼ ðx1x2Þx3
Commutative
x1 þ x2 ¼ x2 þ x1
x1x2 ¼ x2x1
Distributive
x1ðx2 þ x3Þ ¼ x1x2 þ x1x3
2.1.4 Complex Conjugate
The complex conjugate or conjugate of the complex variable x ¼ xr + jxi, which is
denoted by x∗, is defined as follows:
x∗ ¼ xr À jxi ð2:9Þ
The following equations hold true for the complex conjugate.
Theorem 2.1.1 The complex conjugate of the sum of two complex variables is equal
to the sum of the complex conjugates of the individual complex variables as follows:
ðx1 þ x2Þ∗ ¼ x1∗ þ x2∗ ð2:10Þ
Proof Let
x1 ¼ x1r þ jx1i
x2 ¼ x2r þ jx2i
Substitute the above two equations into the following operation:
ðx1 þ x2Þ∗ ¼ ÈÀ þ Á þ À þ jx2i ÁÉ∗ ÁÀ Á
x1r jx1i x2r À
¼ ðx1r þ x2r Þ À jðx1i þ x2i Þ ¼ x1r À jx1i þ x2r À jx2i
¼ x1∗ þ x2∗
Q.E.D.
2.1 Complex Variables 9
The sum of a complex variable and its complex conjugate is equal to two times
its real component as shown below:
x þ x∗ ¼ xr þ jxi þ xr À jxi ¼ 2xr ð2:11Þ
Theorem 2.1.2 The complex conjugate of a complex variable is the original
complex variable:
ðx∗Þ∗ ¼ x ð2:12Þ
Proof ðx∗Þ∗ Â jxiÞ∗Ã∗ jxi∗
ðxr
¼ þ ¼ ½xr À ¼ xr þ jxi ¼ x
Q.E.D.
Theorem 2.1.3 If a complex variable is equal to its complex conjugate, the
variable is real, that is, if x ¼ x∗, then x is real.
Proof If x ¼ x∗, we have the following equation:
xr þ jxi ¼ xr À jxi
Rearranging the terms, we obtain the following equation:
2jxi ¼ 0
or
xi ¼ 0
Since the imaginary component is zero, the complex variable x is real.
Q.E.D.
Theorem 2.1.4 ðx1x2Þ∗ ¼ x1∗x2∗ ð2:13aÞ
Proof By taking the complex conjugate of both sides of (2.3), we obtain the
following result:
ðx1x2Þ∗ ¼ fðx1r x2r À x1i x2i Þ þ jðx1r x2i þ x1i x2r Þg∗
¼ ðx1r x2r À x1i x2i Þ À jðx1r x2i þ x1i x2r Þ
On the other hand, we have the following result:
x∗1 x∗2 ¼ À À ÁÀ À Á ¼ ðx1r x2r À x1i x2i Þ À jðx1r x2i þ x1i x2r Þ
x1r jx1i x2r jx2i
By comparing the above two results, we see that
ðx1x2Þ∗ ¼ x∗1 x∗2
Q.E.D.
10 2 Basic Mathematical Preliminaries
Theorem 2.1.5 x1 ∗ x∗1
x2 x2∗
¼ ð2:13bÞ
Proof By taking the complex conjugate of (2.7), we have the following:
x1 ∗ ¼ x1r x2r þ x1i x2i À j x2r x1i À x1r x2i
x2 x22r þ x22i x22r þ x22i
On the other hand, we obtain the right-hand side of the above equation by
evaluating the following equation:
x∗1 ¼ x1r À jx1i ¼ ðx1r À jx1i Þðx2r þ jx2i Þ
x2∗ x2r À jx2i ðx2r À jx2i Þðx2r þ jx2i Þ
¼ x1r x2r þ x1i x2i À jx2r x1i þ jx1r x2i
x22r þ x22i
¼ x1r x2r þ x1i x2i À j x2r x1i À x1r x2i
x22r þ x22i x22r þ x22i
Q.E.D.
The absolute value of a complex variable x, which is denoted by |x|, is defined as
the square root of the sum of the squares of the real and imaginary components of
x as follows:
qffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð2:14Þ
j x j¼ xr2 þ xi2
The absolute value of a complex variable x is called the magnitude of x.
The product of a complex variable x and its complex conjugate is a real variable
given by the following equation:
xx∗ ¼ ðxr þ jxiÞðxr À jxiÞ ¼ ðxrxr þ xixiÞ þ jðxrxi À xrxiÞ ¼ x2r þ x2i ð2:15Þ
By comparing (2.15) and (2.14), we obtain the following equation: ð2:16Þ
pffiffiffiffiffiffiffi
j x j¼ xx∗ or xx∗ ¼j xj2
By the Maclaurin series expansion, the following equation holds true:
ejω ¼ cos ω þ j sin ω ð2:17Þ
2.2 Matrix Operations 11
2.2 Matrix Operations
In analyzing multivariate probability distributions, it is convenient to use the matrix
operations. This section reviews the basics of the matrix and vector calculus that
will be used in subsequent discussions. Throughout this book, the vectors will be
denoted by boldface letters and the matrices by double letters.
2.2.1 Definitions
Dimensions of a Matrix
The dimension of a matrix is denoted by (m  n), where m denotes the number of
rows of the matrix and n the number of columns of the matrix. A (m  1) matrix is a
matrix of one column with m elements, that is, m rows, and is called an m-
dimensional column vector. A (1 Â n) matrix is an n-dimensional row vector.
Sometimes, the dimension of the matrix may be shown as a subscript as in mÂn. A
matrix may sometimes be denoted by
 Ãm, n
aij
i, j¼1
which represents a matrix with the ijth element denoted by aij, where the row
number i runs from 1 to m and the column number j from 1 to n.
Square Matrix
If m ¼ n, that is, (n  n), the matrix is called the square matrix. For an (n  n)
square matrix, the elements along the diagonal line of the matrix, that is, aii, are
called the diagonal elements. The elements which are not on the diagonal line, aij,
i 6¼ j, are called the off-diagonal elements.
Diagonal Matrix
A matrix is called a diagonal matrix if all its off-diagonal elements are zero as
2 a11 : : : 0 3
64666 : ::: : 77775
: : aii : :
: ::: :
0 : : : ann
Identity Matrix
A matrix is defined as the identity matrix, denoted by , if an arbitrary matrix
multiplied by results in the same matrix : if ¼ ¼ , then ¼ . By this
definition, it can be seen that the identity matrix is a diagonal matrix in which all
diagonal elements are equal to 1:
12 2 Basic Mathematical Preliminaries
23
1: : :0
66466 77775
¼ : : : : :
: : 1 : :
: : : : :
0: : :1
Transposed Matrix
An important special type of matrix that is used frequently in this book is a
transposed matrix. Given an (m  n) matrix , the transposed matrix of , denoted
by T, is the (n  m) matrix whose ijth element is equal to the jith element of for
i ¼ 1, . . ., m and j ¼ 1, . . ., n as follows:
T¼¼ÂahiajÃi0jmii, j,ni¼n,,j1m¼1 ¼ ÂajiÃni,,jm¼1 ai0j ¼ aji
The transposed matrix of a given matrix is obtained by taking the ith row of
as the ith column of the new matrix, for i ¼ 1, . . ., m.
Symmetric Matrix
A square matrix is called a symmetric matrix, if all its off-diagonal elements are
equal to their mirror image elements across the diagonal line, that is, aij ¼ aji ,
i ¼ 1 , . . . , n , j ¼ 1 , . . . , n. For a symmetric matrix , the following holds true:
¼ T ð2:18Þ
Hermitian Matrix
One special type of matrix, particularly important for complex random variables
and stochastic processes, is the Hermitian matrix. A square matrix is called a
Hermitian, or self-adjoint, matrix if it satisfies the following two conditions:
1. The off-diagonal elements are the complex conjugates of their mirror image
elements across the diagonal line of the matrix:
aij ¼ aj∗i , i 6¼ j, 1 ¼ 1, . . . , n ð2:19Þ
2. The diagonal elements are real:
aii ¼ real number
In fact, the second condition is superfluous because we can simply use the first
condition without the restriction i ¼6 j since, as shown by (2.12), if aii ¼ a∗ii , then,
aii is real, that is, the second condition follows.
The following matrix is an example of the Hermitian matrix.
2.2 Matrix Operations 13
2 4 À j5 3
1 2 6
j7 5
4 4 þ j5 Àj7
3
6
For a real matrix, where all its elements are real, the Hermitian matrix and the
symmetric matrix are the same because, with real numbers, the following holds
true:
aij ¼ a∗ji ¼ aji i 6¼ j, 1 ¼ 1, . . . , n
nth Power of a Matrix
A matrix raised to the power n, denoted by n, is defined as n successive
multiplications of as follows n≜ Â Â . . . : Â .
Submatrix
Given a matrix , a submatrix of is a matrix formed by striking out selected rows
and/or selected columns of . For example, given
23
123
¼ 44 5 65
789
the submatrix formed by striking out the second row and the second column, the
submatrix formed by striking out the first row, and the submatrix formed by striking
out the first and the second rows are, respectively,
! ! ½7 8 9
13 456
79 789
Partitioned Matrix or Block Matrix
A matrix may be divided into submatrices by inserting dividing lines between
selected rows and/or columns. For example, a given matrix can be partitioned
as shown below:
2 3 2 a11 a12 a13 j a14 a15 3
77775 6646666 a21 a22 a23 j a24
46666 a11 a12 a13 a14 a15 a31 a32 a33 j a34 a25 7777757
a22 a23 a24 a25 À À À þ À a35
¼ a21 a32 a33 a34 a35 ¼ j À
a31 a42 a43 a44 a45 a41 a42 a43 j a44
a41 a52 a53 a54 a55 a51 a52 a53 a54 a45
a51 a55
The dashed lines partition the given matrix into four submatrices. The dashed
lines are inserted to show the submatrices and do not alter the given matrix . The
14 2 Basic Mathematical Preliminaries
submatrices may be denoted by new matrix notations, and the original matrix may
be shown as a matrix of submatrices as follows:
23
a11 a12 a13 a14 a15
66646 a22 a23 a24 a25 77757 2 3
¼ a21 a32 a33 a34 a35 ¼ 4 11 j 12
a31 a42 a43 a44 a45 À þ À5
a41 21 j 22
a51 a52 a53 a54 a55
where
2 3 23
a11 a12 a13 a14 a15 !
11 ¼ 64 a21 a22 a23 75 12 ¼ 64 a24 a25 57 21 ¼ a41 a42 a43
a32! a33 a35 a51 a52 a53
a31 a34
22 ¼ a44 a45
a54 a55
Inverse Matrix
For a given matrix , if there exists a matrix such that the product of and
produces the identity matrix , is called the inverse matrix of and is denoted by
À1, that is, if ¼ , ¼ À1.
Orthogonal Matrix
A matrix is called an orthogonal matrix, if its inverse matrix is the same as its
transposed matrix as follows:
À1 ¼ T
2.2.2 Matrix Transposition
The following equations hold true with respect to the matrix transposition
operations.
ðaÞ ðTÞT ¼ ð2:20Þ
ðbÞ ðαÞT ¼ αT
ðcÞ ð þ ÞT ¼ T þ T
ðdÞ ðÞT ¼ TT
Proof of (2.20d ) Let
2.2 Matrix Operations 15
¼ ¼ T ¼ T ¼ ¼ T
By the definition of a transposed matrix, we have the following relationships:
eij ¼ bji f ij ¼ aji gij ¼ cji ð2:21Þ
By the multiplication operation defined by (2.29), we have ð2:22Þ
ð2:23Þ
Xn
gik ¼ cki ¼ akjbji
j¼1
Xn
dik ¼ eijf jk
j¼1
Substituting (2.21) into (2.23), we have ð2:24Þ
Xn Xn
dik ¼ bjiakj ¼ akjbji
j¼1 j¼1
By comparing (2.24) with (2.22), we obtain
gik ¼ dik
That is,
¼ or ðÞT ¼ TT
Transposition of a Partitioned Matrix Q.E.D.
ð2:25Þ
ðaÞ Let 2 3 Then T ¼ 82 375;9>>=T ¼ ½TjT:
¼ 64 À 75: :>><46 ÀÀ
3
2 j 0 2 T j 3
0
ðbÞ Let ¼ 64 À þ À 57: Then T ¼ 64 À þ À 75
0 j 0 j T
Proof of (2.25a) By the definition of a transposed matrix, we have the following:
16 2 Basic Mathematical Preliminaries
ð1Þ ¼  Ãm, n TT ¼¼T ¼hhabhiTkjTmjiinilnkT,j,,j,mi¼jp¼nl1,1,jq¼¼¼1 ¼ÂabjiÂjÃkmÃni,,nkjjm¼l,,Ãjp¼1nl,,1jq¼1 aiTj ¼ aji
aij bkTj ¼ bjk
i, j¼1 mlTj ¼ mjl
ÂbkjÃpk,,
ð2Þ ¼ n
j¼1
ð3Þ ¼ ÂmljÃql,,jn¼1
q¼mþp
As shown by the above three relations, has m rows and n columns, has p
rows and n columns, and has q ¼ m + p rows and n columns. The first m rows
of , that is, l ¼ 1, . . ., m, coincide with those of and the next p rows, that is,
l ¼ (m þ 1), . . ., (m + p), with those of , and, thus, we have the following
relations:
ð4Þ mlj ¼ aij l ¼ i ¼ 1, . . . , m; j ¼ 1, . . . , n
ð5Þ mlj ¼ bkj l ¼ m þ k; k ¼ 1, . . . , p; j ¼ 1, . . . , n
Now, consider T. T has n rows and m + p columns. The first m columns of
T coincide with those of T and the next p columns, with those of T. By the
relations (3), (4), and (5) above, we have the following relations:
ð6Þ mlTj ¼ mjl ¼ ail; j ¼ i ¼ 1, . . . , m; l ¼ 1, . . . , n
ð7Þ mlTj ¼ mjl ¼ bkl; j ¼ k þ m k ¼ 1, . . . , p; l ¼ 1, . . . , n
By comparing (6) and (7) with (1) and (2), we obtain the following relations:
ð8Þ mlTj ¼ alTj l ¼ 1, . . . , n; j ¼ 1, . . . , m
ð9Þ mlTj ¼ blTj l ¼ 1, . . . , n; j ¼ ðm þ 1Þ, . . . , ðm þ pÞ
(8) and (9) above show that the first m columns of T coincide with those of T
and the next p columns of T, with those of T.
Q.E.D.
Proof of (2.25b) The matrix identity (b) follows from (a) as follows. Rewrite as
follows:
2 323
j 0 ℕ
¼ 4À þ À5 ¼ 4 À 5
0 j
where
ℕ ¼ ½ j 0 ¼ ½0 j ð2:26Þ
Then, using (a), we have 23
23 0
T
T ¼ 4 À 5
ℕT ¼ 4 À 5 T
0
Again, using (a), we have the following matrix identity:
2.2 Matrix Operations 17
T ¼ 2 ℕ 3T ¼ ÂℕTjTÃ
4 À 5
Substituting (2.26) into the above equation, we obtain the desired result as
follows:
2 T j 3
0
T ¼ 4 À þ À 5
0 j T
Q.E.D.
Illustration of the Matrix Identity (2.25a)
The matrix identity (a) is illustrated below. Let
23
a11 : a1j : a1l
666466666666666666666666 777777777777577777777777
ðmþnÞÂl ¼ 2 mÂl 3 ¼ : : : : :
64 ÀÀÀ 75 ai1 : aij : ail
nÂl : : : : :
am1 : amj: : aml
ÀÀ ÀÀ ÀÀ ÀÀ ÀÀ
b11 : b1j : b1l
: : : : :
bi1 : bij : bin
: : : : :
bn1 : bnj : bnl
2 a11 : a1j : a1l 3
mÂl ¼ 6664666 : : : : : 5777777
ai1 : aij : ail
: : : : :
am1 : amj: : aml
2 b11 : b1j : b1l 3
nÂl ¼ 4666666 : : : : : 7577777
bi1 : bij : bin
: : : : :
bn1 : bnj : bnl
Then, by the definition of the matrix transposition, we have the following
transpositions of the three matrices:
18 2 Basic Mathematical Preliminaries
2 3T 2 a11 : ai1 : am1 j b11 : bi1 : bn1 3
46 75
È ÉT ¼ mÂl ¼ 4666666 : : : : : j : : : : : 7757777
ðmþnÞÂl ÀÀÀ a1j : aij : amj j b1j : bij : bnj
nÂl : : : : : j : : : : :
a1l : ail : aml j b1l : bin : bnl
2 a11 : ai1 : am1 3
ðmÂlÞT ¼ 6666466 : : : : : 7777757
a1j : aij : amj
: : : : :
a1l : ail : aml
2 b11 : bi1 : bn1 3
ðnÂlÞT ¼ 6466666 : : : : : 7777757
b1j : bij : bnj
: : : : :
b1l : bin : bnl
ð2:27Þ
We see that the two submatrices of ÈðmþnÞÂlÉT are ðmÂlÞT and ðnÂlÞT .
Example 2.2.1
Consider a 5 Â 3 matrix consisting of 3 Â 3 and 2 Â 3 submatrices and as
follows:
23
a11 a12 a13
ð3þ2ÞÂ3 ¼ 6666666664 a21 a22 a23 7777777757 ¼ 2 3Â3 3
a31 a32 a33 64 ÀÀ 75
À À À
b11 b12 b13 2Â3
2b21 b21 b23 3 "#
a11 a12 a13 b11 b12 b13
2Â3 ¼ b21 b21 b23
3Â3 ¼ 64 a21 a22 a23 57
a31 a32 a33
Taking the transposition of the above three matrices, we obtain the following
matrices:
2.2 Matrix Operations 19
2 a21 a31 b11 3
a11 b21
a22 a32 b12 b21 75
ðT3þ2ÞÂ3 ¼ 64 a12
a23 a33 3 b13 b23
2 a13 a21 a31
a11 a22 a32 75
3TÂ3 ¼ 46 a12 a23 3 a33
b21
2 a13 b22 57
b11
b23
2TÂ3 ¼ 46 b12
b13
By comparing ðT3þ2ÞÂ3, 3TÂ3 and 2TÂ3, we see that
ðT3þ2ÞÂ3 ¼ Â3TÂ3j2TÂ3Ã
2.2.3 Matrix Multiplication
If the number of columns of is equal to the number of rows of ℕ, the matrix
multiplication  ℕ is defined as follows:
2 3 2 n11 : n1k : n1l 3
75777 6666664 : : : :
66646 m11 : m1j : m1n : : : 7757777
: : : : nj1 : njk : :
mÂl ¼ mÂn  nÂl ¼ : : :  : : : : :
: mij : min : : : : :
mi1 : : : :
: nn1 nnk
mmj mmn
mm1 3 nnl
2
l11 : l1k : l1l
6664666 7777775
¼ : :: : :
: lik : :
li1 :: : :
: :: : :
:
lm1 : lmk : lml
ð2:28Þ
where
Xn i ¼ 1, . . . , m; k ¼ 1, . . . , l
lik ¼ mi1n1k þ Á Á Á þ mijnjk þ Á Á Á þ minnnk ¼ mijnjk, ð2:29Þ
j¼1
20 2 Basic Mathematical Preliminaries
The result of the multiplication of an m  n matrix and an n  l matrix ℕ is an
m  l matrix consisting of the elements defined by (2.29). The resultant of the
multiplication, , has m rows and l columns, which are the number of rows of the
first matrix and the number of columns of the second matrix, respectively.
Example 2.2.2
!! ð1 Â 1Þ þ ð2 Â 2Þ ð1 Â 3Þ þ ð2 Â 4Þ !!
12 1 3 ð3  1Þ þ ð4! 2Þ ð3 Â! 3Þ þ ð4  4Þ 5 11
ð1Þ 3! 4 2 4 ¼ ¼ 11 25
ð2Þ 1 ½1 3 ¼ ð1 Â 1Þ ð1 Â 3Þ ¼ 1 3
3 ð3 Â 1Þ ð3 Â 3Þ 3 9
The following equations hold true with the matrix multiplication
ðaÞ ð þ Þ ¼ þ
ðbÞ ð þ Þ ¼ þ
ðcÞ ðÞ ¼ ðÞ
ðdÞ ¼6 2 þ j þ 3 ð2:30Þ
2 j 32 j 3
ðeÞ64 À þ À 57 64 À þ À 75 ¼ 46 À À À þ À À À 75
j j þ j þ
2 3 2 T j T 3
ðf Þ46 À 75ÂTjTÃ ¼ 46 À þ À 57
T j T
Illustration of (2.30f)
The matrix identity ( f ) is a special case of (2.30e). The matrix identity ( f ) is
illustrated below.
Let
23
¼4À5
Then
23 2 T j 3
T
T ¼ 4 À 5½TjT ¼ 4 À þ À 5
ð2:31Þ
T j T
The above equation is illustrated below.
Let
2.2 Matrix Operations 21
23
a1
6466666666666 7777775777777
ðmþnÞÂ1 ¼ 2 mÂ1 3 ¼ :
4 À 5 :
nÂ1 am
À
b1
:
:
23 bn 2 3
a1 b1
mÂ1 ¼ 664 775 ¼ 466 757
: nÂ1 :
: :
am bn
Find the following five matrix products:
2 a1 3
ðmþnÞÂðmþnÞ ¼ fðmþnÞÂ1gfðmþnÞÂ1gT ¼ 666666666646666 : 777777777775777½a1 : : am j b1 : : bn
:
am
À
b1
:
:
2 : a1am j bn : 3
a1a1 a1b1 : a1bn
¼ 6666466666666 : : : j : : : : 7775777777777
ama1 : amam j amb1 : : ambn
ÀÀ ÀÀ ÀÀ ÀþÀ ÀÀ ÀÀ ÀÀ ÀÀ
b1a1 : b1am j b1b1 : : b1bn
: j : :
: : : j : : : :
: : : :
2 bna31 : bnam j b1bn : : bnbn
T ¼ 6646 a1 5777½ a1 2 a1a1 : a1am 3
am ¼ 4 : : :5
: : :
:
ama1 : amam
2 am3
b1 2 b1b1 b1bn 3
¼ 4666 5777½ b1 bn ¼ 4 : : :5
T : : : :
: bnb1 : bnbn
2 bn 3
a1 2 a1b1 a1bn 3
T ¼ 6664 7577½ b1 bn ¼ 4 : : :5
: : : :
: : ambn
amb1
am 2 a1b1 : a1bn 3
T ¼ ðTÞT ¼ 4 : : : 5
amb1 : ambn
22 2 Basic Mathematical Preliminaries
By comparing the last four matrix products with the four submatrices of the first
matrix product ðmþnÞÂðmþnÞ, we see that they are indeed the same so that (2.31)
results.
It will be useful later in the covariance analysis to note that, by (2.77), is
symmetric. This can be confirmed by noting that the two diagonal submatrices are
symmetric and the transposition of the upper right submatrix is the same as the
lower left submatrix.
Example 2.2.3
Let
2 1 2 3
466 3 4 577 1 2 !
À À 3 4
¼ ¼ ¼ ½1 3
1 3
ð1Þ T ¼ 1 2! 1 3! ð1 Â 1Þ þ ð2 Â 2Þ ð1 Â 3Þ þ ð2 Â 4Þ ! 5 11 !
¼ ¼
34 24 ð3 Â 1Þ þ ð4 Â 2Þ ð3 Â 3Þ þ ð4 Â 4Þ 11 25
!
ð2Þ T ¼ ½1 3 1 ¼ ½ 10 ¼ 10
3
2! 1! ð1 Â 1Þ þ ð2 Â 3Þ ! 7!
ð3Þ T ¼ 1 ¼ ¼
34 3 ! ð3 Â 1Þ þ ð4 Â 3Þ 13
ð4Þ T ¼ ½ 1 1 3 ¼ ½ ð1 Â 1Þ þ ð3 Â 2Þ ð1 Â 3Þ þ ð3 Â 4Þ ¼ ½ 7 13
3
24
We now find T first by direct multiplication and then by the multiplication of
the partitioned as follows:
2 3 1 3 1 ! 2 5 11 3
1 2 2 4 3 4 11 25 7
45 ¼ 7 13 13 5
T ¼ 4 3
3 10
1
Suppose that we have already obtained the matrix products (1)–(4), then T
can be obtained simply by inserting the products as the submatrices as follows:
2 1 2 3 2 3 2 5 11 j 7 3
4 5 466 11 25 j
646 3 4 775 1 3 j 1 ! T j T À À þ 13 577
À À 2 4 j 3 þ 7 13 j À
¼ À j À ¼
T T
1 3 10
Notice that T is a symmetric matrix in which the diagonal submatrices are
asynmotmheert.riNcotaendalstohethaotff-diaTg¼onÀal mTaÁtTr.ices are the transposition matrix of one
2.2 Matrix Operations 23
2.2.4 Determinants
Theorem 2.2.1
detÀÀ1Á ¼ ðdetÞÀ1 ð2:32Þ
The determinant of a matrix is defined by ð2:33Þ
det≜ X ðÀ1ÞtðjÞa1j1 a2j2 . . . anjn
j
where the number j below identifies the jth permutation of the n! permutations of
the numbers 1 through n
j ¼ ðj1; j2; . . . jk; . . . ; jnÞ j varies over all n!permutations of 1, 2, . . . , n ð2:34Þ
and t( j) is the total number of inversions in the permutation j. For example, for
n ¼ 3, there are six permutations of numbers 1, 2, and 3. Then,
ð j1; j2; j3Þ ¼ ð1; 2; 3Þ; ð1; 3; 2Þ; ð2; 1; 3Þ; ð2; 3; 1Þ; ð3; 1; 2Þ; ð3; 2; 1Þ ð2:35Þ
The number j is used for the identification purpose, and the order in which a
permutation is identified is not significant. For example, the permutation (1, 2, 3)
may be identified by j ¼ 1 as in 1 ¼ (1, 2, 3). The same permutation may be
identified by 6 ¼ (1, 2, 3), and the permutation (3, 2, 1) may be identified by
1 ¼ (3, 2, 1). aijk is the element of the ith row and the jkth column, where jk is the
jkth number in the permutation j.
For illustration purposes, identify the fourth permutation in the above equation
by j ¼ 4 to write 4 ¼ (2, 3, 1) and consider the corresponding term in (2.33). For
j ¼ 4, we have
ð j1; j2; j3Þ ¼ ð2; 3; 1Þ
Then, t(4) is obtained by counting the number of times the numbers in the
permutation are inversed as follows. 2 and 1 are inversed and 3 and 1 are inversed,
and so t(4) ¼ 2:
ðÀ1Þtð4Þa1j1 a2j2 a3j3 ¼ ðÀ1Þ2a12a23a31 ¼ a12a23a31
Theorem 2.2.2 If is a diagonal matrix with the diagonal elements aii , i ¼ 1 , . . . ,
n, its inverse matrix is a diagonal matrix with the diagonal elements
1 , i ¼ 1, 2, . . . , n as follows:
aii
24 2 Basic Mathematical Preliminaries
21 3
À1 ¼ 6466 a11 ÁÁÁ 0 7775 ð2:36Þ
⋮ ⋮
⋱ 1
0 ÁÁÁ
ann
The determinants of and À1 are given below:
Yn detÀ1 ¼ Yn 1 ð2:37Þ
det ¼ aii i¼1 aii
i¼1
2.2.5 Matrix Inversion
Given a square matrix , finding its inverse matrix À1 involves finding the
determinant, the minors, the cofactors, and the adjoint matrices of . The determi-
nant is discussed in the previous section. This section discusses the minors, the
cofactors, and the adjoint matrices of and shows how to find À1.
Minor
A minor Mij of is defined to be the determinant of a submatrix of obtained by
striking out the ith row and the jth column:
Mij ≜ det of submatrix of ð2:38Þ
Cofactor
The cofactor of , denoted by ij, is defined as the minor of given by (2.38)
prepended by a sign (À1)i + j as follows:
Aij ¼ ðÀ1ÞiþjMij ð2:39Þ
Adjoint Matrix
The adjoint matrix of a matrix , denoted by adj , is defined by the
following matrix, which is the transposed matrix of the matrix consisting of the
cofactors of :
 Ãn T ð2:40Þ
≜ Aij
adj i, j¼1
2.2 Matrix Operations 25
Inverse Matrix
The inverse matrix of a matrix is given by the following equation:
À1 ¼ 1 adj ð2:41Þ
det
Example 2.2.4 ð2:42Þ
Consider the case of n ¼ 3 and obtain its inverse À1:
23
c11 c12 c13
¼ 4 c21 c22 c23 5
c31 c32 c33
Identify the six permutations of (1, 2, 3) given by (2.35) using the index j ¼ 1 ~ 6
as follows:
1 ¼ ðj1; j2; j3Þ ¼ ð1; 2; 3Þ 2 ¼ ðj1; j2; j3Þ ¼ ð1; 3; 2Þ 3 ¼ ðj1; j2; j3Þ ¼ ð2; 1; 3Þ
4 ¼ ðj1; j2; j3Þ ¼ ð2; 3; 1Þ 5 ¼ ðj1; j2; j3Þ ¼ ð3; 1; 2Þ 6 ¼ ðj1; j2; j3Þ ¼ ð3; 2; 1Þ
ð2:43Þ
Use the three numbers in each permutation as the second subscript of ij as
follows:
C11C22C33 C11C23C32 C12C21C33 C12C23C31 C13 C21 C32 C13C22C31
ð2:44Þ
The total number of inversions are as follows:
tð1Þ ¼ 0 tð2Þ ¼ 1 tð3Þ ¼ 1 tð4Þ ¼ 2 tð5Þ ¼ 2 tð6Þ ¼ 3
ð2:45Þ
Substitute (2.44) and (2.45) into the following equation:
det XX ¼ X6 ðÀ1ÞtðjÞC1j1 C2j2 C3jn3 ¼ ðÀ1Þ0C11C22C33 þ ðÀ1Þ1C11C23C32
j¼1
þðÀ1Þ1C12C21C33 þ ðÀ1Þ2C12C23C31 þ ðÀ1Þ2C13C21C32 þ ðÀ1Þ3C13C22C31
¼ C11C22C33 À C11C23C32 À C12C21C33 þ C12C23C31 þ C13C21C32 À C13C22C31
ð2:46Þ
To determine the adjoint matrix adj XX, first determine its minors as follows:
26 2 Basic Mathematical Preliminaries
C22 C23 !
C32 C33
M11 ¼ det ¼ C22C33 À C23C32
C21 C23 !
C31 C33
M12 ¼ det ¼ C21C33 À C23C31
C21 C22 !
C31 C32
M13 ¼ det ¼ C21C32 À C22C31
C12 C13 !
C32 C33
M21 ¼ det ¼ C12C33 À C13C32
C11 C13 !
C31 C33
M22 ¼ det ¼ C11C33 À C13C31 ð2:47Þ
ð2:48Þ
C11 C12 !
C31 C32
M23 ¼ det ¼ C11C32 À C12C31
C12 C13 !
C22 C23
M31 ¼ det ¼ C12C23 À C13C22
C11 C13 !
C21 C23
M32 ¼ det ¼ C11C23 À C13C21
C11 C12 !
C21 C22
M33 ¼ det ¼ C11C22 À C12C21
Aij ¼ ðÀ1ÞiþjMij
Substituting the above into the following equation, we obtain the adjoint matrix
in which the minors are given by (2.47)
2 A12 A13 3T 2 ÀM12 3
A11 M11 M22 M13
adj X≜ð½Aiji3,j¼1ÞT ¼ 466 A21 A22 A23 775 ¼ 466À M21 ÀM32 ÀM23 775
M33
A31 A32 A33 M31
ð2:49Þ
2 ÀM21 M31 3T
M11 M22 ÀM32 757
¼ 466 ÀM12
M13 ÀM23 M33
By substituting (2.46) and (2.49) into (8.44), we obtain
ÀX 1 ¼ 1 adj X
detX
Example 2.2.5
For n ¼ 5, determine the total number of inversions for the permutation j ¼ (2, 4,
3, 5, 1). 2 before 1 is one inversion; 4 before 3 and 1 is two inversions; 3 before 1 is
one inversion; 5 before 1 is one inversion. t( j) ¼ 5.
2.2 Matrix Operations 27
Theorem 2.2.3 If is symmetric, its inverse matrix À1 is also symmetric.
Proof By the definition of an inverse matrix, we have
À1 ¼ ð2:50Þ
By taking the transposition of both sides of the above, we have ð2:51Þ
ÀÀ1ÁT ¼ TÀÀ1ÁT ¼ T ¼ ð2:52Þ
Since is symmetric, substituting ¼ T into the above, we have
ÀÀ1ÁT ¼
Multiplying both sides of the above from the right with À1 yields
À1ðÀ1ÞT ¼ À1
which yields
ÀÀ1ÁT ¼ À1 ð2:53Þ
By the definition of the symmetric matrix given by (2.18), À1 is symmetric.
Q.E.D.
2.2.6 Matrix Diagonalization
A matrix can be diagonalized by obtaining a diagonalizing matrix ℙ and
performing the following matrix operation:
ℙÀ1 ℙ ¼ , det ℙ 6¼ 0
where is a diagonal matrix with the eigenvalues of as the diagonal elements as
follows:
23
λ1 Á Á Á 0
¼ 4 ⋮ ⋱ ⋮ 5, λi ¼ eigenvalue, , i ¼ 1, . . . , n
0 Á Á Á λn
To find the diagonalizing matrix ℙ, find n eigenvalues of by solving the
following equation:
det ð À λÞ ¼ 0
28 2 Basic Mathematical Preliminaries
Then, obtain the eigenvectors corresponding to the eigenvalues from the fol-
lowing n linearly independent equations:
bi ¼ λibi, i ¼ 1, . . . , n
where the components of the eigenvectors are denoted as follows:
23
b1i
4666666 7577777,
bi ¼ b2i i ¼ 1, . . . ,n
:
bki
:
bni
If is symmetric, ℙ is an orthogonal matrix, that is, ℙÀ1 ¼ ℙT. In addition, ℙ is
orthonormal. The eigenvectors are orthogonal to one another and their norms are
unity. Therefore, the inner products, h., .i, of the eigenvectors are given by
Xn
bi; bj ¼ bkibkj ¼ δij
k¼1
Example 2.2.6
Diagonalize
7 À3 !
10 À4
¼
Solution
To find the eigenvalues, solve
& !'
7 À λ À3
detð À λÞ ¼ det 10 À4 À λ ¼ ðλ À 7Þðλ þ 4Þ þ 30 ¼ λ2 À 3λ þ 2
¼ ðλ À 2Þðλ À 1Þ ¼ 0
Solving the above, we have the eigenvalues λ1 ¼ 2 , λ2 ¼ 1. Find the eigenvec-
tors by solving the following equations corresponding to the two eigenvalues:
For λ1 ¼ 2 b1 ¼ λ1b1 b2 ¼ λ2b2
7 À3 !! !
10 À4 b11 b11
b21 ¼2 b21
7b11 À 3b21 ¼ 2b11 b11 ¼ 3
5b21
3
10b11 À 4b21 ¼ 2b21 b11 ¼ 5b21
The first eigenvector is given by the following, where α1 is an arbitrary constant: