The words you are searching are inside this book. To get more targeted content, please make full-text search by clicking here.

Home Explore Pure Mathematics For Beginners

View in Fullscreen

Pure Mathematics For Beginners

Like this book? You can publish your book online for free in a few minutes!

Download PDF

Related Publications

Discover the best professional documents and content resources in AnyFlip Document Base.

Published by Rafa, 2022-03-01 01:11:32

Pure Mathematics For Beginners

Pages:

Pure Mathematics For Beginners

Example 14.9:

1. The discrete topology on any set is a 4-space. Indeed, if and are disjoint closed subsets
of , then and are also disjoint open subsets of (because all subsets of are both open
and closed).

Every 4-space is a 3-space. This follows easily from the fact that a 4-space is a 1-space and
Theorem 14.2. It follows that except for the discrete topology, every other topology on a finite
set is not a 4-space.

2. The standard topologies on ℝ and ℂ are both 4. This follows immediately from Problem 14
below.

3. In Problem 15 below, you will see a 3-space that is not a 4-space.

The definitions of 0, 1, 2, 3, and 4 are called separation axioms because they all involve
“separating” points and/or closed sets from each other by open sets.

We will now look at two more types of topological spaces that appear frequently in mathematics.

A metric space is a pair ( , ), where is a set and is a function : × → ℝ with the following
properties:

1. For all , ∈ , ( , ) = 0 if and only if = .
2. For all , ∈ , ( , ) = ( , ).
3. For all , , ∈ , ( , ) ≤ ( , ) + ( , ).

The function is called a metric or distance function. It is a consequence of the definition that for all
∈ , ( , ) ≥ 0. You will be asked to prove this in Problem 2 below.

If ( , ) is a metric space, ∈ , and ∈ ℝ+, then the open ball centered at with radius , written
( ) (or ( ; ) if we need to distinguish this metric from other metrics), is the set of all elements
of whose distance to is less than . That is,

( ) = { ∈ | ( , ) < }.

The collection ℬ = { ( ) | ∈ ∧ ∈ ℝ+} covers . Indeed, if ∈ , then ( , ) = 0 < 1, and so,

∈ 1( ). ( ) = − ( , )
− ( , )
Also, the collection ℬ = { ( ) | ∈ ∧ ∈ ℝ+} has the
intersection containment property. To see this, let

∈ ( ) ∩ ( ) and = min{ − ( , ), − ( , )}.
We have ∈ ( ) because ( , ) = 0 < . Now, let

∈ ( ). Then ( , ) < . So, we have

( , ) ≤ ( , ) + ( , ) < ( , ) +

≤ ( , ) + − ( , ) = . ( )

So, ∈ ( ). A similar argument shows that ∈ ( ). So,
∈ ( ) ∩ ( ). It follows that ( ) ⊆ ( ) ∩ ( ).

201

This verifies that ℬ has the intersection containment property.

Since the collection of open balls covers and has the intersection containment property, it follows
that this collection is a basis for a topology on .

Note: Open balls can be visualized as open intervals on the real line ℝ, open disks in the Complex Plane
ℂ (or ℝ2), or open balls in three-dimensional space ℝ3.

When proving theorems about metric spaces, it’s usually most useful to visualize open balls as open
disks in ℂ. This does not mean that all metric spaces look like ℂ. The visualization should be used as
evidence that a theorem might be true. Of course, a detailed proof still needs to be written.

This is exactly what we did when we drew the picture above. That picture represents the open balls
( ) and ( ) as intersecting open disks. Inside this intersection, we can see the open ball ( ).
The reader may also want to draw another picture to help visualize the triangle inequality. A picture
similar to this is drawn to the right of Note 1 following the proof of Theorem 7.4 in Lesson 7.

A topological space ( , ) is metrizable if there is a metric : × → ℝ such that is generated from
the open balls in ( , ). We also say that the metric induces the topology .

Example 14.10:
1. (ℂ, ) is a metric space, where : ℂ × ℂ → ℝ is defined by ( , ) = | − |. Let’s check that
the 3 properties of a metric space are satisfied. Property 3 is the Triangle Inequality (Theorem
7.3 and Problem 4 in Problem Set 7). Let’s verify the other two properties. Let = + and
= + . Then ( , ) = | − | = √( − )2 + ( − )2. So, ( , ) = 0 if and only if
√( − )2 + ( − )2 = 0 if and only if ( − )2 + ( − )2 = 0 if and only if − = 0 and
− = 0 if and only if = and = if and only if = . So, property 1 holds. We have
( , ) = | − | = |– ( − )| = |– 1( − )| = |– 1|| − | = 1| − | = ( , ).

Therefore, property 2 holds.
If ∈ ℂ and ∈ ℝ+, then the open ball ( ) is the set ( ) = { ∈ ℂ | | − | < }. This is
just an open disk in the complex plane, as we defined in Lesson 7.

Since the collection of open disks in the complex plane generates the standard topology on ℂ,
we see that ℂ with the standard topology is a metrizable space.
2. Similarly, (ℝ, ) is a metric space, where : ℝ × ℝ → ℝ is defined by ( , ) = | − |. The
proof is similar to the proof above for (ℂ, ).
In this case, the open ball ( ) is the open interval ( − , + ). To see this, observe that we
have

( ) = { ∈ ℝ | | − | < } = { ∈ ℝ | – < − < }
= { ∈ ℝ | − < < + } = ( − , + ).

Since the collection of bounded open intervals of real numbers generates the standard topology
on ℝ, we see that ℝ with the standard topology is a metrizable space.

202

3. Define the functions 1 and 2 from ℂ × ℂ to ℝ by 1( , ) = |Re − Re | + |Im − Im |
and 2( , ) = max{|Re − Re |, |Im − Im |}. In Problem 7 below, you will be asked to
verify that (ℂ, 1) and (ℂ, 2) are metric spaces that induce the standard topology on ℂ.

So, we see that a metrizable space can be induced by many different metrics.

The open balls ( ; 1) and ( ; 2) are both interiors of squares. For example, the unit open
ball in the metric 1 is 1(0; 1) = { ∈ ℂ | 1(0, ) < 1} = { ∈ ℂ | |Re | + |Im | < 1},

which is the interior of a square with vertices 1, , – 1, and – . Similarly, the unit open ball in the

metric 2 is 1(0; 2) = { ∈ ℂ | 2(0, ) < 1} = { ∈ ℂ | max{|Re |, |Im |} < 1}, which

is the interior of a square with vertices 1 + , – 1 + , – 1 − , and 1 − . 7 +

8

1(0; 1) 1 + 1 1(0; 2) max {7 , 1} = 1

22 8

0 0 1 + 3

4

max {1, 3 } = 1

4

4. We can turn any nonempty set into a metric space by defining : × → ℝ by

( , ) = {01 if =
if ≠

Properties 1 and 2 are obvious. For Property 3, let , , ∈ . If = , then ( , ) = 0, and
so, ( , ) = 0 ≤ ( , ) + ( , ). If ≠ , then ( , ) = 1. Also, cannot be equal to both
and (otherwise = ∧ = → = ). So, ( , ) = 1 or ( , ) = 1 (or both).
Therefore, ( , ) + ( , ) ≥ 1 = ( , ).

If > 1, then ( ) = and if 0 < ≤ 1, then ( ) = { }. It follows that every singleton set
{ } is open and therefore, ( , ) induces the discrete topology on .

Let ( , ) be a topological space. A collection of subsets of is a covering of (or we can say that
covers ) if ⋃ = . If consists of only open sets, then we will say that is an open covering of .

A topological space ( , ) is compact if every open covering of contains a finite subcollection that
covers .

Example 14.11:

1. If is a finite set, then for any topology on , ( , ) is compact. After all, any open covering
of is already finite.

2. If is an infinite set and is the discrete topology on , then ( , ) is not compact. Indeed,
{{ } | ∈ } is an open covering of with no finite subcollection covering .

3. (ℝ, ), where is the standard topology on ℝ, is not compact. Indeed, {( , + 2) | ∈ ℤ} is
an open covering of ℝ with no finite subcollection covering ℝ.

203

4. The topological space (ℝ, ), where is the cofinite topology on ℝ (see part 3 of Example 14.6)
is compact. To see this, let be an open covering of ℝ, and let 0 be any set in . Then ℝ ∖ 0
is finite, say ℝ ∖ 0 = { 1, 2, … , }. For each = 1, 2, … , , let ∈ with ∈ . Then the
collection { 0, 1, 2, … , } is a finite subcollection from that covers ℝ.
There is actually nothing special about ℝ in this example. If is any set, we can define the
cofinite topology on to be the topology generated from the basis { ⊆ | ∖ is finite}.
If we replace ℝ by in the argument above, we see that the topological space ( , ) is compact.

Continuous Functions and Homeomorphisms

If : → and ⊆ , then the image of under is the set [ ] = { ( ) | ∈ }. Similarly, if
⊆ , then the inverse image of under is the set −1[ ] = { ∈ | ( ) ∈ }.

Let ( , ) and ( , ) be topological spaces. A function : → is continuous if for each ∈ , we
have −1[ ] ∈ .

Notes: (1) In words, a function from one topological space to another is continuous if the inverse image
of each open set is open.

(2) Continuity of a function may depend just as much on the two given topologies as it does on the
function .

(3) As an example of Note 2, if is given the discrete topology, then any function : → is
continuous. After all, every subset of is open in , and therefore every subset of of the form
−1[ ], where is an open set in , is open in .

(4) As another example, if = { , } is given the trivial topology, and = = { , } is given the
discrete topology, then the identity function : → is not continuous. To see this, just note that { }
is open in (because every subset of is open), but − 1({ }) = { } is not open in (because { } ≠ ∅
and { } ≠ ).

(5) Constant functions are always continuous. Indeed, let ∈ and suppose that : → is defined
by ( ) = for all ∈ . Let ⊆ . If ∈ , then −1[ ] = and if ∉ , then −1[ ] = ∅. Since
and ∅ are open in any topology on , is continuous.

(6) If ℬ is a basis for , then to determine if is continuous, we need only check that for each ∈ ℬ,
we have −1[ ] ∈ . To see this, assume that for each ∈ ℬ, we have −1[ ] ∈ , and let ∈ .
Since ℬ is a basis for , = ⋃ , for some subset of ℬ. So, −1[ ] = −1[⋃ ] = ⋃{ −1[ ] | ∈ }
(by part (ii) of Problem 1 below). Since is a topology, it is closed under taking arbitrary unions, and
therefore, ⋃{ −1[ ] | ∈ } ∈ .

Similarly, if is a subbasis for , then to determine if is continuous, we need only check that for each
∈ , we have −1[ ] ∈ . To see this, let’s assume that for each ∈ , we have −1[ ] ∈ and let
ℬ be the collection of all finite intersections of sets in . Then ℬ is a basis for . Let ∈ ℬ. Then
= ⋂ for some finite subset of . So, −1[ ] = −1[⋂ ] = ⋂{ −1[ ] | ∈ } (Check this!).
Since is a topology, it is closed under taking finite intersections, and so, ⋂{ −1[ ] | ∈ } ∈ .

204

Example 14.12:

1. Let ( , ) and ( , ) be the topological spaces with sets = { , } and = {1, 2, 3} and
topologies = {∅, { }, { , }} and = {∅, {1, 2}, {1, 2, 3}}. The function : → defined by
( ) = 1 and ( ) = 3 is continuous because −1[{1, 2}] = { }, which is open in ( , ). On
the other hand, the function : → defined by ( ) = 3 and ( ) = 1 is not continuous
because −1[{1, 2}] = { }, which is not open in ( , ). We can visualize these two functions as
follows:

2. Consider (ℝ, ) and (ℝ, ), where is the standard topology on ℝ and is the topology

generated by the basis {( , ∞) | ∈ ℝ}. To avoid confusion, let’s use the notation ℝ and ℝ
to indicate that we are considering ℝ with the topologies and , respectively. The identity
function 1: ℝ → ℝ is continuous because 1−1[( , ∞)] = ( , ∞) is open in (ℝ, ) for every
∈ ℝ. However, the identity function 2: ℝ → ℝ is not continuous because (0, 1) is open in
(ℝ, ), but 2−1[(0, 1)] = (0, 1) is not open in (ℝ, ).

3. Consider (ℝ, ) and ( , ), where is the standard topology on ℝ, = { , , }, and is the

topology {∅, { }, { , }, { , , }}. The function : ℝ → defined by ( ) = { if <0 is
if ≥0
continuous because −1[{ }] = ∅ and −1[{ , }] = (– ∞, 0) are both open in (ℝ, ).

If we replace the topology by the topology = {∅, { }, { , , }}, then the same function is
not continuous because −1[{ }] = [0, ∞), which is not open in (ℝ, ).

Let ( , ) and ( , ) be topological spaces. A function : → is continuous at ∈ if for each
∈ with ( ) ∈ , there is ∈ with ∈ such that [ ] ⊆ .

Example 14.13:
1. Consider the functions and from part 1 of Example 14.12. They are pictured below.

Let’s check that is continuous at . There are two open sets containing ( ) = 1. The first
one is {1, 2}. The set { } is open and [{ }] = {1} ⊆ {1, 2}. The second open set containing 1
is {1, 2, 3}. We can use the open set { } again because [{ }] = {1} ⊆ {1,2,3}. Alternatively,
we can use the open set { , } because [{ , }] = {1, 3} ⊆ {1, 2, 3}.

205

Let’s also check that is continuous at . The only open set containing ( ) = 3 is {1, 2, 3}. We
have ∈ { , } and [{ , }] = {1, 3} ⊆ {1, 2, 3}.

The function is continuous at because the only open set containing ( ) = 3 is {1, 2, 3} and
we have ∈ { } and [{ }] = {3} ⊆ {1, 2, 3}.

The function is not continuous at . The open set {1, 2} contains ( ) = 1. However, the only
open set containing is { , } and [{ , }] = {1, 3} ⊈ {1, 2}.

2. Define : ℝ → ℝ by ( ) = { + 1 if < 00. Then is not continuous at 0. To see this, note
if ≥
that (0) = 1 ∈ (0, 2) and if 0 ∈ ( , ), then [( , )] = ( , 0) ∪ [1, + 1) ⊈ (0, 2) because

2 ∈ ( , 0), so that 2 < 0, and therefore, 2 ∉ (0, 2).

If > 0, then is continuous at . To see this, let ( , ) be an open interval containing
( ) = + 1. Then < + 1 < , and so, − 1 < < − 1. Let = max{0, − 1}. Then
we have < < − 1. So, ∈ ( , − 1). Since > 0, [( , − 1)] = ( + 1, ). We now
show that ( + 1, ) ⊆ ( , ). Let ∈ ( + 1, ). Then + 1 < < . Since ≥ − 1,
+ 1 ≥ . Thus, < < , and therefore, ∈ ( , ). It follows that [( , − 1)] ⊆ ( , ).

Also, if < 0, then is continuous at . To see this, let ( , ) be an open interval containing
( ) = . Then < < . Let = min{0, }. Then we have < < . So, ∈ ( , ). Finally,
note that [( , )] = ( , ) ⊆ ( , ).

We will see in Theorem 14.4 below that if : ℝ → ℝ, where ℝ is given the standard topology,
then the topological definition of continuity here agrees with all the equivalent definitions of
continuity from Lesson 13.

Theorem 14.3: Let ( , ) and ( , ) be topological spaces and let : → . Then is continuous if
and only if is continuous at each ∈ .

Proof: Let ( , ) and ( , ) be topological spaces and let : → . First, suppose that is continuous.
Let ∈ and let ∈ with ( ) ∈ . Since is continuous, −1[ ] ∈ . If we let = −1[ ], then

by part (i) of Problem 1 below, we have [ ] = [ −1[ ]] ⊆ .

Conversely, suppose that is continuous at each ∈ . Let ∈ . If −1[ ] = ∅, then −1[ ] ∈

because every topology contains the empty set. If −1[ ] ≠ ∅, let ∈ −1[ ]. Then ( ) ∈ . So,

there is ∈ with ∈ such that [ ] ⊆ . Let = ⋃{ | ∈ −1[ ]}. Since is a union of
open sets, ∈ . We will show that = −1[ ]. Let ∈ . Then there is ∈ with ∈ . So, we
have ( ) ∈ [ ]. Since [ ] ⊆ , ( ) ∈ . Thus, ∈ −1[ ]. Since ∈ was arbitrary, we have
shown that ⊆ −1[ ]. Now, let ∈ −1[ ]. Then ( ) ∈ . So, ∈ . Since ⊆ , we have
∈ . Since ∈ −1[ ] was arbitrary, we have shown that −1[ ] ⊆ . Since ⊆ −1[ ] and

−1[ ] ⊆ , we have = −1[ ]. □

We now give an − definition of continuity for metrizable topological spaces.

Theorem 14.4: Let ( , ) and ( , ) be metrizable topological spaces where and are induced by
the metrics and , respectively. : → is continuous at ∈ if and only if for all > 0 there is
> 0 such that ( , ) < implies ( ( ), ( )) < .

206

Proof: Let ( , ) and ( , ) be topological spaces with corresponding metrics and and let ∈ .

First, suppose that : → is continuous at ∈ and let > 0. ( ) ∈ ( ( )) and ( ( )) is
open in . Since is continuous at , there is ∈ with ∈ such that [ ] ⊆ ( ( )). Since the
open balls form a basis for , we can find > 0 such that ( ) ⊆ (Why?). It follows that
[ ( )] ⊆ [ ] and so, [ ( )] ⊆ ( ( )). Now, if ( , ) < , then ∈ ( ). So,
( ) ∈ [ ( )]. Since [ ( )] ⊆ ( ( )), we have ( ) ∈ ( ( )). So, ( ( ), ( )) < .

Conversely, suppose that for all > 0 there is > 0 such that ( , ) < implies ( ( ), ( )) < .
Let ∈ with ( ) ∈ . Since the open balls form a basis for , there is > 0 such that
( ) ∈ ( ( )) and ( ( )) ⊆ (Why?). Choose > 0 such that ( , ) < implies
( ( ), ( )) < . Let = ( ). Then ∈ and ∈ . We show that [ ] ⊆ . Let ∈ [ ].
Then there is ∈ with = ( ). Since ∈ = ( ), ( , ) < . Therefore, ( ( ), ( )) < .
So, ( ) ∈ ( ( )). Since ( ( )) ⊆ , ( ) ∈ . Since = ( ), we have ∈ , as desired. □

Note: If we consider a function : ℝ → ℝ with the metric ( , ) = | − |, Theorem 14.4 shows that
all our definitions of continuity given in Lesson 13 are equivalent to the topological definitions given
here.

Let ( , ) and ( , ) be topological spaces. A function : → is a homeomorphism if is a bijection
such that ∈ if and only if [ ] ∈ .

Notes: (1) If : → is a bijection, then every subset ⊆ can be written as [ ] for exactly one
subset ⊆ . If is also continuous, then given ⊆ with [ ] ∈ , we have = −1[ [ ]] ∈ .
Conversely, suppose that is a bijection such that for every subset of , [ ] ∈ implies ∈ .
Then, given ∈ , since there is ⊆ with = [ ], by our assumption, we have
−1[ ] = −1[ [ ]] = ∈ , showing that is continuous. It follows that is a continuous bijection
if and only if is a bijection such that ∀ ⊆ ( [ ] ∈ → ∈ ).

(2) Similarly, : → is a bijective function with continuous inverse −1: → if and only if is a
bijection such that ∀ ⊆ ( ∈ → [ ] ∈ ).

(3) Notes 1 and 2 tell us that : → is a homeomorphism if and only if is a continuous bijective
function with a continuous inverse.

(4) Since a homeomorphism is bijective, it provides a one to one correspondence between the elements
of and the elements of . However, a homeomorphism does much more than this. It also provides a
one to one correspondence between the sets in and the sets in .

(5) A homeomorphism between two topological spaces is analogous to an isomorphism between two
algebraic structures (see Lesson 11). From the topologists point of view, if there is a homeomorphism
from one space to another, the two topological spaces are indistinguishable.

We say that two topological spaces ( , ) and ( , ) are homeomorphic or topologically equivalent
if there is a homeomorphism : → .

207

Example 14.14:
1. Let = { , }, = {∅, { }, { , }}, and = {∅, { }, { , }}. The map : → defined by
( ) = and ( ) = is a homeomorphism from ( , ) to ( , ). Notice that the inverse
image of the open set { } ∈ is the open set { } ∈ . This shows that is continuous.
Conversely, the image of the open set { } ∈ is the open set { } ∈ . This shows that −1 is
continuous. Since is also a bijection, we have shown that is a homeomorphism. On the other
hand, the identity function : → defined by ( ) = and ( ) = is not a
homeomorphism because it is not continuous. For example, the inverse image of the open set
{ } ∈ is the set { } which is not in the topology . We can visualize these two functions as
follows:

Notice that and are both bijections from to , but only the function also gives a one to
one correspondence between the open sets of the topology ( , ) and the open sets of the
topology ( , ).
The homeomorphism shows that ( , ) and ( , ) are topologically equivalent. So, up to
topological equivalence, there are only three topologies on a set with two elements: the trivial
topology, the discrete topology, and the topology with exactly three open sets.
2. Let = { , , }, = {∅, { }, { , }, { , , }}, and = {∅, { , }, { , , }}. Then the identity
function : → is a continuous bijection from ( , ) to ( , ). Indeed, the inverse image of
the open set { , } ∈ is the open set { , } ∈ . However, is not a homeomorphism
because −1 is not continuous. The set { } is open in , but its image [{ }] = { } is not open
in .
3. We saw in part 3 of Example 14.1 that there are 29 topologies on a set with three elements.
However, up to topological equivalence, there are only 9. Below is a visual representation of
the 9 distinct topologies on the set = { , , }, up to topological equivalence.

The dedicated reader should verify that each of the other 20 topologies are topologically
equivalent to one of these and that no two topologies displayed here are topologically
equivalent.

208

4. Consider ℝ together with the standard topology. Define : ℝ → ℝ by ( ) = 2 + 3. Let’s

check that is a homeomorphism. If ≠ , then 2 ≠ 2 , and so, 2 + 3 ≠ 2 + 3. Therefore,

∀ , ∈ ℝ( ≠ → ( ) ≠ ( )). That is, is injective. Next, if ∈ ℝ, let = −3. Then
2
( −3) ( −3)
( ) = = 2 + 3 = ( − 3) + 3 = . So, ∀ ∈ ℝ ∃ ∈ ℝ( ( ) = ). That is, is
2 2
( −3 −2 3),
surjective. Now, let ( , ) be a bounded open interval. −1[( , )] = , which is open.
2
So, is continuous. Also, [( , )] = (2 + 3, 2 + 3), which is open. So, −1 is continuous.

Since is a continuous bijection with a continuous inverse, is a homeomorphism.

5. Consider (ℝ, ) and (ℝ, ), where is the standard topology on ℝ and is the topology

generated by the basis {( , ∞) | ∈ ℝ}. We saw in part 2 of Example 14.12 that the identity
function : ℝ → ℝ is continuous because −1[( , ∞)] = ( , ∞) is open in (ℝ, ) for every
∈ ℝ. However, this function is not a homeomorphism because −1 is not continuous. For

example, (0, 1) is open in (ℝ, ), but [(0, 1)] = (0, 1) is not open in (ℝ, ).

A topological property or topological invariant is a property that is preserved under homeomorphisms.
More specifically, we say that property is a topological property if whenever the topological space
( , ) has property and ( , ) is topologically equivalent to ( , ), then ( , ) also has property .

In Problem 5 below, you will be asked to show that compactness is a topological property. As another
example, let’s show that the property of being a 2-space is a topological property.

Theorem 14.5: Let ( , ) be a 2-space and let ( , ) be topologically equivalent to ( , ). Then ( , )
is a 2-space.

Proof: Let ( , ) be a 2-space and let : → be a homeomorphism. Let , ∈ with ≠ . Since
is bijective, there are , ∈ with ≠ such that ( ) = and ( ) = . Since ( , ) is a

2-space, there are open sets , ∈ with ∈ , ∈ , and ∩ = ∅. Since is a
homeomorphism, [ ], [ ] ∈ . We also have = ( ) ∈ [ ] and = ( ) ∈ [ ]. We show

that [ ] ∩ [ ] = ∅. If not, there is ∈ [ ] ∩ [ ]. So, there are ∈ and ∈ with ( ) =

and ( ) = . So, ( ) = ( ). Since is injective, = . But then ∈ ∩ , contradicting that

∩ = ∅. It follows that [ ] ∩ [ ] = ∅. Therefore, ( , ) is a 2-space. □

The dedicated reader might want to show that each of the other separation axioms ( 0 through 4) are
topological properties and that metrizability is a topological property.

209

Problem Set 14

Full solutions to these problems are available for free download here:

www.SATPrepGet800.com/PMFBXSG

LEVEL 1

1. Let : → and let be a nonempty collection of subsets of . Prove the following:
(i) For any ∈ , [ −1[ ]] ⊆ .
(ii) −1[⋃ ] = ⋃{ −1[ ] | ∈ }.

2. Let ( , ) be a metric space. Prove that for all ∈ , ( , ) ≥ 0.

LEVEL 2

3. Prove that ℬ = { ⊆ ℝ | ℝ ∖ is finite} generates a topology on ℝ that is strictly coarser than
the standard topology. is called the cofinite topology on ℝ.

4. Let = {1 | ∈ ℤ+}, ℬ = {( , ) | , ∈ ℝ ∧ < } ∪ {( , ) ∖ | , ∈ ℝ ∧ < }. Prove

that ℬ is a basis for a topology on ℝ that is strictly finer than the standard topology on ℝ.

LEVEL 3

5. Let ( , ) and ( , ) be topological spaces with ( , ) compact and let : → be a
homeomorphism. Prove that ( , ) is compact.

6. Let be a nonempty set and let ℬ be a collection of subsets of . Prove that the set generated by
ℬ, {⋃ | ⊆ ℬ}, is equal to { ⊆ | ∀ ∈ ∃ ∈ ℬ( ∈ ∧ ⊆ )}.

7. Define the functions 1 and 2 from ℂ × ℂ to ℝ by 1( , ) = |Re − Re | + |Im − Im |
and 2( , ) = max{|Re − Re |, |Im − Im |}. Prove that (ℂ, 1) and (ℂ, 2) are metric
spaces such that 1 and 2 induce the standard topology on ℂ.

8. Let ( , ) be a topological space and let ⊆ . Prove that = { ∩ | ∈ } is a topology
on . Then prove that if ℬ is a basis for , then ℬ = { ∩ | ∈ ℬ} is a basis for . is
called the subspace topology on .

LEVEL 4

9. Let ℬ′ = {( , ) | , ∈ ℚ ∧ < }. Prove that ℬ′ is countable and that ℬ′ is a basis for a
topology on ℝ. Then show that the topology generated by ℬ′ is the standard topology on ℝ.

10. Let ( , ) be a 2-space and ⊆ . Prove that ( , ) is a 2-space (see Problem 8 for the
definition of ). Determine if the analogous statement is true for 3-spaces.

210

11. Let ( 1, 1) and ( 2, 2) be topological spaces. Let ℬ = { × | ∈ 1 ∧ ∈ 2}. Prove that ℬ
is a basis for a topology on 1 × 2, but in general, ℬ itself is not a topology on 1 × 2. Then
prove that if ℬ1 is a basis for 1 and ℬ2 is a basis for 2, then = { × | ∈ ℬ1 ∧ ∈ ℬ2} is
a basis for . The topology is called the product topology on 1 × 2.

LEVEL 5

12. Let ( 1, 1) and ( 2, 2) be 2-spaces. Prove that 1 × 2 with the product topology (as defined
in Problem 11) is also a 2-space. Determine if the analogous statement is true for 3-spaces.

13. Let be the set generated by the half open intervals of the form [ , ) with , ∈ ℝ. Show that
is a topology on ℝ that is strictly finer than the standard topology on ℝ and incomparable with
the topology .

14. Prove that every metrizable space is 4.
15. Consider the topological space (ℝ, ). Prove that ℝ2 with the corresponding product topology

(as defined in Problem 11) is a 3-space, but not a 4-space.
16. Let ( 1, 1) and ( 2, 2) be metrizable spaces. Prove that 1 × 2 with the product topology is

metrizable. Use this to show that (ℝ, ) is not metrizable.

211

LESSON 15 – COMPLEX ANALYSIS
COMPLEX VALUED FUNCTIONS

The Unit Circle

Recall from Lesson 7 that a circle in the Complex Plane is the set of all points that are at a fixed distance
(called the radius of the circle) from a fixed point (called the center of the circle).

The circumference of a circle is the distance around the circle.

If and ′ are the circumferences of two circles with radii and ′, respectively, then it turns out that

= 2 ′′. In other words, the value of the ratio Circumference is independent of the circle that we use to
2 2(radius)

form this ratio. We leave the proof of this fact for the interested reader to investigate themselves. We

call the common value of this ratio (pronounced “pi”). So, we have = , or equivalently,
2

= .

Example 15.1: The unit circle is the circle with radius 1 and center | | = 1
(0, 0). The equation of this circle is | | = 1. If we write in the
standard form = + , we see that | | = √ 2 + 2, and so,
the equation of the unit circle can also be written 2 + 2 = 1.
To the right is a picture of the unit circle in the Complex Plane.

The circumference of the unit circle is 2 ⋅ 1 = .

An angle in standard position consists of two rays, both of which
have their initial point at the origin, and one of which is the
positive -axis. We call the positive -axis the initial ray and we
call the second ray the terminal ray. The radian measure of the
angle is the part of the circumference of the unit circle beginning
at the point (1, 0) on the positive -axis and eventually ending at the point on the unit circle intercepted
by the second ray. If the motion is in the counterclockwise direction, the radian measure is positive and
if the motion is in the clockwise direction, the radian measure is negative.

Example 15.2: Let’s draw a few angles where the terminal ray lies along the line = .

212

Observe that in the leftmost picture, the arc intercepted by the angle has a length that is one-eighth of

the circumference of the circle. Since the circumference of the unit circle is 2 and the motion is in the
counterclockwise direction, the angle has a radian measure of 2 = .

84

Similarly, in the center picture, the arc intercepted by the angle has a length that is seven-eighths of

the circumference of the circle. This time the motion is in the clockwise direction, and so, the radian

measure of the angle is – 7 ⋅ 2 = – 74 .
8

In the rightmost picture, the angle consists of a complete rotation, tracing out the entire circumference

of the circle, followed by tracing out an additional length that is one-eighth the circumference of the

circle. Since the motion is in the counterclockwise direction, the radian measure of the angle is

2 + 2 = 8 + = 94 .
8 4 4

Let’s find the point of intersection of the unit circle with the terminal ray of the angle that lies along
4

the line with equation = (as shown in the leftmost figure from Example 15.2 above). If we call this

point ( , ), then we have = (because ( , ) is on the line = ) and 2 + 2 = 1 (because ( , )

is on the unit circle). Replacing by in the second equation gives us 2 + 2 = 1, or equivalently,

2 2 = 1. So, 2 = 21. The two solutions to this equation are = ±√1 = ± √1 = ± √12. From the picture,
√2
2

it should be clear that we are looking for the positive solution, so that = √12. Since = , we also
√12. (1 √12).
have = Therefore, the point of intersection is ,
√2

Notes: (1) The number 1 can also be written in the form √22. To see that these two numbers are equal,
√2

observe that we have

11 1 √2 1 ⋅ √2 √2
= ⋅1= ⋅ = = 2.
√2 √2 √2 √2 √2 ⋅ √2

(2) In the figure below on the left, we see a visual representation of the circle, the given angle, and the
desired point of intersection.

11 (– 1 , 1 ) 11
(,) (,)
√2 √2 √2 √2 √2 √2

(– 1 , – 1 ) ( 1 ,– 1 )

√2 √2 √2 √2

213

(3) In the figure above on the right, we have divided the Complex Plane into eight regions using the
lines with equations = and = – (together with the - and -axes). We then used the symmetry
of the circle to label the four points of intersection of the unit circle with each of these two lines.

If (pronounced “theta”) is the radian measure of an angle in standard position such that the terminal
ray intersects the unit circle at the point ( , ), then we will say that ( ) = ( , ). This expression
defines a function : ℝ → ℝ × ℝ called the wrapping function. Observe that the inputs of the
wrapping function are real numbers, which we think of as the radian measure of angles in standard
position. The outputs of the wrapping function are pairs of real numbers, which we think of as points
in the Complex Plane. Also, observe that the range of the wrapping function is the unit circle.

We now define the cosine and sine of the angle by = and = , where ( ) = ( , ).

For convenience, we also define the tangent of the angle by = = .

Notes: (1) The wrapping function is not one to one. For example, ( ) = (0, 1) and (5 ) = (0, 1).

≠ 52 . 2 2
2
However, There are actually infinitely many real numbers that map to (0, 1) under the

wrapping function. Specifically, ( + 2 ) = (0, 1) for every ∈ ℤ.

2

In general, each point on the unit circle is the image of infinitely many real numbers. Indeed, if
( ) = ( , ), then ( + 2 ) = ( , ) for all ∈ ℤ.

(2) The wrapping function gives us a convenient way to associate an angle in standard position with
the corresponding point ( , ) on the unit circle. It is mostly used only as a notational convenience. We
will usually be more interested in the expressions cos = and sin = .

Example 15.3: Using the rightmost figure above, we can make the following computations:

1 1 3 1 1 5 1 1 7 1 1
(4) = ( , ) ( ) = (– , ) ( ) = (– ,– ) ( ) = (, –)
√2 √2 4 √2 √2 4 √2 √2 4 √2 √2

1 1 cos 3 1 3 1
cos 4 = √2 sin 4 = √2 =– sin =
4 √2 4 √2

5 1 5 1 7 1 7 1
cos =– sin =– cos 4 = √2 sin =–
4 √2 4 √2 4 √2

It’s also easy to compute the cosine and sine of the four quadrantal angles 0, , , and 3 . Here we use

22

the fact that the points (1, 0), (0, 1), (– 1, 0), and (0, – 1) lie on the unit circle.

3
(0) = (1, 0) (2) = (0, 1) ( ) = (– 1, 0) ( 2 ) = (0, – 1)

cos 0 = 1 sin 0 = 0
cos = – 1 sin = 0 cos 2 = 0 sin 2 = 1

3 3
cos 2 = 0 sin 2 = – 1

214

Also, if we add any integer multiple of 2 to an angle, the cosine and sine of the new angle have the

same values as the old angle. For example, cos 9 = cos ( + 8 ) = cos ( + 2 ) = cos = 1 . This is
4 44 4 4 √2

a direct consequence of the fact that ( + 2 ) = ( ) for all ∈ ℤ.

We can also compute the tangent of each angle by dividing the sine of the angle by the cosine of the
angle. For example, we have

= sin = 1 = 1.
tan 4 4
√2
cos 4 1

√2

Similarly, we have

3 5 7 tan 0 = 0 tan = 0
tan 4 = – 1 tan 4 = 1 tan 4 = – 1

When = or 3 , tan is undefined.

22

Notes: (1) If = + is any complex number, then the point ( , ) lies on a circle of radius centered

at the origin, where = | | = √ 2 + 2. If is the radian measure of an angle in standard position

such that the terminal ray intersects this circle at the point ( , ), then it can be proved that the cosine
.
and sine of the angle are equal to = and =

(2) It is standard to use the abbreviations cos2 and sin2 for(cos )2 and(sin )2, respectively.

From the definition of cosine and sine, we have the following formula called the Pythagorean Identity:
cos2 + sin2 = 1

(3) Also, from the definition of cosine and sine, we have the following two formulas called the Negative
Identities:

(– ) = (– ) = – .

Theorem 15.1: Let and be the radian measures of angles and , respectively. Then we have
cos( + ) = cos cos − sin sin
sin( + ) = sin cos + cos sin .

Notes: (1) The two formulas appearing in Theorem 15.1 are called the Sum Identities. You will be asked
to prove Theorem 15.1 in Problem 14 below (parts (i) and (v)).

(2) Theorem 15.1 will be used to prove De Moivre’s Theorem (Theorem 15.2) below. De Moivre’s
Theorem provides a fast method for performing exponentiation of complex numbers.

(3) and are Greek letters pronounced “theta” and “phi,” respectively. These letters are often used
to represent angle measures. We may sometimes also use the capital versions of these letters, Θ and
Φ, especially when insisting that the radian measures of the given angles are between – and .

215

Exponential Form of a Complex Number ( , ) or ( , )

The standard form (or rectangular form) of a complex number is
= + , where and are real numbers. Recall from Lesson 7 that
we can visualize the complex number = + as the point ( , ) in
the Complex Plane.

If for ≠ 0, we let = | | = | + | = √ 2 + 2 and we let be the
radian measure of an angle in standard position such that the terminal
ray passes through the point ( , ), then we see that and determine
this point. So, we can also write this point as ( , ).

In Note 1 following Example 15.3, we saw that cos = and sin = . By multiplying each side of the

last two equations by , we get = cos and = sin . These equations allow us to rewrite the

complex number = + in the polar form = cos + sin = (cos + sin ).

If we also make the definition = cos + sin , we can write the complex number = + in
the exponential form = .

Recall from Lesson 7 that = | | is called the absolute value or modulus of the complex number. We
will call the angle an argument of the complex number and we may sometimes write = arg .

Note that although = | | and = arg uniquely determine a point ( , ), there are infinitely many
other values for arg that represent the same point. Indeed, ( , + 2 ) represents the same point
for each ∈ ℤ. However, there is a unique such value Θ for arg such that – < Θ ≤ . We call this
value Θ the principal argument of , and we write Θ = Arg .

Notes: (1) The definition = cos + sin is known as Euler’s formula.

(2) When written in exponential form, two complex numbers = and = are equal if and
only if = and = + 2 for some ∈ ℤ.

Example 15.4: Let’s convert the complex number = 1 + to exponential form. To do this, we need

to find and . We have = | | = √12 + 12 = √1 + 1 = √2. Next, we have tan = 1 = 1. It follows
1
√ .
that = . So, in exponential form, we have =

4

Note: is the principal argument of = 1 + because – < π ≤ . When we write a complex number
4 4

in exponential form, we will usually use the principle argument.

If ∈ ℂ, we define 2 to be the complex number ⋅ . Similarly, 3 = ⋅ ⋅ = 2 ⋅ . More generally,
for ∈ ℂ and ∈ ℤ we define as follows:

• For = 0, = 0 = 1.

• For ∈ ℤ+, +1 = ⋅ .

• For ∈ ℤ–, = ( – )–1 = 1– .

216

Due to the following theorem, it’s often easier to compute when is written in exponential form.
Theorem 15.2 (De Moivre’s Theorem): For all ∈ ℤ, ( ) = ( ).

Proof: For = 0, we have ( )0 = (cos + sin )0 = 1 = 0 = (0 ).

We prove De Moivre’s Theorem for ∈ ℤ+ by induction on .

Base Case ( = 1): ( )1 = = (1 ).
Inductive Step: Assume that ≥ 1 and ( ) = ( ). We then have

( ) +1 = (cos + sin ) +1 = (cos + sin ) (cos + sin ) = ( ) (cos + sin )
= ( )(cos + sin ) = (cos + sin )(cos + sin )

= [(cos )(cos ) − (sin )(sin )] + [(sin )(cos ) + (cos )(sin )] .
= cos(( + 1) ) + sin(( + 1) ) (by Theorem 15.1) = (( +1) ).

By the Principle of Mathematical Induction, ( ) = ( ) for all ∈ ℤ+.

If < 0, then

( ) = 1 = 1 = cos(– ) 1 sin(– )
( )– (– ) +

1
= cos( ) − sin( ) (by the Negative Identities)

1 cos( ) + sin( ) cos( ) + sin( )
= cos( ) − sin( ) ⋅ cos( ) + sin( ) = cos2( ) + sin2( )

= cos( ) + sin( ) (by the Pythagorean Identity) = ( ). □

Note: De Moivre’s Theorem generalizes to all ∈ ℂ with a small “twist.” In general, the expression
( ) may have multiple values, whereas ( ) takes on just one value. However, for all ∈ ℂ,
( ) = ( ) in the sense that ( ) is equal to one of the possible values of ( ) .

As a very simple example, let = 0 and = 1. Then ( ) = 0 = 1 and ( ) = 1 which has two
2
12,

values: 1 and – 1 (because 12 = 1 and (– 1)2 = 1). Observe that ( ) is equal to one of the two
possible values of ( ) .

We will not prove this more general result here.

Example 15.5: Let’s compute (2 − 2 )6. If we let = 2 − 2 , we have tan = –2 = – 1, so that = 7
2 4

(Why?). Also, = | | = √22 + (– 2)2 = √22(1 + 1) = √22 ⋅ 2 = √22 ⋅ √2 = 2√2. So, in exponential

form, = 2√2 74 , and therefore,

217

6 = (2√2 74 6 = 26 6 ( 74 6 = 64 ⋅ 8 6(74 ) = 512 212 = 512 ( 2 +10 )

) √2 )

= 512 2 = 512 (cos + sin = 512(0 + ⋅ 1) = .
2 2)

Recall that a square root of a complex number is a complex number such that = 2 (see Lesson
7). More generally, if ∈ ℂ and ∈ ℤ+, we say that ∈ ℂ is an th root of if = .

Suppose that = and = are exponential forms of , ∈ ℂ and that is an th root of .
Let’s derive a formula for in terms of and .

We have = ( ) = ( ). Since = , = ( ). So, = and = + 2 ,

where ∈ ℤ. Therefore, = √ and = +2 = + 2 for ∈ ℤ. Thus, = √ , ( +2 ) ∈ ℤ.

If ≥ , then + 2 = + 2( + − ) = + 2 +2( − ) = + 2 + 2( − ) = + 2( − ) + 2 ,

and therefore, ( +2 ) = . ( +2( − ) )

It follows that there are exactly distinct th roots of given by = √ , ( +2 ) = 0, 1, … , − 1.
The principal root of , written √ , is √ , where – < θ ≤ .

Example 15.6: Let’s compute all the eighth roots of 1 (also called the th roots of unity). If 1 = ,
then = 8√1 (08+2 8 ) = 4 for = 0, 1, 2, 3, 4, 5, 6, 7. Substituting each of these values for into
the expression 4 gives us the following 8 eighth roots of unity.

, + , , – + , – , – − , – , −
√ √ √ √ √ √ √ √

Note: Notice how the eight 8th roots of unity are uniformly distributed on the unit circle.

Functions of a Complex Variable

We will be considering functions : → ℂ, where ⊆ ℂ. If ∈ , then ( ) = for some ∈ ℂ.

If we write both and in standard form, then we have = + and = + for some real
numbers , , , and . Note that the values of and depend upon the values of and . It follows
that the complex function is equivalent to a pair of real functions , : ℝ2 → ℝ. That is, we have
( ) = ( + ) = ( , ) + ( , ).

218

If we write in the exponential form = , we have ( ) = ( ) = ( , ) + ( , ).
Notes: (1) If : → ℂ, = + and ( ) = + , then the function takes the point ( , ) in the
Complex Plane to the point ( , ) in the Complex Plane.
Compare this to a real-valued function, where a point on the real line is taken to a point on the real
line. The usual treatment here is to draw two real lines perpendicular to each other, label one of them
the -axis and the other the -axis. This forms a plane and we can plot points ( , ( )) in the usual
way.
With complex-valued functions, we cannot visualize the situation in an analogous manner. The
problem is that a visualization using this method would require us to plot points of the form ( , , , ).
So, we would need a four-dimensional version of the two-dimensional plane, but humans are capable
of perceiving only three dimensions. Therefore, we will need to come up with other methods for
visualizing complex-valued functions.
(2) One way to visualize a complex-valued function is to simply stay in
the same plane and to analyze how a typical point moves or how a
certain set is transformed. For example, let : ℂ → ℂ be defined by
( ) = − 1. Then the function takes the point ( , ) to the point
( − 1, ). That is, each point is shifted one unit to the left. Similarly, if
⊆ ℂ, then each point of the set is shifted one unit to the left by the
function . Both these situations are demonstrated in the figure to the
right.
This method may work well for very simple functions, but for more complicated functions, the method
in Note 3 below will usually be preferable.
(3) A second way to visualize a complex-valued function is to draw two separate planes: an -plane
and a -plane. We can then draw a point or a set in the -plane and its image under in the
-plane. Let’s see how this works for the function defined by ( ) = − 1 (the same function we
used in Note 2).

219

Example 15.7:

1. Let ( ) = + .

If we write = + , then we have ( + ) = + + = + ( + 1) .

So, ( , ) = and ( , ) = + 1.

Geometrically, is a translation. It takes any point ( , ) in the
Complex Plane and translates it up one unit. For example, the
point (1, 2) is translated to (1, 3) under the function because
(1 + 2 ) = (1 + 2 ) + = 1 + 3 . We can see this in the
figure to the right.

Observe that any vertical line is mapped to itself under the
function . We can see this geometrically because given a
vertical line in the Complex Plane, each point is just moved up
one unit along that same vertical line. The vertical line in the
figure on the right has equation = 1. If we let be the set of
points on the line = 1, then we see that [ ] = . In fact, the function maps bijectively
onto . It might be more precise to say that maps the vertical line = 1 in the -plane to
the vertical line = 1 in the -plane.

If a subset of ℂ satisfies [ ] ⊆ , we will say that is invariant under the function . If
[ ] = , then we will say that is surjectively invariant under . So, in this example, we see
that any vertical line is surjectively invariant under .

A horizontal line, however, is not invariant under the function . For example, the horizontal
line = 1 in the -plane is mapped bijectively to the horizontal line = 2 in the -plane.
We can visualize this mapping as follows:

= 2

= 1

In fact, for any “shape” in the -plane, after applying the function , we wind up with the same
shape shifted up 1 unit in the -plane. We can even think of this function as shifting the whole
plane up 1 unit. More specifically, the image of the -plane under is the entire -plane,
where each point in the -plane is mapped to the point in the -plane that is shifted up 1
unit from the original point. So, ℂ is surjectively invariant under .
2. Let ( ) = .̅
If we write = + , then we have ( + ) = − .

220

So, ( , ) = and ( , ) = – .
Geometrically, is a reflection in the -axis (or real axis). It
takes any point ( , ) in the Complex Plane and reflects it
through the -axis to the point ( , – ). For example, the point
(1, 2) is reflected through the -axis to the point (1, – 2) under
the function because (1 + 2 ) = 1 − 2 . We can see this in
the figure to the right.
Observe that the -axis is invariant under . To see this, note
that any point on the -axis has the form ( , 0) for some ∈ ℂ
and ( + 0 ) = − 0 = = + 0 . Notice that actually
maps each point on the -axis to itself. Therefore, we call each
point on the -axis a fixed point of .
It’s not hard to see that the subsets of ℂ that are invariant under
are precisely the subsets that are symmetric with respect to
the -axis. However, points above and below the -axis are not
fixed points of , as they are reflected across the -axis. The
figure below should help to visualize this. Note that in this
example, invariant is equivalent to surjectively invariant.

In the figure, the rectangle displayed is invariant under . The fixed points of in the rectangle
are the points on the -axis. We see that points below the -axis in the -plane are mapped
to points above the -axis in the -plane. A typical point below the -axis and its image under
above the -axis are shown. Similarly, points above the -axis in the -plane are mapped to
points below the -axis in the -plane.
3. Let ℎ( ) = .
If we write = + , then we have ℎ( + ) = ( + ) = + 2 = − = – + .
So, the function ℎ takes any point ( , ) to the point ( – , ). To understand what this means
geometrically, it is useful to analyze what the image looks like in exponential form.
If we write = , then we have ℎ( ) = ( ) = ⋅ 2 ( ) = ⋅ 2 = ( + 2 ).

221

Notice that remains unchanged under this

transformation. So, ℎ( ) is the same distance

from the origin as . However, the angle
+or 2i g. iGneobmy e trircaadlliya,n s ,
changes from to 2 is a
rotation about the or

equivalently, 90°. As an example, the point

(1, 1) is rotated 90° about the origin to the

point (– 1, 1) (see the figure to the right). We

can see this in one of two ways. If we use the

standard form of 1 + , then we have

ℎ(1 + ) = – 1 + . If we use exponential form,
then by Example 15.4, 1 + = √2 4 . So,

ℎ (√2 4 ) = √2 ( 4 + 2 ) = √2 34 . Therefore,

we have = √2 cos 3 = √2 (– 1) = – 1 and
4
√2
3 (1)
= √2 sin 4 = √2 = 1. So, once again, (1 + ) = – 1 + 1 = – 1 + .
√2

Observe that any circle centered at the origin is surjectively invariant under ℎ and the only fixed
point of ℎ is the origin.

4. Let ( ) = 2.

If we write = , then we have ( ) = ( )2 = 2( )2 = 2 (2 ) by De Moivre’s
Theorem.

Under this function, the modulus of the complex number is squared and the argument is

doubled. As an example, let’s see what happens to the point (1, 1) under this function.
Changing to exponential form, by Example 15.4, we have 1 + = √2 4 . So, (1 + ) = 2 2 .
We see that the modulus of (1 + ) is 2 and the argument of (1 + ) is . So, in the Complex

2

Plane, this is the point that is 2 units from the origin on the positive -axis (because

( ) = (0, 1) and (0, 1) lies on the positive -axis). In standard form, we have (1 + ) = 2 .

2

The only fixed points of are = 0 and = 1. To see this, note that if 2 (2 ) = , then
2 = and 2 = + 2 for some ∈ ℤ. The equation 2 = is equivalent to 2 − = 0 or

( − 1) = 0. So, = 0 or = 1. If = 0, then = 0. So, assume = 1. We see that
2 = + 2 is equivalent to = 2 . So, = 1 ⋅ (2 ) = 0 = 1.

Observe that the unit circle is surjectively invariant under . To see this, first note that if

= lies on the unit circle, then = 1 and ( ) = (2 ), which also has modulus 1.

Furthermore, every point on the unit circle has the form = and ( 2 ) = ( 2 2 =

)

by De Moivre’s Theorem.

What other subsets of ℂ are surjectively invariant under ? Here are a few:

• The positive real axis: { ∈ ℂ | Re > 0 ∧ Im = 0}

222

• The open unit disk: { ∈ ℂ | | | < 1}

• The complement of the open unit disk: { ∈ ℂ | | | ≥ 1}

The dedicated reader should prove that these sets are surjectively invariant under . Are there
any other sets that are surjectively invariant under ? What about sets that are invariant, but
not surjectively invariant?

Limits and Continuity

Let ⊆ ℂ, let : → ℂ, let ∈ ℂ, and let ∈ ℂ be a point such that contains some deleted
neighborhood of . We say that the limit of as approaches is , written lim ( ) = , if for every

→

positive number , there is a positive number such that 0 < | − | < → | ( ) − | < .

Notes: (1) The statement of this definition of limit is essentially the same as the statement of the −
definition of a limit of a real-valued function (see Lesson 13). However, the geometry looks very
different.
For a real-valued function, a deleted neighborhood of has the form ⨀ ( ) = ( − , ) ∪ ( , + )
and we can visualize this neighborhood as follows:

− +

For a complex-valued function, a deleted neighborhood of , say ⨀ ( )
⨀ ( ) = { ∈ ℂ | 0 < | − | < }, is a punctured disk with center .
We can see a visualization of such a neighborhood to the right.

(2) In ℝ, there is a simple one to one correspondence between
neighborhoods (open intervals) and (vertical or horizontal) strips.

In ℂ there is no such correspondence. Therefore, for complex-valued
functions, we start right away with the − definition.

(3) Recall that in ℝ, the expression | − | < is equivalent to
− < < + , or ∈ ( − , + ).

Also, the expression 0 < | − | is equivalent to − ≠ 0, or ≠ .

Therefore, 0 < | − | < is equivalent to ∈ ( − , ) ∪ ( , + ).

In ℂ, if we let = + and = + , then

| − | = |( + ) − ( + )| = |( − ) + ( − ) | = √( − )2 + ( − )2.

So, | − | < is equivalent to ( − )2 + ( − )2 < 2. In other words, ( , ) is inside the disk with
center ( , ) and radius .

223

Also, we have

0 < | − | ⇔ ( − )2 + ( − )2 ≠ 0 ⇔ − ≠ 0 or − ≠ 0 ⇔ ≠ or ≠ ⇔ ≠ .

Therefore, 0 < | − | < is equivalent to “ is in the punctured disk with center and radius .”

(4) Similarly, in ℝ, we have that | ( ) − | < is equivalent to ( ) ∈ ( − , + ), while in ℂ, we
have | ( ) − | < is equivalent to “ ( ) is in the disk with center and radius .”

(5) Just like for real-valued functions, we can think of determining if lim ( ) = as the result of an

→

− game. Player 1 “attacks” by choosing a positive number . This is equivalent to Player 1 choosing
the disk ( ) = { ∈ ℂ | | − | < }.

( )

Player 2 then tries to “defend” by finding a positive number . This is equivalent to Player 2 choosing
the punctured disk ⨀ ( ) = { ∈ ℂ | 0 < | − | < }.

⨀ ( ) ( )

The defense is successful if ∈ ⨀ ( ) implies ( ) ∈ ( ), or equivalently, [ ⨀ ( )] ⊆ ( ).

⨀ ( ) ( )

224

If Player 2 defends successfully, then Player 1 chooses a new positive number ′, or equivalently, a new

neighborhood ′( ) = { ∈ ℂ | | − | < ′}. If Player 1 is smart, then he/she will choose ′ to be
less than (otherwise, Player 2 can use the same ). The smaller the value of ′, the smaller the

neighborhood ′( ), and the harder it will be for Player 2 to defend. Player 2 once again tries to
choose a positive number ′ so that [ ⨀ ′ ( )] ⊆ ′ ( ). This process continues indefinitely. Player 1
wins the − game if at some stage, Player 2 cannot defend successfully. Player 2 wins the −

game if he or she defends successfully at every stage.

(6) If for a given > 0, we have found a > 0 such that [ ⨀ ( )] ⊆ ( ), then any positive number
smaller than works as well. Indeed, if 0 < ′ < , then ⨀ ′ ( ) ⊆ ⨀ ( ). It then follows that
[ ⨀ ′ ( )] ⊆ [ ⨀ ( )] ⊆ ( ).

Example 15.8: Let’s use the − definition of limit to prove that lim ( + 2) = .

→3+6 3

Analysis: Given > 0, we will find > 0 so that 0 < | − (3 + 6 )| < implies |( + 2) − | < .

3

First note that

|( + 2) − | = |1 ( + 6) − 1 (3 )| = |1 ( − 6 − 3)| = |1 | | − 3 − 6 | = 1 | − (3 + 6 )|.
3 3 33 33

So, |( + 2) − | < is equivalent to | − (3 + 6 )| < 3 . Therefore, = 3 should work.

3

Proof: Let > 0 and let = 3 . Suppose that 0 < | − (3 + 6 )| < . Then we have

|( + 2) − | = 1 | − (3 + 6 )| < 1 = 1 (3 ) = .
3 3 3
3

Since > 0 was arbitrary, we have ∀ > 0 ∃ > 0 (0 < | − (3 + 6 )| < → |( + 2) − | < ).

3

Therefore, lim ( + 2) = . □

→3+6 3

Example 15.9: Let’s use the − definition of limit to prove that lim 2 = – 1.

→

Analysis: Given > 0, we need to find > 0 so that 0 < | − | < implies | 2 − (– 1)| < . First
note that | 2 − (– 1)| = | 2 + 1| = |( − )( + )| = | − || + |. Therefore, | 2 − (– 1)| < is

equivalent to | − || + | < .

As in Example 13.9 from Lesson 13, | − | is not an issue because we’re going to be choosing so that
this expression is small enough. But to make the argument work we need to make | + | small too.
Remember from Note 6 above that if we find a value for that works, then any smaller positive number
will work too. This allows us to start by assuming that is smaller than any positive number we choose.
So, let’s just assume that ≤ 1 and see what effect that has on | + |.

Well, if ≤ 1 and 0 < | − | < , then | + | = |( − ) + 2 | ≤ | − | + |2 | < 1 + 2 = 3. Here
we used the Standard Advanced Calculus Trick (SACT) from Note 7 after Example 4.5 in Lesson 4,
followed by the Triangle Inequality (Theorem 7.3), and then the computation |2 | = |2|| | = 2 ⋅ 1 = 2.

225

So, if we assume that ≤ 1, then | 2 − (– 1)| = | − || + | < ⋅ 3 = 3 . Therefore, if we want to
make sure that | 2 − (– 1)| < , then is suffices to choose so that 3 ≤ , as long as we also have

≤ 1. So, we will let = min {1, }.

3

Proof: Let > 0 and let = min {1, 3 }. Suppose that 0 < | − | < . Then since ≤ 1, we have

| + | = |( − ) + 2 | ≤ | − | + |2 | = | − | + |2|| | = | − | + 2 < 1 + 2 = 3, and therefore,

| 2 − (– 1)| = | 2 + 1| = |( − )( + )| = | − || + | < ⋅ 3 ≤ 3 ⋅ 3 = .

Since > 0 was arbitrary, we have ∀ > 0 ∃ > 0 (0 < | − | < → | 2 − (– 1)| < ). Therefore,

lim 2 = – 1. □

→

Theorem 15.3: If lim ( ) exists, then it is unique.

→

Proof: Suppose that lim ( ) = and lim ( ) = . Let > 0. Since lim ( ) = , we can find
→ →2 .)|| L( e ( t )→ − =) ≤|m<|i n 2 ({. S 1)in,− c 2e } l. i|→m+ S u| p ( p( ) o )s=e− t ,h|wa ( →te c 0a)n<<fin | d+ − 2 = >|
0 such that
1 > 0 such that 0 < | − | < 1 22 < . Then
. Since
0 < | − | < 2 → | ( ) − | <
| − | = |( ( ) − ) − ( ( ) −

was an arbitrary positive real number, by Problem 8 from Lesson 5, we have | − | = 0. So,

− = 0, and therefore, = . □

Note: SACT stands for the Standard Advanced Calculus Trick and TI stands for the Triangle Inequality.

Example 15.10: Let’s show that lim ( )2 does not exist.

→0

Proof: If we consider complex numbers of the form + 0 , ( )2 = ( +0 )2 = ( )2 = 12 = 1. Since

−0 ( )2

every deleted neighborhood of 0 contains points of the form + 0 , we see that if lim exists, it

→0

must be equal to 1.

Next, let’s consider complex numbers of the form + . In this case, ( )2 = ( + )2 = 2 2 = – 1.
–2 2
− ( )2

Since every deleted neighborhood of 0 contains points of the form + , we see that if lim exists,

→0

it must be equal to – 1.

By Theorem 15.3, the limit does not exist. □

Define : ℂ × ℂ → ℝ by ( , ) = | − |. By Example 14.10 (part 1), (ℂ, ) is a metric space. So, by
Theorem 14.4, we have the following definition of continuity for complex-valued functions:

Let ⊆ ℂ, let : → ℂ, and let ∈ be a point such that contains some neighborhood of . is
continuous at if and only if for every positive number , there is a positive number such that

| − | < → | ( ) − ( )| < .

226

Example 15.11: Let : ℂ → ℂ be defined by ( ) = + 2. In Example 15.8, we showed that
3
(3+6 ) 3 −6 3( −2)
lim ( ) = . Since (3 + 6 ) = 3 + 2 = 3 + 2 = 3 + 2 = − 2 + 2 = , we see from

→3+6
the proof in Example 15.8 that if | − (3 + 6 )| < , then | ( ) − (3 + 6 )| = |( + 2) − | < . It

3

follows that is continuous at 3 + 6 .

More generally, let’s show that for all ∈ ℂ, is continuous at .

Proof: Let ∈ ℂ, let > 0 and let = 3 . Suppose that | − | < . Then we have

| ( ) − ( )| = |( + 2) − ( + 2)| = | ( − )| = | | | − | < 1 = 1 (3 ) = .
3 3
3 3 3 3

Since > 0 was arbitrary, we have ∀ > 0 ∃ > 0 (| − | < → | ( ) − ( )| < ).

Therefore, is continuous at . □

Notes: (1) We proved ∀ ∈ ℂ ∀ > 0 ∃ > 0 ∀ ∈ ℂ(| − | < → | ( ) − ( )| < ). In words, we
proved that for every complex number , given a positive real number , we can find a positive real
number such that whenever the distance between and is less than , the distance between ( )
and ( ) is less than . And of course, a simpler way to say this is “for every complex number , is
continuous at ,” or ∀ ∈ ℂ ( is continuous at ).”

(2) If we move the expression ∀ ∈ ℂ next to ∀ ∈ ℂ, we get a concept that is stronger than continuity.
We say that a function : → ℂ is uniformly continuous on if

∀ > 0 ∃ > 0 ∀ , ∈ (| − | < → | ( ) − ( )| < ).

(3) As a quick example of uniform continuity, let’s prove that the function : ℂ → ℂ defined by

( ) = + 2 is uniformly continuous on ℂ.
3

New proof: Let > 0 and let = 3 . Let , ∈ ℂ and suppose that | − | < . Then we have

| ( ) − ( )| = |( + 2) − ( + 2)| = | ( − )| = | | | − | < 1 = 1 ⋅ 3 = .
3 3
3 3 3 3

Since > 0 was arbitrary, we have ∀ > 0 ∃ > 0 ∀ , ∈ ℂ (| − | < → | ( ) − ( )| < ).
Therefore, is uniformly continuous.

(4) The difference between continuity and uniform continuity on a set can be described as follows:
In both cases, an is given and then a is chosen. For continuity, for each value of , we can choose a
different . For uniform continuity, once we choose a for some value of , we need to be able to use
the same for every other value of in .

In terms of disks, once a disk of radius is given, we need to be more careful how we choose our disk
of radius . As we check different -values, we can translate our chosen disk as much as we like around
the -plane. However, we are not allowed to decrease the radius of the disk.

227

The Riemann Sphere

We have used the symbols – ∞ and ∞ (or +∞) to describe unbounded intervals of real numbers, as
well as certain limits of real-valued functions. These symbols are used to express a notion of “infinity.”
If we pretend for a moment that we are standing on the real line at 0, and we begin walking to the
right, continuing indefinitely, then we might say we are walking toward ∞. Similarly, if we begin walking
to the left instead, continuing indefinitely, then we might say we are walking toward – ∞.

–∞ ∞

We would like to come up with a coherent notion of infinity with respect to the Complex Plane. There

is certainly more than one way to do this. A method that is most analogous to the picture described
above would be to define a set of infinities {∞ |0 ≤ < 2 }, the idea being that for each angle in
standard position, we have an infinity, ∞ , describing where we would be headed if we were to start
at the origin and then begin walking along the terminal ray of , continuing indefinitely.

The method in the previous paragraph, although acceptable, has the disadvantage of having to deal
with uncountably many “infinities.” Instead, we will explore a different notion that involves just a single
point at infinity. The idea is relatively simple. Pretend you have a large sheet of paper balancing on the
palm of your hand. The sheet of paper represents the Complex plane with the origin right at the center
of your palm. The palm of your hand itself represents the unit circle together with its interior.

Now, imagine using the pointer finger on your other hand to press down on the origin of that sheet of
paper (the Complex Plane), forcing your hand to form a unit sphere (reshaping the Complex Plane into
a unit sphere as well). Notice that the origin becomes the “south pole” of the sphere, while all the
“infinities” described in the last paragraph are forced together at the “north pole” of the sphere. Also,
notice that the unit circle stays fixed, the points interior to the unit circle form the lower half of the
sphere, and the points exterior to the unit circle form the upper half of the sphere with the exception
of the “north pole.”

When we visualize the unit sphere in this way, we refer to it as the Reimann Sphere.

Let’s let 2 be the Reimann Sphere and let’s officially define the north pole and south pole of 2 to be

the points = (0, 0, 1) and = (0, 0, – 1), respectively.

Also, since 2 is a subset of three-dimensional space (formally = (0, 0, 1)
2

known as ℝ3), while ℂ is only two dimensional, let’s identify ℂ ℂ
with ℂ × {0} so that we write points in the Complex Plane as
( , , 0) instead of ( , ). We can then visualize the Complex

Plane as intersecting the Reimann sphere in the unit circle. To

the right we have a picture of the Reimann Sphere together with = (0, 0, – 1)
the Complex Plane.

228

For each point in the Complex Plane, consider the line passing through the points and . This line
intersects 2 in exactly one point . This observation allows us to define a bijection : ℂ → 2 ∖
defined by ( ) = . An explicit definition of can be given by

+ − | |2 − 1
( ) = (1 + | |2 , (1 + | |2) , | |2 + 1)
Below is a picture of a point in the Complex Plane and its image ( ) = on the Riemann Sphere.

( ) =

In Challenge Problem 21 below, you will be asked to verify that is a homeomorphism. If we let
ℂ = ℂ ∪ {∞}, then we can extend to a function : ℂ → 2 by defining (∞) = . ℂ is called the

Extended Complex Plane. If we let consist of all sets ⊆ ℂ that are either open in ℂ or have the
form = ∪ {∞}, where is the complement of a closed and bounded set in ℂ, then defines a
topology on ℂ, and is a homeomorphism from (ℂ, ) to ( 2, 2), where is the product topology
on ℝ3 with respect to the standard topology on ℝ.

Note: Subspace and product topologies were defined in Problems 8 and 11 in Lesson 14.

If is a small positive number, then 1 is a large positive number. We see that the set 1 = { | | | > 1}

is a neighborhood of ∞ in the following sense. Notice that 1 consists of all points outside of the circle

1

of radius centered at the origin. The image of this set under is a deleted neighborhood of .

We can now extend our definition of limit to include various infinite cases. We will do one example
here and you will look at others in Problem 18 below.

lim ( ) = ∞ if and only if ∀ > 0 ∃ > 0 (0 < | − | < → | ( )| > 1 ).

→

Theorem 15.4: lim ( ) = ∞ if and only lim 1 = 0.
( )
→ →

Proof: Suppose lim ( ) = ∞ and let > 0. There is > 0 so that 0 < | − | < → | ( )| > 1 . But,

1 → | (1 ) 1
( )
| ( )| > is equivalent to − 0| < . So, lim = 0.

→

Now, suppose lim 1 = 0 and let > 0. There is > 0 so that 0 < | − | < → | (1 ) − 0| < . But,
( )
→
| (1 ) 1 .
− 0| < is equivalent to| ( )| > So, lim ( ) = ∞. □

→

229

Problem Set 15

Full solutions to these problems are available for free download here:

www.SATPrepGet800.com/PMFBXSG

LEVEL 1

1. In Problems 11 and 12 below, you will be asked to show that ( ) = (1 , √3) and

3 22

( ) = (√3 , 1). Use this information to compute the sine, cosine, and tangent of each of the

6 22

following angles:
(i)

6

(ii)

3

(iii) 2

3

(iv) 5

6

(v) 7

6

(vi) 4

3

(vii) 5

3

(viii) 11

6

2. Use the sum identities (Theorem 15.1) to compute the cosine, sine, and tangent of each of the
following angles:
(i) 5

12

(ii)

12

(iii) 11

12

(iv) 19

12

230

LEVEL 2

3. Each of the following complex numbers is written in exponential form. Rewrite each complex
number in standard form:

(i)
(ii) –52
(iii) 3 4
(iv) 2 3
(v) √2 76
(vi) –54

19

(vii) 12

4. Each of the following complex numbers is written in standard form. Rewrite each complex
number in exponential form:

(i) – 1 −

(ii) √3 +

(iii) 1 − √3

(iv) (√6 + √2) + (√6 − √2)

44

5. Write the following complex numbers in standard form:

(i) (√2 + √2 4

22 )

(ii) (1 + √3 )5

LEVEL 3

6. Use De Moivre’s Theorem to prove the following identities:
(i) cos 2 = cos2 − sin2
(ii) sin 2 = 2 sin cos
(iii) cos 3 = cos3 − 3 cos sin2

7. Suppose that = and = are complex numbers written in exponential form. Express
each of the following in exponential form. Provide a proof in each case:
(i)
(ii)

231

8. Write each function in the form ( ) = ( , ) + ( , ) and ( ) = ( , ) + ( , ):
(i) ( ) = 2 2 − 5
(ii) ( ) = 1

(iii) ( ) = 3 + 2 + + 1

9. Let ( ) = 2 − 2 − 2 + 2 ( + 1) . Rewrite ( ) in terms of .

10. Find all complex numbers that satisfy the given equation:
(i) 6 − 1 = 0
(ii) 4 + 4 = 0

LEVEL 4

11. Consider triangle , where = (0, 0), = (1, 0), and is the point on the unit circle so that
angle has radian measure . Prove that triangle is equilateral, and then use this to prove

3

that ( ) = (1 , √3). You may use the following facts about triangles: (i) The interior angle

3 22

measures of a triangle sum to radians; (ii) Two sides of a triangle have the same length if and
only if the interior angles of the triangle opposite these sides have the same measure; (iii) If two
sides of a triangle have the same length, then the line segment beginning at the point of
intersection of those two sides and terminating on the opposite base midway between the
endpoints of that base is perpendicular to that base.

12. Prove that ( ) = (√3 , 1). You can use facts (i), (ii), and (iii) described in Problem 11.

6 22

13. Let and be the radian measure of angles and , respectively. Prove the following identity:
cos( − ) = cos cos + sin sin

14. Let and be the radian measure of angles and , respectively. Prove the following identities:
(i) cos( + ) = cos cos − sin sin
(ii) cos( − ) = – cos
(iii) cos ( − ) = sin

2

(iv) sin ( − ) = cos

2

(v) sin( + ) = sin cos + cos sin
(vi) sin( − ) = – sin

15. Let , ∈ ℂ. Prove that arg = arg + arg in the sense that if two of the three terms in the
equation are specified, then there is a value for the third term so that the equation holds. Similarly,
prove that arg = arg − arg . Finally, provide examples to show that the corresponding

equations are false if we replace “arg” by “Arg.”

232

LEVEL 5

16. Define the function : ℂ → ℂ by ( ) = 2. Determine the images under of each of the
following sets:
(i) = { + | 2 − 2 = 1}
(ii) = { + | > 0 ∧ > 0 ∧ < 1}
(iii) = { + | ≥ 0 ∧ ≥ 0}
(iv) = { + | ≥ 0}

17. Let ⊆ ℂ, let : → ℂ, let = + ∈ ℂ, and let = + ∈ ℂ be a point such that

contains some deleted neighborhood of . Suppose that ( + ) = ( , ) + ( , ). Prove

that lim ( ) = if and only if lim ( , ) = and lim ( , ) = .
→ ( , )→( , ) ( , )→( , )

18. Give a reasonable definition for each of the following limits (like what was done right before
Theorem 15.4). is a finite real number.

(i) lim ( ) =

→∞

(ii) lim ( ) = ∞

→∞

19. Prove each of the following:

(i) lim ( ) = if and only lim (1) =
→∞ →0

(ii) lim ( ) = ∞ if and only lim 1 = 0.
(1 )
→∞ →0

20. Let , : ℝ → ℝ be defined by ( ) = cos and ( ) = sin . Prove that and are uniformly
continuous on ℝ. Hint: Use the fact that the least distance between two points is a straight line.

CHALLENGE PROBLEM

21. Consider ℂ with the standard topology and 2 with its subspace topology, where 2 is being
considered as a subspace of ℝ3. Let : ℂ → 2 ∖ be defined as follows:
+ − | |2 − 1
( ) = (1 + | |2 , (1 + | |2) , | |2 + 1)

Prove that is a homeomorphism.

233

LESSON 16 – LINEAR ALGEBRA
LINEAR TRANSFORMATIONS

Linear Transformations

Recall from Lesson 8 that a vector space over a field is a set together with a binary operation + on
(called addition) and an operation called scalar multiplication satisfying the following properties:

(1) (Closure under addition) For all . ∈ , + ∈ .
(2) (Associativity of addition) For all , , ∈ , ( + ) + = + ( + ).
(3) (Commutativity of addition) For all , ∈ , + = + .
(4) (Additive identity) There exists an element 0 ∈ such that for all ∈ , 0 + = + 0 = .
(5) (Additive inverse) For each ∈ , there is – ∈ such that + (– ) = (– ) + = 0.
(6) (Closure under scalar multiplication) For all ∈ and ∈ , ∈ .
(7) (Scalar multiplication identity) If 1 is the multiplicative identity of and ∈ , then 1 = .
(8) (Associativity of scalar multiplication) For all , ∈ and ∈ , ( ) = ( ).
(9) (Distributivity of 1 scalar over 2 vectors) For all ∈ and , ∈ , ( + ) = + .
(10) (Distributivity of 2 scalars over 1 vector) For all , ∈ and ∈ , ( + ) = + .

The simplest examples of vector spaces are ℚ , ℝ , and ℂ , the vector spaces consisting of -tuples of
rational numbers, real numbers, and complex numbers, respectively. As a specific example, we have
ℝ3 = {( , , ) | , , ∈ ℝ} with addition defined by ( , , ) + ( , , ) = ( + , + , + ) and
scalar multiplication defined by ( , , ) = ( , , ). Note that unless specified otherwise, we
would usually consider ℝ3 as a vector space over ℝ, so that the scalars are all real numbers.

Let and be vector spaces over a field , and let : → be a function from to .

We say that is additive if for all , ∈ , ( + ) = ( ) + ( ).

We say that is homogenous if for all ∈ and all ∈ , ( ) = ( ).

is a linear transformation if it is additive and homogeneous.

Example 16.1:
1. Let = = ℂ be vector spaces over ℝ and define : ℂ → ℂ by ( ) = 5 . We see that
( + ) = 5( + ) = 5 + 5 = ( ) + ( ). So, is additive. Furthermore, we have
( ) = 5( ) = (5 ) = ( ). So, is homogenous. Therefore, is a linear
transformation.
More generally, for any vector space over ℝ and any ∈ ℝ, the function : → defined
by ( ) = is a linear transformation. The verification is nearly identical to what we did in
the last paragraph. This type of linear transformation is called a dilation.

234

Note that if , ∈ ℝ with ≠ 0, then the function : → defined by ( ) = + is not
a linear transformation. To see this, observe that (2 ) = (2 ) + = 2 + and
2 ( ) = 2( + ) = 2 + 2 . If (2 ) = 2 ( ), then 2 + = 2 + 2 , or
equivalently, = 2 . Subtracting from each side of this equation yields = 0, contrary to
our assumption that ≠ 0. So, the linear functions that we learned about in high school are
usually not linear transformations. The only linear functions that are linear transformations are
the ones that pass through the origin (in other words, must be 0).
2. Let = ℝ4 and = ℝ3 be vector spaces over ℝ and define : ℝ4 → ℝ3 by

(( , , , )) = ( + , 2 − 3 , 5 − 2 ).

We have
(( , , , ) + ( , , , )) = (( + , + , + , + ))

= (( + ) + ( + ), 2( + ) − 3( + ), 5( + ) − 2( + ))
= (( + ) + ( + ), (2 − 3 ) + (2 − 3 ), (5 − 2 ) + (5 − 2 ))

= ( + , 2 − 3 , 5 − 2 ) + ( + , 2 − 3 , 5 − 2 )
= (( , , , )) + (( , , , )).

So, is additive. Also, we have
( ( , , , )) = (( , , , ))

= ( + , 2( ) − 3( ), 5( ) − 2( ))
= ( ( + ), (2 − 3 ), (5 − 2 ))

= ( + , 2 − 3 , 5 − 2 ) = (( , , , )).

So, is homogenous. Therefore, is a linear transformation.
3. Let = ℝ2 and = ℝ be vector spaces over ℝ and define : ℝ2 → ℝ by (( , )) = . Then

is not a linear transformation. Indeed, consider (1, 0), (0, 1) ∈ ℝ2. We have

((1,0) + (0, 1)) = ((1, 1)) = 1 ⋅ 1 = 1.

((1, 0)) + ((0, 1)) = 1 ⋅ 0 + 0 ⋅ 1 = 0 + 0 = 0.

So, ((1,0) + (0, 1)) ≠ ((1, 0)) + ((0, 1)). This shows that is not additive, and therefore,
is not a linear transformation.
Observe that is also not homogeneous. To see this, consider (1, 1) ∈ ℝ2 and 2 ∈ ℝ. We have
(2(1, 1)) = ((2, 2)) = 2 ⋅ 2 = 4, but 2 (1, 1) = 2(1 ⋅ 1) = 2 ⋅ 1 = 2.

In Problem 3 below, you will be asked to show that neither additivity nor homogeneity alone is enough
to guarantee that a function is a linear transformation.

Recall from Lesson 8 that if , ∈ and , ∈ , then + is called a linear combination of the
vectors and with weights and . The next theorem says that a function is a linear transformation
if and only if it “behaves well” with respect to linear combinations.

235

Theorem 16.1: Let and be vector spaces over a field . A function : → is a linear
transformation if and only if for all , ∈ and all , ∈ , ( + ) = ( ) + ( ).

Proof: Suppose that : → is a linear transformation, let , ∈ , and let , ∈ . Since is
additive, ( + ) = ( ) + ( ). Since is homogenous, ( ) = ( ) and
( ) = ( ). Therefore, ( + ) = ( ) + ( ) = ( ) + ( ), as desired.

Conversely, suppose that for all , ∈ , ( + ) = ( ) + ( ). Let , ∈ and let

= = 1. Then ( + ) = (1 + 1 ) = 1 ( ) + 1 ( ) = ( ) + ( ). Therefore, is

additive. Now, let ∈ and ∈ . Then ( ) = ( + 0 ) = ( ) + 0 ( ) = ( ).

Therefore, is homogenous. It follows that is a linear transformation. □

We can use induction to extend Theorem 16.1 to arbitrary linear combinations. If ∈ can be written
as a linear combination of vectors 1, 2, … , ∈ , then ( ) is determined by ( 1), ( 2),…, ( ).
Specifically, if = 1 1 + 2 2 + ⋯ + , then we have

( ) = ( 1 1 + 2 2 + ⋯ + ) = 1 ( 1) + 2 ( 2) + ⋯ + ( ).

In particular, if = { 1, 2, … , } is a basis of , then is completely determined by the values of
( 1), ( 2),…, ( ).

Notes: (1) Recall from Lesson 8 that the vectors 1, 2, … , ∈ are linearly independent if whenever
1 1 + 2 2 + ⋯ + = 0, it follows that all the weights 1, 2, … , are 0.

(2) Also, recall that the set of all linear combinations of 1, 2, … , ∈ is called the span of
1, 2, … , , written span{ 1, 2, … , }.

(3) The set of vectors { 1, 2, … , } is a basis of if 1, 2, … , are linearly independent and
span{ 1, 2, … , } = .

In particular, if { 1, 2, … , } is a basis of , then every vector in can be written as a linear
combination of 1, 2, … , .

So, if we know the values of ( 1), ( 2), … , ( ), then we know the value of ( ) for any ∈ , as
shown above.

In other words, given a basis of , any function : → extends uniquely to a linear transformation
: → .

Let and be vector spaces over a field . We define ℒ( , ) to be the set of all linear
transformations from to . Symbolically, ℒ( , ) = { : → | is a linear transformation}.

Theorem 16.2: Let and be vector spaces over a field . Then ℒ( , ) is a vector space over ,
where addition and scalar multiplication are defined as follows:

+ ∈ ℒ( , ) is defined by ( + )( ) = ( ) + ( ) for , ∈ ℒ( , ).

∈ ℒ( , ) is defined by ( )( ) = ( ) for ∈ ℒ( , ) and ∈ .

236

The reader will be asked to prove Theorem 16.2 in Problem 8 below.

If , , and are vector spaces over , and : → , : → are linear transformations, then the
composition ∘ : → is a linear transformation, where ∘ is defined by ( ∘ )( ) = ( ( ))
for all ∈ . To see this, let , ∈ and , ∈ . Then we have

( ∘ )( + ) = ( ( + )) = ( ( ) + ( ))
= ( ( ( ))) + ( ( ( ))) = ( ∘ )( ) + ( ∘ )( ).

Example 16.2: Let : ℝ2 → ℝ3 be the linear transformation defined by (( , )) = ( , + , ) and
let : ℝ3 → ℝ2 be the linear transformation defined by (( , , )) = ( − , − ). Then
∘ : ℝ2 → ℝ2 is a linear transformation and we have

( ∘ )(( , )) = ( (( , ))) = (( , + , )) = (– , − ).

Notes: (1) In Example 16.2, the composition ∘ : ℝ3 → ℝ3 is also a linear transformation and we have

( ∘ )(( , , )) = ( (( , , ))) = (( − , − )) = ( − , − , − ).

(2) In general, if : → , : → are linear transformations, then ∘ is defined if and only if
= . So, just because ∘ is defined, it does not mean that ∘ is also defined. For example, if
: ℝ → ℝ2 and : ℝ2 → ℝ3, then ∘ is defined and ∘ : ℝ → ℝ3. However, ∘ is not defined.
The “outputs” of the linear transformation are ordered triples of real numbers, while the “inputs” of
the linear transformation are real numbers. They just don’t “match up."

(3) If and are both linear transformations from a vector space to itself (that is , : → ), then
the compositions ∘ and ∘ will both also be linear transformations from to itself.

By Note 3 above, in the vector space ℒ( , ), we can define a multiplication by = ∘ . This
definition of multiplication gives ℒ( , ) a ring structure. In fact, with addition, scalar multiplication,
and composition as previously defined, ℒ( , ) is a structure called a linear algebra.

A linear algebra over a field is a triple ( , +, ⋅), where ( , +) is a vector space over , ( , +, ⋅) is a
ring, and for all , ∈ and ∈ , ( ) = ( ) = ( ).

We will call the last property “compatibility of scalar and vector multiplication.”

Notes: (1) There are two multiplications defined in a linear algebra. As for a vector space, we have
scalar multiplication. We will refer to the ring multiplication as vector multiplication.

(2) Recall from Lesson 4 that a ring ( , +, ⋅) satisfies the first 5 properties of a vector space listed above
(with in place of ) together with the following three additional properties of vector multiplication:

• (Closure) For all , ∈ , ⋅ ∈ .
• (Associativity) For all , , ∈ , ( ⋅ ) ⋅ = ⋅ ( ⋅ ).
• (Identity) There exists an element 1 ∈ such that for all ∈ , 1 ⋅ = ⋅ 1 = .

237

Example 16.3:

1. (ℝ, +, ⋅) is a linear algebra over ℝ, where addition and multiplication are defined in the usual
way. In this example, scalar and vector multiplication are the same.

2. Similarly, (ℂ, +, ⋅) is a linear algebra over ℂ, where addition and multiplication are defined in
the usual way (see Lesson 7). Again, in this example, scalar and vector multiplication are the
same.

3. If is a vector space over a field , then ℒ( , ) is a linear algebra over , where addition and
scalar multiplication are defined as in Theorem 16.2, and vector multiplication is given by
composition of linear transformations. You will be asked to verify this in Problem 9 below.

Recall from Lesson 10 that a function : → is injective if , ∈ and ≠ implies ( ) ≠ ( ).
Also, is surjective if for all ∈ , there is ∈ with ( ) = . A bijective function is one that is both
injective and surjective.

Also recall that a bijective function is invertible. The inverse of is then the function −1: →
defined by −1( ) = “the unique ∈ such that ( ) = .”

By Theorem 10.6 from Lesson 10, −1 ∘ = and ∘ −1 = , where and are the identity
functions on and , respectively. Furthermore, −1 is the only function that satisfies these two
equations. Indeed, if ℎ: → also satisfies ℎ ∘ = and ∘ ℎ = , then

ℎ = ℎ ∘ = ℎ̇ ∘ ( ∘ −1) = (ℎ ∘ ) ∘ −1 = ∘ −1 = −1.

A bijection : → that is also a linear transformation is called an isomorphism. If an isomorphism
: → exists, we say that and are isomorphic. As is always the case with algebraic structures,
isomorphic vector spaces are essentially identical. The only difference between them are the “names”
of the elements. Isomorphisms were covered in more generality in Lesson 11.

If a bijective function happens to be a linear transformation between two vector spaces, it’s nice to
know that the inverse function is also a linear transformation. We prove this now.

Theorem 16.3: Let : → be an invertible linear transformation. Then −1: → is also a linear
transformation.

Proof: Let : → be an invertible linear transformation, let , ∈ , and let , ∈ . Then by the
linearity of , we have

( −1( ) + −1( )) = ( −1( )) + ( −1( )) = + .

Since is injective, −1( ) + −1( ) is the unique element of whose image under is + .

By the definition of −1, −1( + ) = −1( ) + −1( ). □

238

Example 16.4:

1. Let = = ℂ be vector spaces over ℝ and define : ℂ → ℂ by ( ) = 5 , as we did in part 1

of Example 16.1. if ≠ , then 5 ≠ 5 , and so is injective. Also, if ∈ ℂ, then we have

(1 ) = 5 (1 ) = . So, is surjective. It follows that is invertible and that the inverse of

5 5 1
5
is defined by −1( ) = . By Theorem 16.3, −1: ℂ → ℂ is also a linear transformation. In

the terminology of Lesson 11, is an automorphism. In other words, is an isomorphism from

ℂ to itself.

2. Let be a vector space over a field with basis { 1, 2, 3}. Then let : → 3 be the unique
linear transformation such that ( 1) = (1, 0, 0), ( 2) = (0, 1, 0), and ( 3) = (0, 0, 1). In
other words, if ∈ , since { 1, 2, 3} is a basis of , we can write = 1 1 + 2 2 + 3 3,
and is defined by ( ) = 1 ( 1) + 2 ( 2) + 3 ( 3) = ( 1, 2, 3).

To see that is injective, suppose that ( 1 1 + 2 2 + 3 3) = ( 1 1 + 2 2 + 3 3).
Then ( 1, 2, 3) = ( 1, 2, 3). It follows that 1 = 1, 2 = 2, and 3 = 3. Therefore,
1 1 + 2 2 + 3 3 = 1 1 + 2 2 + 3 3 and so, is injective.

Now, if ( , , ) ∈ 3, then ( 1 + 2 + 3) = ( , , ) and so, is surjective. From this
computation, we also see that −1: 3 → is defined by −1(( , , )) = 1 + 2 + 3.

It follows that : → 3 is an isomorphism, so that is isomorphic to 3.

Essentially the same argument as above can be used to show that if is a vector space over a
field with a basis consisting of vectors, then is isomorphic to .

Matrices

Recall from Lesson 8 that for , ∈ ℤ+, an × matrix over a field is a rectangular array with

rows and columns, and entries in . For example, the matrix = [ 2 − 5 1

5 ] is a

– 1 √3 7 +
2 × 3 matrix over ℂ. We will generally use a capital letter to represent a matrix, and the corresponding

lowercase letter with double subscripts to represent the entries of the matrix. We use the first subscript

for the row and the second subscript for the column. Using the matrix above as an example, we see

that ℎ11 = , ℎ12 = 2 − 5 , ℎ13 = 51, ℎ21 = – 1, ℎ22 = √3, and ℎ23 = 7 + .

If is an × matrix, then we can visualize as follows:

11 ⋯ 1

= [ ⋮ ⋮]

1 ⋯

We let be the set of all × matrices over the field . Recall that we add two matrices
, ∈ to get + ∈ using the rule ( + ) = + . We multiply a matrix ∈
by a scalar ∈ using the rule ( ) = . We can visualize these computations as follows:

11 ⋯ 1 11 ⋯ 1 11 + 11 ⋯ 1 + 1

[ ⋮ ⋮ ]+[ ⋮ ⋮ ]=[ ⋮ ⋮]

1 ⋯ 1 ⋯ 1 + 1 ⋯ +

239

11 ⋯ 1 11 ⋯ 1

[ ⋮ ⋮ ]=[ ⋮ ⋮]

1 ⋯ 1 ⋯

With these operations of addition and scalar multiplication, is a vector space over .

We would now like to turn into a linear algebra over by defining a vector multiplication in .
Notice that we will not be turning all vector spaces into linear algebras. We will be able to do this
only when = . That is, the linear algebra will consist only of square matrices of a specific size.

We first define the product of an × matrix with an × matrix, where , , are positive
integers. Notice that to take the product we first insist that the number of columns of be equal
to the number of rows of (these are the “inner” two numbers in the expressions “ × ” and
“ × ”).

So, how do we actually multiply two matrices? This is a bit complicated and requires just a little practice.
Let’s begin by walking through an example while informally describing the procedure, so that we can
get a feel for how matrix multiplication works before getting caught up in the “messy looking”
definition.

Let = [30 21] and = [01 2 60]. Notice that is a 2 × 2 matrix and is a 2 × 3 matrix. Since
3
has 2 columns and has 2 rows, we will be able to multiply the two matrices.

For each row of the first matrix and each column of the second matrix, we add up the products entry
by entry. Let’s compute the product as an example.

= [03 21] ⋅ [10 2 06] = [ ]
3

Since is in the first row and first column, we use the first row of and the first column of to get
= [0 1] [01] = 0 ⋅ 1 + 1 ⋅ 0 = 0 + 0 = 0.

Since is in the second row and first column, we use the second row of and the first column of to
get = [3 2] [01] = 3 ⋅ 1 + 2 ⋅ 0 = 3.

The reader should attempt to follow this procedure to compute the values of the remaining entries.
The final product is

= [03 3 162]
12

Notes: (1) The product of a × 2 matrix and a 2 × matrix is a 2 × 3 matrix.

(2) More generally, the product of an × matrix and an × matrix is an × matrix. Observe
that the inner most numbers (both ) must agree, and the resulting product has dimensions given by
the outermost numbers ( and ).

240

11 ⋯ 1

We formally define matrix multiplication as follows. Let be the × matrix = [ ⋮ ⋮]

11 ⋯ 1 1 ⋯

and let be the × matrix = [ ⋮ ⋮ ]. We define the product to be the × matrix

11 ⋯ 1 1 ⋯

= [ ⋮ ⋮ ] such that

1 ⋯

= 1 1 + 2 2 + ⋯ + = ∑ .

=1

Notes: (1) The symbol Σ is the Greek letter Sigma. In mathematics, this symbol is often used to denote
a sum. Σ is generally used to abbreviate a very large sum or a sum of unknown length by specifying
what a typical term of the sum looks like. Let’s look at a simpler example first before we analyze the
more complicated one above:

5

∑ 2 = 12 + 22 + 32 + 42 + 52 = 1 + 4 + 9 + 16 + 25 = 55.

=1

The expression “ = 1” written underneath the symbol indicates that we get the first term of the sum
by replacing by 1 in the given expression. When we replace by 1 in the expression 2, we get 12.

For the second term, we simply increase by 1 to get = 2. So, we replace by 2 to get 2 = 22.

We continue in this fashion, increasing by 1 each time until we reach the number written above the
symbol. In this case, that is = 5.

(2) Let’s now get back to the expression that we’re interested in.

= ∑ = 1 1 + 2 2 + ⋯ +

=1

Once again, the expression “ = 1” written underneath the symbol indicates that we get the first term
of the sum by replacing by 1 in the given expression. When we replace by 1 in the expression
, we get 1 1 . Notice that this is the first term of .

For the second term, we simply increase by 1 to get = 2. So, we replace by 2 to get 2 2 .

We continue in this fashion, increasing by 1 each time until we reach the number written above the
symbol. In this case, that is = . So, the last term is .

(3) In general, we get the entry in the th row and th column of = by “multiplying” the th
row of with the th column of . We can think of the computation like this:

241

1

[ 1 2 ⋯ ] 2 = 1 1 + 2 2 + ⋯ +
⋮

[ ]

Notice how we multiply the leftmost entry 1 by the topmost entry 1 . Then we move one step to
the right to 2 and one step down to 2 to form the next product, … and so on.

It is fairly straightforward to verify that with our definitions of addition, scalar multiplication, and matrix
multiplication, for each ∈ ℤ+, is a linear algebra over . I leave this as an exercise for the reader.
Note that it is important that the number of rows and columns of our matrices are the same. Otherwise,

the matrix products will not be defined.

Example 16.5:

5

1. [1 2 3 4] ⋅ [– 12] = [1 ⋅ 5 + 2 ⋅ 1 + 3(– 2) + 4 ⋅ 3] = [5 + 2 − 6 + 12] = [13].

3
5

We generally identify a 1 × 1 matrix with its only entry. So, [1 2 3 4] ⋅ [– 12] = .

5 3

2. [– 21] ⋅ [1 2 3 4] = [– – ].
– –

3

55

Notice that [1 2 3 4] ⋅ [– 21] ≠ [– 21] ⋅ [1 2 3 4], and in fact, the two products do not even

33
have the same size. This shows that if and are both defined, then they do not need to

be equal.

3. [01 21] ⋅ [03 22] = [00 + 6 2 + 42] = [36 26].
+ 3 0 +

[03 22] ⋅ [01 21] = [03 + 0 0 + 22] = [30 28].
+ 0 6 +

Notice that [01 21] ⋅ [30 22] ≠ [03 22] ⋅ [10 12].

This shows that even if and are square matrices of the same size, in general ≠ . So,
matrix multiplication is not commutative. is a noncommutative linear algebra.

The Matrix of a Linear Transformation

Let ∈ ℒ( , ) and let = { 1, 2, … , } and = { 1, 2, … , } be bases of and ,
respectively. Recall that is completely determined by the values of ( 1), ( 2), … , ( ).
Furthermore, since ( 1), ( 2),…, ( ) ∈ and is a basis for , each of ( 1), ( 2), … , ( )
can be written as a linear combination of the vectors in . So, we have

242

( 1) = 11 1 + 21 2 + ⋯ + 1
( 2) = 12 1 + 22 2 + ⋯ + 2

⋮

( ) = 1 1 + 2 2 + ⋯ +
⋮

( ) = 1 1 + 2 2 + ⋯ +

Here, we have ∈ for each = 1, 2, … , and = 1, 2, … , . We form the following matrix:

11 ⋯ 1

ℳ ( , ) = [ ⋮ ⋮]
1 ⋯

ℳ ( , ) is called the matrix of the linear transformation with respect to the bases and .

Note: The coefficients in the expression ( ) = 1 1 + 2 2 + ⋯ + become the th
column of ℳ ( , ). Your first instinct might be to form the row [ 1 2 ⋯ ], but this is incorrect.
Pay careful attention to how we form ℳ ( , ) in part 2 of Example 16.6 below to make sure that you
avoid this error.

Example 16.6:

1. Consider the linear transformation : ℂ → ℂ from part 1 of Example 16.1. We are considering
ℂ as a vector space over ℝ and is defined by ( ) = 5 . Let’s use the standard basis for ℂ, so
that = = {1 + 0 , 0 + 1 } = {1, }. We have

(1) = 5 = 5 ⋅ 1 + 0 ⋅

( ) = 5 = 0 ⋅ 1 + 5 ⋅

The matrix of with respect to the standard basis is ℳ ({1, }, {1, }) = [50 05].

In this case, since is being mapped from a vector space to itself and we are using the same
basis for both “copies” of ℂ, we can abbreviate ℳ ({1, }, {1, }) as ℳ ({1, }). Furthermore,
since we are using the standard basis, we can abbreviate ℳ ({1, }, {1, }) even further as ℳ .
So, we can simply write ℳ = [50 05].
Now, let = + ∈ ℂ and write as the column vector = [ ]. We have

ℳ ⋅ = [05 05] [ ] = [55 ] = 5 [ ] = 5 = ( ).

So, multiplication on the left by ℳ gives the same result as applying the transformation .

2. Consider the linear transformation : ℝ4 → ℝ3 from part 2 of Problem 16.1. We are considering
ℝ4 and ℝ3 as vector spaces over ℝ and is defined by

(( , , , )) = ( + , 2 − 3 , 5 − 2 ).
Let’s use the standard bases for ℝ4 and ℝ3, so that
= {(1, 0, 0, 0), (0, 1, 0, 0), (0, 0, 1, 0), (0, 0, 0, 1)} and = {(1, 0, 0), (0, 1, 0), (0, 0, 1)}.

243

We have

((1, 0, 0, 0)) = (1, 2, 0)
((0, 1, 0, 0)) = (0, – 3, 5)
((0, 0, 1, 0)) = (1, 0, 0)
((0, 0, 0, 1)) = (0, 0, – 2)

1 010
The matrix of with respect to the standard bases is ℳ = [2 – 3 0 0]

0 5 0 –2

Once again, we abbreviate ℳ ( , ) as ℳ because we are using the standard bases.

Now, let = ( , , , ) ∈ ℝ4 and write as the column vector = [ ]. We have

1 0 1 0 [ ] = + ( ).
ℳ ⋅ = [2 –3 0 0]
0 [ 2 − 3 ] =
0 5 – 2 5 − 2

So, once again, multiplication on the left by ℳ gives the same result as applying the
transformation .

Let be a vector space over with a finite basis. Then we say that is finite-dimensional. If

= { 1, 2, … , }, then by Problem 12 from Lesson 8, all bases of have elements. In this case, we
say that is -dimensional, and we write dim = .

Theorem 16.4: Let be an -dimensional vector space over a field . Then there is a linear algebra
isomorphism : ℒ( , ) →

You will be asked to prove Theorem 16.4 in Problem 15 below.

Images and Kernels

Let : → be a linear transformation. The image (or range) of is the set [ ] = { ( ) | ∈ }
and the kernel (or null space) of is the set ker( ) = { ∈ | ( ) = 0}.

Example 16.7: Let : ℝ4 → ℝ3 be defined by (( , , , )) = ( + , − , + 2 ). Let’s compute
[ℝ4] and ker( ). First, [ℝ4] consists of all vectors of the form

( + , − , + 2 ) = ( + )(1, 0, 0) + ( − )(0, 1, 0) + ( + 2 )(0, 0, 1)

So, if ( 1, 2, 3) ∈ ℝ3, let = 0, = 1, = – 2, and = 1 3. Then we see that
2

( + )(1, 0, 0) + ( − )(0, 1, 0) + ( + 2 )(0, 0, 1)

= 1(1, 0, 0) + 2(0, 1, 0) + 3(0, 0, 1) = ( 1, 2, 3)

244

Therefore, ℝ3 ⊆ [ℝ4]. Since it is clear that [ℝ4] ⊆ ℝ3, we have [ℝ4] = ℝ3.

Now, ( , , , ) ∈ ker( ) if and only if ( + , − , + 2 ) = (0, 0, 0) if and only if + = 0,
− = 0, and + 2 = 0 if and only if = – , = , and = – if and only if

2
) 12).
( , , , ) = ( , – , , – = (1, – 1, 1, –
2

So, every element of ker( ) is a scalar multiple of (1, – 1, 1, – 12). Thus, ker( ) ⊆ span {(1, – 1, 1, – 21)}.

Conversely, an element of span {(1, – 1, 1, – 1)} has the form ( , – , , – 1 ), and we have

22

(( , – , , – 1 )) = ( − , − , + 2 (– 1 )) = (0, 0, 0). So, span {(1, – 1, 1, – 1)} ⊆ ker( )
2 2
2

Therefore, ker( ) = span {(1, – 1, 1, – 21)}.

Notice that [ℝ4] is a subspace of ℝ3 (in fact, [ℝ4] = ℝ3) and ker( ) is a subspace of ℝ4. Also, the
sum of the dimensions of [ℝ4] and ker( ) is 3 + 1 = 4, which is the dimension of ℝ4. None of this is
a coincidence, as we will see in the next few theorems.

Theorem 16.5: Let and be vector spaces over a field and let : → be a linear transformation.
Then [ ] ≤ .

Proof: We have (0) = (0 + 0) = (0) + (0). Therefore, (0) = 0. It follows that 0 ∈ [ ].

Let , ∈ [ ]. Then there are , ∈ with ( ) = and ( ) = . It then follows that
( + ) = ( ) + ( ) = + . So, + ∈ [ ].

Let ∈ [ ] and ∈ . Then there is ∈ with ( ) = . We have ( ) = ( ) = .
Therefore, ∈ [ ].

By Theorem 8.1 from Lesson 8, [ ] ≤ . □

Theorem 16.6: Let and be vector spaces over a field and let : → be a linear transformation.
Then ker( ) ≤ .

Proof: As in the proof of Theorem 16.5, we have (0) = 0. So, 0 ∈ ker( ).

Let , ∈ ker( ). Then ( + ) = ( ) + ( ) = 0 + 0 = 0. So, + ∈ ker( ).

Let ∈ ker( ) and ∈ . Then ( ) = ( ) = ⋅ 0 = 0. Therefore, ∈ ker( ).

By Theorem 8.1 from Lesson 8, ker( ) ≤ . □

Theorem 16.7: Let and be vector spaces over a field and let : → be a linear transformation.
Then ker( ) = {0} if and only if is injective.

245

Proof: Suppose that ker( ) = {0}, let , ∈ , and let ( ) = ( ). Then ( ) − ( ) = 0. It follows
that ( − ) = ( ) − ( ) = 0. So, − ∈ ker( ). Since ker( ) = {0}, − = 0. Therefore,
= . Since , ∈ were arbitrary, is injective.

Conversely, suppose that is injective, and let ∈ ker( ). Then ( ) = 0. But also, by the proof of

Theorem 16.5, (0) = 0. So, ( ) = (0). Since is injective, = 0. Since ∈ was arbitrary,

ker( ) ⊆ {0}. By the proof of Theorem 16.5, (0) = 0, so that 0 ∈ ker( ), and so, {0} ⊆ ker( ). It

follows that ker( ) = {0}. □

If and are vector spaces over a field , and : → is a linear transformation, then the rank of
is the dimension of [ ] and the nullity of is the dimension of ker( ).

Theorem 16.8: Let and be vector spaces over a field with dim = and let : → be a
linear transformation. Then rank + nullity = .

Note: Before proving the theorem, let’s observe that in a finite-dimensional vector space , any vectors
that are linearly independent can be extended to a basis of .

To see this, let 1, 2, … be linearly independent and let 1, 2, … , be any vectors such that

span{ 1, 2, … , } = . We will decide one by one if we should throw in or exclude each .

Specifically, we start by first letting 0 = { 1, 2, … } and then 1 = { 0 0 if 1 ∈ span 0.
∪ { 1} if 1 ∉ span 0.
−1 if ∈ span −1.
In general, for each = 1, 2, … , we let = { −1 ∪ { } if ∉ span −1. By Problem 6 from

Lesson 8, for each , is linearly independent. Since for each , ∈ span and ⊆ ,

= span{ 1, 2, … , } = span . Therefore, is a basis of .

Proof of Theorem 16.8: Suppose nullity = , where 0 ≤ ≤ . Then there is a basis { 1, 2, … , }
of ker( ) (note that if = 0, this basis is the empty set). In particular, the vectors 1, 2, … , are
linearly independent. By the note above, we can extend these vectors to a basis of , let’s say

= { 1, 2, … , , 1, 2, … , }. So, we have = + . Let’s show that { ( 1), ( 2), … , ( )}
is a basis of [ ].

For linear independence of ( 1), ( 2), … , ( ), note that since is a linear transformation,
1 ( 1) + 2 ( 2) + ⋯ + ( ) = 0 is equivalent to ( 1 1 + 2 2 + ⋯ + ) = 0, which is
equivalent to 1 1 + 2 2 + ⋯ + ∈ ker( ). Since { 1, 2, … , } is a basis of ker( ), we can
find weights 1, 2, … , such that 1 1 + 2 2 + ⋯ + = 1 1 + 2 2 + ⋯ + . Since is
a basis of , all weights (the ’s and ’s) are 0. So, ( 1), ( 2), … , ( ) are linearly independent.

To see that [ ] = span{ ( 1), ( 2), … , ( )}, let ∈ . Since is a basis of , we can write as
a linear combination = 1 1 + 2 2 + ⋯ + 1 1 + 2 2 + ⋯ + . Applying the linear
transformation gives us

( ) = ( 1 1 + 2 2 + ⋯ + + 1 1 + 2 2 + ⋯ + )
= 1 ( 1) + 2 ( 2) + ⋯ + ( ) + 1 ( 1) + 2 ( 2) + ⋯ + ( )

= 1 ( 1) + 2 ( 2) + ⋯ + ( ).

246

Note that ( 1), ( 2), … , ( ) are all 0 because 1, 2, … , ∈ ker( ).

Since each vector of the form ( ) can be written as a linear combination of ( 1), ( 2), … , ( ),
we have shown that [ ] = span{ ( 1), ( 2), … , ( )}.

Since ( 1), ( 2), … , ( ) are linearly independent and [ ] = span{ ( 1), ( 2), … , ( )}, it

follows that { ( 1), ( 2), … , ( )} is a basis of [ ]. Therefore, rank = . □

Eigenvalues and Eigenvectors

We now restrict our attention to linear transformations from a vector space to itself. For a vector space
, we will abbreviate the linear algebra ℒ( , ) by ℒ( ).

If ≤ , we say that is invariant under ∈ ℒ( ) if [ ] ⊆ .

Example 16.8: Let be a vector space and let ∈ ℒ( ).
1. {0} is invariant under . Indeed, (0) = 0 by the proof of Theorem 16.5.
2. is invariant under . Indeed, if ∈ , then ( ) ∈ .
3. ker( ) is invariant under . To see this, let ∈ ker( ). Then ( ) = 0 ∈ ker( ).
4. [ ] is invariant under . To see this, let ∈ [ ]. Then ( ) is clearly also in [ ].

Let be a vector space over a field . We call a subspace ≤ a simple subspace if it consists of all
scalar multiples of a single vector. In other words, is simple if there is a ∈ such that
= { | ∈ }.

Theorem 16.9: Let be a vector space over a field , let = { | ∈ } be a simple subspace of ,
and let ∈ ℒ( ). Then is invariant under if and only if there is ∈ such that ( ) = .

Proof: Suppose that = { | ∈ } is invariant under . Then ( ) ∈ . It follows that ( ) =
for some ∈ .

Conversely, suppose there is ∈ such that ( ) = . Let ∈ . Then there is ∈ such that

= . Then ( ) = ( ) = ( ) = ( ) = ( ) ∈ . Since ∈ was arbitrary, [ ] ⊆ .

Therefore, is invariant under . □

Let be a vector space over a field and let ∈ ℒ( ). A scalar ∈ is called an eigenvalue of if
there is a nonzero vector ∈ such that ( ) = . The vector is called an eigenvector of .

Notes: (1) If is the zero vector, Then ( ) = (0) = 0 = ⋅ 0 for every scalar . This is why we
exclude the zero vector from being an eigenvector. An eigenvector must be nonzero.

(2) If we let : → be the identity linear transformation defined by ( ) = for all ∈ , then we
can write as ( ). So, the equation ( ) = is equivalent to the equation ( − )( ) = 0.

(3) It follows from Note 2 that is an eigenvalue of if and only if ker( − ) ≠ {0}. By Theorem
16.7, is an eigenvalue of if and only if − is not injective.

247

(4) By Note 2, is an eigenvector of corresponding to eigenvalue if and only if is a nonzero vector
such that ( − )( ) = 0. So, the set of eigenvectors of corresponding to is ker( − ). By
Theorem 16.6, ker( − ) is a subspace of . We call this subspace the eigenspace of corresponding
to the eigenvalue .

Example 16.9:

1. Let be any vector space over a field and let : → be the identity linear transformation.
Then for any ∈ , ( ) = = 1 . So, we see that 1 is the only eigenvalue of and every
nonzero vector ∈ is an eigenvector of for the eigenvalue 1.

2. More generally, if ∈ , then the linear transformation satisfies ( )( ) = ( ) = for
all ∈ . So, we see that is the only eigenvalue of and every nonzero vector ∈ is an
eigenvector of for the eigenvalue .

3. Consider ℂ2 as a vector space over ℂ and define : ℂ2 → ℂ2 by (( , )) = (– , ). Observe
that = is an eigenvalue of with corresponding eigenvector (1, – ). Indeed, we have
((1, – )) = ( , 1) and (1, – ) = ( , – 2) = ( , 1). So, ((1, – )) = (1, – ).

Let’s find all the eigenvalues of this linear transformation. We need to solve the equation
(( , )) = ( , ), or equivalently, (– , ) = ( , ). Equating the first components and
second components gives us the two equations – = and = . Solving the first equation
for yields = – . Substituting into the second equation gives us = (– ) = – 2 . So,
+ 2 = 0. Using distributivity on the left-hand side of this equation gives (1 + 2) = 0. So,
= 0 or 1 + 2 = 0. If = 0, then = – ⋅ 0 = 0. So, ( , ) = (0, 0). Since an eigenvector
must be nonzero, we reject = 0. The equation 1 + 2 = 0 has the two solutions = and
= – . These are the two eigenvalues of .

Next, let’s find the eigenvectors corresponding to the eigenvalue = . In this case, we have
(( , )) = ( , ), or equivalently, (– , ) = ( , ). So, – = and = . These two
equations are actually equivalent. Indeed, if we multiply each side of the second equation by ,
we get = 2 , or equivalently, = – or – = .

So, we use only one of the equations, say – = , or equivalently, = – . So, the
eigenvectors of corresponding to the eigenvalue = are all nonzero vectors of the form
( , – ). For example, letting = 1, we see that (1, – ) is an eigenvector corresponding to the
eigenvalue = .

Let’s also find the eigenvectors corresponding to the eigenvalue = – . In this case, we have
(( , )) = – ( , ), or equivalently, (– , ) = (– , – ). So, – = – and = – . Once
again, these two equations are equivalent. Indeed, if we multiply each side of the second
equation by – , we get – = 2 , or equivalently, – = – or – = – .

So, we use only one of the equations, say – = – , or equivalently, = . So, the
eigenvectors of corresponding to the eigenvalue = – are all nonzero vectors of the form
( , ). For example, letting = 1, we see that (1, ) is an eigenvector corresponding to the
eigenvalue = – .

248

Note that if we consider the vector space ℝ2
over the field ℝ instead of ℂ2 over ℂ, then the
linear transformation : ℝ2 → ℝ2 defined by
(( , )) = (– , ) has no eigenvalues (and

therefore, no eigenvectors). Algebraically, this
follows from the fact that 1 + 2 = 0 has no
real solutions.

It is also easy to see geometrically that this
transformation has no eigenvalues. The given
transformation rotates any nonzero point
( , ) ∈ ℝ2 counterclockwise by 90°. Since no
multiple of ( , ) results in such a rotation, we
see that there is no eigenvalue. The figure to
the right shows how rotates the point (1, 1)
counterclockwise 90° to the point (– 1, 1).

Let be a vector space over a field , let 1, 2, … , ∈ , and 1, 2, … , ∈ . Recall from Lesson 8
that the expression 1 1 + 2 2 + ⋯ + is called a linear combination of the vectors 1, 2, … ,
with weights 1, 2, … , .

Also recall once more that 1, 2, … , are linearly dependent if there exist weights 1, 2, … , ∈ ,
with at least one weight nonzero, such that 1 1 + 2 2 + ⋯ + = 0. Otherwise, we say that
1, 2, … , are linearly independent.

In Problem 6 from Lesson 8, you were asked to prove that if a finite set of at least two vectors is linearly
dependent, then one of the vectors in the set can be written as a linear combination of the other
vectors in the set. To prove the next theorem (Theorem 16.11), we will need the following slightly
stronger result.

Lemma 16.10: Let be a vector space over a field and let 1, 2, … , ∈ be linearly dependent
with ≥ 2. Also assume that 1 ≠ 0. Then there is ≤ such that can be written as a linear
combination of 1, 2, … , −1.

Proof: Suppose that 1, 2, … , are linearly dependent and 1 ≠ 0. Let 1 1 + 2 2 + ⋯ + = 0

be a nontrivial dependence relation (in other words, not all the are 0). Since 1 ≠ 0, we must have

≠ 0 for some ≠ 1 (otherwise 1 1 + 2 2 + ⋯ + = 0 implies 1 1 = 0, which implies that

1 = 0, contradicting that the dependence relation is nontrivial). Let be the largest value such that

≠ 0. Then we have 1 1 + 2 2 + ⋯ + = 1 1 + 2 2 + ⋯ + + 0 +1 ⋯ + 0 , and so,

1 1 + 2 2 + ⋯ + = 0. Since ≠ 0, we can solve for to get

= – 1 1 − ⋯ − −1 −1.

So, can be written as a linear combination of 1, 2, … , −1. □

249

Note: A lemma is a theorem whose primary purpose it to prove a more important theorem. Although
Lemma 16.10 is an important result in Linear Algebra, the main reason we are mentioning it now is to
help us prove the next theorem (Theorem 16.11).

Theorem 16.11: Let be a vector space over a field , let ∈ ℒ( ), and let 1, 2, … , be distinct
eigenvalues of with corresponding eigenvectors 1, 2, … , . Then 1, 2, … , are linearly
independent.

Proof: Suppose toward contradiction that 1, 2, … , are linearly dependent. Let be the least integer
such that can be written as a linear combination of 1, 2, … , −1 (we can find such a by Lemma
16.10). Then there are weights 1, 2, … , −1 such that = 1 1 + 2 2 + ⋯ + −1 −1. Apply the
linear transformation to each side of this last equation to get the equation
( ) = ( 1 1 + 2 2 + ⋯ + −1 −1) = 1 ( 1) + 2 ( 2) + ⋯ + −1 ( −1). Since each is
an eigenvector corresponding to eigenvalue , we have = 1 1 1 + 2 2 2 + ⋯ + −1 −1 −1.
We can also multiply each side of the equation = 1 1 + 2 2 + ⋯ + −1 −1 by to get the
equation = 1 1 + 2 2 + ⋯ + −1 −1. We now subtract:

= 1 1 + 2 2 + ⋯ + −1 −1
= 1 1 1 + 2 2 2 + ⋯ + −1 −1 −1

0 = 1( − 1) 1 + 2( − 2) 2 + ⋯ + −1( − −1) −1

Since we chose to be the least integer such that can be written as a linear combination of

1, 2, … , −1, it follows that 1, 2, … , −1 are linearly independent. Therefore, the constants

1( − 1), 2( − 2),…, −1( − −1) are all 0. Since the eigenvalues are all distinct, we must

have 1 = 2 = ⋯ = −1 = 0. Then = 1 1 + 2 2 + ⋯ + −1 −1 = 0, contradicting our

assumption that is an eigenvector. Therefore, 1, 2, … , cannot be linearly dependent. So,

1, 2, … , are linearly independent. □

11 ⋯ 1

Let be a square matrix, say = [ ⋮ ⋮ ]. The diagonal entries of are the entries
1 ⋯

11, 22, … , . All other entries of are nondiagonal entries.

152
Example 16.10: The diagonal entries of the matrix = [3 6 0] are 11 = 1, 22 = 6, and 33 = 8.

298
The nondiagonal entries of are 12 = 5, 13 = 2, 21 = 3, 23 = 0, 31 = 2, and 32 = 9.

A diagonal matrix is a square matrix that has every nondiagonal entry equal to 0.

Example 16.11: The matrix from Example 16.10 is not a diagonal matrix, while the matrices

100 5 00

= [0 6 0] and = [0 – 2 0] are diagonal matrices.

008 0 00

Let be a vector space. A linear transformation ∈ ℒ( ) is said to be diagonalizable if there is a basis
ℬ of for which ℳ ( ) is a diagonal matrix.

250

Pages:

Click to View FlipBook Version