The words you are searching are inside this book. To get more targeted content, please make full-text search by clicking here.
Discover the best professional documents and content resources in AnyFlip Document Base.
Search
Published by , 2018-11-25 06:17:04

Reema Thareja

Reema Thareja

Graphs  387

Algorithm

The algorithm to find the transitive closure of a graph G is given in Fig. 13.8. In order to determine

the transitive closure of a graph, we define a matrix

When k = 0 Ti0j = Ï0 if (i, j) is not in E t where tkij = 1, for i, j, k = 1, 2, 3, ... n if there
When k ≥ 1 ÌÓÔ1 if (i, j) is in E exists a path in G from the vertex i to vertex j with

Tikj = T k- 1 V (T k -1 L T k j-1) intermediate vertices in the set (1, 2, 3, ..., k) and
ij i k k
0 otherwise. That is, G* is constructed by adding an

Figure 13.7  Relation between k and Tikj edge (i, j) into E* if and only if tkij = 1. Look at Fig.
13.7 which shows the relation between k and Tkij.

Transitive_Closure(A, t, n)

Step 1: SET i=1, j=1, k=1

Step 2: Repeat Steps 3 and 4 while i<=n

Step 3: Repeat Step 4 while j<=n

Step 4: IF (A[i][j] = 1)

SET t[i][j] = 1

ELSE

SET t[i][j] =

INCREMENT j

[END OF LOOP]

INCREMENT i

[END OF LOOP]

Step 5: Repeat Steps 6 to 11 while k<=n

Step 6: Repeat Steps 7 to 1 while i<=n

Step 7: Repeat Steps 8 and 9 while j<=n

Step 8: SET t[i,j] = t[i][j] V (t[i][k] L t[k][j])

Step 9: INCREMENT j

[END OF LOOP]

Step 1 : INCREMENT i

[END OF LOOP]

Step 11: INCREMENT k

[END OF LOOP]

Step 12: END

Figure 13.8  Algorithm to find the transitive enclosure of a graph G

13.4  BI-CONNECTED components

A vertex v of G is called an articulation point, if removing v along with the edges incident on v,
results in a graph that has at least two connected components.
 A bi-connected graph (shown in Fig. 13.10) is defined as a connected graph that has no
articulation vertices. That is, a bi-connected graph is connected and non-separable in the sense that
even if we remove any vertex from the graph, the resultant graph is still connected. By definition,

AH AH ∑ A bi-connected undirected graph is a connected graph that
BI cannot be broken into disconnected pieces by deleting any
BC I single vertex.
DJ
DJ ∑ In a bi-connected directed graph, for any two vertices v and
w, there are two directed paths from v to w which have no
vertices in common other than v and w.

EF EF   Note that the graph shown in Fig. 13.9(a) is not a bi-connected
(a) (b) graph, as deleting vertex C from the graph results in two disconnected
components of the original graph (Fig. 13.9(b)).
Figure 13.9  Non bi-connected graph

388  Data Structures Using C

A B A B  As for vertices, there is a related concept for edges.

An edge in a graph is called a bridge if removing

DC CDE that edge results in a disconnected graph. Also, an

edge in a graph that does not lie on a cycle is a
Figure 13.10 Bi-connected Figure 13.11  Graph with bridge. This means that a bridge has at least one
graph bridges
articulation point at its end, although it is not

AD AD necessary that the articulation point is linked to a

CE C E bridge. Look at the graph shown in Fig. 13.11.

BF BF   In the graph, CD and DE are bridges. Consider some

(There are no bridges) (CD is a bridge) more examples shown in Fig. 13.12.

A E ABCD 13.5  REPRESENTATION OF GRAPHS

CD F There are three common ways of storing graphs in
the computer’s memory. They are:
BG E
(CD is a bridge) (All edges are bridges)

Figure 13.12  Graph with bridges ∑ Sequential representation by using an adjacency
matrix.

∑ Linked representation by using an adjacency list that stores the neighbours of a node using

a linked list.

∑ Adjacency multi-list which is an extension of linked representation.

In this section, we will discuss both these schemes in detail.

13.5.1 Adjacency Matrix Representation

An adjacency matrix is used to represent which nodes are adjacent to one another. By definition,

two nodes are said to be adjacent if there is an edge connecting them.

In a directed graph G, if node v is adjacent to node u, then there is definitely an edge from u

to v. That is, if v is adjacent to u, we can get from u to v by traversing one edge. For any graph G

having n nodes, the adjacency matrix will have the dimension of n ¥ n.

  In an adjacency matrix, the rows and columns are labelled by graph vertices. An entry aij in the

adjacency matrix will contain 1, if vertices vi and vj are adjacent to each other. However, if the nodes are

not adjacent, aijwill be set to zero. It is summarized in Fig. 13.13.

1 [if vi is adjacent to vj, that is   Since an adjacency matrix contains only 0s and 1s, it is
there is an edge (vi, vj)]A
aij called a bit matrix or a Boolean matrix. The entries in the
matrix depend on the ordering of the nodes in G. Therefore,

0 [otherwise] a change in the order of nodes will result in a different

adjacency matrix. Figure 13.14 shows some graphs and their

Figure 13.13  Adjacency matrix entry corresponding adjacency matrices.

A BCDE A BCD
A B C A01010 A B A0101

B00010

C01000 B0111

DE D00001 C D C1001

E00100 D0010

(a) Directed graph (b) Directed graph with loop

A BCDE 45 A BCDE
ABC
A B C A01010 A04020

B10110 2 B00070

C01001 7 1 C05000

DE D11001 DE D00003
E00110 3 E00100

(c) Undirected graph (d) Weighted graph

Figure 13.14  Graphs and their corresponding adjacency matrices

Graphs  389

From the above examples, we can draw the following conclusions:

∑ For a simple graph (that has no loops), the adjacency matrix has 0s on the diagonal.
∑ The adjacency matrix of an undirected graph is symmetric.
∑ The memory use of an adjacency matrix is O(n2), where n is the number of nodes in the graph.
∑ Number of 1s (or non-zero entries) in an adjacency matrix is equal to the number of edges

in the graph.
∑ The adjacency matrix for a weighted graph contains the weights of the edges connecting the

nodes.

Now let us discuss the powers of an adjacency matrix. From adjacency matrix A1, we can conclude

that an entry 1 in the ith row and jth column means that there exists a path of length 1 from vi
to vj. Now consider, A2, A3, and A4.

(aij)2 = Âaik akj

Any entry aij = 1 if aik = akj = 1. That is, if there is an edge (vi, vk) and (vk, vj), then there is a path

from vi to vj of length 2.

A BCD   Similarly, every entry in the ith row and jth column of A3 gives the
A B A 0110
B 0011 number of paths of length 3 from node vi to vj.
C 0001   In general terms, we can conclude that every entry in the ith row and
C D
D 1100 jth column of An (where n is the number of nodes in the graph) gives

Figure 13.15  Directed graph the number of paths of length n from node vi to vj. Consider a directed
with its adjacency graph given in Fig. 13.15. Given its adjacency matrix A, let us calculate
matrix
A2, A3, and A4.

A2 = A1 ¥ A1

È0 1 1 0˘ È0 1 1 0˘ È0 0 1 2˘
ÍÍ0 1˙˙ ÍÍ0 1˙˙ ÍÍ1 1˙˙
A2 = Í0 0 1 1˙ ¥ Í0 0 1 1˙ = Í1 1 0 0˙
0 0 0 0 1 0
ÎÍ1 1 0 0˙˚ ÎÍ1 1 0 0˚˙ ÎÍ1 1 2 1˙˚

È0 0 1 2˘ È0 1 1 0˘ È2 2 0 1˘
ÍÍ1 1˙˙ ÍÍ0 1˙˙ ÍÍ1 1˙˙
A3 = Í1 1 0 0˙ ¥ Í0 0 1 1˙ = Í0 2 2 1˙
1 0 0 0 1 2
ÍÎ1 1 2 1˙˚ ÎÍ1 1 0 0˙˚ ÎÍ1 2 2 3˙˚

È2 2 0 1˘ È0 1 1 0˘ È1 3 4 2˘
ÍÍ1 1˙˙ ÍÍ0 1˙˙ ÍÍ1 4˙˙
A4 = Í0 2 2 1˙ ¥ Í0 0 1 1˙ = Í1 2 3 3˙
1 2 0 0 1 1
ÍÎ1 2 2 3˙˚ ÍÎ1 1 0 0˚˙ ÍÎ3 4 3 4˚˙

Now, based on the above calculations, we define matrix B as:

Br = A1 + A2 + A3 + ... + Ar

An entry in the ith row and jth column of matrix Br gives the number of paths of length r or less
than r from vertex vi to vj. The main goal to define matrix B is to obtain the path matrix P. The
path matrix P can be calculated from B by setting an entry Pij = 1, if Bij is non-zero and Pij = 0,

390  Data Structures Using C

if otherwise. The path matrix is used to show whether
there exists a simple path from node vi to vj or not. This
1 [if there is a path from vi to vj] is shown in Fig. 13.16.
  Let us now calculate matrix B and matrix P using the
Pij above discussion.

0 [otherwise]

Figure 13.16  Path matrix entry

  B =

È0 1 1 0˘ È0 0 1 2˘ È2 2 0 1˘ È1 3 4 2˘ È3 6 6 5˘
ÍÍ0 1˙˙ ÍÍ1 1˙˙ ÍÍ1 1˙˙ ÍÍ1 4˙˙ ÍÍ3 7˙˙
Í0 0 1 1˙ + Í1 1 0 0˙ + Í0 2 2 1˙ + Í1 2 3 3˙ = Í2 5 6 5˙
0 0 1 0 1 2 1 1 3 3
ÍÎ1 1 0 0˙˚ ÍÎ1 1 2 1˙˚ ÎÍ1 2 2 3˙˚ ÎÍ3 4 3 4˚˙ ÍÎ6 8 7 8˚˙

Now the path matrix P can be given as:

È1 1 1 1˘
ÍÍ1 1˙˙
   P = Í1 1 1 1˙
1 1
ÍÎ1 1 1 1˚˙

13.5.2  Adjacency List Representation

An adjacency list is another way in which graphs can be represented in the computer’s memory.
This structure consists of a list of all nodes in G. Furthermore, every node is in turn linked to its
own list that contains the names of all other nodes that are adjacent to it.
The key advantages of using an adjacency list are:

∑ It is easy to follow and clearly shows the adjacent nodes of a particular node.

∑ It is often used for storing graphs that have a small-to-moderate number of edges. That is,

an adjacency list is preferred for representing sparse graphs in the computer’s memory;

otherwise, an adjacency matrix is a good choice.

A B AB CX ∑ Adding new nodes in G is easy and straightforward
when G is represented using an adjacency list.
BC DX Adding new nodes in an adjacency matrix is a

C DX

C D DA BX

Figure 13.17 Graph G and its adjacency list difficult task, as the size of the matrix needs to
be changed and existing nodes may have to be

reordered.

AB DX DX Consider the graph given in Fig. 13.17 and see
ABC C how its adjacency list is stored in the memory.
EX
BA

CB

DE D A B E X For a directed graph, the sum of the lengths of all

(Undirected graph) EC DX adjacency lists is equal to the number of edges in

G. However, for an undirected graph, the sum of the

45 A B4 D2X lengths of all adjacency lists is equal to twice the
ABC number of edges in G because an edge (u, v) means
an edge from node u to v as well as an edge from v
B D7X to u. Adjacency lists can also be modified to store
weighted graphs. Let us now see an adjacency list
27 1 C B5X

DE D E3X
3
E C 1X
(Weighted graph)

Figure 13.18  Adjacency list for an undirected graph for an undirected graph as well as a weighted graph.

and a weighted graph This is shown in Fig. 13.18.

Graphs  391

13.5.3  Adjacency Multi-list Representation

Graphs can also be represented using multi-lists which can be said to be modified version of
adjacency lists. Adjacency multi-list is an edge-based rather than a vertex-based representation of
graphs. A multi-list representation basically consists of two parts—a directory of nodes’information
and a set of linked lists storing information about edges. While there is a single entry for each node
in the node directory, every node, on the other hand, appears in two adjacency lists (one for the
node at each end of the edge). For example, the directory entry for node i points to the adjacency
list for node i. This means that the nodes are shared among several lists.
In a multi-list representation, the information about an edge (vi, vj) of an undirected graph can
be stored using the following attributes:
M: A single bit field to indicate whether the edge has been examined or not.

E1 1 E3 E5 E7 6 vi: A vertex in the graph that is connected to vertex vj by an edge.
0 34 vj: A vertex in the graph that is connected to vertex vi by an edge.
E2 2 E4 E6 5 Link i for vi: A link that points to another node that has an edge incident on vi.
Link j for vi: A link that points to another node that has an edge incident on vj.
Figure 13.19 Undirected Consider the undirected graph given in Fig. 13.19.
graph The adjacency multi-list for the graph can be given as:

Edge 1 0 1 Edge 2 Edge 3

Edge 2 0 2 NULL Edge 4

Edge 3 1 3 NULL Edge 4

Edge 4 2 3 NULL Edge 5

Edge 5 3 4 NULL Edge 6

Edge 6 4 5 Edge 7 NULL

Edge 7 4 6 NULL NULL

Using the adjacency multi-list given above, the adjacency list for vertices can be constructed as
shown below:

VERTEX LIST OF EDGES
0 Edge 1, Edge 2
1 Edge 1, Edge 3
2 Edge 2, Edge 4
3 Edge 3, Edge 4, Edge 5
4 Edge 5, Edge 6, Edge 7
5 Edge 6
6 Edge 7

392  Data Structures Using C

Programming Example

1. Write a program to create a graph of n vertices using an adjacency list. Also write the
code to read and print its information and finally to delete the graph.

#include <stdio.h>

#include <conio.h>

#include <alloc.h>

struct node

{

char vertex;

struct node *next;

};

struct node *gnode;

void displayGraph(struct node *adj[], int no_of_nodes);

void deleteGraph(struct node *adj[], int no_of_nodes);

void createGraph(struct node *adj[], int no_of_nodes);

int main()

{

struct node *Adj[10];

int i, no_of_nodes;

clrscr();

printf("\n Enter the number of nodes in G: ");

scanf("%d", &no_of_nodes);

for(i = 0; i < no_of_nodes; i++)

Adj[i] = NULL;

createGraph(Adj, no_of_nodes);

printf("\n The graph is: ");

displayGraph(Adj, no_of_nodes);

deleteGraph(Adj, no_of_nodes);

getch();

return 0;

}

void createGraph(struct node *Adj[], int no_of_nodes)

{

struct node *new_node, *last;

int i, j, n, val;

for(i = 0; i < no_of_nodes; i++)

{

last = NULL;

printf("\n Enter the number of neighbours of %d: ", i);

scanf("%d", &n);

for(j = 1; j <= n; j++)

{

printf("\n Enter the neighbour %d of %d: ", j, i);

scanf("%d", &val);

new_node = (struct node *) malloc(sizeof(struct node));

new_node –> vertex = val;

new_node –> next = NULL;

if (Adj[i] == NULL)

Adj[i] = new_node;

else

last –> next = new_node;

last = new_node

}

}

}

void displayGraph (struct node *Adj[], int no_of_nodes)

Graphs  393

{

struct node *ptr;

int i;

for(i = 0; i < no_of_nodes; i++)

{

ptr = Adj[i];

printf("\n The neighbours of node %d are:", i);

while(ptr != NULL)

{

printf("\t%d", ptr –> vertex);

ptr = ptr –> next;

}

}

}

void deleteGraph (struct node *Adj[], int no_of_nodes)

{

int i;

struct node *temp, *ptr;

for(i = 0; i <= no_of_nodes; i++)

{

ptr = Adj[i];

while(ptr ! = NULL)

{

temp = ptr;

ptr = ptr –> next;

free(temp);

}

Adj[i] = NULL;

}

}

Output

Enter the number of nodes in G: 3
Enter the number of neighbours of 0: 1
Enter the neighbour 1 of 0: 2
Enter the number of neighbours of 1: 2
Enter the neighbour 1 of 1: 0
Enter the neighbour 2 of 1: 2
Enter the number of neighbours of 2: 1
Enter the neighbour 1 of 2: 1
The neighbours of node 0 are: 1
The neighbours of node 1 are: 0 2
The neighbours of node 2 are: 0

Note If the graph in the above program had been a weighted graph, then the structure of the node would

have been:
typedef struct node
{
int vertex;
int weight;
struct node *next;
};

13.6  GRAPH TRAVERSAL ALGORITHMS

In this section, we will discuss how to traverse graphs. By traversing a graph, we mean the method
of examining the nodes and edges of the graph. There are two standard methods of graph traversal
which we will discuss in this section. These two methods are:

394  Data Structures Using C

1. Breadth-first search
2. Depth-first search

While breadth-first search uses a queue as an auxiliary data structure to store nodes for further
processing, the depth-first search scheme uses a stack. But both these algorithms make use of
a variable STATUS. During the execution of the algorithm, every node in the graph will have the
variable STATUS set to 1 or 2, depending on its current state. Table 13.1 shows the value of STATUS
and its significance.

Table 13.1  Value of status and its significance

Status State of the Description
node
The initial state of the node N
1 Ready Node N is placed on the queue or stack and waiting to be processed
Node N has been completely processed
2 Waiting

3 Processed

13.6.1  Breadth-First Search Algorithm

Breadth-first search (BFS) is a graph search algorithm that begins at the root node and explores all

the neighbouring nodes. Then for each of those nearest nodes, the algorithm (Fig. 13.20) explores

their unexplored neighbour nodes, and so on, until it finds the goal.

  That is, we start examining the node A and then all the

Step 1: SET STATUS = 1 (ready state) neighbours of A are examined. In the next step, we examine
for each node in G the neighbours of neighbours of A, so on and so forth. This

Step 2: Enqueue the starting node A means that we need to track the neighbours of the node and
and set its STATUS = 2 guarantee that every node in the graph is processed and no
(waiting state) node is processed more than once. This is accomplished

Step 3: Repeat Steps 4 and 5 until

QUEUE is empty by using a queue that will hold the nodes that are waiting
Step 4: Dequeue a node N. Process it for further processing and a variable STATUS to represent
the current state of the node.
and set its STATUS = 3
(processed state).

Step 5: Enqueue all the neighbours of

N that are in the ready state Example 13.1  Consider the graph G given in Fig. 13.21.
(whose STATUS = 1) and set The adjacency list of G is also given. Assume that G
their STATUS = 2 represents the daily flights between different cities and we
(waiting state)

[END OF LOOP] want to fly from city A to I with minimum stops. That is,
Step 6: EXIT find the minimum path P from A to I given that every edge

Figure 13.20  Algorithm for breadth-first search has a length of 1.
Solution 

  The minimum path P can be found by applying the

A Adjacency lists breadth-first search algorithm that begins at city A and
ends when I is encountered. During the execution of the
A: B, C, D

B: E algorithm, we use two arrays:
B C D C: B, G QUEUE and ORIG. While QUEUE is used to hold the nodes
that have to be processed, ORIG is used to keep track of
D: C, G
E: C, F

E F G F: C, H the origin of each edge. Initially, FRONT = REAR = –1. The
G: F, H, I algorithm for this is as follows:
H: E, I
(a) Add A to QUEUE and add NULL to ORIG.
H I I: F

Figure 13.21  Graph G and its adjacency list FRONT = 0 QUEUE = A
REAR = 0 ORIG = \0

Graphs  395

(b) Dequeue a node by setting FRONT = FRONT + 1 (remove the FRONT element of QUEUE) and enqueue
the neighbours of A. Also, add A as the ORIG of its neighbours.

FRONT = 1 QUEUE = A B C D

REAR = 3 ORIG = \0 A A A

(c) Dequeue a node by setting FRONT = FRONT + 1 and enqueue the neighbours of B. Also, add B as
the ORIG of its neighbours.

FRONT = 2 QUEUE = A B C D E

REAR = 4 ORIG = \0 A A A B

(d) Dequeue a node by setting FRONT = FRONT + 1 and enqueue the neighbours of C. Also, add C as
the ORIG of its neighbours. Note that C has two neighbours B and G. Since B has already been
added to the queue and it is not in the Ready state, we will not add B and only add G.

FRONT = 3 QUEUE = A B C D E G

REAR = 5 ORIG = \0 A A A B C

(e) Dequeue a node by setting FRONT = FRONT + 1 and enqueue the neighbours of D. Also, add D as
the ORIG of its neighbours. Note that D has two neighbours C and G. Since both of them have
already been added to the queue and they are not in the Ready state, we will not add them
again.

FRONT = 4 QUEUE = A B C D E G

REAR = 5 ORIG = \0 A A A B C

(f) Dequeue a node by setting FRONT = FRONT + 1 and enqueue the neighbours of E. Also, add E as
the ORIG of its neighbours. Note that E has two neighbours C and F. Since C has already been
added to the queue and it is not in the Ready state, we will not add C and add only F.

FRONT = 5 QUEUE = A B C D E G F

REAR = 6 ORIG = \0 A A A B C E

(g) Dequeue a node by setting FRONT = FRONT + 1 and enqueue the neighbours of G. Also, add G as
the ORIG of its neighbours. Note that G has three neighbours F, H, and I.

FRONT = 6 QUEUE = A B C D E G F H I

REAR = 9 ORIG = \0 A A A B C E G G

Since F has already been added to the queue, we will only add H and I. As I is our final
destination, we stop the execution of this algorithm as soon as it is encountered and added
to the QUEUE. Now, backtrack from I using ORIG to find the minimum path P. Thus, we have
P as A -> C -> G -> I.

Features of Breadth-First Search Algorithm

Space complexity In the breadth-first search algorithm, all the nodes at a particular level must
be saved until their child nodes in the next level have been generated. The space complexity is
therefore proportional to the number of nodes at the deepest level of the graph. Given a graph with
branching factor b (number of children at each node) and depth d, the asymptotic space complexity
is the number of nodes at the deepest level O(bd).

396  Data Structures Using C

If the number of vertices and edges in the graph are known ahead of time, the space complexity
can also be expressed as O ( | E | + | V | ), where | E | is the total number of edges in G and
| V | is the number of nodes or vertices.

Time complexity  In the worst case, breadth-first search has to traverse through all paths to
all possible nodes, thus the time complexity of this algorithm asymptotically approaches O(bd).
However, the time complexity can also be expressed as O( | E | + | V | ), since every vertex and
every edge will be explored in the worst case.

Completeness  Breadth-first search is said to be a complete algorithm because if there is a
solution, breadth-first search will find it regardless of the kind of graph. But in case of an infinite
graph where there is no possible solution, it will diverge.

Optimality  Breadth-first search is optimal for a graph that has edges of equal length, since it
always returns the result with the fewest edges between the start node and the goal node. But
generally, in real-world applications, we have weighted graphs that have costs associated with
each edge, so the goal next to the start does not have to be the cheapest goal available.

Applications of Breadth-First Search Algorithm

Breadth-first search can be used to solve many problems such as:
∑ Finding all connected components in a graph G.
∑ Finding all nodes within an individual connected component.
∑ Finding the shortest path between two nodes, u and v, of an unweighted graph.
∑ Finding the shortest path between two nodes, u and v, of a weighted graph.

Programming Example

2. Write a program to implement the breadth-first search algorithm.

#include <stdio.h>

#define MAX 10

void breadth_first_search(int adj[][MAX],int visited[],int start)

{

int queue[MAX],rear = –1,front =– 1, i;

queue[++rear] = start;

visited[start] = 1;

while(rear != front)

{

start = queue[++front];

if(start == 4)

printf("5\t");

else

printf("%c \t",start + 65);

for(i = 0; i < MAX; i++)

{

if(adj[start][i] == 1 && visited[i] == 0)

{

queue[++rear] = i;

visited[i] = 1;

}

}

}

}

int main()

Graphs  397

{
int visited[MAX] = {0};
int adj[MAX][MAX], i, j;
printf("\n Enter the adjacency matrix: ");
for(i = 0; i < MAX; i++)
for(j = 0; j < MAX; j++)
scanf("%d", &adj[i][j]);
breadth_first_search(adj,visited,0);
return 0;
}

Output

Enter the adjacency matrix:
01010
10110
01001
11001
00110
ABDCE

13.6.2  Depth-first Search Algorithm

The depth-first search algorithm (Fig. 13.22) progresses by expanding the starting node of G and

then going deeper and deeper until the goal node is found, or until a node that has no children is

encountered. When a dead-end is reached, the algorithm backtracks, returning to the most recent

node that has not been completely explored.

In other words, depth-first search begins at a starting node A which becomes the current node.

Then, it examines each node N along a path P which begins at A. That is, we process a neighbour

of A, then a neighbour of neighbour of A, and so on. During the execution of the algorithm, if we

reach a path that has a node N that has already been processed, then we backtrack to the current

node. Otherwise, the unvisited (unprocessed) node becomes the current node.

  The algorithm proceeds like this

Step 1: SET STATUS = 1 (ready state) for each node in G until we reach a dead-end (end

Step 2: Push the starting node A on the stack and set of path P). On reaching the dead-
its STATUS = 2 (waiting state) end, we backtrack to find another
path P¢. The algorithm terminates
Step 3: Repeat Steps 4 and 5 until STACK is empty
Step 4: Pop the top node N. Process it and set its

Step 5: STATUS = 3 (processed state) when backtracking leads back to the
Push on the stack all the neighbours of N that starting node A. In this algorithm,
are in the ready state (whose STATUS = 1) and edges that lead to a new vertex are
set their STATUS = 2 (waiting state)

[END OF LOOP] called discovery edges and edges that
Step 6: EXIT lead to an already visited vertex are

Figure 13.22  Algorithm for depth-first search called back edges.
Observe that this algorithm is

similar to the in-order traversal of a

binary tree. Its implementation is similar to that of the breadth-first search algorithm but here we

use a stack instead of a queue. Again, we use a variable STATUS to represent the current state of

the node.

Example 13.2  Consider the graph G given in Fig. 13.23. The adjacency list of G is also given.
Suppose we want to print all the nodes that can be reached from the node H (including H itself).
One alternative is to use a depth-first search of G starting at node H. The procedure can be
explained here.

398  Data Structures Using C

A Adjacency lists
A: B, C, D
B: E

B C D C: B, G
D: C, G
E: C, F

E F G F: C, H
G: F, H, I
H: E, I

HI
I: F

Figure 13.23  Graph G and its adjacency list

Solution
(a) Push H onto the stack.

STACK: H

(b) Pop and print the top element of the STACK, that is, H. Push all the neighbours of H onto the
stack that are in the ready state. The STACK now becomes

PRINT: H STACK: E, I

(c) Pop and print the top element of the STACK, that is, I. Push all the neighbours of I onto the
stack that are in the ready state. The STACK now becomes

PRINT: I STACK: E, F

(d) Pop and print the top element of the STACK, that is, F. Push all the neighbours of F onto the
stack that are in the ready state. (Note F has two neighbours, C and H. But only C will be added,
as H is not in the ready state.) The STACK now becomes

PRINT: F STACK: E, C

(e) Pop and print the top element of the STACK, that is, C. Push all the neighbours of C onto the
stack that are in the ready state. The STACK now becomes

PRINT: C STACK: E, B, G

(f) Pop and print the top element of the STACK, that is, G. Push all the neighbours of G onto the
stack that are in the ready state. Since there are no neighbours of G that are in the ready state,
no push operation is performed. The STACK now becomes

PRINT: G STACK: E, B

(g) Pop and print the top element of the STACK, that is, B. Push all the neighbours of B onto the
stack that are in the ready state. Since there are no neighbours of B that are in the ready state,
no push operation is performed. The STACK now becomes

PRINT: B STACK: E

(h) Pop and print the top element of the STACK, that is, E. Push all the neighbours of E onto the
stack that are in the ready state. Since there are no neighbours of E that are in the ready state,
no push operation is performed. The STACK now becomes empty.

PRINT: E STACK:

Since the STACK is now empty, the depth-first search of G starting at node H is complete and the
nodes which were printed are:

Graphs  399

H, I, F, C, G, B, E

These are the nodes which are reachable from the node H.

Features of Depth-First Search Algorithm

Space complexity The space complexity of a depth-first search is lower than that of a breadth-
first search.

Time complexity The time complexity of a depth-first search is proportional to the number of
vertices plus the number of edges in the graphs that are traversed. The time complexity can be
given as (O(|V| + |E|)).
Completeness Depth-first search is said to be a complete algorithm. If there is a solution, depth-
first search will find it regardless of the kind of graph. But in case of an infinite graph, where there
is no possible solution, it will diverge.

Applications of Depth-First Search Algorithm

Depth-first search is useful for:

∑ Finding a path between two specified nodes, u and v, of an unweighted graph.
∑ Finding a path between two specified nodes, u and v, of a weighted graph.
∑ Finding whether a graph is connected or not.
∑ Computing the spanning tree of a connected graph.

Programming Example

3. Write a program to implement the depth-first search algorithm.

#include <stdio.h>

#define MAX 5

void depth_first_search(int adj[][MAX],int visited[],int start)

{

int stack[MAX];

int top = –1, i;

printf("%c–",start + 65);

visited[start] = 1;

stack[++top] = start;

while(top ! = –1)

{

start = stack[top];

for(i = 0; i < MAX; i++)

{

if(adj[start][i] && visited[i] == 0)

{

stack[++top] = i;

printf("%c–", i + 65);

visited[i] = 1;

break;

}

}

if(i == MAX)

top––;

}

}

int main()

{

int adj[MAX][MAX];

int visited[MAX] = {0}, i, j;

400  Data Structures Using C

printf("\n Enter the adjacency matrix: ");
for(i = 0; i < MAX; i++)
for(j = 0; j < MAX; j++)
scanf("%d", &adj[i][j]);
printf("DFS Traversal: ");
depth_first_search(adj,visited,0);
printf("\n");
return 0;
}

Output

Enter the adjacency matrix:
01010
10110
01001
11001
00110
DFS Traversal: A –> C –> E –>

13.7  TOPOLOGICAL SORTING

Topological sort of a directed acyclic graph (DAG) G is defined as a linear ordering of its nodes

in which each node comes before all nodes to which it has outbound edges. Every DAG has one

or more number of topological sorts.

A topological sort of a DAG G is an ordering of the vertices of G such that if G contains an

edge (u, v), then u appears before v in the ordering. Note that topological sort is possible only on

directed acyclic graphs that do not have any

Example 13.3  Consider three DAGs shown in Fig. cycles. For a DAG that contains cycles, no

13.24 and their possible topological sorts. linear ordering of its vertices is possible.

In simple words, a topological ordering of a
A A A DAG G is an ordering of its vertices such that

B C B B any directed path in G traverses the vertices
in increasing order.

DE CD C D Topological sorting is widely used in
scheduling applications, jobs, or tasks.

EF E The jobs that have to be completed are

represented by nodes, and there is an edge

FG from node u to v if job u must be completed

Topological sort Topological sort Topological sort before job v can be started. A topological sort
can be given as: can be given as: can be given as: of such a graph gives an order in which the

∑ A, B, C, D, E ∑ A, B, D, C, E, F ∑ A, B, C, F, D, E, C given jobs must be performed.
∑ A, B, C, E, D ∑ A, B, D, C, F, E ∑ A, B, C, D, E, F, G

∑ A, C, B, D, E ∑ A, B, C, D, E, F ∑ A, B, C, D, F, E, G Algorithm
∑ A, C, B, E, D ∑ A, B, C, D, F, E ∑ A, B, D, C, E, F, G
............................ ................................ The algorithm for the topological sort of a
graph (Fig. 13.25) that has no cycles focuses
∑ A, B, F, E, D ∑ A, B, D, C, F, E, G

Figure 13.24  Topological sort on selecting a node N with zero in-degree,

  One main property of a DAG is that more the that is, a node that has no predecessor. The
number of edges in a DAG, fewer the number of two main steps involved in the topological
topological orders it has. This is because each edge sort algorithm include:
(u, v) forces node u to occur before v, which restricts
the number of valid permutations of the nodes. ∑ Selecting a node with zero in-degree
∑ Deleting N from the graph along with its

edges

Graphs  401

Step 1: Find the in-degree INDEG(N) of every node
in the graph

Step 2: Enqueue all the nodes with a zero in-degree
Step 3: Repeat Steps 4 and 5 until the QUEUE is empty
Step 4: Remove the front node N of the QUEUE by setting

FRONT = FRONT + 1
Step 5: Repeat for each neighbour M of node N:

a) Delete the edge from N to M by setting
INDEG(M) = INDEG(M) - 1

b) IF INDEG(M) = , then Enqueue M, that is,
add M to the rear of the queue

[END OF INNER LOOP]
[END OF LOOP]
Step 6: Exit

Figure 13.25  Algorithm for topological sort

We will use a QUEUE to hold the nodes with zero in-degree. The order in which the nodes will be
deleted from the graph will depend on the sequence in which the nodes are inserted in the QUEUE.
Then, we will use a variable INDEG, where INDEG(N) will represent the in-degree of node N.

Notes 1. The in-degree can be calculated in two ways—either by counting the incoming edges from the graph
or traversing through the adjacency list.
2. The running time of the algorithm for topological sorting can be given linearly as the number of

nodes plus the number of edges O(|V|+|E|).

Example 13.4  Consider a directed acyclic graph G given in Fig. 13.26. We use the algorithm
given above to find a topological sort T of G. The steps are given as below:

C Adjacency lists
AB
A: B
D E B: C, D, E
G
C: E
Figure 13.26 Graph G D: E
F E: F
G: D

Step 1: Find the in-degree INDEG(N) of every node in the graph

INDEG(A) = 0 INDEG(B) = 1 INDEG(C) = 1 INDEG(D) = 2

INDEG(E) = 3 INDEG(F) = 1 INDEG(G) = 0

Step 2: Enqueue all the nodes with a zero in-degree

FRONT = 1 REAR = 2 QUEUE = A, G

Step 3: Remove the front element A from the queue by setting FRONT = FRONT + 1, so

FRONT = 2 REAR = 2 QUEUE = A, G

Step 4: Set INDEG(B) = INDEG(B) – 1, since B is the neighbour of A. Note that INDEG(B) is 0, so

add it on the queue. The queue now becomes

FRONT = 2 REAR = 3 QUEUE = A, G, B

Delete the edge from A to B. The graph now becomes as shown in the figure below

402  Data Structures Using C

C

AB E

D

GF

Step 5: Remove the front element B from the queue by setting FRONT = FRONT + 1, so

FRONT = 2 REAR = 3 QUEUE = A, G, B

Step 6: Set INDEG(D) = INDEG(D) – 1, since D is the neighbour of G. Now,

INDEG(C) = 1 INDEG(D) = 1 INDEG(E) = 3 INDEG(F) = 1

Delete the edge from G to D. The graph now becomes as shown in the figure below

C

AB E

D

GF

Step 7: Remove the front element B from the queue by setting FRONT = FRONT + 1, so

FRONT = 4 REAR = 3 QUEUE = A, G, B

Step 8: Set INDEG(C) = INDEG(C) – 1, INDEG(D) = INDEG(D) – 1, INDEG(E) = INDEG(E)

– 1, since C, D, and E are the neighbours of B. Now,
INDEG(C) = 0, INDEG(D) = 1 and INDEG(E) = 2
Step 9: Since the in-degree of node c and D is zero, add C and D at the rear of the queue. The queue

can be given as below:

FRONT = 4 REAR = 5 QUEUE = A, G, B, C, D

The graph now becomes as shown in the figure below

C

AB E

D

GF

Step 10: Remove the front element C from the queue by setting FRONT = FRONT + 1, so

FRONT = 5 REAR = 5 QUEUE = A, G, B, C, D

Step 11: Set INDEG(E) = INDEG(E) – 1, since E is the neighbour of C. Now, INDEG(E) = 1

The graph now becomes as shown in the figure below

C

AB E

D

GF

Step 12: Remove the front element D from the queue by setting FRONT = FRONT + 1, so

FRONT = 6 REAR = 5 QUEUE = A, B, G, C, D

Step 13: Set INDEG(E) = INDEG(E) – 1, since E is the neighbour of D. Now, INDEG(E) = 0, so

add E to the queue. The queue now becomes.

FRONT = 6 REAR = 6 QUEUE = A, G, B, C, D, E

Step 14: Delete the edge between D an E. The graph now becomes as shown in the figure below

Graphs  403

C

AB E

D

GF

S tep 15: Remove the front element D from the queue by setting FRONT = FRONT + 1, so

FRONT = 7 REAR = 6 QUEUE = A, G, B, C, D, E

S tep 16: Set INDEG(F) = INDEG(F) – 1, since F is the neighbour of E. Now INDEG(F) = 0, so

add F to the queue. The queue now becomes,

FRONT = 7 REAR = 7 QUEUE = A, G, B, C, D, E, F

Step 17: Delete the edge between E and F. The graph now becomes as shown in the figure below

C

AB E

D

GF

There are no more edges in the graph and all the nodes have been added to the queue, so the
topological sort T of G can be given as: A, G, B, C, D, E, F. When we arrange these nodes in a
sequence, we find that if there is an edge from u to v, then u appears before v.

AGBCDE F

Figure 13.27  Topological sort of G

Programming Example

4. Write a program to implement topological sorting.

#include <stdio.h>
#include <conio.h>
#define MAX 10
int n,adj[MAX][MAX];
int front = –1,rear = –1,queue[MAX];
void create_graph(void);
void display();
void insert_queue(int);
int delete_queue(void);
int find_indegree(int);
void main()
{
int node,j = 0,del_node, I;
int topsort[MAX],indeg[MAX];
create_graph();
printf("\n The adjacency matrix is:");
display();
/*Find the in–degree of each node*/
for(node = 1; node <= n; node++)
{
indeg[node] = find_indegree(node);
if( indeg[node] == 0 )

404  Data Structures Using C

insert_queue(node);

}

while(front <= rear) /*Continue loop until queue is empty */

{

del_node = delete_queue();

topsort[j] = del_node; /*Add the deleted node to topsort*/

j++;

/*Delete the del_node edges */

for(node = 1; node <= n; node++)

{

if(adj[del_node][node] == 1 )

{

adj[del_node][node] = 0;

indeg[node] = indeg[node] – 1;

if(indeg[node] == 0)

insert_queue(node);

}

  }

}

printf("The topological sorting can be given as :\n");

for(node=0;i<j;node++)

printf("%d ",topsort[node]);

}

void create_graph()

{

int i,max_edges,org,dest;

printf("\n Enter the number of vertices: ");

scanf("%d",&n);

max_edges = n*(n – 1);

for(i = 1; i <= max_edges; i++)

{

printf("\n Enter edge %d(0 to quit): ",i);

scanf("%d %d",&org,&dest);

if((org == 0) && (dest == 0))

break;

if( org > n || dest > n || org <= 0 || dest <= 0)

{

printf("\n Invalid edge");

i––;

}

else

adj[org][dest] = 1;

}

}

void display()

{

int i,j;

for(i=1;i<=n;i++)

{

printf("\n");

for(j=1;j<=n;j++)

printf("%3d",adj[i][j]);

}

}

void insert_queue(int node)

{

if (rear==MAX–1)

printf("\n OVERFLOW ");

else

{

if (front == –1) /*If queue is initially empty */

Graphs  405

front=0;

queue[++rear] = node ;

}

}

int delete_queue()

{

int del_node;

if (front == –1 || front > rear)

{

printf("\n UNDERFLOW ");

return ;

}

else

{

del_node = queue[front++];

return del_node;

}

}

int find_indegree(int node)

{

int i,in_deg = 0;

for(i = 1; i <= n; i++)

{

if( adj[i][node] == 1 )

in_deg++;

}

return in_deg;

}

Output

Enter number of vertices: 7
Enter edge 1(0 to quit): 1 2
Enter edge 2(0 to quit): 2 3
Enter edge 3(0 to quit): 2 5
Enter edge 4(0 to quit): 2 4
Enter edge 5(0 to quit): 3 5
Enter edge 6(0 to quit): 4 5
Enter edge 7(0 to quit): 5 6
Enter edge 8(0 to quit): 7 4
The topological sorting can be given as:

1723456

13.8  SHORTEST PATH ALGORITHMS

In this section, we will discuss three different algorithms to calculate the shortest path between
the vertices of a graph G. These algorithms include:

∑ Minimum spanning tree
∑ Dijkstra’s algorithm
∑ Warshall’s algorithm
While the first two use an adjacency list to find the shortest path, Warshall’s algorithm uses an
adjacency matrix to do the same.

13.8.1  Minimum Spanning Trees

A spanning tree of a connected, undirected graph G is a sub-graph of G which is a tree that connects
all the vertices together. A graph G can have many different spanning trees. We can assign weights
to each edge (which is a number that represents how unfavourable the edge is), and use it to assign
a weight to a spanning tree by calculating the sum of the weights of the edges in that spanning

406  Data Structures Using C

tree. A minimum spanning tree (MST) is defined as a spanning tree with weight less than or equal
to the weight of every other spanning tree. In other words, a minimum spanning tree is a spanning
tree that has weights associated with its edges, and the total weight of the tree (the sum of the
weights of its edges) is at a minimum.

An Analogy

Take an analogy of a cable TV company laying cable in a new neighbourhood. If it is restricted
to bury the cable only along particular paths, then we can make a graph that represents the points
that are connected by those paths. Some paths may be more expensive (due to their length or the
depth at which the cable should be buried) than the others. We can represent these paths by edges
with larger weights.
Therefore, a spanning tree for such a graph would be a subset of those paths that has no cycles
but still connects to every house. Many distinct spanning trees can be obtained from this graph,
but a minimum spanning tree would be the one with the lowest total cost.

Properties

Possible multiplicity  There can be multiple minimum spanning trees of the same weight.
Particularly, if all the weights are the same, then every spanning tree will be minimum.

Uniqueness  When each edge in the graph is assigned a different weight, then there will be only
one unique minimum spanning tree.

Minimum-cost subgraph  If the edges of a graph are assigned non-negative weights, then a
minimum spanning tree is in fact the minimum-cost subgraph or a tree that connects all vertices.

Cycle property  If there exists a cycle C in the graph G that has a weight larger than that of other
edges of C, then this edge cannot belong to an MST.

Usefulness  Minimum spanning trees can be computed quickly and easily to provide optimal
solutions. These trees create a sparse subgraph that reflects a lot about the original graph.

Simplicity  The minimum spanning tree of a weighted graph is nothing but a spanning tree of
the graph which comprises of n–1 edges of minimum total weight. Note that for an unweighted
graph, any spanning tree is a minimum spanning tree.

Example 13.5   Consider an unweighted graph G given below (Fig. 13.28). From G, we can draw
many distinct spanning trees. Eight of them are given here. For an unweighted graph, every
spanning tree is a minimum spanning tree.

AB AB AB AB AB

CD C D CD CD CD
(Unweighted graph) A B AB AB AB

AB

CD CD CD CD CD

Figure 13.28  Unweighted graph and its spanning trees

Example 13.6  Consider a weighted graph G shown in Fig. 13.29. From G, we can draw three
distinct spanning trees. But only a single minimum spanning tree can be obtained, that is, the
one that has the minimum weight (cost) associated with it.

Graphs  407

Of all the spanning trees given in Fig. 13.29, the one that is highlighted is called the minimum
spanning tree, as it has the lowest cost associated with it.

3 3 3 AB 3
AB AB AB AB

47 56 7 4 76 5

CD CD CD CD CD
2 2 2 2 2

(Weighted graph) (Total cost = 12) (Total cost = 9) (Total cost = 15) (Total cost = 10)

3 AB 3 3 3
AB AB AB AB

6 45 47 75 56

CD CD CD CD CD
2 2

(Total cost = 11) (Total cost = 11) (Total cost = 14) (Total cost = 15) (Total cost = 14)

Figure 13.29  Weighted graph and its spanning trees

Applications of Minimum Spanning Trees

1. MSTs are widely used for designing networks. For instance, people separated by varying
distances wish to be connected together through a telephone network. A minimum spanning
tree is used to determine the least costly paths with no cycles in this network, thereby providing
a connection that has the minimum cost involved.

2. MSTs are used to find airline routes. While the vertices in the graph denote cities, edges
represent the routes between these cities. No doubt, more the distance between the cities,
higher will be the amount charged. Therefore, MSTs are used to optimize airline routes by
finding the least costly path with no cycles.

3. MSTs are also used to find the cheapest way to connect terminals, such as cities, electronic
components or computers via roads, airlines, railways, wires or telephone lines.

4. MSTs are applied in routing algorithms for finding the most efficient path.

13.8.2  Prim’s Algorithm

Prim’s algorithm is a greedy algorithm that is used to form a minimum spanning tree for
a connected weighted undirected graph. In other words, the algorithm builds a tree that includes
every vertex and a subset of the edges in such a way that the total weight of all the edges in the tree
is minimized. For this, the algorithm maintains three sets of vertices which can be given as below:

∑ Tree vertices  Vertices that are a part of the minimum spanning tree T.
∑ Fringe vertices Vertices that are currently not a part of T, but are adjacent to some tree vertex.

∑ Unseen vertices Vertices that are neither tree vertices nor fringe vertices fall under this
category.

The steps involved in the Prim’s algorithm are shown in Fig. 13.30.
∑   Choose a starting vertex.

Step 1: Select a starting vertex ∑ Branch out from the starting
vertex and during each iteration,
Step 2: Repeat Steps 3 and 4 until there are fringe vertices select a new vertex and an edge.
Basically, during each iteration
Step 3: Select an edge e connecting the tree vertex and of the algorithm, we have to
select a vertex from the fringe
fringe vertex that has minimum weight vertices in such a way that the
edge connecting the tree vertex
Step 4: Add the selected edge and the vertex to the and the new vertex has the

minimum spanning tree T

[END OF LOOP]

Step 5: EXIT

Figure 13.30  Prim’s algorithm minimum weight assigned to it.

408  Data Structures Using C

A 7 B 11 E  The running time of Prim’s algorithm can be given as O(E log V) where E is the

34 9 number of edges and V is the number of vertices in the graph.

CD Example 13.7  Construct a minimum spanning tree of the graph given in Fig.
10

Figure 13.31 Graph G 13.31.

Step 1: Choose a starting vertex A.

Step 2: Add the fringe vertices (that are adjacent to A). The edges connecting the vertex and fringe
vertices are shown with dotted lines.

Step 3: Select an edge connecting the tree vertex and the fringe vertex that has the minimum
weight and add the selected edge and the vertex to the minimum spanning tree T. Since the edge
connecting A and C has less weight, add C to the tree. Now C is not a fringe vertex but a tree vertex.

Step 4: Add the fringe vertices (that are adjacent to C).

Step 5: Select an edge connecting the tree vertex and the fringe vertex that has the minimum
weight and add the selected edge and the vertex to the minimum spanning tree T. Since the edge
connecting C and B has less weight, add B to the tree. Now B is not a fringe vertex but a tree vertex.

Step 6: Add the fringe vertices (that are adjacent to B).

Step 7: Select an edge connecting the tree vertex and the fringe vertex that has the minimum
weight and add the selected edge and the vertex to the minimum spanning tree T. Since the
edge connecting B and D has less weight, add D to the tree. Now D is not a fringe vertex but a
tree vertex.

Step 8: Note, now node E is not connected, so we will add it in the tree because a minimum
spanning tree is one in which all the n nodes are connected with n–1 edges that have minimum
weight. So, the minimum spanning tree can now be given as,

A 7 7 7 AB
Step 1 AB AB AB 34
3 3 34 CD
C C CD
10
Step 2 Step 3 10 Step 5
Step 4

11 A 11 A 11
ABE 34 BE 34 BE
C C
34 9 9 9
D D
CD
10

Step 6 Step 7 Step 8

Example 13.8  Construct a minimum spanning tree of the graph given in Fig. 13.32. Start the
Prim’s algorithm from vertex D.

67
ABC
4 8 64

D 14 E
5 78
F 10 G

Figure 13.32 Graph G

Graphs  409

6 E 6 E 6 E 6 E
AB AB AB AB G
48 48 48
48
D D D
14 14 14 D
14
5 5 5
F F F 57
F

10

6 7 6 7 6 7 6
AB C AB C AB C AB
48 48 48
64 6 6 48

DE DE DE D E
14 14 14 14 G

5 78 57 57 57
FG FG FG F

10 10 10 10

67 6 C
ABC AB 64
4 8 64 4
E
DE D 8
14 5
F G
5 78
FG

10

13.8.3  Kruskal’s Algorithm

Kruskal’s algorithm is used to find the minimum spanning tree for a connected weighted graph.
The algorithm aims to find a subset of the edges that forms a tree that includes every vertex. The
total weight of all the edges in the tree is minimized. However, if the graph is not connected, then
it finds a minimum spanning forest. Note that a forest is a collection of trees. Similarly, a minimum
spanning forest is a collection of minimum spanning trees.
Kruskal’s algorithm is an example of a greedy algorithm, as it makes the locally optimal
choice at each stage with the hope of finding the global optimum. The algorithm is shown in
Fig. 13.33.

Step 1: Create a forest in such a way that each graph is a separate

tree.

Step 2: Create a priority queue Q that contains all the edges of the

graph.

Step 3: Repeat Steps 4 and 5 while Q is NOT EMPTY

Step 4: Remove an edge from Q

Step 5: IF the edge obtained in Step 4 connects two different trees,

then Add it to the forest (for combining two trees into one

tree).

ELSE

Discard the edge

Step 6: END

Figure 13.33  Kruskal’s algorithm

410  Data Structures Using C

In the algorithm, we use a priority queue Q in which edges that have minimum weight takes a
priority over any other edge in the graph. When the Kruskal’s algorithm terminates, the forest has
only one component and forms a minimum spanning tree of the graph. The running time of
Kruskal’s algorithm can be given as O(E log V), where E is the number of edges and V is the number
of vertices in the graph.

A Example 13.9  Apply Kruskal’s algorithm on the graph given in Fig. 13.34.
7 61
85 Initially, we have F = {{A}, {B}, {C}, {D}, {E}, {F}}
BCD MST = {}
Q = {(A, D), (E, F), (C, E), (E, D), (C, D), (D, F),
34 5 (A, C), (A, B), (B, C)}
Step 1: Remove the edge (A, D) from Q and make the following changes:
EF
2

Figure 13.34

A F = {{A, D}, {B}, {C}, {E}, {F}}
7 61 MST = {A, D}
85
BCD Q = {(E, F), (C, E), (E, D), (C, D), (D, F), (A, C), (A, B), (B, C)}

34 5

EF
2

Step 2: Remove the edge (E, F) from Q and make the following changes:

A F = {{A, D}, {B}, {C}, {E, F}}
7 61 MST = {(A, D), (E, F)}
85
BCD Q = {(C, E), (E, D), (C, D), (D, F), (A, C), (A, B), (B, C)}

34 5

EF
2

Step 3: Remove the edge (C, E) from Q and make the following changes:

A F = {{A, D}, {B}, {C, E, F}}
7 61 MST = {(A, D), (C, E), (E, F)}
85
BCD Q = {(E, D), (C, D), (D, F), (A, C), (A, B), (B, C)}

34 5

EF
2

Step 4: Remove the edge (E, D) from Q and make the following changes:

A F = {{A, C, D, E, F}, {B}}
7 61 MST = {(A, D), (C, E), (E, F), (E, D)}
85
BCD Q = {(C, D), (D, F), (A, C), (A, B), (B, C)}

34 5

EF
2

Step 5: Remove the edge (C, D) from Q. Note that this edge does not connect different trees, so
simply discard this edge. Only an edge connecting (A, D, C, E, F) to B will be added to the MST.
Therefore,

Graphs  411

F = {{A, C, D, E, F}, {B}}
MST = {(A, D), (C, E), (E, F), (E, D)}
Q = {(D, F), (A, C), (A, B), (B, C)}

Step 6: Remove the edge (D, F) from Q. Note that this edge does not connect different trees, so
simply discard this edge. Only an edge connecting (A, D, C, E, F) to B will be added to the MST.
F = {{A, C, D, E, F}, {B}}
MST = {(A, D), (C, E), (E, F), (E, D)}
Q = {(A, C), (A, B), (B, C)}

Step 7: Remove the edge (A, C) from Q. Note that this edge does not connect different trees, so
simply discard this edge. Only an edge connecting (A, D, C, E, F) to B will be added to the MST.

F = {{A, C, D, E, F}, {B}}
MST = {(A, D), (C, E), (E, F), (E, D)}
Q = {(A, B), (B, C)}

Step 8: Remove the edge (A, B) from Q and make the following changes:

A

71

8 D F = {A, B, C, D, E, F}
BC MST = {(A, D), (C, E), (E, F), (E, D), (A, B)}

34 Q = {(B, C)}

EF
2

Step 9: The algorithm continues until Q is empty. Since the entire forest has become one tree,
all the remaining edges will simply be discarded. The resultant MS can be given as shown below.

A F = {A, B, C, D, E, F}
71 MST = {(A, D), (C, E), (E, F), (E, D), (A, B)}

BCD Q = {}
34
EF
2

Programming Example

5. Write a program which finds the cost of a minimum spanning tree.

#include<stdio.h>

#include<conio.h>

#define MAX 10

int adj[MAX][MAX], tree[MAX][MAX], n;

void readmatrix()

{

int i, j;

printf(“\n Enter the number of nodes in the Graph : “);

scanf(“%d”, &n);

printf(“\n Enter the adjacency matrix of the Graph”);

for (i = 1; i <= n; i++)

for (j = 1; j <= n; j++)

scanf(“%d”, &adj[i][j]);

}

int spanningtree(int src)

412  Data Structures Using C

{

int visited[MAX], d[MAX], parent[MAX];

int i, j, k, min, u, v, cost;

for (i = 1; i <= n; i++)

{

d[i] = adj[src][i];

visited[i] = 0;

parent[i] = src;

}

visited[src] = 1;

cost = 0;

k = 1;

for (i = 1; i < n; i++)

{

min = 9999;

for (j = 1; j <= n; j++)

{

if (visited[j]==0 && d[j] < min)

{

min = d[j];

u = j;

cost += d[u];

}

}

visited[u] = 1;

//cost = cost + d[u];

tree[k][1] = parent[u];

tree[k++][2] = u;

for (v = 1; v <= n; v++)

if (visited[v]==0 && (adj[u][v] < d[v]))

{

d[v] = adj[u][v];

parent[v] = u;

}

}

return cost;

}

void display(int cost)

{

int i;

printf(“\n The Edges of the Mininum Spanning Tree are”);

for (i = 1; i < n; i++)

printf(“ %d %d \n”, tree[i][1], tree[i][2]);

printf(“\n The Total cost of the Minimum Spanning Tree is : %d”, cost);

}

main()

{

int source, treecost;

readmatrix();

printf(“\n Enter the Source : “);

scanf(“%d”, &source);

treecost = spanningtree(source);

display(treecost);

return 0;

}

Output
Enter the number of nodes in the Graph : 4

Enter the adjacency matrix : 0 1 1 0

0 0 0 1

0 1 0 0

Graphs  413

1 0 1 0
Enter the source : 1
The edges of the Minimum Spanning Tree are   1   4
4 2
2 3
The total cost of the Minimum Spanning Tree is : 1

13.8.4  Dijkstra’s Algorithm

Dijkstra’s algorithm, given by a Dutch scientist Edsger Dijkstra in 1959, is used to find the shortest
path tree. This algorithm is widely used in network routing protocols, most notably IS-IS and
OSPF (Open Shortest Path First).
Given a graph G and a source node A, the algorithm is used to find the shortest path (one having
the lowest cost) between A (source node) and every other node. Moreover, Dijkstra’s algorithm
is also used for finding the costs of the shortest paths from a source node to a destination node.
For example, if we draw a graph in which nodes represent the cities and weighted edges
represent the driving distances between pairs of cities connected by a direct road, then Dijkstra’s
algorithm when applied gives the shortest route between one city and all other cities.

Algorithm

Dijkstra’s algorithm is used to find the length of an optimal path between two nodes in a graph.
The term optimal can mean anything, shortest, cheapest, or fastest. If we start the algorithm with
an initial node, then the distance of a node Y can be given as the distance from the initial node to
that node. Figure 13.35 explains the Dijkstra’s algorithm.

1. Select the source node also called the initial node
2. Define an empty set N that will be used to hold nodes to which a shortest

path has been found.
3. Label the initial node with , and insert it into N.
4. Repeat Steps 5 to 7 until the destination node is in N or there are no more

labelled nodes in N.
5. Consider each node that is not in N and is connected by an edge from

the newly inserted node.
6. (a) If the node that is not in N has no label then SET the label of the

node = the label of the newly inserted node + the length of the edge.
(b) Else if the node that is not in N was already labelled, then SET its new

label = minimum (label of newly inserted vertex + length of edge, old
label)
7. Pick a node not in N that has the smallest label assigned to it and add it
to N.

Figure 13.35  Dijkstra’s algorithm

Dijkstra’s algorithm labels every node in the graph where the labels represent the distance
(cost) from the source node to that node. There are two kinds of labels: temporary and permanent.
Temporary labels are assigned to nodes that have not been reached, while permanent labels are
given to nodes that have been reached and their distance (cost) to the source node is known. A
node must be a permanent label or a temporary label, but not both.
The execution of this algorithm will produce either of the following two results:
1. If the destination node is labelled, then the label will in turn represent the distance from the

source node to the destination node.
2. If the destination node is not labelled, then there is no path from the source to the destination

node.

414  Data Structures Using C

Example 13.10  Consider the graph G given in Fig. 13.36. Taking D as the initial node, execute

the Dijkstra’s algorithm on it.

4 Step 1: Set the label of D = 0 and N = {D}.
AB Step 2: Label of D = 0, B = 15, G = 23, and F = 5. Therefore, N = {D, F}.

2 15 17 Step 3: Label of D = 0, B = 15, G has been re-labelled 18 because minimum

C D E (5 + 13, 23) = 18, C has been re-labelled 14 (5 + 9). Therefore, N = {D,
9 5 23 11 F, C}.
Step 4: Label of D = 0, B = 15, G = 18. Therefore, N = {D, F, C, B}.
FG Step 5: Label of D = 0, B = 15, G = 18 and A = 19 (15 + 4). Therefore, N =
13

Figure 13.36 Graph G {D, F, C, B, G}.
Step 6: Label of D = 0 and A = 19. Therefore, N = {D, F, C, B, G, A}

  Note that we have no labels for node E; this means that E is not reachable from D. Only the

nodes that are in N are reachable from B.
  The running time of Dijkstra’s algorithm can be given as O(|V|2+|E|)=O(|V|2) where V is the

set of vertices and E in the graph.

Difference between Dijkstra’s Algorithm and Minimum Spanning Tree

Minimum spanning tree algorithm is used to traverse a graph in the most efficient manner, but
Dijkstra’s algorithm calculates the distance from a given vertex to every other vertex in the graph.
Dijkstra’s algorithm is very similar to Prim’s algorithm. Both the algorithms begin at a specific
node and extend outward within the graph, until all other nodes in the graph have been reached.
The point where these algorithms differ is that while Prim’s algorithm stores a minimum cost
edge, Dijkstra’s algorithm stores the total cost from a source node to the current node. Moreover,
Dijkstra’s algorithm is used to store the summation of minimum cost edges, while Prim’s algorithm
stores at most one minimum cost edge.

13.8.5  Warshall’s Algorithm

If a graph G is given as G=(V, E), where V is the set of vertices and E is the set of edges, the path

matrix of G can be found as, P = A + A2 + A3 + ... + An. This is a lengthy process, so Warshall has given

a very efficient algorithm to calculate the path

Pk[i][j] 1 [if there is a path from vi to vj. matrix. Warshall’s algorithm defines matrices
The path should not use any P0, P1, P2, º, Pn as given in Fig. 13.37.
other nodes except v1, v2, ..., vk]
This means that if P0[i][j] = 1, then there

0 [otherwise] exists an edge from node vi to vj.

If P1[i][j] = 1, then there exists an edge

Figure 13.37  Path matrix entry from vi to vj that does not use any other vertex

except v1.

If P2[i][j] = 1, then there exists an edge from vi to vj that does not use any other vertex except

v1 and v2.

Note that P0 is equal to the adjacency matrix of G. If G contains n nodes, then Pn = P which is the

path matrix of the graph G.

From the above discussion, we can conclude that Pk[i][j] is equal to 1 only when either of the
two following cases occur:

∑ There is a path from vi to vj that does not use any other node except v1, v2, ..., .vk–1 Therefore,
Pk–1[i][j] = 1.

Graphs  415

∑ There is a path from vi to vk and a path from vk to vj where all the nodes use v1, v2, ..., .vk–1
Therefore,

Pk–1[i][k] = 1 AND Pk–1[k][j] = 1

Hence, the path matrix Pn can be calculated with the formula given as:

Pk[i][j] = Pk–1[i][j] V (Pk–1[i][k] L Pk–1[k][j])

where V indicates logical OR operation and L indicates logical AND operation.
  Figure 13.38 shows the Warshall’s algorithm to find the path matrix P using the adjacency
matrix A.

Step 1: [INITIALIZE the Path Matrix] Repeat Step 2 for I = to n-1,

where n is the number of nodes in the graph

Step 2: Repeat Step 3 for J = to n-1

Step 3: IF A[I][J] = , then SET P[I][J] =

ELSE P[I][J] = 1

[END OF LOOP]

[END OF LOOP]

Step 4: [Calculate the path matrix P] Repeat Step 5 for K = to n-1

Step 5: Repeat Step 6 for I = to n-1

Step 6: Repeat Step 7 for J= to n-1

Step 7: SET PK[I][J] = PK-1[I][J] V (PK-1[I][K]

L PK-1[K][J])

Step 8: EXIT

Figure 13.38  Warshall’s algorithm

Example 13.11  Consider the graph in Fig. 13.39 and its adjacency matrix A. We can straightaway

calculate the path matrix P using the Warshall’s algorithm.

  The path matrix P can be given in a single step as: A BCD

A BCD A B A0110

AÊ1 1 1 1ˆ B0011

  P = B ÁÁ1 1 1 1˜˜ C0001
CD

D1100

C Á1 1 1 1˜
DËÁ1 1 1 1¯˜
Figure 13.39 Graph G and its
path matrix P

Thus, we see that calcuplateing A, A2, A3, A4, ..., A5 to calculate P is a very slow and inefficient
technique as compared to the Warshall’s technique.

Programming Exam l

6. Write a program to implement Warshall’s algorithm to find the path matrix.

#include <stdio.h>
#include <conio.h>
void read (int mat[5][5], int n);
void display (int mat[5][5], int n);
void mul(int mat[5][5], int n);
int main()
{
int adj[5][5], P[5][5], n, i, j, k;
clrscr();

416  Data Structures Using C

printf("\n Enter the number of nodes in the graph : ");

scanf("%d", &n);

printf("\n Enter the adjacency matrix : ");

read(adj, n);

clrscr();

printf("\n The adjacency matrix is : ");

display(adj, n);

for(i=0;i<n;i++)

{

for(j=0;j<n;j++)

{

if(adj[i][j] == 0)

P[i][j] = 0;

else

P[i][j] = 1;

}

}

for(k=0; k<n;k++)

{

for(i=0;i<n;i++)

{

for(j=0;j<n;j++)

P[i][j] = P[i][j] | ( P[i][k] & P[k][j]);

}

}

printf("\n The Path Matrix is :");

display (P, n);

getch();

return 0;

}

void read(int mat[5][5], int n)

{

int i, j;

for(i=0;i<n;i++)

{

for(j=0;j<n;j++)

{

printf("\n mat[%d][%d] = ", i, j);

scanf("%d", &mat[i][j]);

}

}

}

void display(int mat[5][5], int n)

{

int i, j;

for(i=0;i<n;i++)

printf("\n");

for(j=0;j<n;j++)

printf("%d\t", mat[i][j]);

}

}

Output 0
1
The adjacency matrix is 1
0 1 1 0
0 0 1
0 0 0
1 1 0

Graphs  417

The Path Matrix is
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1

13.8.6  Modified Warshall's Algorithm

Warshall’s algorithm can be modified to obtain a matrix that gives the shortest paths between the
nodes in a graph G. As an input to the algorithm, we take the adjacency matrix A of G and replace
all the values of A which are zero by infinity (•). Infinity (•) denotes a very large number and
indicates that there is no path between the vertices. In Warshall’s modified algorithm, we obtain
a set of matrices Q0, Q1, Q2, ..., Qm using the formula given below.

Qk[i][j] = Minimum( Mk–1[i][j], Mk–1[i][k] + Mk–1[k][j])

Q0 is exactly the same as A with a little difference that every element having a zero value in A is replaced
by (•) in Q0. Using the given formula, the matrix Qn will give the path matrix that has the shortest path
between the vertices of the graph. Warshall’s modified algorithm is shown in Fig. 13.40.

Step 1: [Initialize the Shortest Path Matrix, Q] Repeat Step 2 for I =

to n-1, where n is the number of nodes in the graph

Step 2: Repeat Step 3 for J = to n-1

Step 3: IF A[I][J] = , then SET Q[I][J] = Infinity (or 9999)

ELSE Q[I][J] = A[I][j]

[END OF LOOP]

[END OF LOOP]

Step 4: [Calculate the shortest path matrix Q] Repeat Step 5 for K =

to n-1

Step 5: Repeat Step 6 for I = to n-1

Step 6: Repeat Step 7 for J= to n-1

Step 7: IF Q[I][J] <= Q[I][K] + Q[K][J]

SET Q[I][J] = Q[I][J]

ELSE SET Q[I][J] = Q[I][K] + Q[K][J]

[END OF IF]

[END OF LOOP]

[END OF LOOP]

[END OF LOOP]

Step 8: EXIT

Figure 13.40  Modified Warshall’s algorithm

Example 13.12  Consider the unweighted graph G given in Fig. 13.41 and apply Warshall’s algorithm
to it.

A B 0110 È9999 1 1 9999˘ È9999 1 1 9999˘
0001 ÍÍ9999 9999 9999 ˙ ÍÍ9999 9999 9999 ˙
0100 Q0 = Í9999 9999 1 ˙   Q1 = Í9999 9999 1 ˙
1 1
C D 1010 9999 1 9999˙ 2 1 9999˙
Í 9999˙˚ Í 9999˙˚
Figure 13.41 Graph G Î 1 Î 1

È9999 1 1 2˘ È9999 1 1 2˘ È3 1 1 2˘
ÍÍ9999 9999 9999 1˙˙ ÍÍ9999 9999 9999 1˙˙ ÍÍ2 1˙˙
Q2 = Í9999 9999 2˙ Q3 = Í9999 9999 2˙ Q4 = Q Í3 3 2 2˙
1 9999 3˚˙ 1 9999 3˙˚ 1 3
Í 2 Í 2 ÎÍ1 2 1 3˙˚
Î 1 Î 1

418  Data Structures Using C

Example 13.13  Consider a weighted graph G given in Fig. 13.42 and apply Warshall’s shortest
path algorithm to it.

3 0330 È9999 3 3 9999˘
AB 0001 9999 9999
0300 9999
32 31 2050 ÍÍ9999 3 1 ˙
Í9999 9999 5 ˙
C D Q0 = 9999˙
5

Figure 13.42 Graph G Í 2 9999˙˚
Î

È9999 3 3 9999˘ È9999 3 3 4˘
ÍÍ9999 9999 9999 ˙ ÍÍ9999 9999 9999 1˙˙
Q1 = Í9999 9999 1 ˙ Q2 = Í9999 9999 4˙
3 3 6˙˚
5 5 9999˙ Í 5 5
Í 9999˚˙ Î
Î 2 2

È9999 3 3 4˘ È6 3 3 4˘
ÍÍ9999 9999 9999 1˙˙ ÍÍ3 1˙˙
Q3 = Í9999 9999 6˙ Q4 = Q Í6 6 6 4˙
3 6˙˚ 3 9
Í 5 5 ÎÍ2 5 5 6˙˚
Î 2

Programming Example

7. Write a program to implement Warshall’s modified algorithm to find the shortest path.

#include <stdio.h>
#include <conio.h>
#define INFINITY 9999
void read (int mat[5][5], int n);
void display(int mat[5][5], int n);
int main()
{
int adj[5][5], Q[5][5], n, i, j, k;
clrscr();
printf("\n Enter the number of nodes in the graph : ");
scanf("%d", &n);
printf("\n Enter the adjacency matrix : ");
read(adj, n);
clrscr();
printf("\n The adjacency matrix is : ");
display(adj, n);
for(i=0;i<n;i++)
{
for(j=0;j<n;j++)
{
if(adj[i][j] == 0)
Q[i][j] = INFINITY;
else
Q[i][j] = adj[i][j];
}
}
for(k=0; k<n;k++)
{
for(i=0;i<n;i++)
{

Graphs  419

for(j=0;j<n;j++)

{

if(Q[i][j] <= Q[i][k] + Q[k][j])

Q[i][j] = Q[i][j];

else

Q[i][j] = Q[i][k] + Q[k][j];

}

}

printf("\n\n");

display(Q, n);

}

getch();

return 0;

}

void read(int mat[5][5], int n)

{

int i, j;

for(i=0;i<n;i++)

{

for(j=0;j<n;j++)

{

printf("\n mat[%d][%d] = ", i, j);

scanf("%d", &mat[i][j]);

}

}

}

void display(int mat[5][5], int n)

{

int i, j;

for(i=0;i<n;i++)

{printf("\n");

for(j=0;j<n;j++)

printf("%d\t", mat[i][j]);

}

}

Output

6 3 3 4

3 6 6 1

6 3 9 4

1 5 5 6

13.9 APPLICATIONS OF GRAPHS

Graphs are constructed for various types of applications such as:

∑ In circuit networks where points of connection are drawn as vertices and component wires
become the edges of the graph.

∑ In transport networks where stations are drawn as vertices and routes become the edges of
the graph.

∑ In maps that draw cities/states/regions as vertices and adjacency relations as edges.
∑ In program flow analysis where procedures or modules are treated as vertices and calls to

these procedures are drawn as edges of the graph.
∑ Once we have a graph of a particular concept, they can be easily used for finding shortest

paths, project planning, etc.
∑ In flowcharts or control-flow graphs, the statements and conditions in a program are

represented as nodes and the flow of control is represented by the edges.

420  Data Structures Using C

∑ In state transition diagrams, the nodes are used to represent states and the edges represent
legal moves from one state to the other.

∑ Graphs are also used to draw activity network diagrams. These diagrams are extensively used
as a project management tool to represent the interdependent relationships between groups,
steps, and tasks that have a significant impact on the project.

An Activity Network Diagram (AND) also known as an Arrow Diagram or a PERT (Program
Evaluation Review Technique) is used to identify time sequences of events which are pivotal
to objectives. It is also helpful when a project has multiple activities which need simultaneous
management. ANDs help the project development team to create a realistic project schedule by
drawing graphs that exhibit:
∑ the total amount of time needed to complete the project
∑ the sequence in which activities must be performed
∑ the activities that can be performed simultaneously
∑ the critical activities that must be monitored on a regular basis.
A sample AND is shown in Fig. 13.43.

START A1 must be A2 and A3 may be performed
completed simultaneously
Starting point of before A2
the project. A2 A5 END
A1 End-point of
the project.
Boxes denote activity
A3 A4
A4 must be completed
before A5 but after A3

Arrows denote dependency among activities

Figure 13.43  Activity network diagram

Points to Remember

∑ A graph is basically a collection of vertices (also ∑ A transitive closure of a graph is constructed to
called nodes) and edges that connect these vertices. answer reachability questions.

∑ Degree of a node u is the total number of edges ∑ Since an adjacency matrix contains only 0s and 1s,
containing the node u. When the degree of a node it is called a bit matrix or a Boolean matrix. The
is zero, it is also called an isolated node. A path P memory use of an adjacency matrix is O(n2), where
is known as a closed path if the edge has the same n is the number of nodes in the graph.
end-points. A closed simple path with length 3 or
more is known as a cycle. ∑ Topological sort of a directed acyclic graph G is
defined as a linear ordering of its nodes in which
∑ A graph in which there exists a path between any two each node comes before all the nodes to which it
of its nodes is called a connected graph. An edge that has outbound edges. Every DAG has one or more
has identical end-points is called a loop. The size of number of topological sorts.
a graph is the total number of edges in it.
∑ A vertex v of G is called an articulation point if
∑ The out-degree of a node is the number of edges that removing v along with the edges incident to v
originate at u. results in a graph that has at least two connected
components.
∑ The in-degree of a node is the number of edges that
terminate at u. A node u is known as a sink if it has ∑ A biconnected graph is defined as a connected graph
a positive in-degree but a zero out-degree. that has no articulation vertices.

Graphs  421

∑ Breadth-first search is a graph search algorithm ∑ A spanning tree of a connected, undirected graph G
that begins at the root node and explores all the is a sub-graph of G which is a tree that connects all
neighbouring nodes. Then for each of those nearest the vertices together.
nodes, the algorithm explores their unexplored
neighbour nodes, and so on, until it finds the goal. ∑ Kruskal’s algorithm is an example of a greedy
algorithm, as it makes the locally optimal choice
∑ The depth-first search algorithm progresses by at each stage with the hope of finding the global
expanding the starting node of G and thus going optimum.
deeper and deeper until a goal node is found, or until
a node that has no children is encountered. ∑ Dijkstra’s algorithm is used to find the length of an
optimal path between two nodes in a graph.

Exercises

Review Questions 8. Consider the graph given below. State all the
simple paths from A to D, B to D, and C to D.
1. Explain the relationship between a linked list Also, find out the in-degree and out-degree of each
structure and a digraph. node. Is there any source or sink in the graph?

2. What is a graph? Explain its key terms. AB
3. How are graphs represented inside a computer’s
DC
memory? Which method do you prefer and why?
4. Consider the graph given below. 9. Consider the graph given below. Find out its
(a) Write the adjacency matrix of G. depth-first and breadth-first traversal scheme.
(b) Write the path matrix of G.
(c) Is the graph biconnected?
(d) Is the graph complete?
(e) Find the shortest path matrix using Warshall’s

algorithm.

G
A

A B BCD
E F EG

C D

H HI

10. Differentiate between depth-first search and
breadth-first search traversal of a graph.
5. Explain the graph traversal algorithms in detail
with example. 11. Explain the topological sorting of a graph G.
12. Define spanning tree.
6. Draw a complete undirected graph having five 13. When is a spanning tree called a minimum
nodes.
spanning tree? Take a weighted graph of your
7. Consider the graph given below and find out the choice and find out its minimum spanning tree.
degree of each node. 14. Explain Prim’s algorithm.
15. Write a brief note on Kruskal’s algorithm.
B C 16. Write a short note on Dijkstra’s algorithm.
D
A
E

422  Data Structures Using C

17. Differentiate between Dijkstra’s algorithm and 28. Given the following adjacency matrix, draw the
minimum spanning tree algorithm. weighted graph.

18. Consider the graph given below. Find the Ê 0 4 0 2 0ˆ
minimum spanning tree of this graph using (a)
Prim’s algorithm, (b) Kruskal’s algorithm, and Á 0 0 0 7 0˜˜
(c) Dijkstra’s algorithm. Á
Á 0 5 0 0 0˜

A 2 Á 0 0 0 0 03˜˜˜¯
2 B2 ÁËÁ 0 0 1 0
C
3 6E 29. Consider five cities: (1) New Delhi, (2) Mumbai,
3 4 (3) Chennai, (4) Bangalore, and (5) Kolkata, and
a list of flights that connect these cities as shown
D in the following table. Use the given information
5 to construct a graph.

1
F

19. Briefly discuss Warshall’s algorithm. Also, discuss Flight No. Origin Destination
its modified version. 101 2 3
102 3 2
20. Show the working of Floyd-Warshall’s algorithm 103 5 3
to find the shortest paths between all pairs of nodes 104 3 4
in the following graph. 105 2 5
106 5 2
AB 107 5 1
108 1 4
CD 109 5 4
110 4 5
21. Write a short note on transitive closure of a graph.
22. Given the adjacency matrix of a graph, write a Programming Exercises

program to calculate the degree of a node N in the 1. Write a program to create and print a graph.
graph. 2. Write a program to determine whether there is at
23. Given the adjacency matrix of a graph, write a
program to calculate the in-degree and the out- least one path from the source to the destination.
degree of a node N in the graph.
24. Given the adjacency matrix of a graph, write a Multiple-choice Questions
function isFullConnectedGraph which returns 1
if the graph is fully connected and 0 otherwise. 1. An edge that has identical end-points is called a
25. In which kind of graph do we use topological
sorting? (a) Multi-path (b) Loop
26. Consider the graph given below and show its
adjacency list in the memory. (c) Cycle (d) Multi-edge

AB 2. The total number of edges containing the node u

is called

(a) In-degree (b) Out-degree

(c) Degree (d) None of these

CD 3. A graph in which there exists a path between any

two of its nodes is called

27. Consider the graph given in Question 26 and show (a) Complete graph (b) Connected graph
the changes in the graph as well as its adjacency list
when node E and edges (A, E) and (C, E) are added (c) Digraph (d) In-directed graph
to it. Also, delete edge (B, D) from the graph.
4. The number of edges that originate at u are called

(a) In-degree (b) Out-degree

(c) Degree (d) source

Graphs  423

5. The memory use of an adjacency matrix is 10. Kruskal’s algorithm is an example of a greedy
algorithm.
(a) O(n) (b) O(n2)
Fill in the Blanks
(c) O(n3) (d) O(log n)
1. ______ has a zero degree.
6. The term optimal can mean
2. In-degree of a node is the number of edges that
(a) Shortest (b) Cheapest ______ at u.

(c) Fastest (d) All of these 3. Adjacency matrix is also known as a ______.
4. A path P is known as a ______ path if the edge
7. How many articulation vertices does a biconnected
has the same end-points.
graph contain?
5. A graph with multiple edges and/or a loop is called
(a) 0 (b) 1 a ______.

(c) 2 (d) 3 6. Vertices that are a part of the minimum spanning
tree T are called ______.
True or False
7. A ______ of a graph is constructed to answer
1. Graph is a linear data structure. reachability questions.
2. In-degree of a node is the number of edges leaving
8. An ______ is a vertex v of G if removing v along
that node. with the edges incident to v results in a graph that
3. The size of a graph is the total number of vertices has at least two connected components.

in it. 9. A ______ graph is a connected graph that is not
4. A sink has a zero in-degree but a positive out- broken into disconnected pieces by deleting
any single vertex.
degree.
5. The space complexity of depth-first search is lower 10. An edge is called a ______ if removing that edge
results in a disconnected graph.
than that of breadth-first search.
6. A node is known as a sink if it has a positive out-

degree but the in-degree = 0.
7. A directed graph that has no cycles is called a

directed acyclic graph.
8. A graph G can have many different spanning trees.
9. Fringe vertices are not a part of T, but are adjacent

to some tree vertex.

14chapter

Searching and
Sorting

Learning Objective

Searching and sorting are two of the most common operations in computer science.
Searching refers to finding the position of a value in a collection of values. Sorting
refers to arranging data in a certain order. The two commonly used orders are
numerical order and alphabetical order. In this chapter, we will discuss the different
techniques of searching and sorting arrays of numbers or characters.

14.1  Introduction to  SEARCHING
Searching means to find whether a particular value is present in an array or not. If the value is
present in the array, then searching is said to be successful and the searching process gives the
location of that value in the array. However, if the value is not present in the array, the searching
process displays an appropriate message and in this case searching is said to be unsuccessful.
There are two popular methods for searching the array elements: linear search and binary search.
The algorithm that should be used depends entirely on how the values are organized in the array.
For example, if the elements of the array are arranged in ascending order, then binary search
should be used, as it is more efficient for sorted lists in terms of complexity. We will discuss all
these methods in detail in this section.

14.2  Linear Search
Linear search, also called as sequential search, is a very simple method used for searching an array
for a particular value. It works by comparing the value to be searched with every element of the
array one by one in a sequence until a match is found. Linear search is mostly used to search an
unordered list of elements (array in which data elements are not sorted). For example, if an array
A[] is declared and initialized as,

int A[] = {10, 8, 2, 7, 3, 4, 9, 1, 6, 5};

Searching and Sorting  425

LINEAR_SEARCH(A, N, VAL) and the value to be searched is VAL = 7, then searching means
to find whether the value ‘7’ is present in the array or not.
Step 1: [INITIALIZE] SET POS = -1 If yes, then it returns the position of its occurrence. Here,
POS = 3 (index starting from 0).
Step 2: [INITIALIZE] SET I = 1   Figure 14.1 shows the algorithm for linear search.
  In Steps 1 and 2 of the algorithm, we initialize the value
Step 3: Repeat Step 4 while I<=N of POS and I. In Step 3, a while loop is executed that would
be executed till I is less than N (total number of elements
Step 4: IF A[I] = VAL in the array). In Step 4, a check is made to see if a match is
found between the current array element and VAL. If a match
SET POS = I is found, then the position of the array element is printed, else
the value of I is incremented to match the next element with
PRINT POS VAL. However, if all the array elements have been compared
with VAL and no match is found, then it means that VAL is not
Go to Step 6 present in the array.

[END OF IF]

SET I = I + 1

[END OF LOOP]

Step 5: IF POS = –1

PRINT "VALUE IS NOT PRESENT

IN THE ARRAY"

[END OF IF]

Step 6: EXIT

Figure 14.1  Algorithm for linear search

Complexity of Linear Search Algorithm

Linear search executes in O(n) time where n is the number of elements in the array. Obviously,
the best case of linear search is when VAL is equal to the first element of the array. In this case,
only one comparison will be made. Likewise, the worst case will happen when either VAL is not
present in the array or it is equal to the last element of the array. In both the cases, n comparisons
will have to be made. However, the performance of the linear search algorithm can be improved
by using a sorted array.

Programming Example

1. Write a program to search an element in an array using the linear search technique.

#include <stdio.h>
#include <stdlib.h>
#include <conio.h>
#define size 20 // Added so the size of the array can be altered more easily
int main(int argc, char *argv[]) {
int arr[size], num, i, n, found = 0, pos = -1;
printf("\n Enter the number of elements in the array : ");
scanf("%d", &n);
printf("\n Enter the elements: ");
for(i=0;i<n;i++)
{
scanf("%d", &arr[i]);
}
printf("\n Enter the number that has to be searched : ");
scanf("%d", &num);
for(i=0;i<n;i++)
{
if(arr[i] == num)
{
found =1;
pos=i;
printf("\n %d is found in the array at position= %d", num,i+1);
/* +1 added in line 23 so that it would display the number in
the first place in the array as in position 1 instead of 0 */
break;
}
}
if (found == 0)

426  Data Structures Using C

printf("\n %d does not exist in the array", num);
return 0;
}

14.3  Binary Search

Binary search is a searching algorithm that works efficiently with a sorted list. The mechanism
of binary search can be better understood by an analogy of a telephone directory. When we are
searching for a particular name in a directory, we first open the directory from the middle and
then decide whether to look for the name in the first part of the directory or in the second part of
the directory. Again, we open some page in the middle and the whole process is repeated until
we finally find the right name.
Take another analogy. How do we find words in a dictionary? We first open the dictionary
somewhere in the middle. Then, we compare the first word on that page with the desired word
whose meaning we are looking for. If the desired word comes before the word on the page, we
look in the first half of the dictionary, else we look in the second half. Again, we open a page in
the first half of the dictionary and compare the first word on that page with the desired word and
repeat the same procedure until we finally get the word. The same mechanism is applied in the
binary search.
Now, let us consider how this mechanism is applied to search for a value in a sorted array.
Consider an array A[] that is declared and initialized as

int A[] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10};

and the value to be searched is VAL = 9. The algorithm will proceed in the following manner.

BEG = 0, END = 10, MID = (0 + 10)/2 = 5

Now, VAL = 9 and A[MID] = A[5] = 5
A[5] is less than VAL, therefore, we now search for the value in the second half of the array. So,
we change the values of BEG and MID.
Now, BEG = MID + 1 = 6, END = 10, MID = (6 + 10)/2 =16/2 = 8

VAL = 9 and A[MID] = A[8] = 8

A[8] is less than VAL, therefore, we now search for the value in the second half of the segment.
So, again we change the values of BEG and MID.
Now, BEG = MID + 1 = 9, END = 10, MID = (9 + 10)/2 = 9
Now, VAL = 9 and A[MID] = 9.
In this algorithm, we see that BEG and END are the beginning and ending positions of the segment
that we are looking to search for the element. MID is calculated as (BEG + END)/2. Initially, BEG =
lower_bound and END = upper_bound. The algorithm will terminate when A[MID] = VAL. When the
algorithm ends, we will set POS = MID. POS is the position at which the value is present in the array.
However, if VAL is not equal to A[MID], then the values of BEG, END, and MID will be changed
depending on whether VAL is smaller or greater than A[MID].
(a) If VAL < A[MID], then VAL will be present in the left segment of the array. So, the value of END

will be changed as END = MID – 1.
(b) If VAL > A[MID], then VAL will be present in the right segment of the array. So, the value of BEG

will be changed as BEG = MID + 1.
  Finally, if VAL is not present in the array, then eventually, END will be less than BEG. When this
happens, the algorithm will terminate and the search will be unsuccessful.
  Figure 14.2 shows the algorithm for binary search.
  In Step 1, we initialize the value of variables, BEG, END, and POS. In Step 2, a while loop is
executed until BEG is less than or equal to END. In Step 3, the value of MID is calculated. In Step 4,
we check if the array value at MID is equal to VAL (item to be searched in the array). If a match is

Searching and Sorting  427

BINARY_SEARCH(A, lower_bound, upper_bound, VAL) found, then the value of POS is printed and
the algorithm exits. However, if a match
Step 1: [INITIALIZE] SET BEG = lower_bound is not found, and if the value of A[MID]
is greater than VAL, the value of END is
END = upper_bound, POS = - 1 modified, otherwise if A[MID] is greater
than VAL, then the value of BEG is altered. In
Step 2: Repeat Steps 3 and 4 while BEG <= END

Step 3: SET MID = (BEG + END)/2

Step 4: IF A[MID] = VAL

SET POS = MID Step 5, if the value of POS = –1, then VAL is
PRINT POS not present in the array and an appropriate
Go to Step 6 message is printed on the screen before
ELSE IF A[MID] > VAL the algorithm exits.
SET END = MID - 1
ELSE

SET BEG = MID + 1 Complexity of Binary Search Algorithm
[END OF IF]
[END OF LOOP] The complexity of the binary search
Step 5: IF POS = -1 algorithm can be expressed as f(n),
PRINT “VALUE IS NOT PRESENT IN THE ARRAY” where n is the number of elements in the
[END OF IF] array. The complexity of the algorithm is
Step 6: EXIT calculated depending on the number of

Figure 14.2  Algorithm for binary search comparisons that are made. In the binary
search algorithm, we see that with each

comparison, the size of the segment where search has to be made is reduced to half. Thus, we

can say that, in order to locate a particular value in the array, the total number of comparisons

that will be made is given as

2f(n) > n or f(n) = log2n

Programming Example

2. Write a program to search an element in an array using binary search.

##include <stdio.h>
#include <stdlib.h>
#include <conio.h>
#define size 10 // Added to make changing size of array easier
int smallest(int arr[], int k, int n); // Added to sort array
void selection_sort(int arr[], int n); // Added to sort array
int main(int argc, char *argv[]) {
int arr[size], num, i, n, beg, end, mid, found=0;
printf("\n Enter the number of elements in the array: ");
scanf("%d", &n);
printf("\n Enter the elements: ");
for(i=0;i<n;i++)
{
scanf("%d", &arr[i]);
}
selection_sort(arr, n); // Added to sort the array
printf("\n The sorted array is: \n");
for(i=0;i<n;i++)
printf(" %d\t", arr[i]);
printf("\n\n Enter the number that has to be searched: ");
scanf("%d", &num);
beg = 0, end = n-1;
while(beg<=end)
{
mid = (beg + end)/2;
if (arr[mid] == num)
{
printf("\n %d is present in the array at position %d", num, mid+1);
found =1;
break;
}

428  Data Structures Using C

else if (arr[mid]>num)
end = mid-1;
else
beg = mid+1;
}
if (beg > end && found == 0)
printf("\n %d does not exist in the array", num);
return 0;
}

int smallest(int arr[], int k, int n)
{
int pos = k, small=arr[k], i;
for(i=k+1;i<n;i++)
{
if(arr[i]< small)
{
small = arr[i];
pos = i;
}
}
return pos;
}
void selection_sort(int arr[],int n)
{
int k, pos, temp;
for(k=0;k<n;k++)
{
pos = smallest(arr, k, n);
temp = arr[k];
arr[k] = arr[pos];
arr[pos] = temp;
}

}

14.4  INTERPOLATION SEARCH

Interpolation search, also known as extrapolation search, is a searching technique that finds a

specified value in a sorted array. The

INTERPOLATION_SEARCH (A, lower_bound, upper_bound, VAL) concept of interpolation search is

Step 1: [INITIALIZE] SET LOW = lower_bound, similar to how we search for names in a
telephone book or for keys by which a
HIGH = upper_bound, POS = –1 book’s entries are ordered. For example,
when looking for a name “Bharat” in a
Step 2: Repeat Steps 3 to 4 while LOW <= HIGH telephone directory, we know that it will

Step 3: SET MID = LOW + (HIGH – LOW) ×

((VAL – A[LOW]) / (A[HIGH] – A[LOW])) be near the extreme left, so applying a

Step 4: IF VAL = A[MID]

POS = MID binary search technique by dividing the

PRINT POS list in two halves each time is not a good
Go to Step 6 idea. We must start scanning the extreme
ELSE IF VAL < A[MID] left in the first pass itself.
  In each step of interpolation search,
SET HIGH = MID – 1 the remaining search space for the
ELSE
SET LOW = MID + 1 value to be found is calculated. The

[END OF IF]

[END OF LOOP] calculation is done based on the values

Step 5: IF POS = –1 at the bounds of the search space and the
PRINT "VALUE IS NOT PRESENT IN THE ARRAY" value to be searched. The value found at
this estimated position is then compared
[END OF IF] with the value being searched for. If the
Step 6: EXIT

Figure 14.3  Algorithm for interpolation search two values are equal, then the search is
complete.

Searching and Sorting  429

  However, in case the values are not equal then depending on the comparison, the remaining
search space is reduced to the part before or after the estimated position. Thus, we see that
interpolation search is similar to the binary search technique. However, the important difference
between the two techniques is that binary search always selects the middle value of the remaining
search space. It discards half of the values based on the comparison between the value found at the
estimated position and the value to be searched. But in interpolation search, interpolation is used to
find an item near the one being searched for, and then linear search is used to find the exact item.

The algorithm for interpolation search is given in Fig. 14.3.

Figure 14.4 helps us to visualize how the search space is divided in case of binary search and

interpolation search.

VAL High
Low Item to be searched

Middle = (low+ high)/2

(a) Binary search divides the list into two equal halves

VAL High
Low Item to be searched

Middle = low + (high - low) ¥ ( (key – a[low]) /(a[high] – a[low]))

(b) Interpolation search divides the list into two equal halves

Figure 14.4  Difference between binary search and interpolation search

Complexity of Interpolation Search Algorithm
When n elements of a list to be sorted are uniformly distributed (average case), interpolation
search makes about log(log n) comparisons. However, in the worst case, that is when the elements
increase exponentially, the algorithm can make up to O(n) comparisons.

Example 14.1  Given a list of numbers a[] = {1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21}. Search

for value 19 using interpolation search technique.

Solution

Low = 0, High = 10, VAL = 19, a[Low] = 1, a[High] = 21
Middle = Low + (High – Low)×((VAL – a[Low]) /(a[High] – a[Low] ))

= 0 +(10 – 0) × ((19 – 1) / (21 – 1) )
= 0 + 10 × 0.9 = 9
a[middle] = a[9] = 19 which is equal to value to be searched.

Programming Example

3. Write a program to search an element in an array using interpolation search.

#include <stdio.h>
#include <conio.h>
#define MAX 20
int interpolation_search(int a[], int low, int high, int val)
{
int mid;
while(low <= high)
{
mid = low + (high – low)*((val – a[low]) / (a[high] – a[low]));
if(val == a[mid])
return mid;
if(val < a[mid])

430  Data Structures Using C

high = mid – 1;
else
low = mid + 1;
}
return –1;
}
int main()
{
int arr[MAX], i, n, val, pos;
clrscr();
printf("\n Enter the number of elements in the array : ");
scanf("%d", &n);
printf("\n Enter the elements : ");
for(i = 0; i <n; i++)
scanf("%d", &arr[i]);
printf("\n Enter the value to be searched : ");
scanf("%d", &val);
pos = interpolation_search(arr, 0, n–1, val);
if(pos == –1)
printf("\n %d is not found in the array", val);
else
printf("\n %d is found at position %d", val, pos);
getche();
return 0;

}

14.5  JUMP SEARCH

When we have an already sorted list, then the other efficient algorithm to search for a value is jump
search or block search. In jump search, it is not necessary to scan all the elements in the list to
find the desired value. We just check an element and if it is less than the desired value, then some
of the elements following it are skipped by jumping ahead. After moving a little forward again,
the element is checked. If the checked element is greater than the desired value, then we have
a boundary and we are sure that the desired value lies between the previously checked element
and the currently checked element. However, if the checked element is less than the value being
searched for, then we again make a small jump and repeat the process.
Once the boundary of the value is determined, a linear search is done to find the value and its
position in the array. For example, consider an array a[] = {1,2,3,4,5,6,7,8,9}. The length of the
array is 9. If we have to find value 8 then following steps are performed using the jump search
technique.

Step 1: First three elements are checked. Since 3 is smaller than 8, we will have to make a
jump ahead

123456789
Step 2: Next three elements are checked. Since 6 is smaller than 8, we will have to make a
jump ahead

123456789
Step 3: Next three elements are checked. Since 9 is greater than 8, the desired value lies
within the current boundary

123456789

Step 4: A linear search is now done to find the value in the array.

The algorithm for jump search is given in Fig. 14.5.

Searching and Sorting  431

JUMP_SEARCH (A, lower_bound, upper_bound, VAL, N)

Step 1: [INITIALIZE] SET STEP = sqrt(N), I = 0, LOW = lower_bound, HIGH = upper_bound, POS = –1

Step 2: Repeat Step 3 while I < STEP

Step 3: IF VAL < A[STEP]

SET HIGH = STEP – 1

ELSE

SET LOW = STEP + 1

[END OF IF]

SET I = I + 1

[END OF LOOP]

Step 4: SET I = LOW

Step 5: Repeat Step 6 while I <= HIGH

Step 6: IF A[I] = Val

POS = I

PRINT POS

Go to Step 8

[END OF IF]

SET I = I + 1

[END OF LOOP]

Step 7: IF POS = –1

PRINT "VALUE IS NOT PRESENT IN THE ARRAY"

[END OF IF]

Step 8: EXIT

Figure 14.5  Algorithm for jump search

Advantage of Jump Search over Linear Search

Suppose we have a sorted list of 1000 elements where the elements have values 0, 1, 2, 3, 4,
…, 999, then sequential search will find the value 674 in exactly 674 iterations. But with jump
search, the same value can be found in 44 iterations. Hence, jump search performs far better than
a linear search on a sorted list of elements.

Advantage of Jump Search over Binary Search

No doubt, binary search is very easy to implement and has a complexity of O(log n), but in case of
a list having very large number of elements, jumping to the middle of the list to make comparisons
is not a good idea because if the value being searched is at the beginning of the list then one (or
even more) large step(s) in the backward direction would have to be taken. In such cases, jump
search performs better as we have to move little backward that too only once. Hence, when
jumping back is slower than jumping forward, the jump search algorithm always performs better.

How to Choose the Step Length?

For the jump search algorithm to work efficiently, we must define a fixed size for the step. If the
step size is 1, then algorithm is same as linear search. Now, in order to find an appropriate step
size, we must first try to figure out the relation between the size of the list (n) and the size of the
step (k). Usually, k is calculated as √n.

Further Optimization of Jump Search

Till now, we were dealing with lists having small number of elements. But in real-world
applications, lists can be very large. In such large lists searching the value from the beginning
of the list may not be a good idea. A better option is to start the search from the k–th element as
shown in the figure below.

432  Data Structures Using C

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 ….

Searching can start from somewhere middle in the list rather than from the beginning to optimize
performance.

We can also improve the performance of jump search algorithm by repeatedly applying jump
search. For example, if the size of the list is 10,00,000 (n). The jump interval would then be √n =
√1000000 =1000. Now, even the identified interval has 1000 elements and is again a large list. So,
jump search can be applied again with a new step size of √1000 ≈ 31. Thus, every time we have
a desired interval with a large number of values, the jump search algorithm can be applied again
but with a smaller step. However, in this case, the complexity of the algorithm will no longer be
O(√n) but will approach a logarithmic value.

Complexity of Jump Search Algorithm

Jump search works by jumping through the array with a step size (optimally chosen to be √n) to
find the interval of the value. Once this interval is identified, the value is searched using the linear
search technique. Therefore, the complexity of the jump search algorithm can be given as O(√n).

Programming Example

4. Write a program to search an element in an array using jump search.

#include <stdio.h>

#include <math.h>

#include <conio.h>

#define MAX 20

int jump_search(int a[], int low, int high, int val, int n)

{

int step, i;

step = sqrt(n);

for(i=0;i<step;i++)

{

if(val < a[step])

high = step – 1;

else

low = step + 1;

}

for(i=low;i<=high;i++)

{

    if(a[i]==val)

  return i;

}

return –1;

}

int main()

{

int arr[MAX], i, n, val, pos;

clrscr();

printf("\n Enter the number of elements in the array : ");

scanf("%d", &n);

printf("\n Enter the elements : ");

for(i = 0; i <n; i++)

scanf("%d", &arr[i]);

printf("\n Enter the value to be searched : ");

scanf("%d", &val);

pos = jump_search(arr, 0, n–1, val, n);

Searching and Sorting  433

if(pos == –1)

printf("\n %d is not found in the array", val);

else

printf("\n %d is found at position %d", val, pos);

getche();

return 0;

}

Fibonacci Search

We are all well aware of the Fibonacci series in which the first two terms are 0 and 1 and then
each successive term is the sum of previous two terms. In the Fibonacci series given below,
each number is called a Fibonacci number.

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ...
The same series and concept can be used to search for a given value in a list of numbers. Such
a search algorithm which is based on Fibonacci numbers is called Fibonacci search and was
developed by Kiefer in 1953. The search follows a divide-and-conquer technique and narrows
down possible locations with the help of Fibonacci numbers.
Fibonacci search is similar to binary search. It also works on a sorted list and has a run time
complexity of O(log n). However, unlike the binary search algorithm, Fibonacci search does
not divide the list into two equal halves rather it subtracts a Fibonacci number from the index to
reduce the size of the list to be searched. So, the key advantage of Fibonacci search over binary
search is that comparison dispersion is low.

14.6  INTRODUCTION TO SORTING

Sorting means arranging the elements of an array so that they are placed in some relevant order
which may be either ascending or descending. That is, if A is an array, then the elements of A are
arranged in a sorted order (ascending order) in such a way that A[0] < A[1] < A[2] < ...... < A[N].
For example, if we have an array that is declared and initialized as

int A[] = {21, 34, 11, 9, 1, 0, 22};

Then the sorted array (ascending order) can be given as:

A[] = {0, 1, 9, 11, 21, 22, 34;

A sorting algorithm is defined as an algorithm that puts the elements of a list in a certain order,
which can be either numerical order, lexicographical order, or any user-defined order. Efficient
sorting algorithms are widely used to optimize the use of other algorithms like search and merge
algorithms which require sorted lists to work correctly. There are two types of sorting:
∑ Internal sorting which deals with sorting the data stored in the computer’s memory
∑ External sorting which deals with sorting the data stored in files. External sorting is applied
when there is voluminous data that cannot be stored in the memory.

14.6.1  Sorting on Multiple Keys

Many a times, when performing real-world applications, it is desired to sort arrays of records
using multiple keys. This situation usually occurs when a single key is not sufficient to uniquely
identify a record. For example, in a big organization we may want to sort a list of employees on
the basis of their departments first and then according to their names in alphabetical order.
Other examples of sorting on multiple keys can be
∑ Telephone directories in which names are sorted by location, category (business or residential),

and then in an alphabetical order.

434  Data Structures Using C

∑ In a library, the information about books can be sorted alphabetically based on titles and then
by authors’ names.

∑ Customers’ addresses can be sorted based on the name of the city and then the street.

Note Data records can be sorted based on a property. Such a component or property is called a sort key. A

sort key can be defined using two or more sort keys. In such a case, the first key is called the primary sort key,
the second is known as the secondary sort key, etc.

Consider the data records given below:

Name Department Salary Phone Number
Telecommunications 1000000 9812345678
Janak Computer Science 890000 9910023456
Raj Electronics 900000 7838987654
Aditya Telecommunications 1100000 9654123456
Huma Computer Science 750000 9350123455
Divya

Now if we take department as the primary key and name as the secondary key, then the sorted
order of records can be given as:

Name Department Salary Phone Number
Computer Science 750000 9350123455
Divya Computer Science 890000 9910023456
Raj Electronics 900000 7838987654
Aditya Telecommunications 1100000 9654123456
Huma Telecommunications 1000000 9812345678
Janak

Observe that the records are sorted based on department. However, within each department the
records are sorted alphabetically based on the names of the employees.

14.6.2  Practical Considerations for Internal Sorting

As mentioned above, records can be sorted either in ascending or descending order based on a field
often called as the sort key. The list of records can be either stored in a contiguous and randomly
accessible data structure (array) or may be stored in a dispersed and only sequentially accessible
data structure like a linked list. But irrespective of the underlying data structure used to store the
records, the logic to sort the records will be same and only the implementation details will differ.
When analysing the performance of different sorting algorithms, the practical considerations
would be the following:

∑ Number of sort key comparisons that will be performed
∑ Number of times the records in the list will be moved
∑ Best case performance
∑ Worst case performance
∑ Average case performance
∑ Stability of the sorting algorithm where stability means that equivalent elements or records

retain their relative positions even after sorting is done

14.7  BUBBLE SORT

Bubble sort is a very simple method that sorts the array elements by repeatedly moving the
largest element to the highest index position of the array segment (in case of arranging elements

Searching and Sorting  435

in ascending order). In bubble sorting, consecutive adjacent pairs of elements in the array are
compared with each other. If the element at the lower index is greater than the element at the
higher index, the two elements are interchanged so that the element is placed before the bigger
one. This process will continue till the list of unsorted elements exhausts.
This procedure of sorting is called bubble sorting because elements ‘bubble’ to the top of the
list. Note that at the end of the first pass, the largest element in the list will be placed at its proper
position (i.e., at the end of the list).

Note If the elements are to be sorted in descending order, then in first pass the smallest element is moved

to the highest index of the array.

Technique

The basic methodology of the working of bubble sort is given as follows:
(a) In Pass 1, A[0] and A[1] are compared, then A[1] is compared with A[2], A[2] is compared with

A[3], and so on. Finally, A[N–2] is compared with A[N–1]. Pass 1 involves n–1 comparisons
and places the biggest element at the highest index of the array.
(b) In Pass 2, A[0] and A[1] are compared, then A[1] is compared with A[2], A[2] is compared with
A[3], and so on. Finally, A[N–3] is compared with A[N–2]. Pass 2 involves n–2 comparisons
and places the second biggest element at the second highest index of the array.
(c) In Pass 3, A[0] and A[1] are compared, then A[1] is compared with A[2], A[2] is compared with
A[3], and so on. Finally, A[N–4] is compared with A[N–3]. Pass 3 involves n–3 comparisons
and places the third biggest element at the third highest index of the array.
(d) In Pass n–1, A[0] and A[1] are compared so that A[0]<A[1]. After this step, all the elements of
the array are arranged in ascending order.

Example 14.2  To discuss bubble sort in detail, let us consider an array A[] that has the following
elements:

A[] = {30, 52, 29, 87, 63, 27, 19, 54}

Pass 1:

(a) Compare 30 and 52. Since 30 < 52, no swapping is done.
(b) Compare 52 and 29. Since 52 > 29, swapping is done.
30, 29, 52, 87, 63, 27, 19, 54
(c) Compare 52 and 87. Since 52 < 87, no swapping is done.
(d) Compare 87 and 63. Since 87 > 63, swapping is done.
30, 29, 52, 63, 87, 27, 19, 54
(e) Compare 87 and 27. Since 87 > 27, swapping is done.
30, 29, 52, 63, 27, 87, 19, 54
(f) Compare 87 and 19. Since 87 > 19, swapping is done.
30, 29, 52, 63, 27, 19, 87, 54
(g) Compare 87 and 54. Since 87 > 54, swapping is done.
30, 29, 52, 63, 27, 19, 54, 87

Observe that after the end of the first pass, the largest element is placed at the highest index of
the array. All the other elements are still unsorted.
Pass 2:

(a) Compare 30 and 29. Since 30 > 29, swapping is done.
29, 30, 52, 63, 27, 19, 54, 87
(b) Compare 30 and 52. Since 30 < 52, no swapping is done.
(c) Compare 52 and 63. Since 52 < 63, no swapping is done.
(d) Compare 63 and 27. Since 63 > 27, swapping is done.
29, 30, 52, 27, 63, 19, 54, 87
(e) Compare 63 and 19. Since 63 > 19, swapping is done.

436  Data Structures Using C

29, 30, 52, 27, 19, 63, 54, 87
(f) Compare 63 and 54. Since 63 > 54, swapping is done.
29, 30, 52, 27, 19, 54, 63, 87

Observe that after the end of the second pass, the second largest element is placed at the second
highest index of the array. All the other elements are still unsorted.
Pass 3:

(a) Compare 29 and 30. Since 29 < 30, no swapping is done.
(b) Compare 30 and 52. Since 30 < 52, no swapping is done.
(c) Compare 52 and 27. Since 52 > 27, swapping is done.
29, 30, 27, 52, 19, 54, 63, 87
(d) Compare 52 and 19. Since 52 > 19, swapping is done.
29, 30, 27, 19, 52, 54, 63, 87
(e) Compare 52 and 54. Since 52 < 54, no swapping is done.

Observe that after the end of the third pass, the third largest element is placed at the third highest
index of the array. All the other elements are still unsorted.
Pass 4:

(a) Compare 29 and 30. Since 29 < 30, no swapping is done.
(b) Compare 30 and 27. Since 30 > 27, swapping is done.
29, 27, 30, 19, 52, 54, 63, 87
(c) Compare 30 and 19. Since 30 > 19, swapping is done.
29, 27, 19, 30, 52, 54, 63, 87
(d) Compare 30 and 52. Since 30 < 52, no swapping is done.

Observe that after the end of the fourth pass, the fourth largest element is placed at the fourth
highest index of the array. All the other elements are still unsorted.
Pass 5:

(a) Compare 29 and 27. Since 29 > 27, swapping is done.
27, 29, 19, 30, 52, 54, 63, 87
(b) Compare 29 and 19. Since 29 > 19, swapping is done.
27, 19, 29, 30, 52, 54, 63, 87
(c) Compare 29 and 30. Since 29 < 30, no swapping is done.

Observe that after the end of the fifth pass, the fifth largest element is placed at the fifth highest
index of the array. All the other elements are still unsorted.
Pass 6:

(a) Compare 27 and 19. Since 27 > 19, swapping is done.
19, 27, 29, 30, 52, 54, 63, 87
(b) Compare 27 and 29. Since 27 < 29, no swapping is done.

Observe that after the end of the sixth pass, the sixth largest element is placed at the sixth largest
index of the array. All the other elements are still unsorted.
Pass 7:

(a) Compare 19 and 27. Since 19 < 27, no swapping is done.

Observe that the entire list is sorted now.

BUBBLE_SORT(A, N)   Figure 14.6 shows the algorithm for bubble sort.
In this algorithm, the outer loop is for the total
Step 1: Repeat Step 2 For 1 = to N-1 number of passes which is N–1. The inner loop will
be executed for every pass. However, the frequency
Step 2: Repeat For J = to N - I of the inner loop will decrease with every pass
because after every pass, one element will be in
Step 3: IF A[J] > A[J + 1] its correct position. Therefore, for every pass, the
inner loop will be executed N–I times, where N is
SWAP A[J] and A[J+1] the number of elements in the array and I is the
count of the pass.
[END OF INNER LOOP]

[END OF OUTER LOOP]

Step 4: EXIT

Figure 14.6  Algorithm for bubble sort


Click to View FlipBook Version