214 Part Two Information Technology Infrastructure
Let’s look at how a relational database organizes data about suppliers and
parts (see Figure 6-4). The database has a separate table for the entity
SUPPLIER and a table for the entity PART. Each table consists of a grid of
columns and rows of data. Each individual element of data for each entity is
stored as a separate field, and each field represents an attribute for that entity.
Fields in a relational database are also called columns. For the entity
SUPPLIER, the supplier identification number, name, street, city, state, and ZIP
code are stored as separate fields within the SUPPLIER table and each field rep-
resents an attribute for the entity SUPPLIER.
The actual information about a single supplier that resides in a table is called
a row. Rows are commonly referred to as records, or in very technical terms, as
tuples. Data for the entity PART have their own separate table.
The field for Supplier_Number in the SUPPLIER table uniquely identifies
each record so that the record can be retrieved, updated, or sorted and it is
called a key field. Each table in a relational database has one field that is
designated as its primary key. This key field is the unique identifier for all the
information in any row of the table and this primary key cannot be duplicated.
FIGURE 6-4 RELATIONAL DATABASE TABLES
A relational database organizes data in the form of two-dimensional tables. Illustrated here are tables for the entities SUPPLIER and PART
showing how they represent each entity and its attributes. Supplier_Number is a primary key for the SUPPLIER table and a foreign key for
the PART table.
Chapter 6 Foundations of Business Intelligence: Databases and Information Management 215
Supplier_Number is the primary key for the SUPPLIER table and Part_Number
is the primary key for the PART table. Note that Supplier_Number appears in
both the SUPPLIER and PART tables. In the SUPPLIER table, Supplier_Number
is the primary key. When the field Supplier_Number appears in the PART table
it is called a foreign key and is essentially a lookup field to look up data about
the supplier of a specific part.
Operations of a Relational DBMS
Relational database tables can be combined easily to deliver data required by
users, provided that any two tables share a common data element. Suppose we
wanted to find in this database the names of suppliers who could provide us
with part number 137 or part number 150. We would need information from
two tables: the SUPPLIER table and the PART table. Note that these two files
have a shared data element: Supplier_Number.
In a relational database, three basic operations, as shown in Figure 6-5,
are used to develop useful sets of data: select, join, and project. The select
operation creates a subset consisting of all records in the file that meet stated
criteria. Select creates, in other words, a subset of rows that meet certain
criteria. In our example, we want to select records (rows) from the PART table
where the Part_Number equals 137 or 150. The join operation combines
relational tables to provide the user with more information than is available in
individual tables. In our example, we want to join the now-shortened PART
table (only parts 137 or 150 will be presented) and the SUPPLIER table into a
single new table.
The project operation creates a subset consisting of columns in a table,
permitting the user to create new tables that contain only the information
required. In our example, we want to extract from the new table only the following
columns: Part_Number, Part_Name, Supplier_Number, and Supplier_Name.
Object-Oriented DBMS
Many applications today and in the future require databases that can store and
retrieve not only structured numbers and characters but also drawings, images,
photographs, voice, and full-motion video. DBMS designed for organizing
structured data into rows and columns are not well suited to handling graphics-
based or multimedia applications. Object-oriented databases are better suited
for this purpose.
An object-oriented DBMS stores the data and procedures that act on those
data as objects that can be automatically retrieved and shared. Object-oriented
database management systems (OODBMS) are becoming popular because they
can be used to manage the various multimedia components or Java applets used
in Web applications, which typically integrate pieces of information from a variety
of sources.
Although object-oriented databases can store more complex types of informa-
tion than relational DBMS, they are relatively slow compared with relational
DBMS for processing large numbers of transactions. Hybrid object-relational
DBMS systems are now available to provide capabilities of both object-oriented
and relational DBMS.
Databases in the Cloud
Suppose your company wants to use cloud computing services. Is there a way
to manage data in the cloud? The answer is a qualified “Yes.” Cloud computing
providers offer database management services, but these services typically
have less functionality than their on-premises counterparts. At the moment,
FIGURE 6-5 THE THREE BASIC OPERATIONS OF A RELATIONAL DBMS 216 Part Two Information Technology Infrastructure
The select, join, and project operations enable data from two different tables to be combined and only selected attributes to be displayed.
Chapter 6 Foundations of Business Intelligence: Databases and Information Management 217
the primary customer base for cloud-based data management consists of Web-
focused start-ups or small to medium-sized businesses looking for database
capabilities at a lower price than a standard relational DBMS.
Amazon Web Services has both a simple non-relational database called
SimpleDB and a Relational Database Service, which is based on an online
implementation of the MySQL open source DBMS. Amazon Relational Database
Service (Amazon RDS) offers the full range of capabilities of MySQL. Pricing is
based on usage. (Charges run from 11 cents per hour for a small database using
1.7 GB of server memory to $3.10 per hour for a large database using 68 GB of
server memory.) There are also charges for the volume of data stored, the num-
ber of input-output requests, the amount of data written to the database, and
the amount of data read from the database.
Amazon Web Services additionally offers Oracle customers the option to
license Oracle Database 11g, Oracle Enterprise Manager, and Oracle Fusion
Middleware to run on the Amazon EC2 (Elastic Cloud Compute) platform.
Microsoft SQL Azure Database is a cloud-based relational database service
based on Microsoft’s SQL Server DBMS. It provides a highly available, scalable
database service hosted by Microsoft in the cloud. SQL Azure Database helps
reduce costs by integrating with existing software tools and providing symme-
try with on-premises and cloud databases.
TicketDirect, which sells tickets to concerts, sporting events, theater perfor-
mances, and movies in Australia and New Zealand, adopted the SQL Azure
Database cloud platform in order to improve management of peak system loads
during major ticket sales. It migrated its data to the SQL Azure database. By
moving to a cloud solution, TicketDirect is able to scale its computing resources
in response to real-time demand while keeping costs low.
CAPABILITIES OF DATABASE MANAGEMENT SYSTEMS
A DBMS includes capabilities and tools for organizing, managing, and accessing
the data in the database. The most important are its data definition language,
data dictionary, and data manipulation language.
DBMS have a data definition capability to specify the structure of the content
of the database. It would be used to create database tables and to define the
characteristics of the fields in each table. This information about the database
would be documented in a data dictionary. A data dictionary is an automated or
manual file that stores definitions of data elements and their characteristics.
Microsoft Access has a rudimentary data dictionary capability that displays
information about the name, description, size, type, format, and other proper-
ties of each field in a table (see Figure 6-6). Data dictionaries for large corporate
databases may capture additional information, such as usage, ownership (who
in the organization is responsible for maintaining the data), authorization;
security, and the individuals, business functions, programs, and reports that
use each data element.
Querying and Reporting
DBMS includes tools for accessing and manipulating information in databases.
Most DBMS have a specialized language called a data manipulation language
that is used to add, change, delete, and retrieve the data in the database.
This language contains commands that permit end users and programming
specialists to extract data from the database to satisfy information requests and
develop applications. The most prominent data manipulation language today is
Structured Query Language, or SQL. Figure 6-7 illustrates the SQL query that
218 Part Two Information Technology Infrastructure
FIGURE 6-6 MICROSOFT ACCESS DATA DICTIONARY FEATURES
Microsoft Access has a rudimentary data dictionary capability that displays information about the size,
format, and other characteristics of each field in a database. Displayed here is the information maintained
in the SUPPLIER table. The small key icon to the left of Supplier_Number indicates that it is a key field.
would produce the new resultant table in Figure 6-5. You can find out more
about how to perform SQL queries in our Learning Tracks for this chapter.
Users of DBMS for large and midrange computers, such as DB2, Oracle, or
SQL Server, would employ SQL to retrieve information they needed from the
database. Microsoft Access also uses SQL, but it provides its own set of
user-friendly tools for querying databases and for organizing data from
databases into more polished reports.
In Microsoft Access, you will find features that enable users to create
queries by identifying the tables and fields they want and the results, and
then selecting the rows from the database that meet particular criteria. These
actions in turn are translated into SQL commands. Figure 6-8 illustrates how
FIGURE 6-7 EXAMPLE OF AN SQL QUERY
Illustrated here are the SQL statements for a query to select suppliers for parts 137 or 150.
They produce a list with the same results as Figure 6-5.
Chapter 6 Foundations of Business Intelligence: Databases and Information Management 219
FIGURE 6-8 AN ACCESS QUERY
Illustrated here is how the query in Figure 6-7 would be constructed using Microsoft Access query-
building tools. It shows the tables, fields, and selection criteria used for the query.
the same query as the SQL query to select parts and suppliers would be
constructed using the Microsoft query-building tools.
Microsoft Access and other DBMS include capabilities for report generation
so that the data of interest can be displayed in a more structured and polished
format than would be possible just by querying. Crystal Reports is a popular
report generator for large corporate DBMS, although it can also be used with
Access. Access also has capabilities for developing desktop system applications.
These include tools for creating data entry screens, reports, and developing the
logic for processing transactions.
DESIGNING DATABASES
To create a database, you must understand the relationships among the data,
the type of data that will be maintained in the database, how the data will be
used, and how the organization will need to change to manage data from a
company-wide perspective. The database requires both a conceptual design and
a physical design. The conceptual, or logical, design of a database is an abstract
model of the database from a business perspective, whereas the physical design
shows how the database is actually arranged on direct-access storage devices.
Normalization and Entity-Relationship Diagrams
The conceptual database design describes how the data elements in the
database are to be grouped. The design process identifies relationships among
data elements and the most efficient way of grouping data elements together to
meet business information requirements. The process also identifies redundant
data elements and the groupings of data elements required for specific
220 Part Two Information Technology Infrastructure
FIGURE 6-9 AN UNNORMALIZED RELATION FOR ORDER
An unnormalized relation contains repeating groups. For example, there can be many parts and suppliers for each order. There is
only a one-to-one correspondence between Order_Number and Order_Date.
application programs. Groups of data are organized, refined, and streamlined
until an overall logical view of the relationships among all the data in the
database emerges.
To use a relational database model effectively, complex groupings of data
must be streamlined to minimize redundant data elements and awkward many-
to-many relationships. The process of creating small, stable, yet flexible and
adaptive data structures from complex groups of data is called normalization.
Figures 6-9 and 6-10 illustrate this process.
In the particular business modeled here, an order can have more than one
part but each part is provided by only one supplier. If we build a relation called
ORDER with all the fields included here, we would have to repeat the name and
address of the supplier for every part on the order, even though the order is for
parts from a single supplier. This relationship contains what are called repeating
data groups because there can be many parts on a single order to a given
supplier. A more efficient way to arrange the data is to break down ORDER into
smaller relations, each of which describes a single entity. If we go step by step
and normalize the relation ORDER, we emerge with the relations illustrated in
Figure 6-10. You can find out more about normalization, entity-relationship
diagramming, and database design in the Learning Tracks for this chapter.
Relational database systems try to enforce referential integrity rules to
ensure that relationships between coupled tables remain consistent. When one
table has a foreign key that points to another table, you may not add a record to
the table with the foreign key unless there is a corresponding record in the linked
table. In the database we examined earlier in this chapter, the foreign key
FIGURE 6-10 NORMALIZED TABLES CREATED FROM ORDER
After normalization, the original relation ORDER has been broken down into four smaller relations. The relation ORDER is left with only two
attributes and the relation LINE_ITEM has a combined, or concatenated, key consisting of Order_Number and Part_Number.
Chapter 6 Foundations of Business Intelligence: Databases and Information Management 221
Supplier_Number links the PART table to the SUPPLIER table. We may not add a
new record to the PART table for a part with Supplier_Number 8266 unless there
is a corresponding record in the SUPPLIER table for Supplier_Number 8266.
We must also delete the corresponding record in the PART table if we delete the
record in the SUPPLIER table for Supplier_Number 8266. In other words, we
shouldn’t have parts from nonexistent suppliers!
Database designers document their data model with an entity-relationship
diagram, illustrated in Figure 6-11. This diagram illustrates the relationship
between the entities SUPPLIER, PART, LINE_ITEM, and ORDER. The boxes
represent entities. The lines connecting the boxes represent relationships. A line
connecting two entities that ends in two short marks designates a one-to-one
relationship. A line connecting two entities that ends with a crow’s foot topped by
a short mark indicates a one-to-many relationship. Figure 6-11 shows that one
ORDER can contain many LINE_ITEMs. (A PART can be ordered many times and
appear many times as a line item in a single order.) Each PART can have only one
SUPPLIER, but many PARTs can be provided by the same SUPPLIER.
It can’t be emphasized enough: If the business doesn’t get its data model
right, the system won’t be able to serve the business well. The company’s
systems will not be as effective as they could be because they’ll have to
work with data that may be inaccurate, incomplete, or difficult to retrieve.
Understanding the organization’s data and how they should be represented in a
database is perhaps the most important lesson you can learn from this course.
For example, Famous Footwear, a shoe store chain with more than 800
locations in 49 states, could not achieve its goal of having “the right style of shoe
in the right store for sale at the right price” because its database was not
properly designed for rapidly adjusting store inventory. The company had an
Oracle relational database running on an IBM AS/400 midrange computer, but
the database was designed primarily for producing standard reports for
management rather than for reacting to marketplace changes. Management
could not obtain precise data on specific items in inventory in each of its stores.
The company had to work around this problem by building a new database
where the sales and inventory data could be better organized for analysis and
inventory management.
6.3 USING DATABASES TO IMPROVE BUSINESS
PERFORMANCE AND DECISION MAKING
Businesses use their databases to keep track of basic transactions, such as paying
suppliers, processing orders, keeping track of customers, and paying employees.
But they also need databases to provide information that will help the company
FIGURE 6-11 AN ENTITY-RELATIONSHIP DIAGRAM
This diagram shows the relationships between the entities SUPPLIER, PART, LINE_ITEM, and ORDER that might be used to model the
database in Figure 6-10.
222 Part Two Information Technology Infrastructure
run the business more efficiently, and help managers and employees make bet-
ter decisions. If a company wants to know which product is the most popular or
who is its most profitable customer, the answer lies in the data.
For example, by analyzing data from customer credit card purchases,
Louise’s Trattoria, a Los Angeles restaurant chain, learned that quality was
more important than price for most of its customers, who were college-
educated and liked fine wine. Acting on this information, the chain introduced
vegetarian dishes, more seafood selections, and more expensive wines, raising
sales by more than 10 percent.
In a large company, with large databases or large systems for separate
functions, such as manufacturing, sales, and accounting, special capabilities
and tools are required for analyzing vast quantities of data and for accessing
data from multiple systems. These capabilities include data warehousing, data
mining, and tools for accessing internal databases through the Web.
DATA WAREHOUSES
Suppose you want concise, reliable information about current operations,
trends, and changes across the entire company If you worked in a large
company, obtaining this might be difficult because data are often maintained in
separate systems, such as sales, manufacturing, or accounting. Some of the data
you need might be found in the sales system, and other pieces in the
manufacturing system. Many of these systems are older legacy systems that
use outdated data management technologies or file systems where information
is difficult for users to access.
You might have to spend an inordinate amount of time locating and
gathering the data you need, or you would be forced to make your decision
based on incomplete knowledge. If you want information about trends, you
might also have trouble finding data about past events because most firms only
make their current data immediately available. Data warehousing addresses
these problems.
What Is a Data Warehouse?
A data warehouse is a database that stores current and historical data of
potential interest to decision makers throughout the company. The data
originate in many core operational transaction systems, such as systems
for sales, customer accounts, and manufacturing, and may include data
from Web site transactions. The data warehouse consolidates and standardizes
information from different operational databases so that the information can be
used across the enterprise for management analysis and decision making.
Figure 6-12 illustrates how a data warehouse works. The data warehouse
makes the data available for anyone to access as needed, but it cannot be
altered. A data warehouse system also provides a range of ad hoc and
standardized query tools, analytical tools, and graphical reporting facilities.
Many firms use intranet portals to make the data warehouse information
widely available throughout the firm.
Catalina Marketing, a global marketing firm for major consumer packaged
goods companies and retailers, operates a gigantic data warehouse that includes
three years of purchase history for 195 million U.S. customer loyalty program
members at supermarkets, pharmacies, and other retailers. It is the largest
loyalty database in the world. Catalina’s retail store customers analyze this
database of customer purchase histories to determine individual customers’
buying preferences. When a shopper checks out at the cash register of one of
Chapter 6 Foundations of Business Intelligence: Databases and Information Management 223
FIGURE 6-12 COMPONENTS OF A DATA WAREHOUSE
The data warehouse extracts current and historical data from multiple operational systems inside the
organization. These data are combined with data from external sources and reorganized into a central
database designed for management reporting and analysis. The information directory provides users
with information about the data available in the warehouse.
Catalina’s retail customers, the purchase is instantly analyzed along with that
customer’s buying history in the data warehouse to determine what coupons
that customer will receive at checkout along with a receipt.
The U.S. Internal Revenue Service (IRS) maintains a Compliance Data
Warehouse that consolidates taxpayer data that had been fragmented among
many different legacy systems, including personal information about taxpay-
ers and archived tax returns. These systems had been designed to process tax
return forms efficiently but their data were very difficult to query and
analyze. The Compliance Data Warehouse integrates taxpayer data from
many disparate sources into a relational structure, which makes querying
and analysis much easier. With a complete and comprehensive picture of
taxpayers, the warehouse helps IRS analysts and staff identify people who
are most likely to cheat on their income tax payments and respond rapidly to
taxpayer queries.
Data Marts
Companies often build enterprise-wide data warehouses, where a central data
warehouse serves the entire organization, or they create smaller, decentral-
ized warehouses called data marts. A data mart is a subset of a data ware-
house in which a summarized or highly focused portion of the organization’s
data is placed in a separate database for a specific population of users. For
example, a company might develop marketing and sales data marts to deal
with customer information. Before implementing an enterprise-wide data
warehouse, bookseller Barnes & Noble maintained a series of data marts—one
for point-of-sale data in retail stores, another for college bookstore sales, and a
third for online sales. A data mart typically focuses on a single subject area or
line of business, so it usually can be constructed more rapidly and at lower
cost than an enterprise-wide data warehouse.
224 Part Two Information Technology Infrastructure
TOOLS FOR BUSINESS INTELLIGENCE:
MULTIDIMENSIONAL DATA ANALYSIS AND DATA
MINING
Once data have been captured and organized in data warehouses and data marts,
they are available for further analysis using tools for business intelligence,
which we introduced briefly in Chapter 2. Business intelligence tools enable
users to analyze data to see new patterns, relationships, and insights that are
useful for guiding decision making.
Principal tools for business intelligence include software for database querying
and reporting, tools for multidimensional data analysis (online analytical
processing), and tools for data mining. This section will introduce you to these
tools, with more detail about business intelligence analytics and applications in
the Chapter 12 discussion of decision making.
Online Analytical Processing (OLAP)
Suppose your company sells four different products—nuts, bolts, washers, and
screws—in the East, West, and Central regions. If you wanted to ask a fairly
straightforward question, such as how many washers were sold during the past
quarter, you could easily find the answer by querying your sales database.
But what if you wanted to know how many washers sold in each of your sales
regions and compare actual results with projected sales?
To obtain the answer, you would need online analytical processing
(OLAP). OLAP supports multidimensional data analysis, enabling users to
view the same data in different ways using multiple dimensions. Each aspect
of information—product, pricing, cost, region, or time period—represents a
different dimension. So, a product manager could use a multidimensional data
analysis tool to learn how many washers were sold in the East in June, how that
compares with the previous month and the previous June, and how it compares
with the sales forecast. OLAP enables users to obtain online answers to ad hoc
questions such as these in a fairly rapid amount of time, even when the data are
stored in very large databases, such as sales figures for multiple years.
Figure 6-13 shows a multidimensional model that could be created to
represent products, regions, actual sales, and projected sales. A matrix of
actual sales can be stacked on top of a matrix of projected sales to form a cube
with six faces. If you rotate the cube 90 degrees one way, the face showing will
be product versus actual and projected sales. If you rotate the cube 90 degrees
again, you will see region versus actual and projected sales. If you rotate 180
degrees from the original view, you will see projected sales and product versus
region. Cubes can be nested within cubes to build complex views of data. A
company would use either a specialized multidimensional database or a tool
that creates multidimensional views of data in relational databases.
Data Mining
Traditional database queries answer such questions as, “How many units of
product number 403 were shipped in February 2010?” OLAP, or multidimen-
sional analysis, supports much more complex requests for information, such as
“Compare sales of product 403 relative to plan by quarter and sales region for
the past two years.” With OLAP and query-oriented data analysis, users need to
have a good idea about the information for which they are looking.
Data mining is more discovery-driven. Data mining provides insights into
corporate data that cannot be obtained with OLAP by finding hidden patterns and
Chapter 6 Foundations of Business Intelligence: Databases and Information Management 225
FIGURE 6-13 MULTIDIMENSIONAL DATA MODEL
The view that is showing is product versus region. If you rotate the cube 90 degrees, the face will
show product versus actual and projected sales. If you rotate the cube 90 degrees again, you will see
region versus actual and projected sales. Other views are possible.
relationships in large databases and inferring rules from them to predict future
behavior. The patterns and rules are used to guide decision making and forecast
the effect of those decisions. The types of information obtainable from data
mining include associations, sequences, classifications, clusters, and forecasts.
• Associations are occurrences linked to a single event. For instance, a study of
supermarket purchasing patterns might reveal that, when corn chips are
purchased, a cola drink is purchased 65 percent of the time, but when there
is a promotion, cola is purchased 85 percent of the time. This information
helps managers make better decisions because they have learned the
profitability of a promotion.
• In sequences, events are linked over time. We might find, for example, that if
a house is purchased, a new refrigerator will be purchased within two weeks
65 percent of the time, and an oven will be bought within one month of the
home purchase 45 percent of the time.
• Classification recognizes patterns that describe the group to which an item
belongs by examining existing items that have been classified and by
inferring a set of rules. For example, businesses such as credit card or
telephone companies worry about the loss of steady customers. Classification
helps discover the characteristics of customers who are likely to leave and
can provide a model to help managers predict who those customers are so
that the managers can devise special campaigns to retain such customers.
• Clustering works in a manner similar to classification when no groups have yet
been defined. A data mining tool can discover different groupings within data,
such as finding affinity groups for bank cards or partitioning a database into
groups of customers based on demographics and types of personal investments.
• Although these applications involve predictions, forecasting uses predictions in
a different way. It uses a series of existing values to forecast what other values
will be. For example, forecasting might find patterns in data to help managers
estimate the future value of continuous variables, such as sales figures.
These systems perform high-level analyses of patterns or trends, but they
can also drill down to provide more detail when needed. There are data mining
226 Part Two Information Technology Infrastructure
applications for all the functional areas of business, and for government and
scientific work. One popular use for data mining is to provide detailed analyses
of patterns in customer data for one-to-one marketing campaigns or for
identifying profitable customers.
For example, Harrah’s Entertainment, the second-largest gambling company
in its industry, uses data mining to identify its most profitable customers and
generate more revenue from them. The company continually analyzes data
about its customers gathered when people play its slot machines or use
Harrah’s casinos and hotels. Harrah’s marketing department uses this informa-
tion to build a detailed gambling profile, based on a particular customer’s
ongoing value to the company. For instance, data mining lets Harrah’s know
the favorite gaming experience of a regular customer at one of its Midwest
riverboat casinos, along with that person’s preferences for room accomodations,
restaurants, and entertainment. This information guides management deci-
sions about how to cultivate the most profitable customers, encourage those
customers to spend more, and attract more customers with high revenue-gen-
erating potential. Business intelligence has improved Harrah’s profits so much
that it has become the centerpiece of the firm’s business strategy.
Predictive analytics use data mining techniques, historical data, and assump-
tions about future conditions to predict outcomes of events, such as the probabil-
ity a customer will respond to an offer or purchase a specific product. For exam-
ple, the U.S. division of The Body Shop International plc used predictive analytics
with its database of catalog, Web, and retail store customers to identify customers
who were more likely to make catalog purchases. That information helped the
company build a more precise and targeted mailing list for its catalogs, improving
the response rate for catalog mailings and catalog revenues.
Text Mining and Web Mining
Business intelligence tools deal primarily with data that have been structured in
databases and files. However, unstructured data, most in the form of text files,
is believed to account for over 80 percent of an organization’s useful informa-
tion. E-mail, memos, call center transcripts, survey responses, legal cases,
patent descriptions, and service reports are all valuable for finding patterns and
trends that will help employees make better business decisions. Text mining
tools are now available to help businesses analyze these data. These tools are
able to extract key elements from large unstructured data sets, discover
patterns and relationships, and summarize the information. Businesses might
turn to text mining to analyze transcripts of calls to customer service centers to
identify major service and repair issues.
Text mining is a relatively new technology, but what’s really new are the
myriad ways in which unstructured data are being generated by consumers and
the business uses for these data. The Interactive Session on Technology
explores some of these business applications of text mining.
The Web is another rich source of valuable information, some of which can
now be mined for patterns, trends, and insights into customer behavior.
The discovery and analysis of useful patterns and information from the World
Wide Web is called Web mining. Businesses might turn to Web mining to help
them understand customer behavior, evaluate the effectiveness of a particular
Web site, or quantify the success of a marketing campaign. For instance,
marketers use Google Trends and Google Insights for Search services, which
track the popularity of various words and phrases used in Google search
queries, to learn what people are interested in and what they are interested in
buying.
Chapter 6 Foundations of Business Intelligence: Databases and Information Management 227
INTERACTIVE SESSION: TECHNOLOGY
WHAT CAN BUSINESSES LEARN FROM TEXT MINING?
Text mining is the discovery of patterns and relation- tions pinpoint key events and relationships, such as
ships from large sets of unstructured data—the kind intent to buy, intent to leave, or customer “wish”
of data we generate in e-mails, phone conversations, events. It can reveal specific product and service
blog postings, online customer surveys, and tweets. issues, reactions to marketing and public relations
The mobile digital platform has amplified the efforts, and even buying signals.
explosion in digital information, with hundreds of
millions of people calling, texting, searching, Attensity’s software integrated with JetBlue’s other
“apping” (using applications), buying goods, and customer analysis tools, such as Satmetrix’s Net
writing billions of e-mails on the go. Promoter metrics, which classifies customers into
groups that are generating positive, negative, or no
Consumers today are more than just consumers: feedback about the company. Using Attensity’s text
they have more ways to collaborate, share informa- analytics in tandem with these tools, JetBlue devel-
tion, and influence the opinions of their friends and oped a customer bill of rights that addressed the
peers, and the data they create in doing so have major issues customers had with the company.
significant value to businesses. Unlike structured
data, which are generated from events such as Hotel chains like Gaylord Hotels and Choice
completing a purchase transaction, unstructured data Hotels are using text mining software to glean
have no distinct form. Nevertheless, managers insights from thousands of customer satisfaction
believe such data may offer unique insights into cus- surveys provided by their guests. Gaylord Hotels is
tomer behavior and attitudes that were much more using Clarabridge’s text analytics solution delivered
difficult to determine years ago. via the Internet as a hosted software service to
gather and analyze customer feedback from surveys,
For example, in 2007, JetBlue experienced e-mail, chat messaging, staffed call centers, and
unprecedented levels of customer discontent in the online forums associated with guests’ and meeting
wake of a February ice storm that resulted in planners’ experiences at the company’s convention
widespread flight cancellations and planes stranded resorts. The Clarabridge software sorts through the
on Kennedy Airport runways. The airline received hotel chain’s customer surveys and gathers positive
15,000 e-mails per day from customers during the and negative comments, organizing them into a
storm and immediately afterwards, up from its usual variety of categories to reveal less obvious insights.
daily volume of 400. The volume was so much larger For example, guests complained about many things
than usual that JetBlue had no simple way to read more frequently than noisy rooms, but complaints of
everything its customers were saying. noisy rooms were most frequently correlated with
surveys indicating an unwillingness to return to the
Fortunately, the company had recently contracted hotel for another stay.
with Attensity, a leading vendor of text analytics
software, and was able to use the software to analyze Analyzing customer surveys used to take weeks,
all of the e-mail it had received within two days. but now takes only days, thanks to the Clarabridge
According to JetBlue research analyst Bryan software. Location managers and corporate execu-
Jeppsen, Attensity Analyze for Voice of the Customer tives have also used findings from text mining to
(VoC) enabled JetBlue to rapidly extract customer influence decisions on building improvements.
sentiments, preferences, and requests it couldn’t find
any other way. This tool uses a proprietary technol- Wendy’s International adopted Clarabridge software
ogy to automatically identify facts, opinions, to analyze nearly 500,000 messages it collects each
requests, trends, and trouble spots from the unstruc- year from its Web-based feedback forum, call center
tured text of survey responses, service notes, e-mail notes, e-mail messages, receipt-based surveys, and
messages, Web forums, blog entries, news articles, social media. The chain’s customer satisfaction team
and other customer communications. The technol- had previously used spreadsheets and keyword
ogy is able to accurately and automatically identify searches to review customer comments, a very slow
the many different “voices” customers use to express manual approach. Wendy’s management was looking
their feedback (such as a negative voice, positive for a better tool to speed analysis, detect emerging
voice, or conditional voice), which helps organiza- issues, and pinpoint troubled areas of the business at
the store, regional, or corporate level.
228 Part Two Information Technology Infrastructure determine how consumers feel about their brand and
take steps to respond to negative sentiment.
The Clarabridge technology enables Wendy’s to
track customer experiences down to the store level Structured data analysis won’t be rendered obso-
within minutes. This timely information helps store, lete by text analytics, but companies that are able to
regional, and corporate managers spot and address use both methods to develop a clearer picture of
problems related to meal quality, cleanliness, and their customers’ attitudes will have an easier time
speed of service. establishing and building their brand and gleaning
insights that will enhance profitability.
Text analytics software caught on first with
government agencies and larger companies with Sources: Doug Henschen, “Wendy’s Taps Text Analytics to Mine
information systems departments that had the Customer Feedback,” Information Week, March 23, 2010; David
means to properly use the complicated software, but Stodder,” How Text Analytics Drive Customer Insight” Information
Clarabridge is now offering a version of its product Week, February 1, 2010; Nancy David Kho, “Customer Experience
geared towards small businesses. The technology has and Sentiment Analysis,” KMWorld, February 1, 2010; Siobhan
already caught on with law enforcement, search tool Gorman, “Details of Einstein Cyber-Shield Disclosed by White
interfaces, and “listening platforms” like Nielsen House,” The Wall Street Journal, March 2, 2010; www.attensity.com,
Online. Listening platforms are text mining tools that accessed June 16, 2010; and www.clarabridge.com, accessed
focus on brand management, allowing companies to June 17, 2010.
CASE STUDY QUESTIONS MIS IN ACTION
1. What challenges does the increase in unstruc- Visit a Web site such as QVC.com or TripAdvisor.com
tured data present for businesses? detailing products or services that have customer
reviews. Pick a product, hotel, or other service with
2. How does text-mining improve decision-making? at least 15 customer reviews and read those reviews,
both positive and negative. How could Web content
3. What kinds of companies are most likely to benefit mining help the offering company improve or better
from text mining software? Explain your answer. market this product or service? What pieces of
information should highlighted?
4. In what ways could text mining potentially lead to
the erosion of personal information privacy?
Explain.
Web mining looks for patterns in data through content mining, structure
mining, and usage mining. Web content mining is the process of extracting
knowledge from the content of Web pages, which may include text, image, audio,
and video data. Web structure mining extracts useful information from the links
embedded in Web documents. For example, links pointing to a document indicate
the popularity of the document, while links coming out of a document indicate
the richness or perhaps the variety of topics covered in the document. Web usage
mining examines user interaction data recorded by a Web server whenever
requests for a Web site’s resources are received. The usage data records the user’s
behavior when the user browses or makes transactions on the Web site and col-
lects the data in a server log. Analyzing such data can help companies determine
the value of particular customers, cross marketing strategies across products, and
the effectiveness of promotional campaigns.
DATABASES AND THE WEB
Have you ever tried to use the Web to place an order or view a product catalog?
If so, you were probably using a Web site linked to an internal corporate
database. Many companies now use the Web to make some of the information
in their internal databases available to customers and business partners.
Chapter 6 Foundations of Business Intelligence: Databases and Information Management 229
Suppose, for example, a customer with a Web browser wants to search an online
retailer’s database for pricing information. Figure 6-14 illustrates how that cus-
tomer might access the retailer’s internal database over the Web. The user
accesses the retailer’s Web site over the Internet using Web browser software on
his or her client PC. The user’s Web browser software requests data from the orga-
nization’s database, using HTML commands to communicate with the Web server.
Because many back-end databases cannot interpret commands written in
HTML, the Web server passes these requests for data to software that translates
HTML commands into SQL so that they can be processed by the DBMS working
with the database. In a client/server environment, the DBMS resides on a
dedicated computer called a database server. The DBMS receives the SQL
requests and provides the required data. The middleware transfers information
from the organization’s internal database back to the Web server for delivery in
the form of a Web page to the user.
Figure 6-14 shows that the middleware working between the Web server and
the DBMS is an application server running on its own dedicated computer (see
Chapter 5). The application server software handles all application operations,
including transaction processing and data access, between browser-based
computers and a company’s back-end business applications or databases. The
application server takes requests from the Web server, runs the business logic to
process transactions based on those requests, and provides connectivity to the
organization’s back-end systems or databases. Alternatively, the software for
handling these operations could be a custom program or a CGI script. A CGI
script is a compact program using the Common Gateway Interface (CGI) specifi-
cation for processing data on a Web server.
There are a number of advantages to using the Web to access an organiza-
tion’s internal databases. First, Web browser software is much easier to use than
proprietary query tools. Second, the Web interface requires few or no changes
to the internal database. It costs much less to add a Web interface in front of a
legacy system than to redesign and rebuild the system to improve user access.
Accessing corporate databases through the Web is creating new efficiencies,
opportunities, and business models. ThomasNet.com provides an up-to-date
online directory of more than 600,000 suppliers of industrial products, such as
chemicals, metals, plastics, rubber, and automotive equipment. Formerly called
Thomas Register, the company used to send out huge paper catalogs with this
information. Now it provides this information to users online via its Web site
and has become a smaller, leaner company.
Other companies have created entirely new businesses based on access to
large databases through the Web. One is the social networking site MySpace,
which helps users stay connected with each other or meet new people.
FIGURE 6-14 LINKING INTERNAL DATABASES TO THE WEB
Users access an organization’s internal database through the Web using their desktop PCs and Web
browser software.
230 Part Two Information Technology Infrastructure
MySpace features music, comedy, videos, and “profiles” with information
supplied by 122 million users about their age, hometown, dating preferences,
marital status, and interests. It maintains a massive database to house and
manage all of this content. Facebook uses a similar database.
6.4 MANAGING DATA RESOURCES
Setting up a database is only a start. In order to make sure that the data for your
business remain accurate, reliable, and readily available to those who need it,
your business will need special policies and procedures for data management.
ESTABLISHING AN INFORMATION POLICY
Every business, large and small, needs an information policy. Your firm’s data
are an important resource, and you don’t want people doing whatever they
want with them. You need to have rules on how the data are to be organized and
maintained, and who is allowed to view the data or change them.
An information policy specifies the organization’s rules for sharing,
disseminating, acquiring, standardizing, classifying, and inventorying informa-
tion. Information policy lays out specific procedures and accountabilities,
identifying which users and organizational units can share information, where
information can be distributed, and who is responsible for updating and
maintaining the information. For example, a typical information policy would
specify that only selected members of the payroll and human resources
department would have the right to change and view sensitive employee data,
such as an employee’s salary or social security number, and that these depart-
ments are responsible for making sure that such employee data are accurate.
If you are in a small business, the information policy would be established and
implemented by the owners or managers. In a large organization, managing and
planning for information as a corporate resource often requires a formal data
administration function. Data administration is responsible for the specific
policies and procedures through which data can be managed as an organizational
resource. These responsibilities include developing information policy, planning
for data, overseeing logical database design and data dictionary development, and
monitoring how information systems specialists and end-user groups use data.
You may hear the term data governance used to describe many of these
activities. Promoted by IBM, data governance deals with the policies and
processes for managing the availability, usability, integrity, and security of the
data employed in an enterprise, with special emphasis on promoting privacy,
security, data quality, and compliance with government regulations.
A large organization will also have a database design and management group
within the corporate information systems division that is responsible for
defining and organizing the structure and content of the database, and
maintaining the database. In close cooperation with users, the design group
establishes the physical database, the logical relations among elements, and the
access rules and security procedures. The functions it performs are called
database administration.
ENSURING DATA QUALITY
A well-designed database and information policy will go a long way toward
ensuring that the business has the information it needs. However, additional
Chapter 6 Foundations of Business Intelligence: Databases and Information Management 231
steps must be taken to ensure that the data in organizational databases are
accurate and remain reliable.
What would happen if a customer’s telephone number or account balance
were incorrect? What would be the impact if the database had the wrong price
for the product you sold or your sales system and inventory system showed
different prices for the same product? Data that are inaccurate, untimely, or
inconsistent with other sources of information lead to incorrect decisions, prod-
uct recalls, and financial losses. Inaccurate data in criminal justice and national
security databases might even subject you to unnecessarily surveillance or
detention, as described in the chapter-ending case study.
According to Forrester Research, 20 percent of U.S. mail and commercial
package deliveries were returned because of incorrect names or addresses.
Gartner Inc. reported that more than 25 percent of the critical data in large
Fortune 1000 companies’ databases is inaccurate or incomplete, including bad
product codes and product descriptions, faulty inventory descriptions,
erroneous financial data, incorrect supplier information, and incorrect
employee data. (Gartner, 2007).
Think of all the times you’ve received several pieces of the same direct mail
advertising on the same day. This is very likely the result of having your name
maintained multiple times in a database. Your name may have been misspelled
or you used your middle initial on one occasion and not on another or the infor-
mation was initially entered onto a paper form and not scanned properly into
the system. Because of these inconsistencies, the database would treat you as
different people! We often receive redundant mail addressed to Laudon,
Lavdon, Lauden, or Landon.
If a database is properly designed and enterprise-wide data standards
established, duplicate or inconsistent data elements should be minimal. Most
data quality problems, however, such as misspelled names, transposed
numbers, or incorrect or missing codes, stem from errors during data input.
The incidence of such errors is rising as companies move their businesses to
the Web and allow customers and suppliers to enter data into their Web sites
that directly update internal systems.
Before a new database is in place, organizations need to identify and correct
their faulty data and establish better routines for editing data once their
database is in operation. Analysis of data quality often begins with a data
quality audit, which is a structured survey of the accuracy and level of
completeness of the data in an information system. Data quality audits can be
performed by surveying entire data files, surveying samples from data files, or
surveying end users for their perceptions of data quality.
Data cleansing, also known as data scrubbing, consists of activities for
detecting and correcting data in a database that are incorrect, incomplete,
improperly formatted, or redundant. Data cleansing not only corrects errors
but also enforces consistency among different sets of data that originated in
separate information systems. Specialized data-cleansing software is available
to automatically survey data files, correct errors in the data, and integrate the
data in a consistent company-wide format.
Data quality problems are not just business problems. They also pose serious
problems for individuals, affecting their financial condition and even their jobs.
The Interactive Session on Organizations describes some of these impacts, as it
details the data quality problems found in the companies that collect and report
consumer credit data. As you read this case, look for the management, organi-
zation, and technology factors behind this problem, and whether existing
solutions are adequate.
232 Part Two Information Technology Infrastructure
INTERACTIVE SESSION: ORGANIZATIONS
CREDIT BUREAU ERRORS—BIG PEOPLE PROBLEMS
You’ve found the car of your dreams. You have a The sheer volume of information being
good job and enough money for a down payment. transmitted from creditors to credit bureaus
All you need is an auto loan for $14,000. You have increases the likelihood of mistakes. Experian, for
a few credit card bills, which you diligently pay off example, updates 30 million credit reports each
each month. But when you apply for the loan day and roughly 2 billion credit reports each
you’re turned down. When you ask why, you’re month. It matches the identifying personal
told you have an overdue loan from a bank you’ve information in a credit application or credit
never heard of. You’ve just become one of the account with the identifying personal information
millions of people who have been victimized by in a consumer credit file. Identifying personal
inaccurate or outdated data in credit bureaus’ information includes items such as name (first
information systems. name, last name and middle initial), full current
address and ZIP code, full previous address and
Most data on U.S. consumers’ credit histories are ZIP code, and social security number. The new
collected and maintained by three national credit credit information goes into the consumer credit
reporting agencies: Experian, Equifax, and file that it best matches.
TransUnion. These organizations collect data from
various sources to create a detailed dossier of an The credit bureaus rarely receive information
individual’s borrowing and bill paying habits. This that matches in all the fields in credit files, so they
information helps lenders assess a person’s credit have to determine how much variation to allow
worthiness, the ability to pay back a loan, and can and still call it a match. Imperfect data lead to
affect the interest rate and other terms of a loan, imperfect matches. A consumer might provide
including whether a loan will be granted in the incomplete or inaccurate information on a credit
first place. It can even affect the chances of finding application. A creditor might submit incomplete or
or keeping a job: At least one-third of employers inaccurate information to the credit bureaus. If the
check credit reports when making hiring, firing, or wrong person matches better than anyone else, the
promotion decisions. data could unfortunately go into the wrong
account.
U.S. credit bureaus collect personal information
and financial data from a variety of sources, Perhaps the consumer didn’t write clearly on
including creditors, lenders, utilities, debt the account application. Name variations on
collection agencies, and the courts. These data are different credit accounts can also result in less-
aggregated and stored in massive databases than-perfect matches. Take the name Edward
maintained by the credit bureaus. The credit Jeffrey Johnson. One account may say Edward
bureaus then sell this information to other Johnson. Another may say Ed Johnson. Another
companies to use for credit assessment. might say Edward J. Johnson. Suppose the last two
digits of Edward’s social security number get
The credit bureaus claim they know which transposed—more chance for mismatches.
credit cards are in each consumer’s wallet, how
much is due on the mortgage, and whether the If the name or social security number on
electric bill is paid on time. But if the wrong another person’s account partially matches the
information gets into their systems, whether data in your file, the computer might attach that
through identity theft or errors transmitted by person’s data to your record. Your record might
creditors, watch out! Untangling the mess can be likewise be corrupted if workers in companies
almost impossible. supplying tax and bankruptcy data from court and
government records accidentally transpose a digit
The bureaus understand the importance of or misread a document.
providing accurate information to both lenders and
consumers. But they also recognize that their own The credit bureaus claim it is impossible for
systems are responsible for many credit-report them to monitor the accuracy of the 3.5 billion
errors. Some mistakes occur because of the pieces of credit account information they receive
procedures for matching loans to individual credit each month. They must continually contend with
reports. bogus claims from consumers who falsify lender
Chapter 6 Foundations of Business Intelligence: Databases and Information Management 233
information or use shady credit-repair companies to include the background banks need to under-
that challenge all the negative information on a stand a complaint.
credit report regardless of its validity. To separate
the good from the bad, the credit bureaus use an Although this system fixes large numbers of
automated e-OSCAR (Electronic Online Solution for errors (data are updated or corrected for 72 percent
Complete and Accurate Reporting) system to of disputes), consumers have few options if the
forward consumer disputes to lenders for system fails. Consumers who file a second dispute
verification. without providing new information might have their
dispute dismissed as “frivolous.” If the consumer
If your credit report showed an error, the tries to contact the lender that made the error on
bureaus usually do not contact the lender directly their own, banks have no obligation to investigate
to correct the information. To save money, the the dispute—unless it’s sent by a credit bureau.
bureaus send consumer protests and evidence to a
data processing center run by a third-party contrac- Sources: Dennis McCafferty, “Bad Credit Could Cost You a Job,”
tor. These contractors rapidly summarize every Baseline, June 7, 2010; Kristen McNamara, “Bad Credit Derails Job
complaint with a short comment and 2-digit code Seekers,” The Wall Street Journal, March 16, 2010; Anne Kadet, Lucy
from a menu of 26 options. For example, the code Lazarony, “Your Name Can Mess Up Your Credit Report,
A3 designates “belongs to another individual with a Bankrate.com, accessed July 1, 2009; “Credit Report Fix a
similar name.” These summaries are often too brief Headache,” Atlanta Journal-Constitution, June 14, 2009; and “Why
Credit Bureaus Can’t Get It Right,” Smart Money, March 2009.
CASE STUDY QUESTIONS MIS IN ACTION
1. Assess the business impact of credit bureaus’ data Go to the Experian Web site (www.experian.com)
quality problems for the credit bureaus, for lenders, and explore the site, with special attention to its
for individuals. services for businesses and small businesses.
Then answer the following questions:
2. Are any ethical issues raised by credit bureaus’
data quality problems? Explain your answer. 1. List and describe five services for businesses and
explain how each uses consumer data. Describe
3. Analyze the management, organization, and tech- the kinds of businesses that would use these
nology factors responsible for credit bureaus’ data services.
quality problems.
2. Explain how each of these services is affected by
4. What can be done to solve these problems? inaccurate consumer data.
234 Part Two Information Technology Infrastructure
PRODUCT _ID 6.5 HANDS-ON MIS PROJECTS
The projects in this section give you hands-on experience in analyzing data
quality problems, establishing company-wide data standards, creating a
database for inventory management, and using the Web to search online
databases for overseas business resources.
Management Decision Problems
1. Emerson Process Management, a global supplier of measurement, analytical,
and monitoring instruments and services based in Austin, Texas, had a new
data warehouse designed for analyzing customer activity to improve service
and marketing that was full of inaccurate and redundant data. The data in the
warehouse came from numerous transaction processing systems in Europe,
Asia, and other locations around the world. The team that designed the
warehouse had assumed that sales groups in all these areas would enter
customer names and addresses the same way, regardless of their location.
In fact, cultural differences combined with complications from absorbing
companies that Emerson had acquired led to multiple ways of entering
quotes, billing, shipping, and other data. Assess the potential business impact
of these data quality problems. What decisions have to be made and steps
taken to reach a solution?
2. Your industrial supply company wants to create a data warehouse where
management can obtain a single corporate-wide view of critical sales
information to identify best-selling products in specific geographic areas, key
customers, and sales trends. Your sales and product information are stored in
several different systems: a divisional sales system running on a Unix server
and a corporate sales system running on an IBM mainframe. You would like to
create a single standard format that consolidates these data from both systems.
The following format has been proposed.
PRODUCT_ DESCRIPTION COST_PER_ UNIT UNITS_ SOLD SALES_REGION DIVISION CUSTOMER_ID
The following are sample files from the two systems that would supply the
data for the data warehouse:
CORPORATE SALES SYSTEM
PRODUCT_ID PRODUCT_ DESCRIPTION UNIT_COST UNITS_SOLD SALES_TERRITORY DIVISION
60231 Bearing, 4” 5.28 900,245 Northeast Parts
85773 SS assembly unit 12.45 992,111 Midwest Parts
MECHANICAL PARTS DIVISION SALES SYSTEM
PROD_NO PRODUCT_ DESCRIPTION COST_PER_UNIT UNITS_SOLD SALES_REGION CUSTOMER_ID
60231 4” Steel bearing 5.28 900,245 N.E. Anderson
85773 SS assembly unit 12.45 992,111 M.W. Kelly Industries
Chapter 6 Foundations of Business Intelligence: Databases and Information Management 235
• What business problems are created by not having these data in a single
standard format?
• How easy would it be to create a database with a single standard format that
could store the data from both systems? Identify the problems that would have
to be addressed.
• Should the problems be solved by database specialists or general business
managers? Explain.
• Who should have the authority to finalize a single company-wide format for this
information in the data warehouse?
Achieving Operational Excellence: Building a
Relational Database for Inventory Management
Software skills: Database design, querying, and reporting
Business skills: Inventory management
Businesses today depend on databases to provide reliable information about
items in inventory, items that need restocking, and inventory costs. In this
exercise, you’ll use database software to design a database for managing inven-
tory for a small business.
Sylvester’s Bike Shop, located in San Francisco, California, sells road, moun-
tain, hybrid, leisure, and children’s bicycles. Currently, Sylvester’s purchases
bikes from three suppliers but plans to add new suppliers in the near
future. This rapidly growing business needs a database system to manage this
information.
Initially, the database should house information about suppliers and
products. The database will contain two tables: a supplier table and a product
table. The reorder level refers to the number of items in inventory that
triggers a decision to order more items to prevent a stockout. (In other words,
if the number of units of a particular item in inventory falls below the reorder
level, the item should be reordered.) The user should be able to perform
several queries and produce several managerial reports based on the data
contained in the two tables.
Using the information found in the tables in MyMISLab, build a simple rela-
tional database for Sylvester’s. Once you have built the database, perform the
following activities:
• Prepare a report that identifies the five most expensive bicycles. The report
should list the bicycles in descending order from most expensive to least
expensive, the quantity on hand for each, and the markup percentage for
each.
• Prepare a report that lists each supplier, its products, the quantities on hand,
and associated reorder levels. The report should be sorted alphabetically
by supplier. Within each supplier category, the products should be sorted
alphabetically.
• Prepare a report listing only the bicycles that are low in stock and need to be
reordered. The report should provide supplier information for the items
identified.
• Write a brief description of how the database could be enhanced to further
improve management of the business. What tables or fields should be added?
What additional reports would be useful?
236 Part Two Information Technology Infrastructure
Improving Decision Making: Searching Online
Databases for Overseas Business Resources
Software skills: Online databases
Business skills: Researching services for overseas operations
Internet users have access to many thousands of Web-enabled databases with
information on services and products in faraway locations. This project
develops skills in searching these online databases.
Your company is located in Greensboro, North Carolina, and manufactures
office furniture of various types. You have recently acquired several new
customers in Australia, and a study you commissioned indicates that, with a
presence there, you could greatly increase your sales. Moreover, your study
indicates that you could do even better if you actually manufactured many of
your products locally (in Australia). First, you need to set up an office in
Melbourne to establish a presence, and then you need to begin importing from
the United States. You then can plan to start producing locally.
You will soon be traveling to the area to make plans to actually set up an
office, and you want to meet with organizations that can help you with your
operation. You will need to engage people or organizations that offer many
services necessary for you to open your office, including lawyers, accountants,
import-export experts, telecommunications equipment and support, and even
trainers who can help you to prepare your future employees to work for you.
Start by searching for U.S. Department of Commerce advice on doing business
in Australia. Then try the following online databases to locate companies that
you would like to meet with during your coming trip: Australian Business
Register (abr.business.gov.au/), Australia Trade Now (australiatradenow.com/),
and the Nationwide Business Directory of Australia (www.nationwide.com.au).
If necessary, you could also try search engines such as Yahoo and Google. Then
perform the following activities:
• List the companies you would contact to interview on your trip to determine
whether they can help you with these and any other functions you think
vital to establishing your office.
• Rate the databases you used for accuracy of name, completeness, ease of use,
and general helpfulness.
• What does this exercise tell you about the design of databases?
LEARNING TRACK MODULES
The following Learning Tracks provide content relevant to topics covered in
this chapter:
1. Database Design, Normalization, and Entity-Relationship Diagramming
2. Introduction to SQL
3. Hierarchical and Network Data Models
Chapter 6 Foundations of Business Intelligence: Databases and Information Management 237
Review Summary
1. What are the problems of managing data resources in a traditional file environment and how are
they solved by a database management system?
Traditional file management techniques make it difficult for organizations to keep track of all of
the pieces of data they use in a systematic way and to organize these data so that they can be easily
accessed. Different functional areas and groups were allowed to develop their own files indepen-
dently. Over time, this traditional file management environment creates problems such as data
redundancy and inconsistency, program-data dependence, inflexibility, poor security, and lack of
data sharing and availability. A database management system (DBMS) solves these problems with
software that permits centralization of data and data management so that businesses have a single
consistent source for all their data needs. Using a DBMS minimizes redundant and inconsistent
files.
2. What are the major capabilities of DBMS and why is a relational DBMS so powerful?
The principal capabilities of a DBMS includes a data definition capability, a data dictionary
capability, and a data manipulation language. The data definition capability specifies the structure and
content of the database. The data dictionary is an automated or manual file that stores information
about the data in the database, including names, definitions, formats, and descriptions of data ele-
ments. The data manipulation language, such as SQL, is a specialized language for accessing and
manipulating the data in the database.
The relational database is the primary method for organizing and maintaining data today in
information systems because it is so flexible and accessible. It organizes data in two-dimensional tables
called relations with rows and columns. Each table contains data about an entity and its attributes. Each
row represents a record and each column represents an attribute or field. Each table also contains a key
field to uniquely identify each record for retrieval or manipulation. Relational database tables can be
combined easily to deliver data required by users, provided that any two tables share a common data
element.
3. What are some important database design principles?
Designing a database requires both a logical design and a physical design. The logical design
models the database from a business perspective. The organization’s data model should reflect its key
business processes and decision-making requirements. The process of creating small, stable, flexible,
and adaptive data structures from complex groups of data when designing a relational database is
termed normalization. A well-designed relational database will not have many-to-many relation-
ships, and all attributes for a specific entity will only apply to that entity. It will try to enforce
referential integrity rules to ensure that relationships between coupled tables remain consistent. An
entity-relationship diagram graphically depicts the relationship between entities (tables) in a
relational database.
4. What are the principal tools and technologies for accessing information from databases to
improve business performance and decision making?
Powerful tools are available to analyze and access the information in databases. A data warehouse
consolidates current and historical data from many different operational systems in a central database
designed for reporting and analysis. Data warehouses support multidimensional data analysis, also
known as online analytical processing (OLAP). OLAP represents relationships among data as a
multidimensional structure, which can be visualized as cubes of data and cubes within cubes of data,
enabling more sophisticated data analysis. Data mining analyzes large pools of data, including the
contents of data warehouses, to find patterns and rules that can be used to predict future behavior and
guide decision making. Text mining tools help businesses analyze large unstructured data sets consist-
ing of text. Web mining tools focus on analysis of useful patterns and information from the World Wide
Web, examining the structure of Web sites and activities of Web site users as well as the contents of Web
pages. Conventional databases can be linked via middleware to the Web or a Web interface to facilitate
user access to an organization’s internal data.
238 Part Two Information Technology Infrastructure
5. Why are information policy, data administration, and data quality assurance essential for managing
the firm’s data resources?
Developing a database environment requires policies and procedures for managing organizational data as
well as a good data model and database technology. A formal information policy governs the maintenance,
distribution, and use of information in the organization. In large corporations, a formal data administration
function is responsible for information policy, as well as for data planning, data dictionary development, and
monitoring data usage in the firm.
Data that are inaccurate, incomplete, or inconsistent create serious operational and financial problems for
businesses because they may create inaccuracies in product pricing, customer accounts, and inventory data,
and lead to inaccurate decisions about the actions that should be taken by the firm. Firms must take special
steps to make sure they have a high level of data quality. These include using enterprise-wide data standards,
databases designed to minimize inconsistent and redundant data, data quality audits, and data cleansing
software.
Key Terms Field, 210
File, 210
Attribute, 210 Foreign key, 215
Data administration, 230 Information policy, 230
Data cleansing, 231 Key field, 214
Data definition, 217 Normalization, 219
Data dictionary, 217 Object-oriented DBMS, 215
Data governance, 230 Object-relational DBMS, 215
Data inconsistency, 211 Online analytical processing
Data manipulation language, 217
Data mart, 223 (OLAP), 224
Data mining, 224 Predictive analytics, 226
Data quality audit, 231 Primary key, 210
Data redundancy, 211 Program-data dependence, 211
Data warehouse, 222 Record, 214
Database, 210 Referential integrity, 220
Database (rigorous definition), 212 Relational DBMS, 213
Database administration, 230 Structured Query Language
Database management system (DBMS), 212
Database server, 229 (SQL), 217
Entity, 210 Text mining, 226
Entity-relationship diagram, 221 Tuple, 214
Web mining, 226
Chapter 6 Foundations of Business Intelligence: Databases and Information Management 239
Review Questions 4. What are the principal tools and technologies for
accessing information from databases to
1. What are the problems of managing data resources improve business performance and decision
in a traditional file environment and how are they making?
solved by a database management system?
• Define a data warehouse, explaining how it
• List and describe each of the components in works and how it benefits organizations.
the data hierarchy.
• Define business intelligence and explain how
• Define and explain the significance of it is related to database technology.
entities, attributes, and key fields.
• Describe the capabilities of online analytical
• List and describe the problems of the processing (OLAP).
traditional file environment.
• Define data mining, describing how it differs
• Define a database and a database manage- from OLAP and the types of information it
ment system and describe how it solves the provides.
problems of a traditional file environment.
• Explain how text mining and Web mining
2. What are the major capabilities of DBMS and why differ from conventional data mining.
is a relational DBMS so powerful?
• Describe how users can access information
• Name and briefly describe the capabilities of from a company’s internal databases through
a DBMS. the Web.
• Define a relational DBMS and explain how it 5. Why are information policy, data administration,
organizes data. and data quality assurance essential for managing
the firm’s data resources?
• List and describe the three operations of a
relational DBMS. • Describe the roles of information policy and
data administration in information manage-
3. What are some important database design princi- ment.
ples?
• Explain why data quality audits and data
• Define and describe normalization and referen- cleansing are essential.
tial integrity and explain how they contribute
to a well-designed relational database.
• Define and describe an entity-relationship dia-
gram and explain its role in database design.
Discussion Questions Video Cases
1. It has been said that you do not need database Video Cases and Instructional Videos illustrating
management software to create a database some of the concepts in this chapter are available.
environment. Discuss. Contact your instructor to access these videos.
2. To what extent should end users be involved in
the selection of a database management system
and database design?
3. What are the consequences of an organization not
having an information policy?
Collaboration and Teamwork: Identifying Entities and Attributes in an
Online Database
With your team of three or four students, select an between the entities you have identified. If possible,
online database to explore, such as AOL Music, use Google Sites to post links to Web pages,
iGo.com, or the Internet Movie Database (IMDb). team communication announcements, and work
Explore one of these Web sites to see what informa- assignments; to brainstorm; and to work collabora-
tion it provides. Then list the entities and attributes tively on project documents. Try to use Google Docs
that the company running the Web site must keep to develop a presentation of your findings for the
track of in its databases. Diagram the relationship class.
240 Part Two Information Technology Infrastructure
The Terror Watch List Database’s Troubles Continue
CASE STUDY
I n the aftermath of the 9-11 attacks, the FBI’s Protection, Secret Service, U.S. Marshals Service, and
Terrorist Screening Center, or TSC, was the White House. Airlines use data supplied by the
established to consolidate information about TSA system in their NoFly and Selectee lists for
suspected terrorists from multiple government prescreening passengers, while the U.S. Customs and
agencies into a single list to enhance inter-agency Border Protection system uses the watchlist data to
communication. A database of suspected terrorists help screen travelers entering the United States. The
known as the terrorist watch list was created. State Department system screens applicants for visas
Multiple U.S. government agencies had been to enter the United States and U.S. residents applying
maintaining separate lists and these agencies lacked for passports, while state and local law enforcement
a consistent process to share relevant information. agencies use the FBI system to help with arrests,
detentions, and other criminal justice activities. Each
Records in the TSC database contain sensitive but of these agencies receives the subset of data in the
unclassified information on terrorist identities, such watch list that pertains to its specific mission.
as name and date of birth, that can be shared with
other screening agencies. Classified information When an individual makes an airline reservation,
about the people in the watch list is maintained in arrives at a U.S. port of entry, applies for a U.S. visa,
other law enforcement and intelligence agency data- or is stopped by state or local police within the
bases. Records for the watchlist database are pro- United States, the frontline screening agency or
vided by two sources: The National Counterterrorism airline conducts a name-based search of the individ-
Center (NCTC) managed by the Office of the ual against the records from the terrorist watch list
Director of National Intelligence provides identifying database. When the computerized name-matching
information on individuals with ties to international system generates a “hit” (a potential name match)
terrorism. The FBI provides identifying information against a watch list record, the airline or agency will
on individuals with ties to purely domestic terrorism. review each potential match. Matches that are
clearly positive or exact matches that are inconclu-
These agencies collect and maintain terrorist sive (uncertain or difficult to verify) are referred to
information and nominate individuals for inclusion the applicable screening agency’s intelligence or
in the TSC’s consolidated watch list. They are operations center and to the TSC for closer examina-
required to follow strict procedures established by tion. In turn, TSC checks its databases and other
the head of the agency concerned and approved by sources, including classified databases maintained by
the U.S. Attorney General. TSC staff must review the NCTC and FBI to confirm whether the individual
each record submitted before it is added to the is a positive, negative, or inconclusive match to the
database. An individual will remain on the watch list watch list record. TSC creates a daily report summa-
until the respective department or agency that rizing all positive matches to the watch list and
nominated that person to the list determines that the distributes them to numerous federal agencies.
person should be removed from the list and deleted
from the database The process of consolidating information from
disparate agencies has been a slow and painstaking
The TSC watch list database is updated daily with one, requiring the integration of at least 12 different
new nominations, modifications to existing records, databases. Two years after the process of integration
and deletions. Since its creation, the list has took place, 10 of the 12 databases had been
ballooned to 400,000 people, recorded as 1.1 million processed. The remaining two databases (the U.S.
names and aliases, and is continuing to grow at a rate Immigration and Customs Enforcement’s Automatic
of 200,000 records each year. Information on the list Biometric Identification System and the FBI’s
is distributed to a wide range of government agency Integrated Automated Fingerprint Identification
systems for use in efforts to deter or detect the System) are both fingerprint databases. There is still
movements of known or suspected terrorists. more work to be done to optimize the list’s useful-
ness.
Recipient agencies include the FBI, CIA, National
Security Agency (NSA), Transportation Security Reports from both the Government Accountability
Administration (TSA), Department of Homeland Office and the Office of the Inspector General assert
Security, State Department, Customs and Border
Chapter 6 Foundations of Business Intelligence: Databases and Information Management 241
that the list contains inaccuracies and that govern- quick redress process for innocents that hope to
ment departmental policies for nomination and remove themselves from it.
removal from the lists are not uniform. There has also
been public outcry resulting from the size of the list The number of requests for removal from the
and well-publicized incidents of obvious non-terrorists watch list continues to mount, with over 24,000
finding that they are included on the list. requests recorded (about 2,000 each month) and only
54 percent of them resolved. The average time to
Information about the process for inclusion on the process a request in 2008 was 40 days, which was not
list must necessarily be carefully protected if the list (and still is not) fast enough to keep pace with the
is to be effective against terrorists. The specific number of requests for removal coming in. As a
criteria for inclusion are not public knowledge. We result, law-abiding travelers that inexplicably find
do know, however, that government agencies popu- themselves on the watch list are left with no easy
late their watch lists by performing wide sweeps of way to remove themselves from it.
information gathered on travelers, using many
misspellings and alternate variations of the names of In February 2007, the Department of Homeland
suspected terrorists. This often leads to the inclusion Security instituted its Traveler Redress Inquiry
of people who do not belong on watch lists, known as Program (TRIP) to help people that have been
“false positives.” It also results in some people being erroneously added to terrorist watch lists remove
listed multiple times under different spellings of themselves and avoid extra screening and question-
their names. ing. John Anderson’s mother claimed that despite
her best efforts, she was unable to remove her son
While these selection criteria may be effective for from the watch lists. Senator Kennedy reportedly
tracking as many potential terrorists as possible, they was only able to remove himself from the list by
also lead to many more erroneous entries on the list personally bringing up the matter to Tom Ridge, then
than if the process required more finely tuned infor- the Director of the Department of Homeland
mation to add new entries. Notable examples of ‘false Security.
positives’ include Michael Hicks, an 8-year-old New
Jersey Cub Scout who is continually stopped at the Security officials say that mistakes such as the one
airport for additional screening and the late senator that led to Anderson and Kennedy’s inclusion on
Ted Kennedy, who had been repeatedly delayed in no-fly and consolidated watch lists occur due to the
the past because his name resembles an alias once matching of imperfect data in airline reservation
used by a suspected terrorist. Like Kennedy, Hicks systems with imperfect data on the watch lists. Many
may have been added because his name is the same airlines don’t include gender, middle name, or date
or similar to a different suspected terrorist. of birth in their reservations records, which increases
the likelihood of false matches.
These incidents call attention to the quality and
accuracy of the data in the TSC consolidated One way to improve screening and help reduce
terrorist watch list. In June 2005, a report by the the number of people erroneously marked for
Department of Justice’s Office of the Inspector additional investigation would be to use a more
General found inconsistent record counts, duplicate sophisticated system involving more personal data
records, and records that lacked data fields or had about individuals on the list. The TSA is developing
unclear sources for their data. Although TSC just such a system, called “Secure Flight,” but it has
subsequently enhanced its efforts to identify and been continually delayed due to privacy concerns
correct incomplete or inaccurate watch list records, regarding the sensitivity and safety of the data it
the Inspector General noted in September 2007 that would collect. Other similar surveillance programs
TSC management of the watch list still showed some and watch lists, such as the NSA’s attempts to gather
weaknesses. information about suspected terrorists, have drawn
criticism for potential privacy violations.
Given the option between a list that tracks every
potential terrorist at the cost of unnecessarily track- Additionally, the watch list has drawn criticism
ing some innocents, and a list that fails to track because of its potential to promote racial profiling
many terrorists in an effort to avoid tracking inno- and discrimination. Some allege that they were
cents, many would choose the list that tracked every included by virtue of their race and ethnic descent,
terrorist despite the drawbacks. But to make matters such as David Fathi, an attorney for the ACLU of
worse for those already inconvenienced by wrongful Iranian descent, and Asif Iqbal, a U.S. citizen of
inclusion on the list, there is currently no simple and Pakistani decent with the same name as a
Guantanamo detainee. Outspoken critics of U.S.
foreign policy, such as some elected officials and
242 Part Two Information Technology Infrastructure Detroit on Christmas Day 2009, had not made it onto
the no-fly list. Although Abdulmutallab’s father had
university professors, have also found themselves on reported concern over his son’s radicalization to the
the list. U.S. State Department, the Department did not
revoke Adbulmutallab’s visa because his name was
A report released in May 2009 by Department of misspelled in the visa database, so he was allowed to
Justice Inspector General Glenn A. Fine found that enter the United States. Faisal Shahzad, the Times
the FBI had incorrectly kept nearly 24,000 people on Square car bomber, was apprehended on May 3,
its own watch list that supplies data to the terrorist 2010, only moments before his Emirates airline flight
watch list on the basis of outdated or irrelevant infor- to Dubai and Pakistan was about to take off. The
mation. Examining nearly 69,000 referrals to the FBI airline had failed to check a last-minute update to the
list, the report found that 35 percent of those people no-fly list that had added Shahzad’s name.
remained on the list despite inadequate justification.
Even more worrisome, the list did not contain the Sources: Scott Shane, “Lapses Allowed Suspect to Board Plane,”
names of people who should have been listed The New York Times, May 4, 2010; Mike McIntire, “Ensnared by
because of their terrorist ties. Error on Growing U.S. Watch List,” The New York Times, April 6,
2010; Eric Lipton, Eric Schmitt, and Mark Mazzetti, “Review of Jet
FBI officials claim that the bureau has made Bomb Plot Shows More Missed Clues,” The New York Times,
improvements, including better training, faster January 18, 2010; Lizette Alvarez, “Meet Mikey, 8: U.S. Has Him
processing of referrals, and requiring field office on Watch List,” The New York Times, January 14, 2010; Eric
supervisors to review watch-list nominations for Lichtblau, “Justice Dept. Finds Flaws in F.B.I. Terror List,” The New
accuracy and completeness. But this watch list and York Times, May 7, 2009; Bob Egelko, “Watch-list Name Confusion
the others remain imperfect tools. In early 2008, it Causes Hardship,” San Francisco Chronicle, March 20, 2008;
was revealed that 20 known terrorists were not cor- “Reports Cite Lack of Uniform Policy for Terrorist Watch List,” The
rectly listed on the consolidated watch list. (Whether Washington Post, March 18, 2008; Siobhan Gorman, “NSA’s
these individuals were able to enter the U.S. as a Domestic Spying Grows as Agency Sweeps Up Data,” The Wall
result is unclear.) Street Journal, March 10, 2008; Ellen Nakashima, and Scott
McCartney, “When Your Name is Mud at the Airport,” The Wall
Umar Farouk Abdulmutallab, the Nigerian who Street Journal, January 29, 2008.
unsuccessfully tried to detonate plastic explosives on
the Northwest Airlines flight from Amsterdam to
CASE STUDY QUESTIONS 4. How effective is the system of watch lists
described in this case study? Explain your answer.
1. What concepts in this chapter are illustrated in
this case? 5. If you were responsible for the management of
the TSC watch list database, what steps would you
2. Why was the consolidated terror watch list take to correct some of these weaknesses?
created? What are the benefits of the list?
6. Do you believe that the terror watch list
3. Describe some of the weaknesses of the watch represents a significant threat to individuals’
list. What management, organization, and privacy or Constitutional rights? Why or why not?
technology factors are responsible for these
weaknesses?
This page intentionally left blank
Chapter 7
Telecommunications, the Internet,
and Wireless Technology
LEARNING OBJECTIVESS CHAPTER OUTLINE
After reading this chapter, you 7.1 TELECOMMUNICATIONS AND NETWORKING
will be able to answer the IN TODAY’S BUSINESS WORLD
following questions: Networking and Communication Trends
What Is a Computer Network?
1. What are the principal components Key Digital Networking Technologies
of telecommunications networks and
key networking technologies? 7.2 COMMUNICATIONS NETWORKS
Signals: Digital vs. Analog
2. What are the main telecommunica- Types of Networks
tions transmission media and types Physical Transmission Media
of networks?
7.3 THE GLOBAL INTERNET
3. How do the Internet and Internet What Is the Internet?
technology work and how do they Internet Addressing and Architecture
support communication and Internet Services and Communications Tools
e-business? The Web
4. What are the principal technologies 7.4 THE WIRELESS REVOLUTION
and standards for wireless network- Cellular Systems
ing, communication, and Internet Wireless Computer Networks and Internet Access
access? RFID and Wireless Sensor Networks
5. Why are radio frequency identifica- 7.5 HANDS-ON MIS PROJECTS
tion (RFID) and wireless sensor Management Decision Problems
networks valuable for business? Improving Decision Making: Using Spreadsheet
Software to Evaluate Wireless Services
Interactive Sessions: Achieving Operational Excellence: Using Web Search
Engines for Business Research
The Battle Over Net Neutrality
Monitoring Employees on LEARNING TRACK MODULES
Computing and Communications Services Provided
Networks: Unethical or by Commercial Communications Vendors
Good Business? Broadband Network Services and Technologies
Cellular System Generations
WAP and I-Mode: Wireless Cellular Standards for
Web Access
Wireless Applications for Customer Relationship
Management, Supply Chain Management, and
Health care
Web 2.0
HYUNDAI HEAVY INDUSTRIES CREATES
A WIRELESS SHIPYARD
W hat’s it like to be the world’s largest shipbuilder? Ask Hyundai Heavy Industries
(HHI), headquartered in Ulsan, South Korea, which produces 10 percent of the
world’s ships. HHI produces tankers, bulk carriers, containerships, gas and
chemical carriers, ship engines, offshore oil and gas drilling platforms, and under-
sea pipelines.
Coordinating and optimizing the production of so many different products, is obviously a
daunting task. The company has already invested nearly $50 million in factory planning
software to help manage this effort. But HHI’s “factory” encompasses 11 square kilometers
(4.2 square miles) stretching over land and sea, including nine drydocks, the largest of
which spans more than seven football fields to support construction of four vessels simul-
taneously. Over 12,000 workers build up to 30 ships at one time, using millions of parts
ranging in size from small rivets to five-story buildings.
This production environment proved too large and complex to easily track the movement of
parts and inventory in real time as these events were taking place. Without up-to-the-minute
data, the efficiencies from enterprise resource planning software are very limited. To make
matters worse, the recent economic downturn hit HHI especially hard, as world trading and
shipping plummeted. Orders for new ships in 2009 plunged to 7.9 million compensated gross
tons (CGT, a measurement of vessel size), down from 150 million CGT the previous year. In
this economic environment, Hyundai Heavy was looking for new ways to reduce expenses
and streamline production.
HHI’s solution was a high-speed wireless network across the entire shipyard, which was
built by KT Corp., South Korea’s largest telecommunications firm. It is able to transmit data at
a rate of 4 megabits per second, about four times faster than the typical cable modem deliver-
ing high-speed Internet service to U.S. households. The company uses radio sensors to track
the movement of parts as they move from fabrication shop to the side of a drydock and then
onto a ship under construction. Workers on the ship use notebook computers or handheld
mobile phones to access plans and engage in two-way video conversations with ship designers
in the office, more than a mile away.
In the past, workers who were inside a vessel below ground or below sea level had to climb
topside to use a phone or walkie-talkie when they had to talk to someone about a problem. The
new wireless network is connected to the electric lines in the ship, which convey digital data
to Wi-Fi wireless transmitters
placed around the hull during
construction. Workers’ Internet
phones, webcams, and PCs are
linked to the Wi-Fi system, so
workers can use Skype VoIP to
call their colleagues on the sur-
face. Designers in an office
building a mile from the con-
struction site use the webcams
to investigate problems.
On the shipyard roads, 30 trans-
porter trucks fitted to receivers
connected to the wireless net-
work update their location
every 20 seconds to a control
room. This helps dispatchers
245
246 Part Two Information Technology Infrastructure
match the location of transporters with orders for parts, shortening the trips
each truck makes. All of the day’s movements are finished by 6 P.M. instead of
8 P.M. By making operations more efficient and reducing labor costs, the wire-
less technology is expected to save Hyundai Heavy $40 million annually.
Sources: Evan Ramstad, “High-Speed Wireless Transforms a Shipyard,” The Wall Street
Journal, March 15, 2010 and “Hyundai Heavy Plans Wireless Shipyard,” The Korea Herald,
March 30, 2010.
Hyundai Heavy Industries’s experience illustrates some of the powerful
capabilities and opportunities provided by contemporary networking
technology. The company used wireless networking technology to connect
designers, laborers, ships under construction, and transportation vehicles to
accelerate communication and coordination, and cut down on the time, dis-
tance, or number of steps required to perform a task.
The chapter-opening diagram calls attention to important points raised by
this case and this chapter. Hyundai Heavy Industries produces ships and other
products that are very labor-intensive and sensitive to changes in global
economic conditions. Its production environment is large, complex, and
extremely difficult to coordinate and manage. The company needs to keep
operating costs as low as possible. HHI’s shipyard extends over a vast area, and
it was extremely difficult to monitor and coordinate different projects and work
teams.
Management decided that wireless technology provided a solution and
arranged for the deployment of a wireless network throughout the entire
shipyard. The network also links the yard to designers in HHI’s office a mile
away. The network made it much easier to track parts and production activities
and to optimize the movements of transporter trucks. HHI had to redesign its
production and other work processes to take advantage of the new technology.
Chapter 7 Telecommunications, the Internet, and Wireless Technology 247
7.1 TELECOMMUNICATIONS AND NETWORKING IN
TODAY’S BUSINESS WORLD
I f you run or work in a business, you can’t do without networks. You
need to communicate rapidly with your customers, suppliers, and
employees. Until about 1990, businesses used the postal system or
telephone system with voice or fax for communication. Today, however,
you and your employees use computers and e-mail, the Internet, cell
phones, and mobile computers connected to wireless networks for this
purpose. Networking and the Internet are now nearly synonymous with
doing business.
NETWORKING AND COMMUNICATION TRENDS
Firms in the past used two fundamentally different types of networks:
telephone networks and computer networks. Telephone networks historically
handled voice communication, and computer networks handled data traffic.
Telephone networks were built by telephone companies throughout the twenti-
eth century using voice transmission technologies (hardware and software),
and these companies almost always operated as regulated monopolies through-
out the world. Computer networks were originally built by computer
companies seeking to transmit data between computers in different locations.
Thanks to continuing telecommunications deregulation and information
technology innovation, telephone and computer networks are converging into
a single digital network using shared Internet-based standards and equipment.
Telecommunications providers today, such as AT&T and Verizon, offer data
transmission, Internet access, cellular telephone service, and television
programming as well as voice service. (See the Chapter 3 opening case.) Cable
companies, such as Cablevision and Comcast, now offer voice service and
Internet access. Computer networks have expanded to include Internet
telephone and limited video services. Increasingly, all of these voice, video, and
data communications are based on Internet technology.
Both voice and data communication networks have also become more power-
ful (faster), more portable (smaller and mobile), and less expensive. For instance,
the typical Internet connection speed in 2000 was 56 kilobits per second, but
today more than 60 percent of U.S. Internet users have high-speed broadband
connections provided by telephone and cable TV companies running at 1 to 15
million bits per second. The cost for this service has fallen exponentially, from 25
cents per kilobit in 2000 to a tiny fraction of a cent today.
Increasingly, voice and data communication, as well as Internet access, are
taking place over broadband wireless platforms, such as cell phones, mobile
handheld devices, and PCs in wireless networks. In a few years, more than half
the Internet users in the United States will use smartphones and mobile
netbooks to access the Internet. In 2010, 84 million Americans accessed the
Internet through mobile devices, and this number is expected to double by 2014
(eMarketer, 2010).
WHAT IS A COMPUTER NETWORK?
If you had to connect the computers for two or more employees together in the
same office, you would need a computer network. Exactly what is a network?
In its simplest form, a network consists of two or more connected computers.
248 Part Two Information Technology Infrastructure
Figure 7-1 illustrates the major hardware, software, and transmission compo-
nents used in a simple network: a client computer and a dedicated server
computer, network interfaces, a connection medium, network operating
system software, and either a hub or a switch.
Each computer on the network contains a network interface device called a
network interface card (NIC). Most personal computers today have this card
built into the motherboard. The connection medium for linking network
components can be a telephone wire, coaxial cable, or radio signal in the case of
cell phone and wireless local area networks (Wi-Fi networks).
The network operating system (NOS) routes and manages communica-
tions on the network and coordinates network resources. It can reside on every
computer in the network, or it can reside primarily on a dedicated server
computer for all the applications on the network. A server computer is a
computer on a network that performs important network functions for client
computers, such as serving up Web pages, storing data, and storing the network
operating system (and hence controlling the network). Server software such as
Microsoft Windows Server, Linux, and Novell Open Enterprise Server are the
most widely used network operating systems.
Most networks also contain a switch or a hub acting as a connection point
between the computers. Hubs are very simple devices that connect network
components, sending a packet of data to all other connected devices. A switch
has more intelligence than a hub and can filter and forward data to a specified
destination on the network.
What if you want to communicate with another network, such as the
Internet? You would need a router. A router is a communications processor
used to route packets of data through different networks, ensuring that the data
sent gets to the correct address.
FIGURE 7-1 COMPONENTS OF A SIMPLE COMPUTER NETWORK
Illustrated here is a very simple computer network, consisting of computers, a network operating system residing on a dedicated server
computer, cable (wiring) connecting the devices, network interface cards (NICs), switches, and a router.
Chapter 7 Telecommunications, the Internet, and Wireless Technology 249
Networks in Large Companies
The network we’ve just described might be suitable for a small business. But what
about large companies with many different locations and thousands of employ-
ees? As a firm grows, and collects hundreds of small local area networks, these
networks can be tied together into a corporate-wide networking infrastructure.
The network infrastructure for a large corporation consists of a large number of
these small local area networks linked to other local area networks and to
firmwide corporate networks. A number of powerful servers support a corporate
Web site, a corporate intranet, and perhaps an extranet. Some of these servers
link to other large computers supporting back-end systems.
Figure 7-2 provides an illustration of these more complex, larger scale corpo-
rate-wide networks. Here you can see that the corporate network infrastructure
supports a mobile sales force using cell phones and smartphones, mobile
employees linking to the company Web site, internal company networks using
mobile wireless local area networks (Wi-Fi networks), and a videoconferencing
system to support managers across the world. In addition to these computer
networks, the firm’s infrastructure usually includes a separate telephone
network that handles most voice data. Many firms are dispensing with their
traditional telephone networks and using Internet telephones that run on their
existing data networks (described later).
As you can see from this figure, a large corporate network infrastructure uses
a wide variety of technologies—everything from ordinary telephone service and
corporate data networks to Internet service, wireless Internet, and cell phones.
FIGURE 7-2 CORPORATE NETWORK INFRASTRUCTURE
Today’s corporate network infrastructure is a collection of many different networks from the public
switched telephone network, to the Internet, to corporate local area networks linking workgroups,
departments, or office floors.
250 Part Two Information Technology Infrastructure
One of the major problems facing corporations today is how to integrate all the
different communication networks and channels into a coherent system that
enables information to flow from one part of the corporation to another, and from
one system to another. As more and more communication networks become dig-
ital, and based on Internet technologies, it will become easier to integrate them.
KEY DIGITAL NETWORKING TECHNOLOGIES
Contemporary digital networks and the Internet are based on three key
technologies: client/server computing, the use of packet switching, and the
development of widely used communications standards (the most important of
which is Transmission Control Protocol/Internet Protocol, or TCP/IP) for
linking disparate networks and computers.
Client/Server Computing
We introduced client/server computing in Chapter 5. Client/server computing
is a distributed computing model in which some of the processing power is
located within small, inexpensive client computers, and resides literally on
desktops, laptops, or in handheld devices. These powerful clients are linked to
one another through a network that is controlled by a network server computer.
The server sets the rules of communication for the network and provides every
client with an address so others can find it on the network.
Client/server computing has largely replaced centralized mainframe
computing in which nearly all of the processing takes place on a central large
mainframe computer. Client/server computing has extended computing to
departments, workgroups, factory floors, and other parts of the business that
could not be served by a centralized architecture. The Internet is the largest
implementation of client/server computing.
Packet Switching
Packet switching is a method of slicing digital messages into parcels called
packets, sending the packets along different communication paths as they
become available, and then reassembling the packets once they arrive at their
destinations (see Figure 7-3). Prior to the development of packet switching,
computer networks used leased, dedicated telephone circuits to communicate
with other computers in remote locations. In circuit-switched networks, such as
the telephone system, a complete point-to-point circuit is assembled, and then
communication can proceed. These dedicated circuit-switching techniques
were expensive and wasted available communications capacity—the circuit was
maintained regardless of whether any data were being sent.
Packet switching makes much more efficient use of the communications
capacity of a network. In packet-switched networks, messages are first broken
down into small fixed bundles of data called packets. The packets include
information for directing the packet to the right address and for checking trans-
mission errors along with the data. The packets are transmitted over various
communications channels using routers, each packet traveling independently.
Packets of data originating at one source will be routed through many different
paths and networks before being reassembled into the original message when
they reach their destinations.
TCP/IP and Connectivity
In a typical telecommunications network, diverse hardware and software
components need to work together to transmit information. Different com-
Chapter 7 Telecommunications, the Internet, and Wireless Technology 251
FIGURE 7-3 PACKED-SWITCHED NETWORKS AND PACKET COMMUNICATIONS
Data are grouped into small packets, which are transmitted independently over various communica-
tions channels and reassembled at their final destination.
ponents in a network communicate with each other only by adhering to a
common set of rules called protocols. A protocol is a set of rules and proce-
dures governing transmission of information between two points in a
network.
In the past, many diverse proprietary and incompatible protocols often forced
business firms to purchase computing and communications equipment from a
single vendor. But today, corporate networks are increasingly using a single,
common, worldwide standard called Transmission Control Protocol/
Internet Protocol (TCP/IP). TCP/IP was developed during the early 1970s to
support U.S. Department of Defense Advanced Research Projects Agency
(DARPA) efforts to help scientists transmit data among different types of
computers over long distances.
TCP/IP uses a suite of protocols, the main ones being TCP and IP. TCP refers
to the Transmission Control Protocol (TCP), which handles the movement of
data between computers. TCP establishes a connection between the computers,
sequences the transfer of packets, and acknowledges the packets sent. IP refers
to the Internet Protocol (IP), which is responsible for the delivery of packets and
includes the disassembling and reassembling of packets during transmission.
Figure 7-4 illustrates the four-layered Department of Defense reference model
for TCP/IP.
1. Application layer. The Application layer enables client application programs to
access the other layers and defines the protocols that applications use to
exchange data. One of these application protocols is the Hypertext Transfer
Protocol (HTTP), which is used to transfer Web page files.
2. Transport layer. The Transport layer is responsible for providing the Application
layer with communication and packet services. This layer includes TCP and
other protocols.
3. Internet layer. The Internet layer is responsible for addressing, routing, and
packaging data packets called IP datagrams. The Internet Protocol is one of the
protocols used in this layer.
252 Part Two Information Technology Infrastructure
FIGURE 7-4 THE TRANSMISSION CONTROL PROTOCOL/INTERNET PROTOCOL
(TCP/IP) REFERENCE MODEL
This figure illustrates the four layers of the TCP/IP reference model for communications
4. Network Interface layer. At the bottom of the reference model, the Network
Interface layer is responsible for placing packets on and receiving them from the
network medium, which could be any networking technology.
Two computers using TCP/IP are able to communicate even if they are based
on different hardware and software platforms. Data sent from one computer to
the other passes downward through all four layers, starting with the sending
computer’s Application layer and passing through the Network Interface layer.
After the data reach the recipient host computer, they travel up the layers and
are reassembled into a format the receiving computer can use. If the receiving
computer finds a damaged packet, it asks the sending computer to retransmit it.
This process is reversed when the receiving computer responds.
7.2 COMMUNICATIONS NETWORKS
Let’s look more closely at alternative networking technologies available to
businesses.
SIGNALS: DIGITAL VS. ANALOG
There are two ways to communicate a message in a network: either using an
analog signal or a digital signal. An analog signal is represented by a continuous
waveform that passes through a communications medium and has been used
for voice communication. The most common analog devices are the telephone
handset, the speaker on your computer, or your iPod earphone, all of which
create analog wave forms that your ear can hear.
A digital signal is a discrete, binary waveform, rather than a continuous
waveform. Digital signals communicate information as strings of two discrete
states: one bit and zero bits, which are represented as on-off electrical pulses.
Computers use digital signals and require a modem to convert these digital signals
into analog signals that can be sent over (or received from) telephone lines, cable
lines, or wireless media that use analog signals (see Figure 7-5). Modem stands for
Chapter 7 Telecommunications, the Internet, and Wireless Technology 253
FIGURE 7-5 FUNCTIONS OF THE MODEM
A modem is a device that translates digital signals into analog form (and vice versa) so that computers
can transmit data over analog networks such as telephone and cable networks.
modulator-demodulator. Cable modems connect your computer to the Internet
using a cable network. DSL modems connect your computer to the Internet using
a telephone company’s land line network. Wireless modems perform the same
function as traditional modems, connecting your computer to a wireless network
that could be a cell phone network, or a Wi-Fi network. Without modems,
computers could not communicate with one another using analog networks
(which include the telephone system and cable networks).
TYPES OF NETWORKS
There are many different kinds of networks and ways of classifying them. One
way of looking at networks is in terms of their geographic scope (see Table 7-1).
Local Area Networks
If you work in a business that uses networking, you are probably connecting to
other employees and groups via a local area network. A local area network
(LAN) is designed to connect personal computers and other digital devices
within a half-mile or 500-meter radius. LANs typically connect a few computers
in a small office, all the computers in one building, or all the computers in
several buildings in close proximity. LANs also are used to link to long-distance
wide area networks (WANs, described later in this section) and other networks
around the world using the Internet.
Review Figure 7-1, which could serve as a model for a small LAN that might
be used in an office. One computer is a dedicated network file server, providing
users with access to shared computing resources in the network, including
software programs and data files.
The server determines who gets access to what and in which sequence. The
router connects the LAN to other networks, which could be the Internet or
another corporate network, so that the LAN can exchange information with
networks external to it. The most common LAN operating systems are
Windows, Linux, and Novell. Each of these network operating systems supports
TCP/IP as their default networking protocol.
TABLE 7-1 TYPES OF NETWORKS
TYPE AREA
Local area network (LAN) Up to 500 meters (half a mile); an office or floor of a building
Campus area network (CAN) Up to 1,000 meters (a mile); a college campus or corporate facility
Metropolitan area network (MAN) A city or metropolitan area
Wide area network (WAN) A transcontinental or global area
254 Part Two Information Technology Infrastructure
Ethernet is the dominant LAN standard at the physical network level,
specifying the physical medium to carry signals between computers, access
control rules, and a standardized set of bits used to carry data over the system.
Originally, Ethernet supported a data transfer rate of 10 megabits per second
(Mbps). Newer versions, such as Fast Ethernet and Gigabit Ethernet, support
data transfer rates of 100 Mbps and 1 gigabits per second (Gbps), respectively,
and are used in network backbones.
The LAN illustrated in Figure 7-1 uses a client/server architecture where the
network operating system resides primarily on a single file server, and the
server provides much of the control and resources for the network.
Alternatively, LANs may use a peer-to-peer architecture. A peer-to-peer
network treats all processors equally and is used primarily in small networks
with 10 or fewer users. The various computers on the network can exchange
data by direct access and can share peripheral devices without going through a
separate server.
In LANs using the Windows Server family of operating systems, the peer-
to-peer architecture is called the workgroup network model, in which a small
group of computers can share resources, such as files, folders, and printers,
over the network without a dedicated server. The Windows domain network
model, in contrast, uses a dedicated server to manage the computers in the
network.
Larger LANs have many clients and multiple servers, with separate servers
for specific services, such as storing and managing files and databases (file
servers or database servers), managing printers (print servers), storing and
managing e-mail (mail servers), or storing and managing Web pages (Web
servers).
Sometimes LANs are described in terms of the way their components are
connected together, or their topology. There are three major LAN topologies:
star, bus, and ring (see Figure 7-6).
In a star topology, all devices on the network connect to a single hub.
Figure 7-6 illustrates a simple star topology in which all network traffic flows
through the hub. In an extended star network, multiple layers of hubs are
organized into a hierarchy.
In a bus topology, one station transmits signals, which travel in both
directions along a single transmission segment. All of the signals are broadcast
in both directions to the entire network. All machines on the network receive
the same signals, and software installed on the client computers enables each
client to listen for messages addressed specifically to it. The bus topology is the
most common Ethernet topology.
A ring topology connects network components in a closed loop. Messages
pass from computer to computer in only one direction around the loop, and
only one station at a time may transmit. The ring topology is primarily found in
older LANs using Token Ring networking software.
Metropolitan and Wide Area Networks
Wide area networks (WANs) span broad geographical distances—entire
regions, states, continents, or the entire globe. The most universal and powerful
WAN is the Internet. Computers connect to a WAN through public networks,
such as the telephone system or private cable systems, or through leased lines
or satellites. A metropolitan area network (MAN) is a network that spans a
metropolitan area, usually a city and its major suburbs. Its geographic scope
falls between a WAN and a LAN.
Chapter 7 Telecommunications, the Internet, and Wireless Technology 255
FIGURE 7-6 NETWORK TOPOLOGIES
The three basic network topologies are the star, bus, and ring.
PHYSICAL TRANSMISSION MEDIA
Networks use different kinds of physical transmission media, including twisted
wire, coaxial cable, fiber optics, and media for wireless transmission. Each has
advantages and limitations. A wide range of speeds is possible for any given
medium depending on the software and hardware configuration.
Twisted Wire
Twisted wire consists of strands of copper wire twisted in pairs and is an older
type of transmission medium. Many of the telephone systems in buildings had
twisted wires installed for analog communication, but they can be used for
digital communication as well. Although an older physical transmission
medium, the twisted wires used in today’s LANs, such as CAT5, can obtain
speeds up to 1 Gbps. Twisted-pair cabling is limited to a maximum recom-
mended run of 100 meters (328 feet).
Coaxial Cable
Coaxial cable, similar to that used for cable television, consists of thickly
insulated copper wire that can transmit a larger volume of data than twisted
wire. Cable was used in early LANs and is still used today for longer (more than
100 meters) runs in large buildings. Coaxial has speeds up to 1 Gbps.
Fiber Optics and Optical Networks
Fiber-optic cable consists of bound strands of clear glass fiber, each the thick-
ness of a human hair. Data are transformed into pulses of light, which are sent
through the fiber-optic cable by a laser device at rates varying from 500 kilobits to
several trillion bits per second in experimental settings. Fiber-optic cable is
considerably faster, lighter, and more durable than wire media, and is well suited
to systems requiring transfers of large volumes of data. However, fiber-optic cable
is more expensive than other physical transmission media and harder to install.
256 Part Two Information Technology Infrastructure
Until recently, fiber-optic cable had been used primarily for the high-speed
network backbone, which handles the major traffic. Now cellular phone
companies such as Verizon are starting to bring fiber lines into the home for
new types of services, such as Verizon’s Fiber Optic Services (FiOS) Internet
service that provides up 50 Mbps download speeds.
Wireless Transmission Media
Wireless transmission is based on radio signals of various frequencies. There
are three kinds of wireless networks used by computers: microwave, cellular,
and Wi-Fi. Microwave systems, both terrestrial and celestial, transmit high-
frequency radio signals through the atmosphere and are widely used for
high-volume, long-distance, point-to-point communication. Microwave sig-
nals follow a straight line and do not bend with the curvature of the earth.
Therefore, long-distance terrestrial transmission systems require that trans-
mission stations be positioned about 37 miles apart. Long-distance transmis-
sion is also possible by using communication satellites as relay stations for
microwave signals transmitted from terrestrial stations.
Communication satellites use microwave transmission and are typically
used for transmission in large, geographically dispersed organizations that
would be difficult to network using cabling media or terrestrial microwave, as
well as for home Internet service, especially in rural areas. For instance, the
global energy company BP p.l.c. uses satellites for real-time data transfer of oil
field exploration data gathered from searches of the ocean floor. Using geosyn-
chronous satellites, exploration ships transfer these data to central computing
centers in the United States for use by researchers in Houston, Tulsa, and
suburban Chicago. Figure 7-7 illustrates how this system works. Satellites are
also used for home television and Internet service. The two major satellite
Internet providers (Dish Network and DirectTV) have about 30 million
subscribers, and about 17 percent of all U.S. households access the Internet
using satellite services (eMarketer, 2010).
FIGURE 7-7 BP’S SATELLITE TRANSMISSION SYSTEM
Communication satellites help BP transfer seismic data between oil exploration ships and research
centers in the United States.
Chapter 7 Telecommunications, the Internet, and Wireless Technology 257
Cellular systems also use radio waves and a variety of different protocols to
communicate with radio antennas (towers) placed within adjacent geographic
areas called cells. Communications transmitted from a cell phone to a local
cell pass from antenna to antenna—cell to cell—until they reach their final
destination.
Wireless networks are supplanting traditional wired networks for many
applications and creating new applications, services, and business models.
In Section 7.4, we provide a detailed description of the applications and
technology standards driving the “wireless revolution.”
Transmission Speed
The total amount of digital information that can be transmitted through any
telecommunications medium is measured in bits per second (bps). One signal
change, or cycle, is required to transmit one or several bits; therefore, the
transmission capacity of each type of telecommunications medium is a func-
tion of its frequency. The number of cycles per second that can be sent
through that medium is measured in hertz—one hertz is equal to one cycle of
the medium.
The range of frequencies that can be accommodated on a particular telecom-
munications channel is called its bandwidth. The bandwidth is the difference
between the highest and lowest frequencies that can be accommodated on a
single channel. The greater the range of frequencies, the greater the bandwidth
and the greater the channel’s transmission capacity.
7.3 THE GLOBAL INTERNET
We all use the Internet, and many of us can’t do without it. It’s become an
indispensable personal and business tool. But what exactly is the Internet? How
does it work, and what does Internet technology have to offer for business?
Let’s look at the most important Internet features.
WHAT IS THE INTERNET?
The Internet has become the world’s most extensive, public communication
system that now rivals the global telephone system in reach and range. It’s also
the world’s largest implementation of client/server computing and internet-
working, linking millions of individual networks all over the world. This global
network of networks began in the early 1970s as a U.S. Department of Defense
network to link scientists and university professors around the world.
Most homes and small businesses connect to the Internet by subscribing to an
Internet service provider. An Internet service provider (ISP) is a commercial
organization with a permanent connection to the Internet that sells temporary
connections to retail subscribers. EarthLink, NetZero, AT&T, and Time Warner
are ISPs. Individuals also connect to the Internet through their business firms,
universities, or research centers that have designated Internet domains.
There are a variety of services for ISP Internet connections. Connecting via a
traditional telephone line and modem, at a speed of 56.6 kilobits per second
(Kbps) used to be the most common form of connection worldwide, but it has
been largely replaced by broadband connections. Digital subscriber line (DSL),
cable, satellite Internet connections, and T lines provide these broadband
services.
258 Part Two Information Technology Infrastructure
Digital subscriber line (DSL) technologies operate over existing telephone
lines to carry voice, data, and video at transmission rates ranging from 385 Kbps
all the way up to 9 Mbps. Cable Internet connections provided by cable
television vendors use digital cable coaxial lines to deliver high-speed Internet
access to homes and businesses. They can provide high-speed access to the
Internet of up to 15 Mbps. In areas where DSL and cable services are unavail-
able, it is possible to access the Internet via satellite, although some satellite
Internet connections have slower upload speeds than other broadband services.
T1 and T3 are international telephone standards for digital communication.
They are leased, dedicated lines suitable for businesses or government
agencies requiring high-speed guaranteed service levels. T1 lines offer guaran-
teed delivery at 1.54 Mbps, and T3 lines offer delivery at 45 Mbps. The Internet
does not provide similar guaranteed service levels, but simply “best effort.”
INTERNET ADDRESSING AND ARCHITECTURE
The Internet is based on the TCP/IP networking protocol suite described
earlier in this chapter. Every computer on the Internet is assigned a unique
Internet Protocol (IP) address, which currently is a 32-bit number repre-
sented by four strings of numbers ranging from 0 to 255 separated by periods.
For instance, the IP address of www.microsoft.com is 207.46.250.119.
When a user sends a message to another user on the Internet, the message is
first decomposed into packets using the TCP protocol. Each packet contains its
destination address. The packets are then sent from the client to the network
server and from there on to as many other servers as necessary to arrive at a
specific computer with a known address. At the destination address, the packets
are reassembled into the original message.
The Domain Name System
Because it would be incredibly difficult for Internet users to remember strings
of 12 numbers, the Domain Name System (DNS) converts domain names to IP
addresses. The domain name is the English-like name that corresponds to the
unique 32-bit numeric IP address for each computer connected to the Internet.
DNS servers maintain a database containing IP addresses mapped to their
corresponding domain names. To access a computer on the Internet, users
need only specify its domain name.
DNS has a hierarchical structure (see Figure 7-8). At the top of the DNS
hierarchy is the root domain. The child domain of the root is called a top-level
domain, and the child domain of a top-level domain is called is a second-level
domain. Top-level domains are two- and three-character names you are famil-
iar with from surfing the Web, for example, .com, .edu, .gov, and the various
country codes such as .ca for Canada or .it for Italy. Second-level domains
have two parts, designating a top-level name and a second-level name—such
as buy.com, nyu.edu, or amazon.ca. A host name at the bottom of the hierar-
chy designates a specific computer on either the Internet or a private
network.
The most common domain extensions currently available and officially
approved are shown in the following list. Countries also have domain names
such as .uk, .au, and .fr (United Kingdom, Australia, and France, respectively),
and there is a new class of “internationalized” top level domains that use
non-English characters (ICANN, 2010). In the future, this list will expand to
include many more types of organizations and industries.
Chapter 7 Telecommunications, the Internet, and Wireless Technology 259
FIGURE 7-8 THE DOMAIN NAME SYSTEM
Domain Name System is a hierarchical system with a root domain, top-level domains, second-level
domains, and host computers at the third level.
.com Commercial organizations/businesses
.edu Educational institutions
.gov U.S. government agencies
.mil U.S. military
.net Network computers
.org Nonprofit organizations and foundations
.biz Business firms
.info Information providers
Internet Architecture and Governance
Internet data traffic is carried over transcontinental high-speed backbone
networks that generally operate today in the range of 45 Mbps to 2.5 Gbps (see
Figure 7-9). These trunk lines are typically owned by long-distance telephone
companies (called network service providers) or by national governments. Local
connection lines are owned by regional telephone and cable television compa-
nies in the United States that connect retail users in homes and businesses to
the Internet. The regional networks lease access to ISPs, private companies,
and government institutions.
Each organization pays for its own networks and its own local Internet
connection services, a part of which is paid to the long-distance trunk line
owners. Individual Internet users pay ISPs for using their service, and they
generally pay a flat subscription fee, no matter how much or how little they use
the Internet. A debate is now raging on whether this arrangement should
continue or whether heavy Internet users who download large video and music
files should pay more for the bandwidth they consume. The Interactive Session
on Organizations explores this topic, as it examines the pros and cons of
network neutrality.
260 Part Two Information Technology Infrastructure
FIGURE 7-9 INTERNET NETWORK ARCHITECTURE
The Internet backbone connects to regional networks, which in turn provide access to Internet service
providers, large firms, and government institutions. Network access points (NAPs) and metropolitan-
area exchanges (MAEs) are hubs where the backbone intersects regional and local networks and
where backbone owners connect with one another.
No one “owns” the Internet, and it has no formal management. However,
worldwide Internet policies are established by a number of professional organi-
zations and government bodies, including the Internet Architecture Board
(IAB), which helps define the overall structure of the Internet; the Internet
Corporation for Assigned Names and Numbers (ICANN), which assigns IP
addresses; and the World Wide Web Consortium (W3C), which sets Hypertext
Markup Language and other programming standards for the Web.
These organizations influence government agencies, network owners, ISPs,
and software developers with the goal of keeping the Internet operating as
efficiently as possible. The Internet must also conform to the laws of the
sovereign nation-states in which it operates, as well as the technical infra-
structures that exist within the nation-states. Although in the early years of
the Internet and the Web there was very little legislative or executive interfer-
ence, this situation is changing as the Internet plays a growing role in the
distribution of information and knowledge, including content that some find
objectionable.
The Future Internet: IPv6 and Internet2
The Internet was not originally designed to handle the transmission of massive
quantities of data and billions of users. Because many corporations and govern-
ments have been given large blocks of millions of IP addresses to accommodate
current and future workforces, and because of sheer Internet population
growth, the world will run out of available IP addresses using the existing
Chapter 7 Telecommunications, the Internet, and Wireless Technology 261
addressing convention by 2012 or 2013. Under development is a new version of
the IP addressing schema called Internet Protocol version 6 (IPv6), which contains
128-bit addresses (2 to the power of 128), or more than a quadrillion possible
unique addresses.
Internet2 and Next-Generation Internet (NGI) are consortia representing
200 universities, private businesses, and government agencies in the United
States that are working on a new, robust, high-bandwidth version of the
Internet. They have established several new high-performance backbone net-
works with bandwidths reaching as much as 100 Gbps. Internet2 research
groups are developing and implementing new technologies for more effective
routing practices; different levels of service, depending on the type and impor-
tance of the data being transmitted; and advanced applications for distributed
computation, virtual laboratories, digital libraries, distributed learning, and tele-
immersion. These networks do not replace the public Internet, but they do pro-
vide test beds for leading-edge technology that may eventually migrate to the
public Internet.
INTERNET SERVICES AND COMMUNICATION TOOLS
The Internet is based on client/server technology. Individuals using the
Internet control what they do through client applications on their computers,
such as Web browser software. The data, including e-mail messages and Web
pages, are stored on servers. A client uses the Internet to request information
from a particular Web server on a distant computer, and the server sends the
requested information back to the client over the Internet. Chapters 5 and 6
describe how Web servers work with application servers and database servers
to access information from an organization’s internal information systems
applications and their associated databases. Client platforms today include
not only PCs and other computers but also cell phones, small handheld digital
devices, and other information appliances.
Internet Services
A client computer connecting to the Internet has access to a variety of services.
These services include e-mail, electronic discussion groups, chatting and
instant messaging, Telnet, File Transfer Protocol (FTP), and the Web.
Table 7-2 provides a brief description of these services.
Each Internet service is implemented by one or more software programs. All
of the services may run on a single server computer, or different services may
TABLE 7-2 MAJOR INTERNET SERVICES
CAPABILITY FUNCTIONS SUPPORTED
E-mail
Chatting and instant messaging Person-to-person messaging; document sharing
Newsgroups Interactive conversations
Telnet Discussion groups on electronic bulletin boards
File Transfer Protocol (FTP) Logging on to one computer system and doing work on another
World Wide Web Transferring files from computer to computer
Retrieving, formatting, and displaying information (including text,
audio, graphics, and video) using hypertext links
262 Part Two Information Technology Infrastructure
INTERACTIVE SESSION: ORGANIZATIONS
THE BATTLE OVER NET NEUTRALITY
What kind of Internet user are you? Do you primar- 50 percent of its network capacity. In 2008, the
ily use the Net to do a little e-mail and look up company slowed down transmission of BitTorrent
phone numbers? Or are you online all day, watching files, used extensively for piracy and illegal sharing
YouTube videos, downloading music files, or playing of copyrighted materials, including video. The
massively multiplayer online games? If you’re the Federal Communications Commission (FCC) ruled
latter, you are consuming a great deal of bandwidth, that Comcast had to stop slowing peer-to-peer traffic
and hundreds of millions of people like you might in the name of network management. Comcast then
start to slow the Internet down. YouTube consumed filed a lawsuit challenging the FCC’s authority to
as much bandwidth in 2007 as the entire Internet enforce network neutrality. In April 2010, a federal
did in 2000. That’s one of the arguments being made appeals court ruled in favor of Comcast that the
today for charging Internet users based on the FCC did not have the authority to regulate how an
amount of transmission capacity they use. Internet provider manages its network.
If user demand for the Internet overwhelms Advocates of net neutrality are pushing Congress
network capacity, the Internet might not come to a to find ways to regulate the industry to prevent
screeching halt, but users would be faced with very network providers from adopting Comcast-like
sluggish download speeds and slow performance of practices. The strange alliance of net neutrality
YouTube, Facebook, and other data-heavy services. advocates includes MoveOn.org, the Christian
(Heavy use of iPhones in urban areas such as New Coalition, the American Library Association, every
York and San Francisco has already degraded service major consumer group, many bloggers and small
on the AT&T wireless network. AT&T reports that 3 businesses, and some large Internet companies like
percent of its subscriber base accounts for 40 Google and Amazon.
percent of its data traffic. )
Net neutrality advocates argue that the risk of
Other researchers believe that as digital traffic on censorship increases when network operators can
the Internet grows, even at a rate of 50 percent per selectively block or slow access to certain content
year, the technology for handling all this traffic is such as Hulu videos or access to competing low-cost
advancing at an equally rapid pace. services such as Skype and Vonage. There are
already many examples of Internet providers
In addition to these technical issues, the debate restricting access to sensitive materials (such as
about metering Internet use centers around the Pakistan’s government blocking access to anti-
concept of network neutrality. Network neutrality is Muslim sites and YouTube as a whole in response to
the idea that Internet service providers must allow content it deemed defamatory to Islam.)
customers equal access to content and applications,
regardless of the source or nature of the content. Proponents of net neutrality also argue that a
Presently, the Internet is indeed neutral: all Internet neutral Internet encourages everyone to innovate
traffic is treated equally on a first-come, first-served without permission from the phone and cable
basis by Internet backbone owners. companies or other authorities, and this level
playing field has spawned countless new busi-
However, telecommunications and cable compa- nesses. Allowing unrestricted information flow
nies are unhappy with this arrangement. They want becomes essential to free markets and democracy
to be able to charge differentiated prices based on as commerce and society increasingly move online.
the amount of bandwidth consumed by content
being delivered over the Internet. These companies Network owners believe regulation to enforce net
believe that differentiated pricing is “the fairest way” neutrality will impede U.S. competitiveness by stifling
to finance necessary investments in their network innovation, discouraging capital expenditures for new
infrastructures. networks, and curbing their networks’ ability to cope
with the exploding demand for Internet and wireless
Internet service providers point to the upsurge in traffic. U.S. Internet service lags behind many other
piracy of copyrighted materials over the Internet. nations in overall speed, cost, and quality of service,
Comcast, the second largest Internet service adding credibility to this argument.
provider in the United States, reported that illegal
file sharing of copyrighted material was consuming
Chapter 7 Telecommunications, the Internet, and Wireless Technology 263
And with enough options for Internet access, neutrality on wired connections, but not on wireless
regulation would not be essential for promoting net networks, which are becoming the dominant Internet
neutrality. Dissatisfied consumers could simply platform. The proposal was an effort to define some
switch to providers who enforce net neutrality and sort of middle ground that would safeguard net neu-
allow unlimited Internet use. trality while giving carriers the flexibility they needed
to manage their networks and generate revenue from
Since the Comcast ruling was overturned, FCC them. None of the major players in the net neutrality
efforts to support net neutrality have been in a debate showed support and both sides remain dug in.
holding pattern as it searches for some means of
regulating broadband Internet service within the Sources: Joe Nocera, “The Struggle for What We Already Have,”
constraints of current law and current court rulings. The New York Times, September 4, 2010; Claire Cain Miller, “Web
One proposal is to reclassify broadband Internet Plan is Dividing Companies,” The New York Times, August 11,
transmission as a telecommunications service so the 2010; Wayne Rash, “Net Neutrality Looks Dead in the Clutches of
FCC could apply decades-old regulations for Congress,” eWeek, June 13 2010; Amy Schatz and Spencer E. Ante,
traditional telephone networks. “FCC Web Rules Create Pushback,” The Wall Street Journal, May 6,
2010; Amy Schatz, “New U.S. Push to Regulate Internet Access,”
In August 2010, Verizon and Google issued a policy The Wall Street Journal, May 5, 2010; and Joanie Wexler: “Net
statement proposing that regulators enforce net Neutrality: Can We Find Common Ground?” Network World, April
1, 2009.
CASE STUDY QUESTIONS MIS IN ACTION
1. What is network neutrality? Why has the Internet 1. Visit the Web site of the Open Internet Coalition
operated under net neutrality up to this point in and select five member organizations. Then visit
time? the Web site of each of these organizations or surf
the Web to find out more information about each.
2. Who’s in favor of net neutrality? Who’s opposed? Write a short essay explaining why each organiza-
Why? tion is in favor of network neutrality.
3. What would be the impact on individual users, 2. Calculate how much bandwidth you consume
businesses, and government if Internet providers when using the Internet every day. How many
switched to a tiered service model? e-mails do you send daily and what is the size of
each? (Your e-mail program may have e-mail file
4. Are you in favor of legislation enforcing network size information.) How many music and video
neutrality? Why or why not? clips do you download daily and what is the size of
each? If you view YouTube often, surf the Web to
find out the size of a typical YouTube file. Add up
the number of e-mail, audio, and video files you
transmit or receive on a typical day.
be allocated to different machines. Figure 7-10 illustrates one way that these
services can be arranged in a multitiered client/server architecture.
E-mail enables messages to be exchanged from computer to computer, with
capabilities for routing messages to multiple recipients, forwarding messages,
and attaching text documents or multimedia files to messages. Although some
organizations operate their own internal electronic mail systems, most e-mail
today is sent through the Internet. The costs of e-mail is far lower than equiva-
lent voice, postal, or overnight delivery costs, making the Internet a very inex-
pensive and rapid communications medium. Most e-mail messages arrive
anywhere in the world in a matter of seconds.
Nearly 90 percent of U.S. workplaces have employees communicating
interactively using chat or instant messaging tools. Chatting enables two or