DPB 30063 / MEASURES OF CENTRAL TENDENCY 43 3.2.3 Mode
DPB 30063 / MEASURES OF CENTRAL TENDENCY 44 Example 9 Compute the mode from the table below Class Frequency 1.0 – 1.4 1 1.5 – 1.9 7 2.0 – 2.4 8 2.5 – 2.9 6 3.0 – 3.4 4 3.5 – 3.9 18 4..0 – 4.4 7 4.5 – 4.9 11 5.0 – 5.4 9
DPB 30063 / MEASURES OF CENTRAL TENDENCY 45 Solution Example 10 Table below shows the distribution of test scored obtained by 42 students in a Business Mathematics class. Find mode. Scores obtained Number of students 80 – 90 1 90 – 100 2 100 – 110 5 110 – 120 10 120 – 130 15 130 - 140 7 140 - 150 2
DPB 30063 / MEASURES OF CENTRAL TENDENCY 46 Solution
DPB 30063 / MEASURES OF CENTRAL TENDENCY 47 3.2.4 Relationship between mean, median and mode • The concept of skewness can helps us to understand the relationship between the three measures: mean, median and mode. • The mode is the highest point of the curve and the median is the middle value. • The mean is usually located somewhere towards the tail of the distribution because the mean is affected by all values, including the extreme ones. • A bell-shaped or normal distribution has no skewness whereby the mean, median and mode are all at the centre of the distribution. i. Positively skewed or skewed to the right In a set of data, if the mode is less than median and mode, the distribution is skewed to the right or positively skewed. Mode < median < mean mode median mean
DPB 30063 / MEASURES OF CENTRAL TENDENCY 48 ii. Negatively skewed or skewed to the left In a set of data, if the mode is more than mean and median, the distribution is skewed to the left or negatively skewed. Mode > median > mean mean median mode iii. Symmetrical or bell shape If the mean, median and mode are equal, therefore the distribution is called symmetrical or bell-shape. mean = median = mode Mean Median Mode
DPB 30063 / MEASURES OF CENTRAL TENDENCY 49 Example 11. Identify the shape of the distribution below a. If mean = 56, median = 68, mode = 50 (Answer : ) b. If median = 125, mode = 145, mean = 115 (Answer : ) c. If mode = 255, median = 255, mean = 255 (Answer : )
DPB 30063 / MEASURES OF CENTRAL TENDENCY 50 SELF REVISION QUESTION 1 The cost per load (in cents) of laundry detergents tested by a consumer organization is shown below:- Class Limits Frequency 13 – 19 20 – 26 27 – 33 34 – 40 41 – 47 48 – 54 55 – 61 62 - 68 2 7 12 5 6 1 0 2 Calculate :- a. Mean b. Median c. Mode QUESTION 2 a. Affan has been working on programing and updating a Website for his company for the past 7 months. The following data represent the number of hours that Affan worked for each month: 24, 25, 31, 50, 53, 66, 78 Based on the above data, you are required to find : i. Mean ii. Median iii. Mode
DPB 30063 / MEASURES OF CENTRAL TENDENCY 51 b. A salesman keeps a record of the number of shops he visits each day. Shops visited Frequency 0 – 9 3 10 – 19 8 20 – 29 21 30 – 39 60 40 - 49 21 Based on the frequency Table above, you are required to calculate: i. Mean ii. Median iii. Mode c. Determine the form of distribution. QUESTION 3 a. Give the definition of mean. b. List the THREE steps to calculate median for an ungrouped data. c. Below is the data collected in a study regarding the orders of baju kurung at Seri Indah Boutique during the Aidilfitri festival in the year 2004. Calculate the mean, mode and median (using empirical rule) Class Number of orders 10 – 19 85 20 – 29 120 30 – 39 225 40 – 49 135 50 – 59 105 60 - 69 30 Total 700
DPB 30063 / MEASURES OF CENTRAL TENDENCY 52 d. Give TWO advantages and TWO disadvantages of using mean as a measurement of central tendency. QUESTION 4 The following is the number of books borrowed by students per semester that has been recorded by the library management of polytechnic. Number of books borrowed Frequency 30 – 39 7 40 – 49 11 50 – 59 19 60 – 69 25 70 – 79 18 80 – 89 11 90 - 99 9 a. From the table, compute: i. Mean ii. Median iii. Mode b. State the conclusions from the answers derived from a) i, ii and iii.
DPB 30063 / MEASURES OF DISPERSION AND SKEWNESS 53 CHAPTER 4 : MEASURES OF DISPERSION AND SKEWNESS
DPB 30063 / MEASURES OF DISPERSION AND SKEWNESS 54 CHAPTER 4 : MEASURES OF DISPERSION AND SKWENESS Learning Objectives: End of chapter, student be able to: 1. Explain the measure of dispersion. 2. Calculate the measure of dispersion for ungrouped data. 3. Calculate the measure of dispersion for grouped data. 4. Calculate the coefficient of variation. 5. Calculate the measures of skewness. Introduction • Measure of dispersion help us to understand the spread or variability of a set data. • It gives additional information to judge the reliability of the measure of central tendency and helps us in comparing dispersion that is present in various samples. • A widely spread of distribution should not be used for decision making. • For example, a financial analyst knows that a widely dispersed earnings indicate a high risk to stockholder and creditors whereas small dispersion of earnings indicated stable earnings and therefore lower risk level. • There are two types measure of dispersion that is ungroup data and group data. • There are various types of measure of dispersion that are range, mean deviation, variance, standard deviation and coefficient of variation. 4.1 Measure of Dispersion for Ungroup Data 4.1.1 Range Range is the simplest measures of dispersion. It is the difference between the highest value and the lowest value in a set of data. FORMULA: 4.1.2 Mean Deviation Mean Deviation is calculated by summing up the difference between each observation and the mean. This value is then divided by the number of observations.
DPB 30063 / MEASURES OF DISPERSION AND SKEWNESS 55 FORMULA: 4.1.3 Variance and Standard Deviation • The variance and standard deviation are the measures of the average scatter around the mean. • They measure the fluctuations of data values above and below its mean. • Large standard deviation means large variability within the data set. • When comparing distributions of different mean and variances, a useful measure is the coefficient of variation (cv). • The coefficient of variation gives us the ratio of the standard deviation to the arithmetic mean expressed as a percentage FORMULA: Example 1 Consider the following sample: 41, 55, 30, 38, 50, 42, 39, 25 Find : i. Range ii. Mean deviation iii. Variance and standard deviation iv. Coefficient of variation
DPB 30063 / MEASURES OF DISPERSION AND SKEWNESS 56 Solution
DPB 30063 / MEASURES OF DISPERSION AND SKEWNESS 57 Example 2 Ali’s monthly earnings for a year ( in RM) are as follows : 139 150 151 151 157 158 160 161 162 162 173 175 Find : i. Range ii. Mean deviation iii. Variance and standard deviation iv. Coefficient of variation Solution
DPB 30063 / MEASURES OF DISPERSION AND SKEWNESS 58 Variance = 1 −1 [∑ 2 − (∑) 2 ] 4.2 Measure of Dispersion for Group Data 4.2.1 Range FORMULA: 4.2.2 Mean Deviation FORMULA:
DPB 30063 / MEASURES OF DISPERSION AND SKEWNESS 59 4.2.3 Variance and Standard Deviation FORMULA: Example 3 For the following data find : i. Range iii. Variance and standard deviation ii. Mean deviation iv. Coefficient of variation Price f 30 – 39 8 40 – 49 20 50 – 59 32 60 – 69 28 70 – 79 23 80 - 89 9 Solution
DPB 30063 / MEASURES OF DISPERSION AND SKEWNESS 60
DPB 30063 / MEASURES OF DISPERSION AND SKEWNESS 61 Example 4 Find the range, mean deviation, variance, standard deviation and coefficient of variation for the following data: Total Asset Number 10 – 14 8 14 – 18 10 18 – 22 15 22 – 26 12 26 – 30 6 30 - 34 2 Solution
DPB 30063 / MEASURES OF DISPERSION AND SKEWNESS 62 Example 5 Typist Deeja can type 40 word per minute with standard deviation of 5 while typist Aisah can type 160 words per minute with standard deviation of 10. Which typist is more consistent in her work? Solution 4.3 Measure of Skewness Measure of skewness is used to determine the different between the mean and the mode of the distribution. It can be summarized in the following data: If Mode > median / mean Distribution is positively skewed (skewed to the right) If mode < median / mean Distribution is negatively skewed (skewed to the left If mode = median = mean ( = 0 ) Distribution is symmetrical (bell shaped)
DPB 30063 / MEASURES OF DISPERSION AND SKEWNESS 63 4.3.1 Pearson’s Coefficient of Skewness 1 (PCS 1) Pearson’s Coefficient of skewness is usually used to measure the skewness of the distribution. Formula = 4.3.2 Pearson’s Coefficient of Skewness 2 (PCS 2) Formula = Example 6 Given that the mean, mode and standard deviation of a set of data are 4, 5 and 0.5 respectively, find Pearson’s Coefficient of skewness and explain the distribution. Solution: Example 7 Table below gives the distribution of price of 136 begs. Calculate Pearson’s Coefficient of skewness 1 and 2. Determine the type of distribution in terms of skewness. Price Frequency 100 and less than 200 16 200 and less than 300 23 300 and less than 400 29 400 and less than 500 36 500 and less than 600 32
DPB 30063 / MEASURES OF DISPERSION AND SKEWNESS 64 Solution Price Freq 100 - 200 16 200 - 300 23 300 - 400 29 400 - 500 36 500 - 600 32
DPB 30063 / MEASURES OF DISPERSION AND SKEWNESS 65
DPB 30063 / MEASURES OF DISPERSION AND SKEWNESS 66 SELF REVISION Question 1 The data below refers to the age and number of customers of a grocery store in one week. Age (year) Customer 5 – 9 3 10 – 14 7 15 – 19 11 20 – 24 18 25 – 29 28 30 – 34 17 35 – 39 14 40 – 44 8 45 - 49 4 Based on the information above, you are required to calculate:- a. Deviation of the mean b. Standard Deviation c. The coefficient of variation Question 2 a. The table below shows the weight of 100 honeydews produce from Farm X. Weight (’00 grams) Frequency 4 – 6 4 6 – 8 9 8 – 10 34 10 – 12 25 12 - 14 28 i. Calculate the mean of the weight. ii. Calculate standard deviation of the weight. Farm Z produced honeydews with a mean weight of 1350 gram and standard deviation of 250 gram. Choose which farm has more consistent weight?
DPB 30063 / MEASURES OF DISPERSION AND SKEWNESS 67 Question 3 a. i . Define range ii. Identify the range of the following data: Height (cm) Frequency 150 – 155 5 155 – 160 10 160 – 165 8 165 - 170 6 b. The following table shows the time (in minutes) taken by 40 students to complete a test. Time (minutes) Number of students 70.1 – 80.0 2 80.1 – 90.1 4 90.1 – 100.0 15 100.1 – 110.0 8 110.1 – 120.0 11 Calculate: i. variance ii. standard deviation c. Around 60 college students were randomly chosen and their scores obtained for a particular Statistics Quiz are given below: Mean = 58.4 Median = 56.8 Mode = 54.2 Variance = 21.16 i. Calculate the Pearson’s Coefficient of Variation 2 ii. Determine the skewness of distribution by sketching the graph.
DPB 30063 / CORRELATION AND REGRESSION 68 CHAPTER 5 : CORRELATION AND REGRESSION
DPB 30063 / CORRELATION AND REGRESSION 69 CHAPTER 5 : CORRELATION AND REGRESSION Learning Objectives : End of chapter, student be able to : 1. Explain the concept of correlation. 2. Construct scatter diagram. 3. Calculate linear coefficient of correlation. 4. Show concept of regression. Introduction • Sometimes two variables are found to relate to each other in some ways. • A change of one variable might cause another variable to change due to the influence of the first variable on the second variable. • For example, an increase in sugar price may cause the price of certain food to increase. • Higher sugar prices cause the production cost of these food to increase, therefore the food manufacturers have to increase their selling price. • The same scenario applies when petroleum price increase. • Correlation analysis is a statistical method used to measure the strength of the relationship between two variables. • Regression analysis is a statistical technique that can be used to obtain the equation relating to the two variables. 5.1 Scatter Diagram • Scatter diagram can be used to determine the strength of relationship between two variables. • Scatter diagram is a plotted graph. • Normally the independent variable is labelled on the horizontal axis and the dependent variable is on the vertical axis. • The scatter diagram forms certain patterns to show the strength between two variables.
DPB 30063 / CORRELATION AND REGRESSION 70 Perfect positive correlation Perfect negative correlation (a) (b) (c)
DPB 30063 / CORRELATION AND REGRESSION 71 Example 1 The manager of ABC System randomly selected 10 sales representatives and determined the number of sales calls each one made last month and the number of units of the product he or she sold last month. Plot a scatter diagram and determined the strength of the correlation. Sales Representative Number of Sales Calls Number of Units Sold Mohd Haziq 14 28 Siti Nabilah 35 66 Amirul Aiman 22 38 Atia Aisyah 29 70 Puteri Afiqah 6 22 Nor Batrisya 15 27 Yusof Hafiz 17 28 Siti Nabihah 20 47 Amir Luqman 12 14 Mohd Afiq 29 68 Solution
DPB 30063 / CORRELATION AND REGRESSION 72 Example 2 The following sample observations were selected. Draw a scatter diagram and determine the strength of the two variables. x 5 3 6 3 4 4 6 8 7 y 13 15 7 13 13 11 9 5 10 Solution 5.2 Correlation • Linear correlation coefficient provides us with measures to evaluate the strength of the relationship. • Two methods are commonly used for this purpose :- i. Pearson’s product moment correlation coefficient. ii. Spearman’s rank correlation coefficient 5.2.1 Pearson’s product moment correlation coefficient (r) • It tells us two aspects of the relationship between two variables. • The sign ( -ve or +ve) for r identifies the kind of relationship between the two quantitative variables and the magnitude of r describes the strength of the relationship. • The magnitude of the correlation lies between – 1.0 and + 1.0.
DPB 30063 / CORRELATION AND REGRESSION 73 • The value of correlation coefficient that is close to – ve 1.0 indicated that the two variables have a strong negative relationship. • Negative relationship means that an increase in one variable causes the other variable to decrease. • The value that is close to + 1.0 indicates that the two variables have a strong positive relationship. • Positive relationship means that an increase in one variable will cause the other variable to increase and vice versa. Formula : • There are six types of strength of the correlation : i. Strong positive correlation ( 0.66 < r < 0.99) ii. Moderate positive correlation ( 0.33 < r < 0.65 ) iii. Weak positive correlation ( 0.01 < r < 0.32 ) iv. Strong negative correlation (- 0.66 < r < - 0.99) v. Moderate negative correlation ( - 0.33 < r < 0.65 ) vi. Weak negative correlation ( - 0.01 < r < - 0.32) Example 3 ABZ enterprise has collected data on its production and total costs. The result are given below. Calculate the coefficient of correlation for the data. Production 65 63 76 46 68 72 68 57 36 39 Total Cost 68 66 86 48 65 66 71 57 42 87
DPB 30063 / CORRELATION AND REGRESSION 74 Solution Production Total Cost 65 68 63 66 76 86 46 48 68 65 72 66 68 71 57 57 36 42 96 87 Formula
DPB 30063 / CORRELATION AND REGRESSION 75 Example 4 The data for the problem involving the number of sales calls and the number of units sold are given below. Determine the coefficient of correlation. Representative Sales Calls Units Sold Mohd Haziq 14 28 Siti Nabilah 35 66 Amirul Aiman 22 38 Atia Aisyah 29 70 Puteri Afiqah 6 22 Nor Batrisya 15 27 Isa Dzarif 17 28 Siti Nabihah 20 47 Amir Luqman 12 14 Mohd Afiq 29 68 Solution
DPB 30063 / CORRELATION AND REGRESSION 76 5.2.2 Spearman’s Rank Correlation of Coefficient (P) • Spearman’s rank correlation allows us to study the relationship between sets of ranked data. Formula • Like the coefficient of correlation, the coefficient of rank correlation can assume any value from – 1.00 up to + ve 1.00. • A value of – ve 1.00 indicated perfect negative correlation. • A value of + ve 1.00 indicates perfect positive correlation among the ranks. • A rank correlation of 0 indicated that there is no association among the ranks. Example 5 The following sample observations were selected. X 5 3 6 3 4 4 6 8 y 13 15 7 12 13 11 9 5 Identify the ranking using : i. Ascending order ii. Descending order iii. Calculate Spearman’s Ranking Correlation
DPB 30063 / CORRELATION AND REGRESSION 77 Solution X Y 3 13 5 15 6 7 3 12 4 13 4 11 6 9 8 5 X Y 3 13 5 15 6 7 3 12 4 13 4 11 6 9 8 5
DPB 30063 / CORRELATION AND REGRESSION 78 Example 6 A department at a faculty offers both full-time and part-time, classes for the Master in Quantitative Sciences. At the end of the program, each graduating student was asked to rank the eight courses in the program according to their favorite, “1” indicates the most favorite while “8” indicates the least favorite. The average ranks for each course ranked by the full-time and part-time students were recorded as follows. Courses Ranking by Full time student Ranking by part time student Operation Management 1 7 Forecasting 4 8 Economics 3 5 Research Methodology 8 1 Organizational Behavior 6 3 Stochastics Models 7 2 Quantitative Marketing 2 6 Final Project 5 4
DPB 30063 / CORRELATION AND REGRESSION 79 Solution Courses Ranking by Full time student (x) Ranking by part time student (y) Operation Management 1 7 Forecasting 4 8 Economics 3 5 Research Methodology 8 1 Organizational Behavior 6 3 Stochastics Models 7 2 Quantitative Marketing 2 6 Final Project 5 4
DPB 30063 / CORRELATION AND REGRESSION 80 5.3 Regression • Regression analysis is a statistical techniques that can be used to obtain the equation relating to the two variables. • A more accurate regression line is using Least Squares Method. • A line with a positive slope indicates that there is a direct relationship between the two variables. • This means that if x increase, y will also increase as well, and vice versa. • A negative slope indicates an inverse relationship between the two variables. • This means that if x increase, y will decrease, and if x decrease, y will increase. • Thus x and y are always moving in opposite directions. • The linear regression for sample data can be written in the form of y = a + bx • x is the independent variable and y is the dependent variable, while a, b are constant. • The regression line can be used to make forecasting about the value of y for a given value of x in the domain. • The accuracy of the forecast depends on the strength of the relationship between the two variables. • The stronger the relationship between x and y, the more accurate is the forecast. Formula
DPB 30063 / CORRELATION AND REGRESSION 81 Example 7 Consider the data below. What is the regression equation. Find the estimated monthly sales if sales call are 6 and 29 Sales Representative Sales Calls Units Sold Mohd Ali 14 28 Budiyanto 35 66 Chong Wei 22 38 Misbun 29 70 Safawi 6 22 Kamarul 15 27 Rajagopal 17 28 Roslina 20 47 Swee Lee 12 14 Siti Rahimah 29 68 Solution
DPB 30063 / CORRELATION AND REGRESSION 82 Example 8 A production manager collected data on production cost and the quantity produced for 10 consecutive days. These data are given below. Day 1 2 3 4 5 6 7 8 9 10 Quantity (‘000 units) 10 13 20 18 17 15 16 14 11 12 Cost (RM ‘000) 20 28 38 35 33 30 34 29 23 25 a. Find the regression equation for production cost, y, on the production quantity, x, using the least squares method. b. Explain the meaning of the constant a and b in the equation. c. Estimate the production cost when the production quantity is 25,000 units. How much was the fixed cost. Explain. Solution Quantity Cost 10 20 13 28 20 38 18 35 17 33 15 30 16 34 14 29 11 23 12 25
DPB 30063 / CORRELATION AND REGRESSION 83
DPB 30063 / CORRELATION AND REGRESSION 84 SELF REVISION QUESTION 1 a. The survey result of the average monthly rents (in RM) for one-bedroom apartments and twobedroom apartments in randomly selected metropolitan areas are shown in Table 3. Determine if there is a relationship between the rents by using Spearman Rank correlation coefficient. One bedroom , x Two-bedroom, y 782 1223 486 902 451 739 529 954 618 1055 520 875 845 1455 b. A doctor wishes to know whether there is a relationship between a mother’s weight (in kg) and her newborn baby’s weight (in kg). Mother’s weight , x Baby’s weight, y 79.8 3.0 72.6 3.7 84.8 4.2 95.3 3.2 88.9 4.0 64.4 4.2 93.0 3.4 97.5 3.9 From the table above, you are required to: i. Draw a scatter plot. ii. Identify the regression equation, y = a + bx using least square method.
DPB 30063 / CORRELATION AND REGRESSION 85 QUESTION 2 (a) Seasons Delight wants to identify whether the expenses in advertisement gives positive impacts to the sales for the last 6 months. The result is given in the table below: Month Advertisement Expenses (RM’000) Sales (RM Million) January 2 8 February 4 12 March 7 17 April 8 20 May 10 22 June 13 28 July 14 32 i. Relate the expenses of advertising and sales by using the Pearson’s product moment correlation coefficient. ii. Identify the regression line using the least square method. (b) i. Based on table in (a), sketch an appropriate scatter diagram to show the relationship between expenses in advertisement and sales. ii. Based on answer in (a) (ii), calculate the sales if the expenses in advertisement is RM 20,000.00 QUESTION 3 (a) The table below shows the Mathematics and Science test scores for five students: Students Mathematics Science AA 96 98 BB 89 75 CC 88 90 DD 66 50 EE 72 60 i. Draw a scatter for the above data and determine the type of relationship between the Mathematics test scores and the Science test scored. ii. Calculate Spearman’s rank correlation.
DPB 30063 / PROBABILITY 86 CHAPTER 6 : ELEMENTARY OF PROBABILITY CONCEPTS
DPB 30063 / PROBABILITY 87 CHAPTER 6 : ELEMENTARY OF PROBABILITY CONCEPTS Learning Objective End of chapter, student be able to: 1. Understand basic concept of probability 2. Identify sample, events and probability 3. Apply additional rules for probability 4. Apply multiplication for probability 5. Calculate tree diagram in probability Introduction • Probability gives a measurement of the likelihood that a certain outcome will occur. • It acts as a link between descriptive and inferential statistics. • Probability is used to make statements about the occurance or non-occurance of an event under uncertain conditions. 6.1 Definition of Probability • Probability is a value between 0 and 1, inclusive. • It describe the chance or likelihood that an event will happen. • A value near zero means the event is not likely to happen. • A value near one means it is likely to happen. • It is also said that a probability is a measure of the likelihood that an event will happen in the future. • Probability for event A is denoted by P(A). 6.2 Use of probability theory • Probability theory is a quantitative measure of uncertainty. It is a number that conveys the strength of our belief that certain event will occur. • Allows the decision makers with only limited information to analyze the risks and minimize the gamble inherent, for example, in marketing new product or accepting an incoming shipment possibly containing defective parts.
DPB 30063 / PROBABILITY 88 • Probability theory is the basis for inferential statistics. In inferential statistics, we make decisions under conditions of uncertainty. Probability theory will help us make decisions under such conditions if imperfect information and uncertainty. Combining probability and probability distributions with descriptive statistics will help us make decisions about population based on information obtained from samples. 6.3 Basic definition of terms used Experiment - A process that gives one and only one observation among several likely observations to occur. - Example: Tossing a coin Rolling a fair dice Measuring daily rainfall Recording test grade Outcome - The result of a single trial of an experiment - Example: Getting a head or a tail in tossing a fair coin Sample Space - A sample space is the set of all possible outcomes of an experiment - Example: Consider the experiment of flipping two coins. Possible outcomes: Therefore the sample space is S = { } Event - A subset of the sample space and consist one or more outcomes. - Example: In an experiment of tossing a dice and recording the number on top face. If E is the event that an even number occurs, then E = { }. If event F is the number on top face more than 3, then F = { }
DPB 30063 / PROBABILITY 89 6.4 Independent Events • Two events are independent if the occurance of one does not affect the probability of the other occurance. • An example would be rolling a dice and flipping a coin. Rolling the dice does not affect the probability of flipping the coin to get a head. 6.5 Dependent Events • If the occurance of one event affects the probability of the occurance of another event, the two events are said to be dependent events. • An example can be picking up two balls from a box one after another but the first one picked will not be replaced before the second ball is pick. Example 1 State briefly whether the following pairs of event are independent or dependent: a. Getting two successive ‘6’ in rolling a fair dies twice. Answer: b. Earning a high level of income and paying a high amount of income tax. Answer: c. Being a doctor and having blue eyes. Answer: d. Having high absentism rate and failing an exanimation. Answer: 6.6 Venn Diagram • The objectives of using Venn diagram is to represent sets, subsets, union and intersection of sets and the differences of two sets. • Venn diagram is an illustration in which shapes most commonly circles represent groups of items which usually sharing common characteristics. It is used for visualizing logical relationships. Most important, it can be used in solving problems that are often encountered. • An event also can be illustrated by using Venn diagram. A Venn diagram is a rectangle representing the whole space (universal set) and circles inside representing various subspaces (other subsets of
DPB 30063 / PROBABILITY 90 the universal data). It provides a convenient way to represent a sample space. There are two situations of event. a. Union event Let A and B be two events defined in a sample space. The union of events A and B is the event that occurs when either A or B or both occur. It is denoted as A U B. S A B b. Intersection event Let A and B be two events defined in a sample space. The intersaction of event A and B is the event that occurs when either A and B occur. It is denoted by either A ∩ B or AB. S Intersection A B Example 2 Statistical experiment has 10 equally likely outcomes that are denoted by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10. Let event A = { 5, 7, 9 }, B = { 1, 2, 5, 10 } and C = { 1, 4, 6, 8 }. Draw a Venn diagram. List the sample space of: i. A U B ii. A ∩ B Solution :
DPB 30063 / PROBABILITY 91 6.7 Two Way Table A two way table organizes two sets of categorical data (can sometimes be numeric like year etc) using rows and columns. When reading a two way table, many different sets of information can be consider. Every column, every row and every intersection of columns and rows can state a different conclusion about the data set. Calculate percentages based on two way table. a. What percent of people responded that they do not like skateboard and do not like snowmobiles. b. What percent of people who like skateboard, do not like snowmobiles. Like Skateboard Do Not Like Skateboard Totals Like Snowmobiles 80 25 105 Do not like snowmobiles 45 10 55 Totals 125 35 160
DPB 30063 / PROBABILITY 92 Finishing a partially completed table Use what you know to calculate using addition and subtraction Example 3 A survey is carried on students and teachers, asking them whether they prefer football or hockey. Complete the following two way table and calculate the probabilities. Football Hockey Total Student 33 b c Teachers a 1 69 Total e 55 d a. A student is picked at random. What is the probability they prefer football? b. Someone who prefer football is picked at random. What is the probability they are teacher? c. Someone from the survey is picked at random. What are the chances, they are a teacher who prefer hockey? After School Activity Yes No Total Male d 40 e Female c b 95 Total 102 a 187