# UMUC STAT200 full course latest 2015 spring december [ all homework , quizes and final exam ]

Question

UMUC STAT-200 Homework Assignments Week #1

.1em; font-weight: normal;”>Textbook #1: Lane .1em; font-weight: normal;”>et al.1em; font-weight: normal;”>. Introduction to Statistics, David M. Lane et al., 2013.

( http://onlinestatbook.com/Online_Statistics_Education.pdf)

Textbook #2:Illowskyet al. Introductory Statistics, Barbara Illowsky et al., 2013. (http://openstaxcollege.org/files/textbook_version/hi_res_pdf/15/col11562-op.pdf)

Lane – Chapter 1: 3,10,17,18

3. If you are told only that you scored in the 80th percentile, do you know from that description exactly how it was calculated? Explain.

10. For the numbers 1, 2, 4, 16, compute the following:ΣX,ΣX2, and (ΣX)2

For #17 and #18 see the ADHD Treatment Case Study

(Page 624, http://onlinestatbook.com/2/case_studies/adhd.html)

17. What is the independent variable of this experiment? How many levels does it have?

18. What is the dependent variable? On what scale (nominal, ordinal, interval, ratio) was it measure

Illowsky – Chapter 1: 50,52,72,80,84,90

Use the following information to answer #50 and #52:

A Lake Tahoe Community College instructor is interested in the mean number of days Lake Tahoe Community College math students are absent from class during a quarter.

50. What is the population she is interested in?

a. all Lake Tahoe Community College students

b. all Lake Tahoe Community College English students

c. all Lake Tahoe Community College students in her classes

d. all Lake Tahoe Community College math students

52. The instructor’s sample produces a mean number of days absent of 3.5 days. This value is an example of a:

a. parameter.

b. data.

c. statistic.

d. variable.

72. A study was done to determine the age, number of times per week, and the duration (amount of time) of residents using a local park in San Jose. The first house in the neighborhood around the park was selected randomly and then every eighth house in the neighborhood around the park was interviewed. The sampling method was:

a. simple random

b. systematic

c. stratified

d. cluster

80. Fifty part-time students were asked how many courses they were taking this term. The (incomplete) results are shown below:

.png”>

Table: Part-time Student Course Loads

a. Fill in the blanks in the table.

b. What percent of students take exactly two courses?

c. What percent of students take one or two courses?

84. Forbes magazine published data on the best small firms in 2012. These were firms which had been publicly traded for at least a year, have a stock price of at least $5 per share, and have reported annual revenue between $5 million and $1 billion. The table shows the ages of the chief executive officers for the first 60 ranked firms.

.png”>

a. What is the frequency for CEO ages between 54 and 65?

b. What percentage of CEOs are 65 years or older?

c. What is the relative frequency of ages under 50?

d. What is the cumulative relative frequency for CEOs younger than 55?

e. Which graph shows the relative frequency and which shows the cumulative relative frequency?

.png”>

90. Seven hundred and seventy-one distance learning students at Long Beach City College responded to surveys in the 2010-11 academic year. Highlights of the summary report are listed in Table 1.39.

.png”>

a. What percent of the students surveyed do not have a computer at home?

b. About how many students in the survey live at least 16 miles from campus?

c. If the same survey were done at Great Basin College in Elko, Nevada, do you think the percentages would be the same? Why?

UMUC STAT-200 Homework Assignments Week #8

Dr. Brian Killough

Textbook #1: Lane et al. Introduction to Statistics, David M. Lane et al., 2013.

(.com/Online_Statistics_Education.pdf”>http://onlinestatbook.com/Online_Statistics_Education.pdf )

Textbook #2:Illowsky et al. Introductory Statistics, Barbara Illowsky et al., 2013.

(.org/files/textbook_version/hi_res_pdf/15/col11562-op.pdf”>http://openstaxcollege.org/files/textbook_version/hi_res_pdf/15/col11562-op.pdf )

Lane – Chapter 15: (1,5,8,10)

1. What is the null hypothesis tested by analysis of variance?

5. What is the difference between “N” and “n”?

8. What kind of skew does the F distribution have?

10. Assume an experiment is conducted with 5 conditions and 6 subjects in each condition.

What are dfnumerator and dfdenominator?

Illowsky – Chapter 13 (61,63,69,71,77,81)

HINT: In order to create an ANOVA table, you will need to follow examples 13.1 and 13.2 in your book. At the end of this homework assignment, there is also another detailed example. DO NOT calculate the SSwithin by hand. Use EXCEL (or calculate by hand) the value for SStotaland then calculate the value for SSbetween. The value of SSwithin = SStotal – SSbetween. Use the equations in the summary table at the top of page 699 to find the value of the F-statistic and then find the P-value.

.jpg”>

Use the following information to answer #61 and #63. Suppose a group is interested in determining whether teenagers obtain their drivers licenses at approximately the same average age across the country. Suppose that the following data are randomly collected from five teenagers in each region of the country. The numbers represent the age at which teenagers obtained their drivers licenses.

Null-Hypothesis is H0: μ1 = μ2 = μ3 = μ4 = μ5

Alternate Hypothesis is Hα: At least any two of the group means μ1, μ2, …, μ5 are not equal.

.jpg”>61. Find the degrees of freedom (numerator) = df(num)

63. Find the F-statistic using an ANOVA table.

69. A researcher wants to know if the mean times (in minutes) that people watch their favorite news station are the same. The table (on right) shows the results of a study. Assume that all distributions are normal, the population standard deviations are approximately the same, and the data were collected independently and randomly. Use a level of significance of 0.05.

.jpg”>71. Are the mean number of times a month a person eats out the same for whites, blacks, Hispanics and Asians? The table (on right) shows the results of a study. Assume that all distributions are normal, the four population standard deviations are approximately the same, and the data were collected independently and randomly. Use a level of significance of 0.05.

77. A grassroots group opposed to a proposed increase in the gas tax claimed that the increase would hurt working-class people the most, since they commute the farthest to work. Suppose that the group randomly surveyed 24 individuals and asked them their daily one-way commuting mileage. The results are as follows (see table). Determine whether or not the variance in mileage driven is statistically the same among the groups. Use a 5% significance level.

.jpg”>.jpg”>

.jpg”>

81. Is the variance for the amount of money, in dollars, that shoppers spend on Saturdays at the mall the same as the variance for the amount of money that shoppers spend on Sundays at the mall? The table (on right) shows the results of a study. Assume a 10% significance level.

HINT: See Section 13.4 (Test of Two Variances) to find the F-statistic. Assume the null hypothesis is the variances are the same and assume the alternate hypothesis is.png”>.

Example: Calculations in the Analysis of Variance (ANOVA)

.jpg”>

.jpg”>

.jpg”>

.jpg”>

Example Solution Table

.png”>

P-value using F-distribution (use EXCEL formula)

FDIST (F, df1, df2) = FDIST(9.085,4,45) = 0.00001815

UMUC STAT-200 Homework Assignments Week #2

Dr. Brian Killough

Textbook #1: Lane et al. Introduction to Statistics, David M. Lane et al., 2013.

(.com/Online_Statistics_Education.pdf”>http://onlinestatbook.com/Online_Statistics_Education.pdf )

Textbook #2:Illowsky et al. Introductory Statistics, Barbara Illowsky et al., 2013.

(.org/files/textbook_version/hi_res_pdf/15/col11562-op.pdf”>http://openstaxcollege.org/files/textbook_version/hi_res_pdf/15/col11562-op.pdf )

Lane – Chapter 2: 7,9 and Chapter 3: 6,8,30,31

7. For the data from the 1977 Stat. and Biom. 200 class for eye color, construct:

a. pie graph

b. horizontal bar graph

c. vertical bar graph

d. a frequency table with the relative frequency of each eye color

.jpg”>.png”>

9. Which of the box plots on the graph has a large positive skew? Which has a large negative skew?

.png”>6. You recorded the time in seconds it took for 8 participants to solve a puzzle. These times appear in the table on the right. However, when the data was entered into the statistical program, the score that was supposed to be 22.1 was entered as 21.2. You had calculated the following measures of central tendency: the mean, the median, and the mean trimmed 25%. Which of these measures of central tendency will change when you correct the recording error?

8. You know the minimum, the maximum, and the 25th, 50th, and 75th percentiles of a distribution. Which of the following measures of central tendency or variability can you determine? Mean, Median, Mode, Trimean, Geometric Mean, Range, Interquartile Range, Variance, Standard Deviation

For #30 and #31 see the ADHD Treatment Case Study

(Page 624, .com/2/case_studies/adhd.html”>http://onlinestatbook.com/2/case_studies/adhd.html)

30. What is the mean number of correct responses of the participants after taking the placebo (0 mg/kg)?

31. What are the standard deviation and the interquartile range of the d0 condition?

Illowsky – Chapter 2: 78, 80, 84, 88

78. Twenty-five randomly selected students were asked the number of movies they watched the previous week. The results are as follows.

.jpg”>

a. Construct a histogram of the data.

b. Complete the columns of the chart.

80. Use the following information: Suppose one hundred eleven people who shopped in a

special t-shirt store were asked the number of t-shirts they own costing more than $19 each.

.jpg”>

If the data were collected by asking the first 111 people who entered the store, then the type of sampling is:

a. cluster

b. simple random

c. stratified

d. convenience

84. Given the following box plot:

.jpg”>

a. which quarter has the smallest spread of data? What is that spread?

b. which quarter has the largest spread of data? What is that spread?

c. find the interquartile range (IQR).

d. are there more data in the interval 5–10 or in the interval 10–13? How do you know this?

e. which interval has the fewest data in it? How do you know this?

i. 0–2

ii. 2–4

iii. 10–12

iv. 12–13

v. need more information

88. Given the following box plots, answer the questions.

.jpg”>

a. In complete sentences, explain why each statement is false.

i. Data 1 has more data values above two than Data 2 has above two.

ii. The data sets cannot have the same mode.

iii. For Data 1, there are more data values below four than there are above four.

b. For which group, Data 1 or Data 2, is the value of “7” more likely to be an outlier?

Explain why in complete sentences

UMUC STAT-200 Homework Assignments Week #3

Dr. Brian Killough

Textbook #1: Lane et al. Introduction to Statistics, David M. Lane et al., 2013.

(.com/Online_Statistics_Education.pdf”>http://onlinestatbook.com/Online_Statistics_Education.pdf )

Textbook #2:Illowsky et al. Introductory Statistics, Barbara Illowsky et al., 2013.

(.org/files/textbook_version/hi_res_pdf/15/col11562-op.pdf”>http://openstaxcollege.org/files/textbook_version/hi_res_pdf/15/col11562-op.pdf )

Lane – Chapter 5: 7,25,27

7. You flip a coin three times.

(a) What is the probability of getting heads on only one of your flips?

(b) What is the probability of getting heads on at least one flip?

25. You are to participate in an exam for which you had no chance to study, and for that reason cannot do anything but guess for each question (all questions being of the multiple choice type, so the chance of guessing the correct answer for each question is 1/d, d being the number of options (distractors) per question; so in case of a 4-choice question, your guess chance is ¼ = 0.25. Your instructor offers you the opportunity to choose amongst the following exam formats:

I. 6 questions of the 4-choice type; you pass when 5 or more answers are correct;

II. 5 questions of the 5-choice type; you pass when 4 or more answers are correct;

III. 4 questions of the 10-choice type; you pass when 3 or more answers are correct.

Rank the three exam formats according to their attractiveness. It should be clear that the format with the highest probability to pass is the most attractive format. Which would you choose and why?

HINT: Use the Binomial Probability function

27. A refrigerator contains 6 apples, 5 oranges, 10 bananas, 3 pears, 7 peaches, 11 plums, and 2 mangos.

a. Imagine you stick your hand in this refrigerator and pull out a piece of fruit at random. What is the probability that you will pull out a pear?

b. Imagine now that you put your hand in the refrigerator and pull out a piece of fruit. You decide you do not want to eat that fruit so you put it back into the refrigerator and pull out another piece of fruit. What is the probability that the first piece of fruit you pull out is a banana and the second piece you pull out is an apple?

c. What is the probability that you stick your hand in the refrigerator one time and pull out a mango or an orange?

Illowsky – Chapter 3 (86,98,112,124) and Chapter 4 (72,80,88)

86. Roll two fair dice. Each die has six faces.

a. List the sample space.

b. Let A be the event that either a three or four is rolled first, followed by an even number. Find P(A).

c. Let B be the event that the sum of the two rolls is at most seven. Find P(B).

d. In words, explain what “P(A|B)” represents. Find P(A|B).

e. Are A and B mutually exclusive events? Explain your answer in one to three complete sentences, including numerical justification.

f. Are A and B independent events? Explain your answer in one to three complete sentences, including numerical justification.

98. At a college, 72% of courses have final exams and 46% of courses require research papers. Suppose that 32% of courses have a research paper and a final exam. Let F be the event that a course has a final exam. Let R be the event that a course requires a research paper.

a. Find the probability that a course has a final exam or a research project.

b. Find the probability that a course has NEITHER of these two requirements.

112. The table identifies a group of children by one of four hair colors, and by type of hair.

.jpg”>

a. Complete the table.

b. What is the probability that a randomly selected child will have wavy hair?

c. What is the probability that a randomly selected child will have either brown or blond hair?

d. What is the probability that a randomly selected child will have wavy brown hair?

e. What is the probability that a randomly selected child will have red hair, given that he or she has straight hair?

f. If B is the event of a child having brown hair, find the probability of the complement of B.

g. In words, what does the complement of B represent?

124. Suppose that 10,000 U.S. licensed drivers are randomly selected.

a. How many would you expect to be male?

b. Using the table or tree diagram, construct a contingency table of gender versus age group.

c. Using the contingency table, find the probability that out of the age 20–64 group, a randomly selected driver is female.

72. You buy a lottery ticket to a lottery that costs $10 per ticket. There are only 100 tickets available to be sold in this lottery. In this lottery there are one $500 prize, two $100 prizes, and four $25 prizes. Find your expected gain or loss.

80. Florida State University has 14 statistics classes scheduled for its Summer 2013 term. One class has space available for 30 students, eight classes have space for 60 students, one class has space for 70 students, and four classes have space for 100 students.

a. What is the average class size assuming each class is filled to capacity?

b. Space is available for 980 students. Suppose that each class is filled to capacity and select a statistics student at random. Let the random variable X equal the size of the student’s class. Define the PDF for X.

c. Find the mean of X.

d. Find the standard deviation of X.

88. A school newspaper reporter decides to randomly survey 12 students to see if they will attend Tet (Vietnamese New Year) festivities this year. Based on past years, she knows that 18% of students attend Tet festivities. We are interested in the number of students who will attend the festivities.

a. In words, define the random variable X.

b. List the values that X may take on.

c. Give the distribution of X. X ~ _____(_____,_____)

d. How many of the 12 students do we expect to attend the festivities?

e. Find the probability that at most four students will attend.

f. Find the probability that more than two students will attend.

UMUC STAT-200 Homework Assignments Week #4

Dr. Brian Killough

(.com/Online_Statistics_Education.pdf”>http://onlinestatbook.com/Online_Statistics_Education.pdf )

Textbook #2:Illowsky et al. Introductory Statistics, Barbara Illowsky et al., 2013.

Z-Tables (Normal Distribution) are at the end of this document

Lane – Chapter 7: 8,11,12

8. Assume the speed of vehicles along a stretch of I-10 has an approximately normal distribution with a mean of 71 mph and a standard deviation of 8 mph.

a. The current speed limit is 65 mph. What is the proportion of vehicles less than or equal to the speed limit?

b. What proportion of the vehicles would be going less than 50 mph?

c. A new speed limit will be initiated such that approximately 10% of vehicles will be over the speed limit. What is the new speed limit based on this criterion?

d. In what way do you think the actual distribution of speeds differs from a normal distribution?

11. A group of students at a school takes a history test. The distribution is normal with a mean of 25, and a standard deviation of 4.

(a) Everyone who scores in the top 30% of the distribution gets a certificate. What is the lowest score

someone can get and still earn a certificate?

(b) The top 5% of the scores get to compete in a statewide history contest. What is the lowest score someone can get and still go onto compete with the rest of the state?

12. Use the normal distribution to approximate the binomial distribution and find the probability of getting 15 to 18 heads out of 25 flips. Compare this to what you get when you calculate the probability using the binomial distribution. Write your answers out to four decimal places.

Illowsky – Chapter 6 (60,66,76,88)

60. The patient recovery time from a particular surgical procedure is normally distributed with a mean of 5.3 days and a standard deviation of 2.1 days. What is the median recovery time?

a. 2.7 b. 5.3 c. 7.4 d. 2.1

66. Height and weight are two measurements used to track a child’s development. TheWorld Health Organization measures child development by comparing the weights of children who are the same height and the same gender. In 2009, weights for all 80 cm girls in the reference population had a mean μ = 10.2 kg and standard deviation σ = 0.8 kg. Weights are normally distributed. X ~ N(10.2, 0.8). Calculate the z-scores that correspond to the following weights and interpret them.

a. 11 kg b. 7.9 kg c. 12.2 kg

76. Suppose that the distance of fly balls hit to the outfield (in baseball) is normally distributed with a mean of 250 feet and a standard deviation of 50 feet.

a. If X = distance in feet for a fly ball, then X ~ _____(_____,_____)

b. If one fly ball is randomly chosen from this distribution, what is the probability that this ball traveled fewer than 220 feet?

c. Find the 80th percentile of the distribution of fly balls. Sketch the graph, and write the probability statement.

88. Facebook provides a variety of statistics on its Web site that detail the growth and popularity of the site. On average, 28 percent of 18 to 34 year olds check their Facebook profiles before getting out of bed in the morning. Suppose this percentage follows a normal distribution with a standard deviation of five percent.

a. Find the probability that the percent of 18 to 34-year-olds who check Facebook before getting out of bed in the morning is at least 30.

b. Find the 95th percentile, and express it in a sentence.

Illowsky – Chapter 7 (62,70,96)

62. Suppose that the distance of fly balls hit to the outfield (in baseball) is normally distributed with a mean of 250 feet and a standard deviation of 50 feet. We randomly sample 49 fly balls.

Assume X-bar is the “Random Variable” X and is defined by the Central Limit Theorem

a. If X-bar = average distance in feet for 49 fly balls, then X-bar ~ _______(_______,_______)

b. What is the probability that the 49 balls traveled an average of less than 240 feet?

c. Find the 80th percentile of the distribution of the average of 49 fly balls.

70. Which of the following is NOT TRUE about the distribution for averages?

a. The mean, median, and mode are equal.

b. The area under the curve is one.

c. The curve never touches the x-axis.

d. The curve is skewed to the right.

96. A typical adult has an average IQ score of 105 with a standard deviation of 20. If 20 randomly selected adults are given an IQ test, what is the probability that the sample mean scores will be between 85 and 125 points?

UMUC STAT-200 Homework Assignments Week #5

Dr. Brian Killough

(.com/Online_Statistics_Education.pdf”>http://onlinestatbook.com/Online_Statistics_Education.pdf )

Textbook #2:Illowsky et al. Introductory Statistics, Barbara Illowsky et al., 2013.

See the file named “UMUC_STAT200_EXCEL_Tips” at the “Course Materials” menu link to find functions for calculating the Normal Distribution and Student’s T-distribution values needed for this assignment.

Lane – Chapter 10: 4,12,15,18

4. Why is a 99% confidence interval wider than a 95% confidence interval?

12. A person claims to be able to predict the outcome of flipping a coin. This person is correct 16/25 times. Compute the 95% confidence interval on the proportion of times this person can predict coin flips correctly. What conclusion can you draw about this test of his ability to predict the future?

15. You take a sample of 22 from a population of test scores, and the mean of your sample is 60. (a) You know the standard deviation of the population is 10. What is the 99% confidence interval on the population mean. HINT: Use the Z-score from the Normal Distribution (b) Now assume that you do not know the population standard deviation, but the standard deviation in your sample is 10. What is the 99% confidence interval on the mean now? HINT: Use the Student’s T-table

18. You were interested in how long the average psychology major at your college studies per night, so you asked 10 psychology majors to tell you the amount they study. They told you the following times: 2, 1.5, 3, 2, 3.5, 1, 0.5, 3, 2, 4. (a) Find the 95% confidence interval on the population mean. (b) Find the 90% confidence interval on the population mean. HINT: Be sure to use the Student’s T-table for each solution since you have a limited set of samples.

Illowsky – Chapter 8 (100,106,112,116,120,130)

100. Multiple choice … What is meant by the term “90% confident” when constructing a confidence interval for a mean?

a. If we took repeated samples, approximately 90% of the samples would produce the same confidence interval.

b. If we took repeated samples, approximately 90% of the confidence intervals calculated from those samples would contain the sample mean.

c. If we took repeated samples, approximately 90% of the confidence intervals calculated from those samples would contain the true value of the population mean.

d. If we took repeated samples, the sample mean would equal the population mean in approximately 90% of the samples.

106. Suppose that a committee is studying whether or not there is waste of time in our judicial system. It is interested in the mean amount of time individuals waste at the courthouse waiting to be called for jury duty. The committee randomly surveyed 81 people who recently served as jurors. The sample mean wait time was 8 hours with a sample standard deviation of 4 hours.

a. Find:.png”>, Sx, N, N-1

b. Define the random variables X and.png”> in words.

c. Which distribution should you use for this problem? Explain your choice.

d. Find the 95% confidence interval for the population mean time wasted.

e. Explain in a complete sentence what the confidence interval means.

112. In a recent sample of 84 used car sales costs, the sample mean was $6,425 with a standard deviation of $3,156. Assume the underlying distribution is approximately normal.

a. Which distribution should you use for this problem? Explain your choice.

b. Define the random variable.png”> in words.

c. Find the 95% confidence interval

d. Explain what a “95% confidence interval” means for this study.

116. Use the following information to answer the next two exercises: A quality control specialist for a restaurant chain takes a random sample of size 12 to check the amount of soda served in the 16 oz. serving size. The sample mean is 13.30 with a sample standard deviation of 1.55. Assume the underlying population is normally distributed. What is the error bound of the 95% confidence interval?

a. 0.87 b. 1.98 c. 0.99 d. 1.74

120. An article regarding interracial dating and marriage recently appeared in the Washington Post. Of the 1,709 randomly selected adults, 315 identified themselves as Latinos, 323 identified themselves as blacks, 254 identified themselves as Asians, and 779 identified themselves as whites. In this survey, 86% of blacks said that they would welcome a white person into their families. Among Asians, 77% would welcome a white person into their families, 71% would welcome a Latino, and 66% would welcome a black person. HINT: See Section 8.3 (A Population Proportion)

a. We are interested in finding the 95% confidence interval for the percent of all black adults who would welcome a white person into their families. Define the random variables X and P′, in words.

b. Which distribution should you use for this problem? Explain your choice.

c. Find the 95% confidence interval and error bound.

130. On May 23, 2013, Gallup reported that of the 1,005 people surveyed, 76% of U.S. workers believe that they will continue working past retirement age. The confidence level for this study was reported at 95% with a +/- 3% margin of error.

a. Determine the estimated proportion from the sample.

b. Determine the sample size.

c. Identify CL and α.

d. Calculate the error bound based on the information provided.

e. Compare the error bound in part d to the margin of error reported by Gallup. Are they the same?

f. Create a confidence interval for the results of this study.

UMUC STAT-200 Homework Assignments Week #6

Dr. Brian Killough

(.com/Online_Statistics_Education.pdf”>http://onlinestatbook.com/Online_Statistics_Education.pdf )

Textbook #2:Illowsky et al. Introductory Statistics, Barbara Illowsky et al., 2013.

See the file named “UMUC_STAT200_EXCEL_Tips” at the “Course Materials” menu link to find functions for calculating the Normal Distribution and Student’s T-distribution values needed for this assignment.

Lane – Chapter 11: (18)

18. You choose an alpha level of .01 and then analyze your data.

a. What is the probability that you will make a Type I error given that the null hypothesis is true?

b. What is the probability that you will make a Type I error given that the null hypothesis is false?

Lane – Chapter 12: (7,13)

7. Below are data showing the results of six subjects on a memory test. The three scores per subject are their scores on three trials (a, b, and c) of a memory task. Are the subjects getting better each trial? Test the linear effect of trial for the data in the table below.

a. Compute L (linear effect of trial) for each subject using the contrast weights -1, 0, and 1. That is, compute (-1)(a) + (0)(b) + (1)(c) for each subject. Make a new column in your table with this result.

b. Compute a one-sample t-test on this column (with the L values for each subject) you created.

HINT: See the example in the “Specific Comparisons” section of Chapter-12. Find the “t-value” and the “two-tailed probability” using the EXCEL “TDIST” function. Assume the statistic for this problem is “L” and use the following formula for the t-value. Also assume the hypothesized value is 0, since the contrast weighting (-1,0,+1) for a “perfect” set of data would make “L” be 0 in all cases.

t = (statistic – hypothesized value) / (standard error of the statistic)

t = (Mean of L – 0) / (Standard Error of L)

.jpg”>t-value =.png”>

X = Sample Mean

S = Sample Standard Deviation

N = Number of L-Samples

13. You are conducting a study to see if students do better when they study all at once or in intervals. One group of 12 participants took a test after studying for one hour continuously. The other group of 12 participants took a test after studying for three twenty minute sessions. The first group had a mean score of 75 and a variance of 120. The second group had a mean score of 86 and a variance of 100.

HINT: See Chapter-12 section on “Differences between two means (independent groups)”.

a. What is the calculated t value? Are the mean test scores of these two groups significantly different at the .05 level?

b. What would the t value be if there were only 6 participants in each group? Would the scores be significant at the .05 level?

Lane – Chapter 13: (4)

4. Rank order the following in terms of power.

Population-1 Mean

n

Population-2 Mean

Standard Deviation

A

29

20

43

12

B

34

15

40

6

C

105

24

50

27

D

170

2

120

10

Illowsky – Chapter 9 (65,71,77)

Background for problems 65 and 71: Previously, an organization reported that teenagers spent 4.5 hours per week, on average, on the phone. The organization thinks that, currently, the mean is higher. Fifteen randomly chosen teenagers were asked how many hours per week they spend on the phone. The sample mean was 4.75 hours with a sample standard deviation of 2.0. Conduct a hypothesis test.

65. The null and alternative hypotheses are:

a. Ho: x ¯ = 4.5, Ha : x ¯ > 4.5

b. Ho: μ ≥ 4.5, Ha: μ 4.75

d. Ho: μ = 4.5, Ha: μ > 4.5

71. The Type-1error is:

a. to conclude that the current mean hours per week is higher than 4.5, when in fact, it is higher

b. to conclude that the current mean hours per week is higher than 4.5, when in fact, it is the same

c. to conclude that the mean hours per week currently is 4.5, when in fact, it is higher

d. to conclude that the mean hours per week currently is no higher than 4.5, when in fact, it is not higher

77. An article in the San Jose Mercury News stated that students in the California state university system take 4.5 years, on average, to finish their undergraduate degrees. Suppose you believe that the mean time is longer. You conduct a survey of 49 students and obtain a sample mean of 5.1 with a sample standard deviation of 1.2. Do the data support your claim at the 1% level? Solve using the following steps, similar to Appendix-E (Hypothesios Testing with One Sample Mean):

a. State the Null Hypothesis (Ho) and Alternate Hypothesis (Ha)

b. Find the random variable X

c. State the distribution you will use and why ?

d. What is the test statistic (t-value) ?

e. What is the P-value (probability) ?

f. Will you reject or not reject the Null Hypothesis and why ?

Illowsky – Chapter 10 (79,91,120)

79. A student at a four-year college claims that mean enrollment at four–year colleges is higher than at two–year colleges in the United States. Two surveys are conducted. Of the 35 two–year colleges surveyed, the mean enrollment was 5,068 with a standard deviation of 4,777. Of the 35 four-year colleges surveyed, the mean enrollment was 5,466 with a standard deviation of 8,191. Test the hypothesis, assuming a 5% significance level. Solve using the following steps, similar to Appendix-E (Hypothesios Testing with Two Sample Means):

a. State the Null Hypothesis (Ho) and Alternate Hypothesis (Ha)

b. Find the random variable X (remember that X is the difference between the two sample means)

c. State the distribution you will use and why ?

d. What is the test statistic (t-value) ?

e. What is the P-value (probability) ?

f. Will you reject or not reject the Null Hypothesis and why ?

91. A powder diet is tested on 49 people, and a liquid diet is tested on 36 different people. Of interest is whether the liquid diet yields a higher mean weight loss than the powder diet. The powder diet group had a mean weight loss of 42 pounds with a standard deviation of 12 pounds. The liquid diet group had a mean weight loss of 45 pounds with a standard deviation of 14 pounds. Test the hypothesis, assuming a 5% significance level. Solve using the following steps, similar to Appendix-E (Hypothesios Testing with Two Sample Means):

a. State the Null Hypothesis (Ho) and Alternate Hypothesis (Ha)

b. Find the random variable X (remember that X is the difference between the two sample means)

c. State the distribution you will use and why ?

d. What is the test statistic (t-value) ?

e. What is the P-value (probability) ? HINT: Since the size of each sample set is different, the formula for the “degrees of freedom” is far more complex. See the formula on Page-554.

f. Will you reject or not reject the Null Hypothesis and why ?

120. A golf instructor is interested in determining if her new technique for improving players’ golf scores is effective. She takes four new students. She records their 18-hole scores before learning the technique and then after having taken her class. She conducts a hypothesis test. The data are as follows. Is the correct decision to Reject the Null Hypothesis or Not Reject the Null Hypothesis? Test the hypothesis, assuming a 5% significance level. Solve using the following steps:

a. State the Null Hypothesis (Ho) and Alternate Hypothesis (Ha)

b. Find the random variable X. Consider we are testing “paired samples”.

c. State the distribution you will use and why ?

d. What is the test statistic (t-value) ? Assume, the population mean of the differences = 0.

e. What is the P-value (probability) ?

f. Will you reject or not reject the Null Hypothesis and why ?

Player 1

Player 2

Player 3

Player 4

Mean score before class

83

78

93

87

Mean score after class

80

80

86

86

UMUC STAT-200 Homework Assignments Week #7

Dr. Brian Killough

(.com/Online_Statistics_Education.pdf”>http://onlinestatbook.com/Online_Statistics_Education.pdf )

Textbook #2:Illowsky et al. Introductory Statistics, Barbara Illowsky et al., 2013.

See the file named “UMUC_STAT200_EXCEL_Tips” at the “Course Materials” menu link to find functions for calculating Linear Regression and Chi Square Distribution statistics.

Lane – Chapter 14: (2,6)

2. The formula for a regression equation is Y’ = 2X + 9.

a. What would be the predicted score for a person scoring 6 on X?

b. If someone’s predicted score was 14, what was this person’s score on X?

6. For the (X,Y) data points below, compute:

Data Points: (4,6), (3,7), (5,12), (11,17), (10,9), (14,21)

a. The correlation (r) and determine if it is significantly different from a hypothesized slope of 0 (null hypothesis). HINT: Use the significance test for correlation on Page-482 and assume a 95% confidence.

b. The slope and intercept of the linear regression line

Lane – Chapter 17: (5,14)

5. At a school pep rally, a group of sophomore students organized a free raffle for prizes. They claim that they put the names of all of the students in the school in the basket and that they randomly drew 36 names out of this basket. Of the prize winners, 6 were freshmen, 14 were sophomores, 9 were juniors, and 7 were seniors. The results do not seem that random to you. You think it is a little fishy that sophomores organized the raffle and also won the most prizes. Your school is composed of 30% freshmen, 25% sophomores, 25% juniors, and 20% seniors.

a. What are the expected frequencies of winners from each class?

b. Conduct a significance test to determine whether the winners of the prizes were distributed throughout the classes as would be expected based on the percentage of students in each group. Report your Chi Square and p values.

c. What do you conclude about the null hypothesis (the observed and expected data are the same) assuming a 95% confidence?

.png”>14. A geologist collects hand-specimen sized pieces of limestone from a particular area. A qualitative assessment of both texture and color is made with the following results. Is there evidence of association between color and texture for these limestones? Explain your answer by testing your null hypothesis assuming a 95% confidence level.

Illowsky – Chapter 11 (70,102,113,117)

70. TRUE or FALSE … The standard deviation of the chi-square distribution is twice the mean.

102. Do men and women select different breakfasts? The breakfasts ordered by randomly selected men and women at a popular breakfast place is shown in the table. Conduct a test for homogeneity at a 5% level of significance (a=0.05).

.png”>

Suppose an airline claims that its flights are consistently on time with an average delay of at most 15 minutes. It claims that the average delay is so consistent that the variance is no more than 150 minutes. Doubting the consistency part of the claim, a disgruntled traveler calculates the delays for his next

25 flights. The average delay for those 25 flights is 22 minutes with a standard deviation of 15 minutes.

113. Find df

117. Leta=0.05. What is your decision regarding the hypothesis? Write your conclusion in a sentence and discuss whether there is sufficient data to support your decision. HINT: See Section 11.6 on “Test of a Single Variance”.

Illowsky – Chapter 12 (66,82)

66. Can a coefficient of determination be negative? Why or why not?

.png”>82. The cost of a leading liquid laundry detergent in different sizes is given in the table.

a. Using “size” as the independent variable (x) and “cost” as the dependent variable (y), and draw a scatter plot using EXCEL.

b. Calculate the least-squares line. Put the equation in the form of: ŷ = a + bx. See the instructors “EXCEL Tips” for finding a linear regression.

c. Find the correlation coefficient. Is it significant?

d. If the laundry detergent were sold in a 40-ounce size, find the estimated cost.

e. Is the least-squares line valid for predicting what a 300-ounce size of the laundry detergent would you cost? Why or why not?

f. What is the slope of the least-squares (best-fit) line? Interpret the slope.

UMUC STAT-200 Homework Assignments Week #8

Dr. Brian Killough

(.com/Online_Statistics_Education.pdf”>http://onlinestatbook.com/Online_Statistics_Education.pdf )

Textbook #2:Illowsky et al. Introductory Statistics, Barbara Illowsky et al., 2013.

Lane – Chapter 15: (1,5,8,10)

1. What is the null hypothesis tested by analysis of variance?

5. What is the difference between “N” and “n”?

8. What kind of skew does the F distribution have?

10. Assume an experiment is conducted with 5 conditions and 6 subjects in each condition.

What are dfnumerator and dfdenominator?

Illowsky – Chapter 13 (61,63,69,71,77,81)

HINT: In order to create an ANOVA table, you will need to follow examples 13.1 and 13.2 in your book. At the end of this homework assignment, there is also another detailed example. DO NOT calculate the SSwithin by hand. Use EXCEL (or calculate by hand) the value for SStotaland then calculate the value for SSbetween. The value of SSwithin = SStotal – SSbetween. Use the equations in the summary table at the top of page 699 to find the value of the F-statistic and then find the P-value.

.jpg”>

Use the following information to answer #61 and #63. Suppose a group is interested in determining whether teenagers obtain their drivers licenses at approximately the same average age across the country. Suppose that the following data are randomly collected from five teenagers in each region of the country. The numbers represent the age at which teenagers obtained their drivers licenses.

Null-Hypothesis is H0: μ1 = μ2 = μ3 = μ4 = μ5

Alternate Hypothesis is Hα: At least any two of the group means μ1, μ2, …, μ5 are not equal.

.jpg”>61. Find the degrees of freedom (numerator) = df(num)

63. Find the F-statistic using an ANOVA table.

69. A researcher wants to know if the mean times (in minutes) that people watch their favorite news station are the same. The table (on right) shows the results of a study. Assume that all distributions are normal, the population standard deviations are approximately the same, and the data were collected independently and randomly. Use a level of significance of 0.05.

.jpg”>71. Are the mean number of times a month a person eats out the same for whites, blacks, Hispanics and Asians? The table (on right) shows the results of a study. Assume that all distributions are normal, the four population standard deviations are approximately the same, and the data were collected independently and randomly. Use a level of significance of 0.05.

77. A grassroots group opposed to a proposed increase in the gas tax claimed that the increase would hurt working-class people the most, since they commute the farthest to work. Suppose that the group randomly surveyed 24 individuals and asked them their daily one-way commuting mileage. The results are as follows (see table). Determine whether or not the variance in mileage driven is statistically the same among the groups. Use a 5% significance level.

.jpg”>.jpg”>

.jpg”>

81. Is the variance for the amount of money, in dollars, that shoppers spend on Saturdays at the mall the same as the variance for the amount of money that shoppers spend on Sundays at the mall? The table (on right) shows the results of a study. Assume a 10% significance level.

HINT: See Section 13.4 (Test of Two Variances) to find the F-statistic. Assume the null hypothesis is the variances are the same and assume the alternate hypothesis is.png”>.

UMUC STAT-200 Test #1

Dr. Brian Killough

Instructions:Students must complete the quiz, on their own, with no help from other students, though help from your instructor is allowed. Students may use their books, computers and any other online resources to complete the quiz. Students must submit their answers and detailed work in a WORD or PDF attachment in the assignments area before the deadline on Sunday at 12-midnight. All work must be shown to receive full credit for a solution. Submitting an answer without any supporting information or explanation, will not receive credit. In some cases no supporting information or explanation is needed, but in many cases, an explanation of how the answer was obtained is needed. Please use your judgment in providing supporting information for your solutions. It is strongly suggested that students DO NOT wait until Sunday afternoon to start their quiz. Late tests will be penalized 10% per day.

Course Material: Covers material from weeks 1,2,3

Scoring: Each problem is worth 10 points. Some problems contains multiple parts, but the total value of the problem will be 10 points. The total test score will be a maximum of 100 points.

==========================================================================

(Problem 1) A random sample of 12 customers was chosen in a supermarket. The (incomplete) results for their checkout times are shown in the table below.

Checkout Time (minutes)

Frequency

Relative Frequency

Cumulative Relative Frequency

4.0 – 5.9

2

6.0 – 7.9

0.25

8.0 – 9.9

10.0 – 11.9

1

12.0 – 13.9

2

TOTALS

12

(a – 4 points) Complete the frequency table

(b – 2 points) What percent of the checkout times are at least 10 minutes?

(c – 2 points) What percent of the checkout times are between 8 and 10 minutes?

(d – 2 points) What percent of the checkout times are less than 12 minutes?

(Problem 2) Using the data from Problem #1 …

(a – 4 points) Construct a histogram

(b – 2 points) In what class interval must the median lie?

Assume the largest recorded checkout time was 13.2 minutes. Suppose that data point was incorrect and the actual checkout time was 13.8 minutes.

(c – 2 points) Will the mean of the dataset increase, decrease or remain the same and why?

(d – 2 points) Will the median of the dataset increase, decrease or remain the same and why?

(Problem 3) A fitness center is interested in the mean amount of time the clients exercise each week. A survey will be conducted of the clients. Answer the following questions (2 points each).

(a) What is the population?

(b) What is the sample?

(c ) What is the parameter?

(d) What is the statistic ?

(e) What is the variable?

(Problem 4) A random sample of starting salaries for an engineer are: $38000, $42000, $44000, $48000, and $68000. Find the following and show all work (2 points each). Include equations, a table or EXCEL work, to show how you found your solution.

(a) Mean

(b) Median

(c) Mode

(d) Standard Deviation

(e) If a recent graduate is considering a career in engineering, which statistic (mean or median) should they consider when determining the starting salary they are likely to make? Explain your answer.

(Problem 5) The checkout times (in minutes) for 12 randomly selected customers at a large supermarket during the store’s busiest time are as follows: 4.6, 8.5, 6.1, 7.8, 10.7, 9.3, 12.4, 5.8, 9.7, 8.8, 6.7, 13.2

(a – 2 points) What is the mean checkout time?

(b – 2 points) What is the value for the 25% percentile (first quartile) Q1?

(c – 2 points) What is the value for the 50% percentile (median)?

(d – 2 points) What is the value for the 75% percentile (third quartile) Q3?

(e – 2 points) Construct a boxplot of the dataset.

(Problem 6) Roll two fair dice. Each die has six faces.

(a – 2 points) List the number of outcomes in the sample space

(b – 2 points) What is the probability of rolling a 2 or a 5 on the first roll?

(c – 2 points) What is the probability of rolling a 2 or 5 and then an ODD number?

(d – 2 points) What is the probability the sum of the rolls is less than 4?

(e – 2 points) What is the probability that the second roll is greater than 4, given that the first roll is an even number?

(Problem 7) In a box of 100 cookies, 36 contain chocolate and 12 contain nuts. Of those, 8 cookies contain both chocolate and nuts.

(a – 3 points) Draw a Venn diagram representing the sample space and label all regions

(b – 1 points) What is the probability that a randomly selected cookie contains chocolate?

(c – 3 points ) What is probability that a randomly selected cookie contains chocolate OR nuts? Note, it cannot contain both chocolate and nuts, but must have either chocolate OR nuts.

(d – 3 points) What is the probability that a randomly selected cookie contains nuts, given that it contains chocolate?

(Problem 8)Assume a baseball team has a lineup of 9 batters.

(a – 4 points) How many different batting orders are possible with these 9 players?

(b – 4 points) How many different ways can I select the first 3 batters?

(c – 2 points) Is a “Combination Lock” really a permutation or combination of numbers? Explain your answer.

(Problem 9) You are playing a game with 3 prizes hidden behind 5 doors. One prize is worth $100, another is worth $20 and another $10. You have to pay $20 if you choose a door with no prize.

(a – 4 points) Construct a probability table. See your homework for Illowsky, Chapter 4, #72 and #80.

(b – 3 points) What is your expected winning?

**30 %**discount on an order above

**$ 100**

Use the following coupon code:

RESEARCH