# 1) You own a bakery and decide to compare your weekly flour consumption in pounds

August 30, 2017

Question
Name:
Each question is worth 1 point for a total of 20 points. Partial credit will be given where appropriate.

1) You own a bakery and decide to compare your weekly flour consumption in pounds (x- variable) and the sales you make in dollars (y-variable) each week. You enter your raw data into Excel and run a simple linear regression. Below are your summary output results.

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.7529
R Square 0.5669
Square 0.5128
Standard
Error 24.1940
Observations 10

ANOVA

Significance

df SS MS F F
Regression 1 6129.716 6129.716 10.472 0.012
Residual 8 4682.784 585.348
Total 9 10812.500

Coefficients

Standard
Error t Stat

P-
value Lower 95%

Upper 95%

Lower
95.1 % Upper 95.0%

Intercept -214.45 113.65 -1.89 0.10 -476.54 47.63 476.54 47.63
X Variable 1 5.53 1.71 3.24 0.01 1.59 9.48 1.59 9.48
a) Using the above data, identify the simple linear regression model (equation) that could be used to predict sales based on flour consumption.
? =
b) Identify the percent of variability in sales (y) that is explained by y’s relationship with flour consumption (x).
percent variability =

2) Assume you are in the analysis phase of a project. Name three statistical tools that you could apply to your data in order to drive your next steps in the DMAIC process.
a)
b)
c)

3) Describe the following data five different ways (include numbers in your answer):
Data: 13, 10, 15, 11, 12, 12, 7, 8, 16, 14
a)
b)
c)
d)
e)

4) In a normal distribution of measurements having a mean of 500 feet and a standard deviation of 80 feet, what percent of the distribution falls between 300 and 450 feet?
%

5) You have just performed a linear regression analysis on successive values of a time series and you see autocorrelation. What might your r2 be equal to?
r2 =

6) The null hypothesis is Ho: µ = 10, and the alternative hypothesis is Ha: µ ? 10. Assume alpha = 0.01. If the null hypothesis was rejected, what would the 99% confidence interval for µ look like?

a) (12.1, 15.3) b) (8.5, 12.1) c) (5.3, 15.5) d) (9.8, 10.5)

7) Given the above range chart, what can you conclude?
a) Process variation is unstable and unpredictable.
b) Measurement variation is declining.
c) Within-subgroup variation is stable and predictable.
d) Discrimination is a problem.

8) Describe two ways to determine whether your measurement system is repeatable and reproducible:
a)

b)

9) You are interested in developing a control chart. Your dimension of concern is the diameter of a cylindrical part. Every hour you take two measurements. Choose the most appropriate chart.
a) np chart b) IMR chart c) c chart d) x-bar/R chart

10) You are interested in developing a second control chart. However, you have collected data on the number of visible scratches on the part. You take a small constant subgroup size of 2 every day. Choose the most appropriate chart.
a) np chart b) IMR chart c) c chart d) x-bar/R chart

11) A hypothesis is being tested at alpha = 0.05. At which of the following p-values would the null hypothesis be rejected?
a) 0.150 b) 0.005 c) 0.055 d) 0.350

Page 3 of 5
12) It was reported in USA Today that from 1999 through 2003 the number of daily spam messages sent worldwide was:
(x)Year Number Year (y) Spam Messages Sent (billions)
1 1999 1.0
2 2000 2.3
3 2001 4.0
4 2002 5.6
5 2003 7.3
The regression equation was determined to be: y = –0.73 + 1.59 x
where y is the number of spam messages sent in billions and x is the year number. Using the model, what is the predicted number of spam messages sent in 2004?
a) 9.00 billion b) 7.85 billion c) 3,185 billion d) 8.81 billion

13) If we want to detect a change in the process, we increase our sample size.
a) continuous b) larger c) smaller d) normal

14) Specific models have been developed to aid in the analysis of time series data when usual regression methods are not appropriate. What model uses the average of the last several values of a time series to forecast the next value?
Name of the model:

15) A strong correlation does not mean a cause-and-effect relationship. Causation is only one explanation of an observed association. What else could produce a strong correlation?
a) Confounding factor
b) Coincidence
c) Common cause
d) All of the above

16) A correlation coefficient r = –0.72 would indicate:
a) There is a strong positive correlation between two factors
b) There is a moderate negative correlation between two factors
c) There is no correlation between four factors
d) There is a moderate positive correlation between four factors
17) When the variability in x decreases (for example outliers are removed from the data), the correlation coefficient gets closer to .

18) You enter your data into Excel and run a multiple regression. Below are your summary output results. What variables are significant and should be included in your model?
Name the variables:
Regression Statistics
Multiple R 0.94898
R Square 0.90056
Square 0.82101
Standard Error 6.24921
Observations 10
ANOVA
df
SS
MS
F Significance
F
Regression 4 1768.3366 442.0841 11.3202 0.0101
Residual 5 195.2634 39.0527
Total 9 1963.6000

Coefficients Standard
Error
t Stat
P-value
Lower 95%
Intercept 48.3628 14.1772 3.4113 0.0190 11.9192
weight -21.3770 9.7575 -2.1908 0.0300 -46.4595
height -12.4787 7.3110 -1.7068 0.1486 -31.2723
power -4.2240 8.8810 -0.4756 0.6544 -27.0534
speed 0.2849 0.3917 0.7272 0.4997 -0.7221

19) Using the above Excel summary output results, answer the following questions.
a) How many samples were collected to generate this data?
b) What is the correlation for this multiple regression, and what does it indicate?

20) Certain data is inappropriate for a regression analysis such as:
a) Residuals that form a pattern when plotted
b) There aren’t any outliers
c) The correlation coefficient is less than 1
d) All of the above

Get a 30 % discount on an order above \$ 100
Use the following coupon code:
RESEARCH
Positive SSL