Statistics 512 Divisions 1 and 4, Spring 2014 Sample Midterm I

| November 24, 2016

Statistics 512
Divisions 1 and 4, Spring 2014
Sample Midterm I
1. In a simple linear regression problem with n = 28, the following estimates were ob¯
tained: b0 = 4.30, b1 = 0.56, s{b0 } = 0.012, s{b1 } = 0.034, s = 0.1385, X = −0.0773,
SSX = 14.598. Values of X were in the range −1.43 to 0.683.
(a) Write the simple linear regression model. Include the distributional assumption.

(b) Write the estimated regression line and the estimated error variance.

(c) If appropriate, estimate the mean value of Y for the given X. If not appropriate,
say why.
i. X = 0.

ii. X = 2.

(d) Give a 95% confidence interval for the slope of the regression line.

1

(e) Give a 99% prediction interval for the next value of Y when X = 0.5.

2. Short answer questions. Unless stated otherwise, each part is unrelated.
(a) If the design matrix has dimensions 20 × 4, what is the dimension of the error
vector?

(b) A particular linear model has parameter values β0 = 20, β1 = 0.5, and σ 2 = 4.
Assuming all the linear model assumptions are true, what is the probability that
an observation Y would be greater than 19.4, given that X = 6?

(c) A quarter of the data fall outside the prediction region, what is the approximate
confidence level for the prediction intervals?

(d) For a model with 5 predictors and 40 observations, if R2 is 0.3 what is the test
statistic for the ANOVA F test?

2

(e) The α levels of 2 confidence intervals are 0.05 and 0.01 . Find a lower bound
on the overall coverage rate for the two confidence intervals. (In other words, the
probability that the confidence intervals cover both their values is at least what?)

(f) You run a least squares regression on SAS and get an SSE value of 20 m2 . Your
officemate claims to be able to find a line (using the same data) with an SSE of
15 m2 . Should you believe your officemate? Why or why not?

(g) For a particular X value, the prediction interval has length 10; the confidence
interval has length 5. If SSX is 3, what is the length of the confidence interval
for the slope?

(h) For a given set of data, the optimal Box-Cox transformation (of the response)
is at λ = 0.25 . You decide to use the “suggested” transformation and take the
square root of the response. What is the optimal Box-Cox transformation for the
transformed data?

3

(i) Using the matrices from least squares estimation, what are the entries of the
vector X e?

3. Refer to the SAS output on the last pages (marked OUTPUT FOR PROBLEM 3).
The data are from a study of 78 seventh grade students. The goal is to predict GRADE
(average school grade on a scale of 0 to 11) from variables which include IQ (score on
an I.Q. test) and GENDER (0 = female, 1 = male).
(a) Using the output for the simple linear regression, does there appear to be a linear
relationship between GRADE and IQ? Give a test statistic with degrees of freedom
and p-value to support your answer (you may use other evidence as well).

(b) Individual 51 has GRADE = 0.53 and IQ = 103 . What value of GRADE is
predicted for this individual by the estimated simple linear regression model?

(c) The variable IQGEN is the product of IQ and GENDER. Examine the output for
the model involving these three variables. Write down the estimated regression
equation for this model. Also write down the two separate fitted lines for female
and male students.

(d) Examine the results of the t-tests for the three regression coefficients as well as
the result of the (general linear) F -test labeled “SAMELINE”. The results of
4

this general linear test were produced with the SAS input line “test gender,
iqgen;”.
State the null hypotheses tested by each of these four tests and whether that
hypothesis is rejected. What apparent conflict do you see between the results of
these tests?

5

OUTPUT FOR PROBLEM 3
The REG Procedure
Model: MODEL1
Dependent Variable: grade
Analysis of Variance
Sum of
Squares

Source

DF

Model
Error
Corrected Total

1
76
77

136.31881
203.10809
339.42689

Root MSE
Dependent Mean
Coeff Var

1.63477
7.44654
21.95343

Variable

DF

Parameter
Estimate

Intercept
iq

1
1

Mean
Square

F Value

Pr > F

51.01

<.0001

136.31881
2.67247

R-Square
Adj R-Sq

0.4016
0.3937

Parameter Estimates
Standard
Error
t Value
Pr > |t|

-3.55706
0.10102

1.55176
0.01414

-2.29
7.14

95% Confidence Limits

0.0247
<.0001

-6.64766
0.07285

-0.46645
0.12919

The REG Procedure
Model: MODEL1
Dependent Variable: grade
Analysis of Variance

Source

DF

Sum of
Squares

Mean
Square

Model
Error
Corrected Total

3
74
77

155.42484
184.00205
339.42689

51.80828
2.48651

Root MSE
Dependent Mean
Coeff Var

1.57687
7.44654
21.17586

R-Square
Adj R-Sq

F Value

Pr > F

20.84

<.0001

0.4579
0.4359

Parameter Estimates

Variable

DF

Parameter
Estimate

Standard
Error

t Value

Pr > |t|

Intercept
iq
gender
iqgen

1
1
1
1

-2.25235
0.09400
-3.84266
0.02656

2.15377
0.02017
3.03670
0.02784

-1.05
4.66
-1.27
0.95

0.2991
<.0001
0.2097
0.3432

6

Test sameline Results for Dependent Variable grade

Source

DF

Mean
Square

Numerator
Denominator

2
74

9.55302
2.48651

7

F Value

Pr > F

3.84

0.0259

Get a 30 % discount on an order above $ 100
Use the following coupon code:
RESEARCH
Order your essay today and save 30% with the discount code: RESEARCHOrder Now
Positive SSL