# Statistics 512 Divisions 1 and 4, Spring 2014 Sample Midterm I

Statistics 512

Divisions 1 and 4, Spring 2014

Sample Midterm I

1. In a simple linear regression problem with n = 28, the following estimates were ob¯

tained: b0 = 4.30, b1 = 0.56, s{b0 } = 0.012, s{b1 } = 0.034, s = 0.1385, X = −0.0773,

SSX = 14.598. Values of X were in the range −1.43 to 0.683.

(a) Write the simple linear regression model. Include the distributional assumption.

(b) Write the estimated regression line and the estimated error variance.

(c) If appropriate, estimate the mean value of Y for the given X. If not appropriate,

say why.

i. X = 0.

ii. X = 2.

(d) Give a 95% conﬁdence interval for the slope of the regression line.

1

(e) Give a 99% prediction interval for the next value of Y when X = 0.5.

2. Short answer questions. Unless stated otherwise, each part is unrelated.

(a) If the design matrix has dimensions 20 × 4, what is the dimension of the error

vector?

(b) A particular linear model has parameter values β0 = 20, β1 = 0.5, and σ 2 = 4.

Assuming all the linear model assumptions are true, what is the probability that

an observation Y would be greater than 19.4, given that X = 6?

(c) A quarter of the data fall outside the prediction region, what is the approximate

conﬁdence level for the prediction intervals?

(d) For a model with 5 predictors and 40 observations, if R2 is 0.3 what is the test

statistic for the ANOVA F test?

2

(e) The α levels of 2 conﬁdence intervals are 0.05 and 0.01 . Find a lower bound

on the overall coverage rate for the two conﬁdence intervals. (In other words, the

probability that the conﬁdence intervals cover both their values is at least what?)

(f) You run a least squares regression on SAS and get an SSE value of 20 m2 . Your

oﬃcemate claims to be able to ﬁnd a line (using the same data) with an SSE of

15 m2 . Should you believe your oﬃcemate? Why or why not?

(g) For a particular X value, the prediction interval has length 10; the conﬁdence

interval has length 5. If SSX is 3, what is the length of the conﬁdence interval

for the slope?

(h) For a given set of data, the optimal Box-Cox transformation (of the response)

is at λ = 0.25 . You decide to use the “suggested” transformation and take the

square root of the response. What is the optimal Box-Cox transformation for the

transformed data?

3

(i) Using the matrices from least squares estimation, what are the entries of the

vector X e?

3. Refer to the SAS output on the last pages (marked OUTPUT FOR PROBLEM 3).

The data are from a study of 78 seventh grade students. The goal is to predict GRADE

(average school grade on a scale of 0 to 11) from variables which include IQ (score on

an I.Q. test) and GENDER (0 = female, 1 = male).

(a) Using the output for the simple linear regression, does there appear to be a linear

relationship between GRADE and IQ? Give a test statistic with degrees of freedom

and p-value to support your answer (you may use other evidence as well).

(b) Individual 51 has GRADE = 0.53 and IQ = 103 . What value of GRADE is

predicted for this individual by the estimated simple linear regression model?

(c) The variable IQGEN is the product of IQ and GENDER. Examine the output for

the model involving these three variables. Write down the estimated regression

equation for this model. Also write down the two separate ﬁtted lines for female

and male students.

(d) Examine the results of the t-tests for the three regression coeﬃcients as well as

the result of the (general linear) F -test labeled “SAMELINE”. The results of

4

this general linear test were produced with the SAS input line “test gender,

iqgen;”.

State the null hypotheses tested by each of these four tests and whether that

hypothesis is rejected. What apparent conﬂict do you see between the results of

these tests?

5

OUTPUT FOR PROBLEM 3

The REG Procedure

Model: MODEL1

Dependent Variable: grade

Analysis of Variance

Sum of

Squares

Source

DF

Model

Error

Corrected Total

1

76

77

136.31881

203.10809

339.42689

Root MSE

Dependent Mean

Coeff Var

1.63477

7.44654

21.95343

Variable

DF

Parameter

Estimate

Intercept

iq

1

1

Mean

Square

F Value

Pr > F

51.01

<.0001

136.31881

2.67247

R-Square

Adj R-Sq

0.4016

0.3937

Parameter Estimates

Standard

Error

t Value

Pr > |t|

-3.55706

0.10102

1.55176

0.01414

-2.29

7.14

95% Confidence Limits

0.0247

<.0001

-6.64766

0.07285

-0.46645

0.12919

The REG Procedure

Model: MODEL1

Dependent Variable: grade

Analysis of Variance

Source

DF

Sum of

Squares

Mean

Square

Model

Error

Corrected Total

3

74

77

155.42484

184.00205

339.42689

51.80828

2.48651

Root MSE

Dependent Mean

Coeff Var

1.57687

7.44654

21.17586

R-Square

Adj R-Sq

F Value

Pr > F

20.84

<.0001

0.4579

0.4359

Parameter Estimates

Variable

DF

Parameter

Estimate

Standard

Error

t Value

Pr > |t|

Intercept

iq

gender

iqgen

1

1

1

1

-2.25235

0.09400

-3.84266

0.02656

2.15377

0.02017

3.03670

0.02784

-1.05

4.66

-1.27

0.95

0.2991

<.0001

0.2097

0.3432

6

Test sameline Results for Dependent Variable grade

Source

DF

Mean

Square

Numerator

Denominator

2

74

9.55302

2.48651

7

F Value

Pr > F

3.84

0.0259

**30 %**discount on an order above

**$ 100**

Use the following coupon code:

RESEARCH