Assignment 2

Due by: 9am Friday 5 February

This assignment must be submitted by 9am on the above due date. Any assignment

submitted after the due date and time will be given a mark of zero.

The purpose of this assignment is to give you practice in using the regression technique, to

build and estimate regression models, and to test certain hypotheses taken from “theory”.

Lectures 5-9 provide information to complete the tasks set out in this Assignment.

This is a group assignment. A group up to three students may work together and submit one

assignment for the group. All members of the group, however, MUST be enrolled in the

same tutorial. For assignments submitted as a group, all students in the group, as long as

they are all enrolled in the same tutorial, will receive the same mark for the assignment. Any

student, who attempts to submit an assignment with a group that is not in their own

tutorial, or in a group with more than three members, will not receive any credit for that

assignment. Students should form their own groups. Individuals may work alone if they wish

and submit their own answers, but I would urge students to work in groups.

Your assignment answers must be no more than ten pages in total, including all graphs

tables, and written responses. Any assignments in excess of this length will be ignored

during assignment marking. Answer the questions directly. Do not undertake inappropriate

tests or discuss irrelevant matters.

Students should form groups via the LMS groups sign up link and one member submits the

assignment on behalf of the group. More instructions will be provided on the LMS.

Students must copy and paste the template provided with the assignment into the top of

the first page of their assignment answers, and complete the template before submitting

their answers. It is essential that you include the name of your tutor and your allocated

tutorial day and time at the top of your assignment answers in order for your assignment to

be graded in a timely manner.

Dr Wasana Karunarathne

Department of Economics

The University of Melbourne

Question 1 (50 Marks)

Background:

Research findings on the impact of class attendance of university students on their exam

performance provide ambiguous conclusions. Some research suggests a strong positive association

between students attending lectures and their performance in exams. However, others suggest that

the positive impact of lecture attendance reported in the literature may be overestimated as it

reflects the impact of unobservable factors on exam marks. (Feel free to search the web to read up

on journal articles written in this area of research).

Suppose that we are also interested in estimating the impact of lecture attendance on the final exam

marks of university students. We have been given a dataset collected for a sample of 680 university

students. Our data set includes the variables described below.

attend Number of classes attended out of 32 classes during the semester

attend_rate Percentage of classes attended during the semester

prior_GPA Cumulative GPA prior to the semester of interest

UAE Score of a university admissions examination, consisting of 4 subject areas. Range of

marks: 1-36

finalscore Final exam score for the semester

skipped Number of classes skipped

std_score Standardised exam score calculated as: (final exam score for each student – mean of

final exam score)/standard deviation of final exam score

gender Takes value 1 if the student is a male

Notes:

1. It is sensible to use standardised exam score in your regression since it is easier to interpret

a student’s performance relative to the rest of the class. For a similar reason, we would

prefer to include the attendance rate into the regression rather than the number of classes

attended.

2. There are evidence to suggest that UAE and Prior_GPA have quadratic relationship with the

final scores and there is an interaction effect of Prior_GPA and attendance rate on scores.

Make sure that you only include relevant variables into your regression.

3. Students must follow all the 6 steps when they do hypothesis testing. When necessary,

students must provide EViews outputs as evidence (these include the regression output).

Questions:

(a) Provide descriptive statistics for all the variables given in the data set. Discuss these statistics.

[4 Marks]

(b) Before estimating the regression, we are interested in looking at the impact of a range of factors

(including class attendance given by attendance rate) on the standardised final score for the

term. Using appropriate graphs, depict the relationship between each of the independent

variable with the dependent variable. Also using appropriate statistics to estimate the linear

relationships, discuss the nature of relationship between these variables.

[2 Marks]

(c) From the graphs in part (b), identify if there is any violation of the six OLS assumptions discussed

in class. Also clearly explain why you think the assumption(s) might be violated.

[3 Marks]

(d) Write down a population regression model for the final score (not the standardised score) using

the variables in the data set (use all the relevant variables even if you believe they may not have

an impact of the final score. Use attendance rate to estimate the impact of attendance on final

score).

[2 Marks]

(e) Estimate the population regression model in part (d) above. Report the results.

[2 Marks]

(f) Conduct any statistical tests to determine if OLS assumptions (for the error variable) are

violated. If any of the assumptions is violated, take necessary action and re-estimate the

regression (otherwise you can leave it as it is). If you have a new regression model, report the

results and provide the EViews output as well.

[4 Marks]

(g) Tom believes that gender explains the variation in exam scores. Interpret the coefficient on

gender and test his claim.

[4 Marks]

(h) Using the appropriate model, write an equation to show the impact of attendance on final exam

score.

[2 Marks]

(i) What is the impact of lecture attendance on final scores for students who have average prior

cumulative GPA?

[2 Marks]

(j) Is there any statistical evidence to suggest that the attendance in lectures affect exam scores?

[4 Marks]

(k) Using the p-value approach, test for the overall utility of the model you estimated.

[3 Marks]

(l) Is this regression a reasonable fit?

[2 Marks]

(m) Suppose that you are now interested in estimating the impact of the given variables on the

standardised exam scores (rather than raw scores). Re-write the population regression model

with the new dependent variable. If you estimate this regression model, what changes do you

expect to have on the estimated regression coefficients?

[3 Marks]

(n) Estimate the regression for part (m) without the variable “gender”. Report the results.

[4 Marks]

(o) Interpret the goodness of fit of the new model. Based on the coefficient for goodness of the fit

of the model, can we say that model 1 (estimated in part e) is better at explaining the variation

in test scores than model 2 (estimated in part n)?

[4 Marks]

(p) Do you agree with the concern of some researchers that the estimated effect of attendance on

test scores is overestimated? Using the regression models estimated above, clearly explain your

reasoning.

[5 Marks]

END OF ASSIGNMENT

