# Your company would like you to run a controlled experiment

12nd(This assignment is worth 45 points total. Each part (for example, 1 b. or 3 a.) is worth 3 points)1. You work for a company that has developed a computer-based individual tutoring program for2ndgraders,in which students use recess or after school time to do computer activities designed to improve their readingskills. The cost of buying the technology for a school building is fairly high. You want to know whether usingthis new tutoring program will really help improve student learning. Suppose that you have the cooperationof a large school district in which there are over 20002ndgraders in thirty school buildings.a. Your company would like you to run a controlled experiment to determine the eect of the new programon scores obtained by the second graders on a reading test administered by the district at the end of theschool year. Explain how you would design the experiment to estimate the causal eect of the new programon2ndb.Now suppose that the company has already released the new program, and many schools are alreadygrade reading scores.having some of their second graders use it (while many others are not). You do not have any information onthe test scores of individual students, but you do have a sample that tells you, for 200 school districts, theaverageofnd22ndgrade reading score for all second graders in the district last year (avgscore ), and the percentagegraders in the district who used the new tutoring program last year (pcttutor ).Write the population model asavgscore = ?0 + ?1 pcttutor + u,whereuis the population error term.Suppose that the value of?1in this population model is positive.Does this mean that a district with a higher percentage of students using the new program will always havea higher reading score than a district with a lower percentage of students using the new program? Explain.c. Is there any important dierences between the following two statements about the relationship betweenavgscore and pcttutor ? Explain.2nd graders use the program, average reading scores are lower.ndless 2graders to use this new tutoring program, then its average(i) In districts where less(ii) If a district assignstest score will godown.d. Do you think it is likely that the percentage of second graders in a district who are using this new tutoringprogram is independent of the other factors that aect the average2ndgrade reading score in the district?Explain.e. Does the sample described in part b of this question consist of cross section data, time series data , or apanel data?12. Suppose a garbage incinerator was built in a community. You are interested in estimating the eect of ahouse’s distance from the incinerator (dist ) in miles on the price at which that house can be sold in dollarsin ten thousands (price ). Write the population model relating distance to price as follows:log(price) = ?0 + ?1 dist + u,a. Explain whatincluded inurepresents in the population model above. Give an example of a factor that is likely to beu.b. Do you think there is likely to be any correlation betweenuanddist?Explain (Think about how the citymight have decided where to build the new incinerator).Suppose that using a sample of 300 houses sold in the community the year after the incinerator was built,the above population model is estimated by ordinary least squares (OLS) with the following result:log(price) = 0.084 + 0.0212dist + uˆ,in whichc.R2 = .122.Explain the dierence betweenuin the population regression model anduˆin the sample regressionequation.d. Interpret the estimated coecient ondist.Is the sign of this estimate what you would expect it to be?Explain and round your answers to two decimal places.e. What does theR2of this sample regression measure? Interpret thisR2 .3. Using the data in the stata data set attend.dta, estimate the following population model by OLSpriGP A = ?0 + ?1 ACT + uin which ACT is the student’s score on the ACT test, an aptitude test taken by American high schoolstudents, and priGPA is the student’s GPA on a 4 point scale at the time the class begins.a. In this sample, what is the standard deviation ofACT ,and what is the mean ofpriGP A?(Use the sumcommand in Stata to nd this out and round your answers to four decimal places.)b. Using the reg command in Stata, nd the OLS estimates of?0and?1 and round your answers to fourACT . (Note: Whenever you aredecimal places. Give a verbal interpretation of the estimated coecient ofasked for a verbal interpretation of the coecient in a regression, you should be sure that your answer isexpressed in terms of the units in which the relevant variables are measured (in this case,in points, andACTpriGP A is measuredis measured in points), and you should remember to specify other things equal orholding other things constant. So, in this case, your answer should take the formOther things equal, a 1 point increase in the student’s ACT score is associated with a ___________point___________ in the student’s GPA at the time the class begins. )2c. What would the predicted dierence in GPA at the time the class started be between a student who scoreda 20 on the ACT and a student who scored 30 on the ACT? (Round your answers to four decimal places.)d. The list command in Stata displays the values of variables in a dataset. For exampel, typing list X in 5would give you the value of the variable X for thein Stata, nd the residual for theactual value ofpriGP A100th5thobservation in the dataset. Using the list commandstudent in the data set. That is, what is the dierence between theminus the predicted value ofpriGP Afor this student? (Round your answers to fourdecimal places.)f inal (score out of 40 on the nal exam for the class) is the dependenttermGP A (GPA for the current semester) is the independent variable. Then estimate a regreswhich f inal is the dependent variable and ACT is the independent variable. Which regression doese. Estimate a regression in whichvariable andsion ina better job of predicting nal exam performance? Explain.

