7 7th Turorial
7.1 Recap
Previously we have been discussing multivariate linear regression that have the formula \(\textbf{Y} = \textbf{X}\beta + \epsilon\). Now assume we have two linear model \(M_1\) and \(M_2\) where \(M_2\) is a simplification of \(M_1\) such as (\(M_1: y_i = \beta_0 + \beta_1~x_{1_{i}} + \beta_2~x_{2_{i}}+ \beta_3~x_{3_{i}}\) and \(M_2: y_i = \beta_0 + \beta_1~x_{1_{i}}\)) A question arises here is which model is preferred and that is equivalent to test
\[H_0 = \beta_2 = \beta_3 =0 ~~~ vs ~~~ H_1 = \beta_2 \neq 0 , \beta_3 \neq0 \]
The decision rule is to reject \(H_0\) if \[F = \frac{(D_2-D_1)/q}{D_1/(n-p)} > F_{q,n−p,\alpha}\]
where \(n\) is the number of observations, \(p\) is the number of parameters in \(M_1\) and \(q\) is the number of parameters fixed to reduce \(M_1\) to \(M_2\). For the example above, \(p = 4\) and \(q = 2\), and \(D_1\) and \(D_2\) are the SSR of \(M_1\) and \(M_2\) respectively. Equivalently, we could use the notation \(SSE(...)\). For instance \(SSE(X_1,X_2,X_3)\) denotes the sum of squared error for a multiple linear regression that includes \(X_1\), \(X_2\) and \(X_3\) to draw the model.
7.2 Exercises
Exercise 7.1 Download the csv file for the dataset here
- Estimate the liner model for the given data and interpret its coefficients.
- Discuss the efficiency of the model by two different approaches.
- Write the ANOVA table that factorize the sum square regression \(X_1\) and \(X_2\) given \(X_1\).
- Use partial F to test whether you can remove \(X_2\) from model.
- Calculate \(R^2\) , \(r^2_{Y,2.1}\) , \(r_{Y,1.2}\) and \(r^2_{Y,2}\)
- Estimate the corresponding standard model and discuss its coefficient.