The data are n = 30 observations on driver age and the maximum distance (feet) at which individuals can read a highway sign (Sign Distance data). Show (Data source: Mind On Statistics, 3rd edition, Utts and Heckard) The plot below gives a scatterplot of the highway sign data along with the least squares regression line. Here is the accompanying Minitab output, which is found by performing Stat >> Regression >> Regression on the highway sign data. Regression Analysis: Distance, AgeCoefficients
Regression EquationDistance = 577 - 3.01 Age Hypothesis Test for the Intercept (\(\beta_{0}\))This test is rarely a test of interest, but does show up when one is interested in performing a regression through the origin (which we touched on earlier in this lesson). In the Minitab output above, the row labeled Constant gives the information used to make inferences about the intercept. The null and alternative hypotheses for a hypotheses test about the intercept are written as: \(H_{0} \colon \beta_{0} = 0\) In other words, the null hypothesis is testing if the population intercept is equal to 0 versus the alternative hypothesis that the population intercept is not equal to 0. In most problems, we are not particularly interested in hypotheses about the intercept. For instance, in our example, the intercept is the mean distance when the age is 0, a meaningless age. Also, the intercept does not give information about how the value of y changes when the value of x changes. Nevertheless, to test whether the population intercept is 0, the information from the Minitab output is used as follows:
So how exactly is the p-value found? For simple regression, the p-value is determined using a t distribution with n − 2 degrees of freedom (df), which is written as \(t_{n−2}\), and is calculated as 2 × area past |t| under a \(t_{n−2}\) curve. In this example, df = 30 − 2 = 28. The p-value region is the type of region shown in the figure below. The negative and positive versions of the calculated t provide the interior boundaries of the two shaded regions. As the value of t increases, the p-value (area in the shaded regions) decreases. t - t 2 x the area to the right of \(\mid t \mid\)Hypothesis Test for the Slope (\(\beta_{1}\))This test can be used to test whether or not x and y are linearly related. The row pertaining to the variable Age in the Minitab output from earlier gives information used to make inferences about the slope. The slope directly tells us about the link between the mean y and x. When the true population slope does not equal 0, the variables y and x are linearly related. When the slope is 0, there is not a linear relationship because the mean y does not change when the value of x is changed. The null and alternative hypotheses for a hypotheses test about the slope are written as: \(H_{0}
\colon \beta_{1}\) = 0 In other words, the null hypothesis is testing if the population slope is equal to 0 versus the alternative hypothesis that the population slope is not equal to 0. To test whether the population slope is 0, the information from the Minitab output is used as follows:
As before, the p-value is the region illustrated in the figure above. Confidence Interval for the Slope (\(\beta_{1}\))A confidence interval for the unknown value of the population slope \(\beta_{1}\) can be computed as sample statistic ± multiplier × standard error of statistic → \(b_{1 }\)± t* × se(\(b_{1}\)) To find the t* multiplier, you can do one of the following:
95% Confidence IntervalIn our example, n = 30 and df = n − 2 = 28. For 95% confidence, t* = 2.05. A 95% confidence interval for \(\beta_{1}\), the true population slope, is: 3.0068 ±
(2.05 × 0.4243) Interpretation: With 95% confidence, we can say the mean sign reading distance decreases somewhere between 2.14 and 3.88 feet per each one-year increase in age. It is incorrect to say that with 95% probability the mean sign reading distance decreases somewhere between 2.14 and 3.88 feet per each one-year increase in age. Make sure you understand why!!! 99% Confidence IntervalFor 99% confidence, t* = 2.76. A 99% confidence interval for \(\beta_{1}\) , the true population slope is: 3.0068 ± (2.76 × 0.4243) Interpretation: With 99% confidence, we can say the mean sign reading distance decreases somewhere between 1.84 and 4.18 feet per each one-year increase in age. Notice that as we increase our confidence, the interval becomes wider. So as we approach 100% confidence, our interval grows to become the whole real line. As a final note, the above procedures can be used to calculate a confidence interval for the population intercept. Just use \(b_{0}\) (and its standard error) rather than \(b_{1}\). |