As the degrees of freedom increase, the correspondence between the t and z distributions _________.

In order to continue enjoying our site, we ask that you confirm your identity as a human. Thank you very much for your cooperation.

Here are my answers to the Week 3 Quiz Activity of the couse Inferential Statistics with R presented by Coursera and conducted by Mine Çetinkaya-Rundel.

This is a R Markdown file. To a better viewing, it could be forked and knitted on RStudio to a html file or it could be viewed directly as a RPubs publication.

We will use the devtools package to install the statsr package associated with this course. We need to install and load this package.

install.packages("devtools") library(devtools)

Now we can install the rest of the packages we will use during the course. Type the following commands in the Console as well:

install.packages("dplyr") install.packages("ggplot2") install.packages("shiny") install_github("StatsWithR/statsr")

  1. People of different ages were asked to stand on a “force platform” and maintain a stable upright position. The ``wiggle" of the board in the forward-backward direction is recorded; more wiggle corresponds to less balance. The participants are divided into two age groups: young and elderly. The average wiggle among elderly people was 26.33 mm, and the average among young people was 18.125 mm. The bootstrap distribution for the difference in means is shown below, based on 100 bootstrap samples. Of the following choices, which is the most accurate 90% bootstrap confidence interval for the true difference in means?
    1. (3.75 mm, 15 mm)
    2. (2.5 mm, 18 mm)
    3. (5 mm, 15 mm)
    4. (3 mm, 17 mm)
    5. The most accurate 90% bootstrap confidence interval for the true difference in means is (3.75 mm, 15 mm).

Construct bootstrap confidence intervals using one of the following methods:

  • Percentile method: XX% confidence level is the middle XX% of the bootstrap distribution.
  • Standard error method: If the standard error of the bootstrap distribution is known, and the distribution is nearly normal, the bootstrap interval can also be calculated as \(\bar{x}_{boot} \pm z^{*} {SE}_{boot}\).

Recognize that when the bootstrap distribution is extremely skewed and sparse, the bootstrap confidence interval may not be appropriate.

For a 90% confidence interval we would want to exclude 10% of samples outside of the confidence interval, i.e. 5% on each tail. With 100 samples, that means we just count off 5 points corresponding to 5 bootstrap sample statistics from each end of the bootstrap distribution to determine what the endpoints of the confidence interval are.

  1. Which of the following is false regarding paired data?
    1. Two data sets of different sizes cannot be analyzed as paired data.
    2. In a paired analysis we first subtract the paired observations from each other, and then do inference on the differences.
    3. Each observation in one data set has a natural correspondence with exactly one observation from the other data set.
    4. Each observation in one data set is subtracted from the average of the other data set’s observations.
    5. It is false that each observation in one data set is subtracted from the average of the other data set’s observations.

This question refers to the following:

  • Define observations as paired if each observation in one dataset has a special correspondence or connection with exactly one observation in the other data set.
  • Carry out inference for paired data by first subtracting the paired observations from each other, and then treating the set of differences as a new numerical variable on which to do inference (such as a confidence interval or hypothesis test for the average difference).

It doesn’t make any sense to subtract each observation in one data set from the average of the other data set’s observations, we subtract the paired observations from each other.

  1. Which of the following is false about bootstrap and sampling distributions?
    1. Bootstrap distributions are centered at the sample statistic, sampling distributions are centered at the population parameter.
    2. Both distributions are created by sampling with replacement from the population.
    3. Both distributions are comprised of sample statistics.
    4. Both distributions get narrower as the standard deviation decreases.
    5. The bootstrap distribution is not created by sampling with replacement from the population.

Bootstrap distributions are constructed by resampling from the sample while sampling distributions by resampling from the population.

  1. A study examining the relationship between weight of school children (4th to 6th graders) found a 95% confidence interval for the difference between the average number of school days missed by overweight and normal weight children (\(\mu_{overweight}\)\(\mu_{normal}\)) to be 1.3 to 2.8 days. Which of the following is true based on this confidence interval?
    1. We are 95% confident that overweight children on average miss 1.3 to 2.8 days more than children with normal weight.
    2. At 5% significance level, the data do not provide convincing evidence of a difference between the average number of missed days by overweight and normal weight children.
    3. We are 95% confident that overweight children on average miss 1.3 to 2.8 days fewer than children with normal weight.
    4. At 10% significance level, the data do not provide convincing evidence of a difference between the average number of missed days by overweight and normal weight children.
    5. We are 95% confident that overweight children on average miss 1.3 to 2.8 days more than children with normal weight.

This question refers to the following:

  • Recognize that a good interpretation of a confidence interval for the difference between two parameters includes a comparative statement (mentioning which group has the larger parameter).
  • Recognize that a confidence interval for the difference between two parameters that doesn’t include 0 is in agreement with a hypothesis test where the null hypothesis that sets the two parameters equal to each other is rejected.
  1. When doing inference on a single mean, which of the following is the correct justification for using the t-distribution rather than the normal distribution?
    1. Because the standard error estimate may not be accurate.
    2. Because the t-distribution is not symmetric.
    3. All of the above.
    4. None of the above.
    5. That is because the standard error estimate may not be accurate.

With a small sample size our estimate of the standard error as \(s/\sqrt{n}\) is not reliable, since the sample standard deviation, \(s\), may not be a reliable estimate for the population standard deviation \(\sigma\) when the sample size is low. We make up for this by using the t instead of the normal distribution.

  1. The figure below shows three t-distribution curves. Which curve has the highest degree of freedom?
    1. Dotted.
    2. Solid.
    3. Dashed.
    4. It is the solid one.

This question refers to the following:

  • Describe how the t-distribution is different from the normal distribution, and what “heavy tail” means in this context.
  • Note that the t-distribution has a single parameter, degrees of freedom, and as the degrees of freedom increases this distribution approaches the normal distribution.

As the degrees of freedom increases the t distribution starts approaching the normal distribution, and t distributions with lower degrees of freedom will have heavier tails than t distributions with higher degrees of freedom.

  1. My friend, Tom, believes that his supermarket’s prices are lower than mine, and sets an alternative hypothesis test reflecting this. We construct a list of 10 identical items and purchase them at our respective stores. Tom wants to know if these data support his hypothesis. Which of the following is the correct description of Tom’s situation?
    1. Tom has a two-sided alternative hypothesis and should do a paired t-test.
    2. Tom has a one-sided alternative hypothesis and should do a paired t-test.
    3. Tom has a one-sided alternative hypothesis and should do an independent samples t-test.
    4. Tom has a two-sided alternative hypothesis and should do an independent samples t-test.
    5. Actually, Tom has a one-sided alternative hypothesis and should do a paired t-test.

The test is a paired test because the same 10 items were bought at each store; i.e. each observation in one data set has a special correspondence to exactly one observation in the other data set.

  1. Suppose that a one-tail t test is being applied to find out if the population mean is less than 100. The level of significance is 0.05 and 25 observations were sampled. The null hypothesis would be rejected for t scores in which of the following regions? Choose the best answer.
    1. T < 1.96
    2. T > 1.71
    3. T < -1.71
    4. T < -1.32
    5. T > 1.32
    6. The null hypothesis would be rejected for t scores in the region T < -1.71.
## [1] -1.710882

This question refers to the following :

  • Use a T-statistic, with degrees of freedom \(df=n−1\) for inference for a population mean.
  • Use a T-statistic, with degrees of freedom \(df=min(n_{1}−1,n_{2}−1)\) for inference for difference between means of two population means using data from two small samples.
  • Describe how to obtain a p-value and a critical t-score (\(t^{*}_{df}\)) for a confidence interval.

The degrees of freedom are \(df = 25 - 1 = 24\), the alternative hypothesis is ”less than 100”, significance level is 5%, we can use the table to find the cutoff value corresponding to the lower 5% of the \(t_24\) curve.

  1. We would like to test if students who are in the social sciences, natural sciences, arts and humanities, and other fields spend the same amount of time studying for this course. What type of test should we use?
    1. F-test (ANOVA)
    2. z-test
    3. t-test for two dependent groups
    4. t-test for two independent groups
    5. The type of test we should use is ANOVA, there are many groups, and we’re comparing averages.
  2. Which of the following is not a condition required for comparing means across multiple groups using ANOVA?
    1. The observations should be independent within and across groups.
    2. The data within each group should be nearly normal.
    3. There should be at least 10 successes and 10 failures.
    4. The variability across the groups should be about equal.
    5. The condition which is not required for compating means across multiple groups using ANOVA is “There should be at least 10 successes and 10 failures”.

List the conditions necessary for performing ANOVA

  • the observations should be independent within and across groups
  • the data within each group are nearly normal
  • the variability across the groups is about equal

and use graphical diagnostics to check if these conditions are met.

Success-failure condition is relevant for categorical variables.

  1. A study compared five different methods for teaching descriptive statistics. The five methods were traditional lecture and discussion, programmed textbook instruction, programmed text with lectures, computer instruction, and computer instruction with lectures. 45 students were randomly assigned, 9 to each method. After completing the course, students took a 1-hour exam.

    Which of the following is the correct degrees of freedom for an F-test for evaluating if the average test scores are different for the different teaching methods?
    1. \({df}_G = 4\) and \({df}_E = 44\)
    2. \({df}_G = 40\) and \({df}_E = 4\)
    3. \({df}_G = 4\) and \({df}_E = 40\)
    4. \({df}_G = 5\) and \({df}_E = 45\)
    5. \({df}_G = 45\) and \({df}_E = 4\)
    6. The correct degrees of freedom are \({df}_G = 4\) and \({df}_E = 40\).

This question refers to the following learning objective(s): Recognize that the test statistic for ANOVA, the F statistic, is calculated as the ratio of the mean square between groups (MSG, variability between groups) and mean square error (MSE, variability within errors). Also recognize that the F statistic has a right skewed distribution with two different measures of degrees of freedom: one for the numerator (\({df}_{G}=k−1\), where \(k\) is the number of groups), and one for the denominator (\({df}_{E}=n−k\), where \(n\) is the total sample size).

The group degrees of freedom is number of levels (categories) minus 1 (\(k−1=5−1=4\)) and the error degrees of freedom is the sample size minus the number of levels (\(n−k=45−5=40\)).

  1. A study compared five different methods for teaching descriptive statistics. The five methods were traditional lecture and discussion, programmed textbook instruction, programmed text with lectures computer instruction, and computer instruction with lectures. 45 students were randomly assigned, 9 to each method. After completing the course, students took a 1-hour exam. We are interested in finding out if the average test scores are different for the different teaching methods.

    The p-value of the test is 0.0168. What is the conclusion of the test?
    1. At most two group means are significantly different from each other.
    2. All five group means are significantly different from each other.
    3. All five group means are equal to each other.
    4. Only two group means are significantly different from each other.
    5. At least two group means are significantly different from each other.
    6. The conclusion of the test is that at least two group means are significantly different from each other.

Recognize that the null hypothesis in ANOVA sets all means equal to each other, and the alternative hypothesis suggest that at least one mean is different.

  • \(H_0\): \(\mu_1 = \mu_2 = ... = \mu_k\)
  • \(H_A\): at least one mean is different

The p-value is low so we reject the null hypothesis.

  1. For given values of the sample mean and the sample standard deviation when n = 25, you conduct a hypothesis test and obtain a p-value of 0.0667, which leads to non-rejection of the null hypothesis. What will happen to the p-value if the sample size increases (and all else stays the same)?
    1. Decrease
    2. Stay the same
    3. May either increase or decrease
    4. Increase
    5. Since the sample size increases, the p-value decreases.

This question refers to the following:

  • Use a t-statistic, with degrees of freedom \(df=n−1\) for inference for a population mean:
\begin{align} \text{CI:} \ \ \bar{x} \pm t^{*}_{df}SE && \text{HT:} \ \ T_{df} \frac{\bar{x} - \mu}{SE} \end{align}

where \(SE=s/\sqrt{n}\).

  • Use a t-statistic, with degrees of freedom \(df=n_{diff}−1\) for inference for the difference in two paired (dependent) means:
\begin{align} \text{CI:} \ \ \bar{x}_{diff} \pm t^{*}_{df}SE && \text{HT:} \ \ T_{df} \frac{\bar{x}_{diff} - \mu_{diff}}{SE} \end{align}

where \(SE=s/\sqrt{n}\). Note that \(\mu_{diff}\) is often 0, since often \(H_0 : \mu_{diff}=0\).

As the sample size increases the standard error will decrease, which increases the test statistic, and hence decreases the p-value (the tail area).

  1. A study compared five different methods for teaching descriptive statistics. The five methods were traditional lecture and discussion, programmed textbook instruction, programmed text with lectures, computer instruction, and computer instruction with lectures. 45 students were randomly assigned, 9 to each method. After completing the course, students took a 1-hour exam. We are interested in finding out if the average test scores are different for the different teaching methods.

    If the original significance level for the ANOVA was 0.05, what should be the adjusted significance level for the pairwise tests to compare all pairs of means to each other?
    1. 0.005
    2. 0.01
    3. 0.5
    4. 0.05
    5. 0.025

alpha = 0.05 k = 5 comparisons <- k*(k-1)/2 alpha/comparisons

## [1] 0.005

This question refers to the following learning objective(s): Describe why conducting many t-tests for differences between each pair of means leads to an increased Type 1 Error rate, and we use a corrected significance level (Bonferroni correction, α⋆=α/K, where K is the number of comparisons being considered) to combat inflating this error rate.

  • Note that \(K=k(k−1)2\), where \(k\) is the number of groups.

\(K = (5-4)/2\), \(\alpha^{*} = 0.05/10 = 0.005\).