Exam 14: Learning From Experiment Data
Testing hypotheses about differences in experimental treatments requires some
modifications to the procedures for testing hypotheses about differences in population
characteristics.
a) How are the hypotheses different?
b) How are the conditions different?
c) How are the conclusions different?
a) The hypotheses would have to be worded in terms of treatment means or
proportions rather than population means or proportions.
b) For experiments, the individuals or objects must be randomly assigned to
treatments, whereas in testing hypotheses using data from sampling, the samples
must be randomly selected.
For experiments, the number of individuals in the treatment groups must be large
or the distributions of the response variable would need to be approximately
normal in very large treatment groups. For data from sampling, the assumption is
that the samples are large or the population distributions are approximately
normal.
c) Conclusions would need to be worded in terms of treatment means or proportions.
If random sampling from a population preceded the random assignment to
treatments, it would be reasonable to generalize the conclusions about treatment
effects to the population.
Osteoarthritis of the knee is a degenerative disease causing joint pain, stiffness, and
decreased function. Usual treatment is a combination of physical therapy,
medication, and arthroscopic surgery. The arthroscopic surgery involves removal of
cartilage fragments and calcium crystals, or the smoothing of bones to eliminate
difficulties with joint motion. Arthroscopic surgery is widely used, but it is not clear
how, or even if it works. In a recent study patients were randomly assigned to two
treatment groups. Patients in the arthroscopic surgery group received standard
arthroscopic treatment plus physical therapy and medicine. Patients in the other
group received only physical therapy and medicine. Two years after treatment the
patients were evaluated using standard scales for pain, stiffness, and physical
function. Higher scores indicate increased pain, increased stiffness, and decreased
physical function. Physical function data from the study are summarized below.
Graphical displays of the data indicate that it is reasonable to assume that the two
physical function score distributions are approximately normal. Physical Function Scores at 24 months
Treatment group Mean Standard deviation Sample size Arthroscopic 52.6 16.4 44 Placebo 47.7 12.0 44
(a) What null hypothesis and alternative hypothesis might be used in this setting to compare mean pain rating for the two treatments?
(b) For your hypotheses in part (a), what is the associated -Value?
(c) Given your results in parts (a) and (b), what would you conclude about the difference between the two treatments? Be sure to give your answer in the context of this study.
Either a one-tailed or two-tailed test is appropriate.
For the one-tailed test, the alternative hypothesis is that the mean for the arthroscopic
physical function scores is greater than the mean for the non-arthroscopic physical
function scores.
a) Let represent the mean physical function score for the arthroscopic treatment
represent the mean physical function score for the non-arthroscopic treatment
b)
The associated P-value is 0.0568.
c) Since the P-value is greater than .05, we do not have sufficient evidence to
conclude that the arthroscopic surgery results in a greater mean physical function
score than physical therapy and medicine.
For the two-tailed test, the alternative hypothesis is that the mean arthroscopic
physical function score is different from the mean non-arthroscopic physical function
scores.
a) Let represent the mean for the arthroscopic treatment
represent the mean for the non-arthroscopic treatment
b)
The associated P-value is 0.1137.
c) Since the P-value is greater than .05, there is no statistically significant difference
between the treatment means. We do not have sufficient evidence to conclude
that the mean physical function score for arthroscopic surgery treatment is
different from the mean physical function score of the physical therapy and
medicine treatment.
Proper nutrition is essential for aircraft pilots, given the demands of their job.
Previous studies have indicated that United States Air Force pilots do not regularly
eat breakfast, and thus may have low blood glucose levels after fasting at night. To
investigate the potential for danger, 8 pilots were selected for study in flight
simulators. Each pilot participated in two trials and they tried a different drink, either
Drink A (high carbohydrate) or Drink B (low carbohydrate) in each trial. The order in
which the drinks were tried was determined at random. The second trial was
conducted 2 days after the first trial, so that any effects of the first drink were
eliminated. After consuming one of the drinks, the pilots were subjected to a variety
of attitude recovery tasks. (An attitude recovery task is one where the pilot must
return to wings-level flight.) The times to recovery are shown in the table below.
Graphical displays of the data indicate that the t-procedure is appropriate. Time to Wings Level (sec)
Pilot 1 2 3 4 5 6 7 8 Drink A 11.05 10.35 10.73 9.14 13.61 11.46 11.45 12.31 Drink B 10.51 10.73 15.85 8.55 12.65 15.77 8.96 9.18
Do these data provide convincing evidence that the mean attitude recovery time differs for the two drinks? Provide appropriate statistical justification for your conclusion.
Let represent the mean difference in time to wings level for treatments and
The problem states that the assumptions for the -procedure are met.
Since the -value is very large, we cannot reject the null hypothesis. There is not sufficient evidence of a difference in mean recovery times to wings level between drinks and .
Testing hypotheses about differences in experimental treatments requires some
modifications to the procedures for testing hypotheses about differences in population
characteristics.
a) How are the hypotheses different?
b) How are the conditions different?
c) How are the conclusions different?
The Internet is increasingly available to the general public, and this has not gone
unnoticed by those who construct and give surveys. Researchers are using electronic
surveys more and more, which naturally leads to questions about the usefulness of
electronic surveys compared to traditional mail and telephone surveys. One very
important aspect of surveys is the response rate. To study the difference in response
rates between electronic and postal surveys, 377 college faculty members randomly
selected from a list of members in a single professional membership, the Mid-South
Educational Research Association. Each person was assigned at random to either
receive the survey by the U. S. Postal Service (USPS) or by email. There were 189
people in the USPS group and 188 people in the email group. Each group was sent a
follow-up notice two weeks after the initial mailing. Each group was sent a follow-
up notice two weeks after the initial mailing. Eighty-four of the USPS delivered
surveys were returned, and 42 of the electronic surveys were returned.
(a) Construct and interpret a 95% confidence interval for the difference in
proportions of surveys returned before being sent a follow-up notice for the mail
and the email survey.
(b) In the context of this study, discuss the statistical significance and practical
significance of the results.
(c) To whom do you feel the results of this study can be generalized? Justify your
response in a few sentences.
The Internet is increasingly available to the general public, and this has not gone
unnoticed by those who construct and give surveys. Researchers are using electronic
surveys more and more, which naturally leads to questions about the usefulness of
electronic surveys compared to traditional mail and telephone surveys. One very
important aspect of surveys is the response rate. To study the difference in response
rates between electronic and postal surveys, 377 college faculty members randomly
selected from a list of members in a single professional membership, the Mid-South
Educational Research Association. Each person was assigned at random to either
receive the survey by the U. S. Postal Service (USPS) or by email. There were 189
people in the USPS group and 188 people in the email group. Each group was sent a
follow-up notice two weeks after the initial mailing. Forty-eight of the USPS
delivered surveys were returned before the follow-up notice, and 24 of the electronic
surveys were returned before the follow-up notice.
(a) Construct and interpret a 95% confidence interval for the difference in
proportions of surveys returned before being sent a follow-up notice for the mail
and the email survey.
(b) In the context of this study, discuss the statistical significance and practical
significance of the results.
(c) To whom do you feel the results of this study can be generalized? Justify your
response in a few sentences.
Osteoarthritis of the knee is a degenerative disease causing joint pain, stiffness, and
decreased function. Usual treatment is a combination of physical therapy,
medication, and arthroscopic surgery. The arthroscopic surgery involves removal of
cartilage fragments and calcium crystals, or the smoothing of bones to eliminate
difficulties with joint motion. Arthroscopic surgery is widely used, but it is not clear
how, or even if it works. In a recent study patients were randomly assigned to two
treatment groups. Patients in the arthroscopic surgery group received standard
arthroscopic treatment plus physical therapy and medicine. Patients in the other
group received only physical therapy and medicine. Two years after treatment the
patients were evaluated using standard scales for pain, stiffness, and physical
function. Higher scores indicate increased pain, increased stiffness, and decreased
physical function. Pain score data from the study are summarized below. Graphical
displays of the data indicate that it is reasonable to assume that the two pain score
distributions are approximately normal. Pain Scores at 24 months Treatment group Mean Standard deviation Sample size Arthroscopic 68.8 18.5 88 Non-Arthroscopic 63.8 19.8 80 (a) What null hypothesis and alternative hypothesis might be used in this setting to
compare mean pain rating for the two treatments?
(b) For your hypotheses in part (a), what is the associated P-Value?
(c) Given your results in parts (a) and (b), what would you conclude about the
difference between the two treatments? Be sure to give your answer in the context
of this study.
Proper nutrition is essential for aircraft pilots, given the demands of their job.
Previous studies have indicated that United States Air Force pilots do not
regularly eat breakfast, and thus may have low blood glucose levels after fasting
at night. To investigate the potential for danger, 8 pilots were selected for study
in flight simulators. Each pilot participated in two trials and they tried a different
drink, either Drink A (high carbohydrate) or Drink B (low carbohydrate) in each
trial. The order in which the drinks were tried was determined at random. The
second trial was conducted 2 days after the first trial, so that any effects of the
first drink were eliminated. After consuming one of the drinks, the pilots were
subjected to a variety of attitude recovery tasks. (An attitude recovery task is one
where the pilot must return to wings-level flight.) The duration of self-illusory
motion (time between wings-level flight and the pilot to "feel" he or she has
returned to wings-level flight) are shown in the table below. Graphical displays
of the data indicate that it is reasonable to assume that the two "felt recovery
time" distributions are approximately normal. Duration of self-illusory motion (sec) Pilot 1 2 3 4 5 6 7 8 Drink A 31 27 29 33 23 25 23 24 Drink B 34 23 29 20 24 26 26 27 Do these data provide convincing evidence that the mean self-illusory motion time
differs for the two drinks? Provide appropriate statistical justification for your
conclusion.
Filters
- Essay(0)
- Multiple Choice(0)
- Short Answer(0)
- True False(0)
- Matching(0)