Stats Final – Flashcards

Unlock all answers in this set

Unlock answers
question
School administrators collect data on students attending the school. Which of the following variables is quantitative? A) whether the student is in AP* classes B) whether the student has taken the SAT C) grade point average D) none of these E) class (freshman, soph., junior, senior)
answer
C) grade point average
question
Which of the following variables would most likely follow a Normal model? A) family income B) scores on an easy test C) heights of singers in a co-ed choir D) weights of adult male elephants E) all of these
answer
D) weights of adult male elephants
question
A professor has kept records on grades that students have earned in his class. If he wants to examine the percentage of students earning the grades A, B, C, D, and F during the most recent term, which kind of plot could he make? A) timeplot B) boxplot C) dotplot D) pie chart E) histogram
answer
D) pie chart
question
Which is true of the data shown in the histogram? I. The distribution is approximately symmetric. II. The mean and median are approximately equal. III. The median and IQR summarize the data better than the mean and standard deviation. A) I only B) I and III C) I and II D) III only E) I, II, and III
answer
C) I and II
question
Two sections of a class took the same quiz. Section A had 15 students who had a mean score of 80, and Section B had 20 students who had a mean score of 90. Overall, what was the approximate mean score for all of the students on the quiz? A) 85.7 B) It cannot be determined. C) 85.0 D) none of these E) 84.3
answer
A) 85.7
question
Your Stats teacher tells you your test score was the 3r d quartile for the class. Which is true? I. You got 75% on the test. II. You can't really tell what this means without knowing the standard deviation. III. You can't really tell what this means unless the class distribution is nearly Normal. A) I only B) none of these C) II only D) III only E) II and II
answer
B) none of these
question
Suppose that a Normal model described student scores in a history class. Parker has a standardized score (z-score) of +2.5. This means that Parker A) is 2.5 standard deviations above average for the class. B) is 2.5 points above average for the class. C) none of these D) has a score that is 2.5 times the average for the class. E) has a standard deviation of 2.5.
answer
A) is 2.5 standard deviations above average for the class.
question
The advantage of making a stem-and-leaf display instead of a dotplot is that a stem-and-leaf display A) satisfies the area principle. B) preserves the individual data values. C) shows the shape of the distribution better than a dotplot. D) none of these E) A stem-and-leaf display is for quantitative data, while a dotplot shows categorical data
answer
B) preserves the individual data values.
question
The five-number summary of credit hours for 24 students in a statistics class is: min - 13 q1 - 15 median - 16.5 q3 - 18 max - 22 From this information we know that A) there are no outliers in the data. B) there are both low and high outliers in the data. C) there is at least one low outlier in the data. D) none of these E) there is at least one high outlier in the data.
answer
A) there are no outliers in the data.
question
Which of the following data summaries are changed by adding a constant to each data value? I. the mean II. the median III. the standard deviation A) I and II B) I and III C) I, II, and III D) I only E) III only
answer
A) I and II
question
All but one of these statements contain a mistake. Which could be true? A) There is a correlation of 0.63 between gender and political party. B) The correlation between a football player's weight and the position he plays is 0.54. C) The correlation between a car's length and its fuel efficiency is 0.71 miles per gallon. D) There is a high correlation (1.09) between height of a corn stalk and its age in weeks. E) The correlation between the amount of fertilizer used and the yield of beans is 0.42
answer
E) The correlation between the amount of fertilizer used and the yield of beans is 0.42
question
Residuals are... A) variation in the data that is explained by the model. B) possible models not explored by the researcher. C) the difference between observed responses and values predicted by the model. D) data collected from individuals that is not consistent with the rest of the group. E) none of these
answer
C) the difference between observed responses and values predicted by the model.
question
Which statement about influential points is true? I. Removal of an influential point changes the regression line. II. Data points that are outliers in the horizontal direction are more likely to be influential than points that are outliers in the vertical direction. III. Influential points have large residuals. A) I, II, and III B) I and II C) I and III D) II and III E) I only
answer
B) I and II
question
Which is true? I. Random scatter in the residuals indicates a model with high predictive power. II. If two variables are very strongly associated, then the correlation between them will be near +1.0 or -1.0. III. The higher the correlation between two variables the more likely the association is based in cause and effect. A) I, II, and III B) none C) I and II only D) II only E) I only
answer
B) none
question
A scatterplot of 1 y vs. x shows a strong positive linear pattern. It is probably true that A) the residuals plot for regression ofY on Xshows a curved pattern. B) the scatterplot ofY vs Xalso shows a linear pattern. C) the correlation between XandY is near +1.0. D) accurate predictions can be made forY even if extrapolation is involved. E) large values of Xare associated with large values of Y.
answer
A) the residuals plot for regression ofY on Xshows a curved pattern.
question
It's easy to measure the circumference of a tree's trunk, but not so easy to measure its height. Foresters developed a model for ponderosa pines that they use to predict the tree's height (in feet) from the circumference of its trunk (in inches): ln h = -1.2 + 1.4(ln C). A lumberjack finds a tree with a circumference of 60"; how tall does this model estimate the tree to be? A) 83' B) 5' C) 11' D) 19' E) 93
answer
E) 93
question
Two variables that are actually not related to each other may nonetheless have a very high correlation because they both result from some other, possibly hidden, factor. This is an example of A) regression. B) a lurking variable. C) leverage. D) extrapolation. E) an outlier.
answer
B) a lurking variable.
question
If the point in the upper right corner of this scatterplot is removed from the data set, then what will happen to the slope of the line of best fit (b) and to the correlation (r)? A) both will remain the same. B) both will increase. C) b will decrease, and r will increase. D) both will decrease. E) b will increase, and r will decrease
answer
C) b will decrease, and r will increase.
question
Researchers studying growth patterns of children collect data on the heights of fathers and sons. The correlation between the fathers' heights and the heights of their 16 year-old sons is most likely to be... A) somewhat greater than 1.0 B) near -1.0 C) near +0.7 D) near 0 E) exactly +1.0
answer
C) near +0.7
question
The auto insurance industry crashed some test vehicles into a cement barrier at speeds of 5 to 25 mph to investigate the amount of damage to the cars. They found a correlation ofr = 0.60 between speed (MPH) and damage ($). If the speed at which a car hit the barrier is 1.5 standard deviations above the mean speed, we expect the damage to be the mean damage. A) 0.60 SD above B) equal to C) 0.90 SD above D) 1.5 SD above E) 0.36 SD above
answer
C) 0.90 SD above
question
Which scatterplot shows a strong association between two variables even though the correlation is probably near zero? A) B) C) D) E)
answer
D)
question
The correlation between XandY is r = 0.35. If we double each Xvalue, decrease each Y by 0.20, and interchange the variables (put Xon theY-axis and vice versa), the new correlation A) is 0.90 B) is 0.35 C) is 0.50 D) is 0.70 E) cannot be determined.
answer
B) is 0.35
question
The correlation between a family's weekly income and the amount they spend on restaurant meals is found to be r = 0.30. Which must be true? I. Families tend to spend about 30% of their incomes in restaurants. II. In general, the higher the income, the more the family spends in restaurants. III. The line of best fit passes through 30% of the (income, restaurant$) data points. A) I, II, and III B) II only C) I only D) II and III only E) III only
answer
B) II only
question
A medical researcher finds that the more overweight a person is, the higher his pulse rate tends to be. In fact, the model suggests that 12-pound differences in weight are associated with differences in pulse rate of 4 beats per minute. Which is true? I. The correlation between pulse rate and weight is 0.33. II. If you lose 6 pounds, your pulse rate will slow down 2 beats per minute. III. A positive residual means a person's pulse rate is higher than the model predicts. A) I only B) II and III only C) II only D) III only E) none
answer
D) III only
question
Education research consistently shows that students from wealthier families tend to have higher SAT scores. The slope of the line that predicts SAT score from family incomeis 6.25 points per $1000, and the correlation between the variables is 0.48. Then the slope of the line that predicts family income from SAT score(in $1000 per point)... A) is 0.037 B) is 0.16 C) is 13.02 D) is 3.00 E) is 6.25
answer
A) is 0.037
question
A regression analysis of company profits and the amount of money the company spent on advertising found r2 = 0.72. Which of these is true? I. The model can correctly predict the profit for 72% of companies. II. On average, about 72% of a company's profit results from advertising. III. On average, companies spend about 72% of their profits on advertising. A) II only B) none C) I and III D) III only E) I only
answer
B) none
question
A least squares line of regression has been fitted to a scatterplot; the model's residuals plot is shown. Which is true? A) The linear model is appropriate. B) None of these. C) The linear model is poor because the correlation is near 0. D) A curved model would be better. E) The linear model is poor because some residuals are large.
answer
A) The linear model is appropriate.
question
A company sponsoring a new Internet search engine wants to collect data on the ease of using it. Which is the best way to collect the data? A) census B) simulation C) sample survey D) observational study E) experiment
answer
C) sample survey
question
The January 2005 Gallup Youth Survey telephoned a random sample of 1028 U.S. teens and asked these teens to name their favorite movie from 2004.Napoleon Dynamite had the highest percentage with 8% of teens ranking it as their favorite movie. Which is true? I. The population of interest is all U.S. teens II. 8% is a statistic and not the actual percentage of all U.S. teens who would rank this movie as their favorite. III. This sampling design should provide a reasonably accurate estimate of the actual percentage of all U.S. teens who would rank this movie as their favorite. A) III only B) I, II, and III C) I and II D) II only E) I only
answer
B) I, II, and III
question
Suppose your local school district decides to randomly test high school students for attention deficit disorder (ADD). There are three high schools in the district, each with grades 9-12. The school board pools all of the students together and randomly samples 250 students. Is this a simple random sample? A) No, because we can't guarantee that there are students from each grade in the sample. B) Yes, because they could have chosen any 250 students from throughout the district. C) Yes, because the students were chosen at random. D) Yes, because each student is equally likely to be chosen. E) No, because we can't guarantee that there are students from each school in the sample.
answer
B) Yes, because they could have chosen any 250 students from throughout the district.
question
A basketball player has a 70% free throw percentage. Which plan could be used to simulate the number of free throws she will make in her next five free throw attempts? I. Let 0,1 represent making the first shot, 2, 3 represent making the second shot,..., 8, 9 represent making the fifth shot. Generate five random numbers 0-9, ignoring repeats. II. Let 0, 1, 2 represent missing a shot and 3, 4,..., 9 represent making a shot. Generate five random numbers 0-9 and count how many numbers are in 3-9 . III. Let 0, 1, 2 represent missing a shot and 3, 4,..., 9 represent making a shot. Generate five random numbers 0-9 and count how many numbers are in 3-9, ignoring repeats. A) II only B) II and III C) III only D) I only E) I, II, and III
answer
A) II only
question
More dogs are being diagnosed with thyroid problems than have been diagnosed in the past. A researcher identified 50 puppies without thyroid problems and kept records of their diets for several years to see if any developed thyroid problems. This is a(n) A) prospective study B)retrospective study C) survey D) blocked experiment E)randomized experiment
answer
A) prospective study
question
A chemistry professor who teaches a large lecture class surveys his students who attend his class about how he can make the class more interesting, hoping he can get more students to attend. This survey method suffers from A) undercoverage B) voluntary response bias C) response bias D) none of these E) nonresponse bias
answer
A) undercoverage
question
Placebos are a tool for A) control B) blinding C) blocking D) sampling E)randomization
answer
B) blinding
question
Double-blinding in experiments is important so that I. The evaluators do not know which treatment group the participants are in. II. The participants do not know which treatment group they are in. III. No one knows which treatment any of the participants are getting. A) II only B) I, II,and III C) I and II D) I only E) III only
answer
C) I and II
question
Which of the following is not required in an experimental design? A) All are required in an experimental design. B) blocking C) control D) replication E)randomization
answer
B) blocking
question
A researcher wants to compare the effect of a new type of shampoo on hair condition. The researcher believes that men and women may react to the shampoo differently. Additionally, the researcher believes that the shampoo will react differently on hair that is dyed. The subjects are split into four groups: men who dye their hair; men who do not dye their hair; women who dye their hair; women who do not dye their hair. Subjects in each group are randomly assigned to the new shampoo and the old shampoo. This experiment A) has two factors (shampoo type and whether hair is dyed) blocked by gender. B) has three factors (shampoo type, gender, whether hair is dyed). C) has one factor (shampoo type), blocked by gender and whether hair is dyed. D) is completely randomized. E) has two factors (gender and whether hair is dyed) blocked by shampoo type.
answer
C) has one factor (shampoo type), blocked by gender and whether hair is dyed.
question
According to the National Telecommunication and Information Administration, 56.5% of U.S. households owned a computer in 2001. What is the probability that of three randomly selected U.S. households at least one owned a computer in 2001? A) 43.5% B) 56.5% C) 18.0% D) 91.8% E) 82.0%
answer
D) 91.8%
question
A fair coin has come up "heads" 10 times in a row. The probability that the coin will come up heads on the next flip is A) 50%. B) It cannot be determined. C) greater than 50%, since it appears that we are in a streak of "heads." D) less than 50%, since "tails" is due to come up.
answer
A) 50%.
question
According to the National Telecommunication and Information Administration, 50.5% of U.S. households had Internet access in 2001. What is the probability that four randomly selected U.S. households all had Internet access in 2001? A) 6.5% B) 93.5% C) 49.5% D) 12.6% E) 50.5%
answer
A) 6.5%
Get an explanation on any task
Get unstuck with the help of our AI assistant in seconds
New