Ch. 1-Ch. 5 Summative Review AP Statistics – Flashcards

Unlock all answers in this set

Unlock answers
question
By definition, a subset of a population selected for study is a
answer
sample
question
The distinction between descriptive and inferential statistics is that
answer
descriptive statistics describe data sets, inferential statistics involve generalizing to populations
question
A characteristic whose value may change from one individual to another is a
answer
variable
question
According to Chebyshev's Rule, at least what percent of data is within 5 standard deviations of the mean?
answer
96
question
The Empirical Rule can be used when assessing a distribution if
answer
the distribution is approximately normal
question
A treatment that has no active ingredients is a
answer
placebo treatment
question
A study cannot be an experiment if
answer
a procedure of random assignment to treatments is not performed
question
The term used to describe the bias that occurs if some segment of a population is systematically excluded form a sample is
answer
selection bias
question
The z-score and percentile are measures of
answer
variability
question
Jane Doe and Richard Roe are law students. Their scores on the mid terms and finals in the Statistics for Law class, together with the means and standard deviations for these exams are given below: Midterm: JD 60, RR 50, Mean 40, Stand. Dev. 10 Final: JD 60, RR 50, Mean 80, Stand. Dev. 15
answer
Jane did better than Richard on both exams (do z score = x-mean/stand dev.)
question
The bias that occurs because observations were not made of all individuals selected for a sample is
answer
non-response bias
question
By definition, a simple random sample is one such that
answer
each possible sample of size n has an equal chance of being selected
question
The sampling frame is the
answer
list of all elements in a population
question
An experiment is a planned intervention undertaken to observe the effects of
answer
explanatory variables
question
Two variables are confounded if
answer
their effects on the response variable cannot be distinguished
question
Randomization, as a strategy in experimental design, would be unsuccessful if
answer
aspects of the experimental condition other than the values of the explanatory variable, systematically favors a treatment
question
Blocking would be unsuccessful if
answer
the blocks were heterogeneous on the blocking factor
question
The design strategy of making multiple observations for each experimental condition is
answer
replication
question
In utilizing direct control, which of the following are held constant
answer
values of an extraneous variable
question
In a study of dexterity, data was taken on 100 individuals. The variables measured for each person were the number of errors in picking up a dime in 50 trials, and the total time it took to pick up the dime 50 times. The data set is, therefore,
answer
bivariate
question
Suppose I have a set of data with 5 numbers: -6.0, -4.5, 0, 5.0, and an unknown 5th number. For these 5 data points, which of the following statistics can NEVER be greater than zero
answer
the median
question
Look at packet because boxplot
answer
Median is 4, Q1 is at 3, Q3 is at 6
question
In a study of hatchling resting metabolism, three species, labeled A, B, and C below, were studied. Below is a pie chart of the sample sizes for each of the species. 36 hatchlings were studied in total. Based on the pie chart, about how many of the hatchlings were Species C hatchlings
answer
12
question
Suppose that a frequency distribution and a cumulative frequency distribution are constructed from the same set of data, using the same classes. Then, for each class,
answer
the frequency (less than or equal to) the cumulative frequency
question
Which of the following variables yields data that would be suitable for use in a histogram
answer
length of a phone call
question
Use the following frequency table to determine the proportion of values less than 60 CI: 15-<30 F: 15 CI: 30-<45 F: 14 CI: 45-<60 F: 16 CI: 60-<75 F: 12 CI: 75-<90 F: 18 Total: 75
answer
.600
question
Canine problem look at packet
answer
canine problem look at packet. 13
question
Canine problem look at packet. Considering the graphic displays, the best description of these data would be
answer
Canine problem look at packet. skewed right
question
Canine problem look at packet. When constructing a modified box plot, one must find the upper and lower mild outlier cutoffs. For these data, the upper mild outlier cutoff would be
answer
Canine problem look at packet. OMIT no right answer, to find you would find the Q3 and Q1 to calculate the IQR (Q3-Q1). Then take the Q3+1.5*IQR
question
A distribution can have more than one
answer
mode
question
It is possible for a distribution to be
answer
symmetric and normal
question
A data set consisting of observations on two or more attributes is called a
answer
multivariate data set
question
By definition, strata are groups of population units that
answer
form well defined subpopulations
question
Suppose we have the following data: 12, 17, 13, 25, 16, 21, 30, 14, 16, 18 To find the 10% trimmed mean, what numbers should be deleted from the calculation
answer
First put in numerical order 12, 13, 14, 16, 16, 17, 21, 25, 30 Then since there's 10, you would take 10% of ten which is one so you take one off each end, i.e. 12 and 30
question
The percentage of data points falling at or below the upper quartile is
answer
75
question
For which of the following statistics would one not need to put the data in order from smallest to largest
answer
the range
question
In terms of sensitivity to outliers, which is the correct ordering of the following statistics from least sensitive to most sensitive? In other words, if the following statistics were ordered like this: least sensitive < sensitive < most sensitive what should the ordering be
answer
median < trimmed mean < mean
question
Suppose that for a set of numeric data, where the numbers are not all different, the standard deviation is less than 1.0. Then it must be true that
answer
the variance < the standard deviation
question
Tributaries question see packet. The two points to the lower left of the original plot are the two points where zero species were observed. If these points are judged to be erroneous observations and deleted form the analysis, what would be the effect of the deletion on the sample statistics and best fit line for this data
answer
the standard deviation of pH would decrease and the slope would be smaller
question
Look at packet for tributaries question Using the equation of the best fit line above, to the nearest unit what is the predicted number of species observed for a mean pH = 6.0
answer
5
question
A point is called influential point if
answer
it plays a large role in determining the slope of the least squares line
question
From March, 1980, to April, 1981, data were gathered on the amount of lead sold in gasoline (metric tons) in Massachusetts vs. the amount of lead found in umbilical blood in Boston (micrograms per deciliter). A summary of the analysis is presented below, and a least squares regression line has been fit to the data. Approximately what percentage of the variation in umbilical lead concentrations can be explained by the linear model r(squared)= .453
answer
45.3% taken from r(squared) and making it a percentage
question
Which of the following indicates that an association between x and y is positive
answer
a positive Pearson's correlation coefficient
question
The slope of the regression line and the correlation between two variables is related in the following way
answer
the slope and correlation must be of the same sign
question
When regressing y on x, y is referred to as the
answer
response variable
question
A good fit of the simple linear regression model would be characterized by
answer
a relatively large r(squared) and a relatively small se
question
Of the following, which is not true of r
answer
r is always between 0 and 1
question
Suppose that for two variables, x and y, the least squares line, yhat = a + bx is found, and r is greater than zero. Which of the following statement is correct
answer
for values of x less than xbar, the residuals must generally be relatively large
question
Look at packet. The fit of the data indicates that on average the estimates of the logs of the proportion returning are declining as the logarithm of the distance increases. The number in the table that indicates this is
answer
the slope of the regression line
question
From this analysis, the proportion of bats returning that would predicted for a release distance of 30 km is in which range below
answer
.21-.25
question
From March, 1980, to April, 1981, data were gathered on the amount of lead sold in gasoline (metric tons) in Massachusetts vs. the amount of lead found in umbilical blood in Boston (micrograms per deciliter). A summary of the analysis is presented below, and a least squares regression line has been fit to the data. The residual associated with the observation that has a gasoline lead value of 82 metric tons and 4.5 micrograms per deciliter is in which interval below
answer
-1 less than or equal to residual less than -.5
Get an explanation on any task
Get unstuck with the help of our AI assistant in seconds
New