question

By definition, a subset of a population selected for study is a

answer

sample

question

The distinction between descriptive and inferential statistics is that

answer

descriptive statistics describe data sets, inferential statistics involve generalizing to populations

question

A characteristic whose value may change from one individual to another is a

answer

variable

question

According to Chebyshev's Rule, at least what percent of data is within 5 standard deviations of the mean?

answer

96

question

The Empirical Rule can be used when assessing a distribution if

answer

the distribution is approximately normal

question

A treatment that has no active ingredients is a

answer

placebo treatment

question

A study cannot be an experiment if

answer

a procedure of random assignment to treatments is not performed

question

The term used to describe the bias that occurs if some segment of a population is systematically excluded form a sample is

answer

selection bias

question

The z-score and percentile are measures of

answer

variability

question

Jane Doe and Richard Roe are law students. Their scores on the mid terms and finals in the Statistics for Law class, together with the means and standard deviations for these exams are given below: Midterm: JD 60, RR 50, Mean 40, Stand. Dev. 10 Final: JD 60, RR 50, Mean 80, Stand. Dev. 15

answer

Jane did better than Richard on both exams (do z score = x-mean/stand dev.)

question

The bias that occurs because observations were not made of all individuals selected for a sample is

answer

non-response bias

question

By definition, a simple random sample is one such that

answer

each possible sample of size n has an equal chance of being selected

question

The sampling frame is the

answer

list of all elements in a population

question

An experiment is a planned intervention undertaken to observe the effects of

answer

explanatory variables

question

Two variables are confounded if

answer

their effects on the response variable cannot be distinguished

question

Randomization, as a strategy in experimental design, would be unsuccessful if

answer

aspects of the experimental condition other than the values of the explanatory variable, systematically favors a treatment

question

Blocking would be unsuccessful if

answer

the blocks were heterogeneous on the blocking factor

question

The design strategy of making multiple observations for each experimental condition is

answer

replication

question

In utilizing direct control, which of the following are held constant

answer

values of an extraneous variable

question

In a study of dexterity, data was taken on 100 individuals. The variables measured for each person were the number of errors in picking up a dime in 50 trials, and the total time it took to pick up the dime 50 times. The data set is, therefore,

answer

bivariate

question

Suppose I have a set of data with 5 numbers: -6.0, -4.5, 0, 5.0, and an unknown 5th number. For these 5 data points, which of the following statistics can NEVER be greater than zero

answer

the median

question

Look at packet because boxplot

answer

Median is 4, Q1 is at 3, Q3 is at 6

question

In a study of hatchling resting metabolism, three species, labeled A, B, and C below, were studied. Below is a pie chart of the sample sizes for each of the species. 36 hatchlings were studied in total. Based on the pie chart, about how many of the hatchlings were Species C hatchlings

answer

12

question

Suppose that a frequency distribution and a cumulative frequency distribution are constructed from the same set of data, using the same classes. Then, for each class,

answer

the frequency (less than or equal to) the cumulative frequency

question

Which of the following variables yields data that would be suitable for use in a histogram

answer

length of a phone call

question

Use the following frequency table to determine the proportion of values less than 60 CI: 15-<30 F: 15 CI: 30-<45 F: 14 CI: 45-<60 F: 16 CI: 60-<75 F: 12 CI: 75-<90 F: 18 Total: 75

answer

.600

question

Canine problem look at packet

answer

canine problem look at packet. 13

question

Canine problem look at packet. Considering the graphic displays, the best description of these data would be

answer

Canine problem look at packet. skewed right

question

Canine problem look at packet. When constructing a modified box plot, one must find the upper and lower mild outlier cutoffs. For these data, the upper mild outlier cutoff would be

answer

Canine problem look at packet. OMIT no right answer, to find you would find the Q3 and Q1 to calculate the IQR (Q3-Q1). Then take the Q3+1.5*IQR

question

A distribution can have more than one

answer

mode

question

It is possible for a distribution to be

answer

symmetric and normal

question

A data set consisting of observations on two or more attributes is called a

answer

multivariate data set

question

By definition, strata are groups of population units that

answer

form well defined subpopulations

question

Suppose we have the following data: 12, 17, 13, 25, 16, 21, 30, 14, 16, 18 To find the 10% trimmed mean, what numbers should be deleted from the calculation

answer

First put in numerical order 12, 13, 14, 16, 16, 17, 21, 25, 30 Then since there's 10, you would take 10% of ten which is one so you take one off each end, i.e. 12 and 30

question

The percentage of data points falling at or below the upper quartile is

answer

75

question

For which of the following statistics would one not need to put the data in order from smallest to largest

answer

the range

question

In terms of sensitivity to outliers, which is the correct ordering of the following statistics from least sensitive to most sensitive? In other words, if the following statistics were ordered like this: least sensitive < sensitive < most sensitive what should the ordering be

answer

median < trimmed mean < mean

question

Suppose that for a set of numeric data, where the numbers are not all different, the standard deviation is less than 1.0. Then it must be true that

answer

the variance < the standard deviation

question

Tributaries question see packet. The two points to the lower left of the original plot are the two points where zero species were observed. If these points are judged to be erroneous observations and deleted form the analysis, what would be the effect of the deletion on the sample statistics and best fit line for this data

answer

the standard deviation of pH would decrease and the slope would be smaller

question

Look at packet for tributaries question Using the equation of the best fit line above, to the nearest unit what is the predicted number of species observed for a mean pH = 6.0

answer

5

question

A point is called influential point if

answer

it plays a large role in determining the slope of the least squares line

question

From March, 1980, to April, 1981, data were gathered on the amount of lead sold in gasoline (metric tons) in Massachusetts vs. the amount of lead found in umbilical blood in Boston (micrograms per deciliter). A summary of the analysis is presented below, and a least squares regression line has been fit to the data. Approximately what percentage of the variation in umbilical lead concentrations can be explained by the linear model r(squared)= .453

answer

45.3% taken from r(squared) and making it a percentage

question

Which of the following indicates that an association between x and y is positive

answer

a positive Pearson's correlation coefficient

question

The slope of the regression line and the correlation between two variables is related in the following way

answer

the slope and correlation must be of the same sign

question

When regressing y on x, y is referred to as the

answer

response variable

question

A good fit of the simple linear regression model would be characterized by

answer

a relatively large r(squared) and a relatively small se

question

Of the following, which is not true of r

answer

r is always between 0 and 1

question

Suppose that for two variables, x and y, the least squares line, yhat = a + bx is found, and r is greater than zero. Which of the following statement is correct

answer

for values of x less than xbar, the residuals must generally be relatively large

question

Look at packet. The fit of the data indicates that on average the estimates of the logs of the proportion returning are declining as the logarithm of the distance increases. The number in the table that indicates this is

answer

the slope of the regression line

question

From this analysis, the proportion of bats returning that would predicted for a release distance of 30 km is in which range below

answer

.21-.25

question

From March, 1980, to April, 1981, data were gathered on the amount of lead sold in gasoline (metric tons) in Massachusetts vs. the amount of lead found in umbilical blood in Boston (micrograms per deciliter). A summary of the analysis is presented below, and a least squares regression line has been fit to the data. The residual associated with the observation that has a gasoline lead value of 82 metric tons and 4.5 micrograms per deciliter is in which interval below

answer

-1 less than or equal to residual less than -.5

Ch. 1-Ch. 5 Summative Review AP Statistics – Flashcards

Unlock all answers in this set

Haven't found what you were looking for?

Search for samples, answers to your questions and flashcards