AP Statistics Study Guide Ch. 1-3

observational study
where you observe individuals and collect data without trying to influence results

deliberately doing something in order to provoke responses

What is the GSOCS anagram used for?
examining and describing a distribution

Gaps, Spread, Outliers, Center, Shape

Ogive graph
graph of cumulative frequency

Graphs of Categorical Variables
bar graph, pie chart

Graphs of Quantitative Variables
box plot, ogive graph, stem plot, histogram, stem/leaf plot

resistant measure
a value that is resistant to extreme values

example(s) of resistant measures

example(s) of nonresistant measures
mean, standard deviation, correlation

pth percentile
a value such that p percent of observations fall at or below it

5 number summary
minimum, Q1, median, Q3, maximum

formulas for determining outliers
less than Q1-1.5(IQR); more than Q3+1.5(IQR)

standard deviation
the average distance from the mean

when is the standard deviation 0?
when all observations are the same value

the size of a population doesn’t matter if …
the sample is random

What is the mean of a set of z-scores?

What is the standard deviation of a set of z-scores?

a peak in a graph

having one peak

formula for linear transformation

multiplying each observation by a positive number ‘b’ multiples _ by ‘b’
mean, median, IQR, stdev

adding ‘a’ to each observation adds ‘a’ to _
mean, median

what statistics are not affected by adding ‘a’ to each observation
IQR, stdev

converting raw scores to standard deviation units called z-scores

formula for a z-score

Chebyshev’s Inequality
In any distribution, the percent of observations falling within ‘k’ standard deviations of the mean is at least 100(1-1/k^2)

density curve
a curve that has an area of 1 underneath it

in skewed graphs, the ____ is closest to the tail (mean, median)

the median of a graph is referred to as
the equal areas point

the mean of a graph is referred to as
the tipping point

Empirical Rule
in a Normal distribution, approx. 68% of the data is within 1 stdev of the mean, approx. 95% of the data is within 2 stdev’s of the mean, approx. 99.7% of the data is within 3 stdev’s of the mean

inflection point
a point located at +/- 1 stdev on a Normal curve where a change in curvature occurs

ti-nspire: how to get a z-score from a probability

response variable
measures an outcome of a study

explanatory variable
explains or influences changes in a response variable

R (correlation coefficient)
measures the direction and strength of the linear relationship between two quantitative variables

4 rules of ‘R’
1. no distinction between explanatory/response variables
2. not affected by units of measurement
3. positive R=positive correlation; negative R=negative correlation
4. always between -1 and 1

making predictions based on data inside of known quantities

making predictions based on data outside of known quantities

Least Squares Regression Line (LSRL)
describes how a response variable ‘y’ changes as an explanatory variable ‘x’ changes – used to predict y values

formula of LSRL
predicted y = a + bx

slope of LSRL
b = r(sy/sx) that passes through the point mean(x), mean(y)

observed value – expected value (y-y(hat))

residual plot
scatterplot of regression residuals against explanatory variables

R^2(coefficient of determination)
tells us how well the LSRL predicts y values of a response variable – the number is the percentage of values best explained by the LSRL – the others are explained best by “other stuff”

formula of R^2
1-SSE/SST (SSE= sum of residuals squared, SST= sum of deviations squared)

influential observation
an extreme value that, if removed, has a big effect on the equation of the LSRL

lurking variable
variables not included in the explanatory/response variables of the study and may influence the interpretation of relationships