Statistics Test 1 (1-4) – Flashcards
Unlock all answers in this set
Unlock answersquestion
characteristics of people or things
answer
In statistics variables are
question
population
answer
In statistics, the data we work with is just one part of a bigger picture called the
question
How where the variables measured? What variables were measured? Who collected the data?
answer
What are some things that should be asked when developing an understanding of data?
question
Numerical and Categorical
answer
What are two basic types of variables in statistics?
question
qualitative
answer
Categorical values are also referred to as ____ variables.
question
quantitative
answer
Numerical values are also referred to as ____ variables.
question
Variation and data
answer
The study of statistics rest on what two major concepts?
question
context
answer
Data are more than just numbers, because data have _____
question
Variation
answer
The circles shown are similar, but not exactly the same. This is an example of?
question
frequency
answer
The number of times a value is observed in a data set is called a ___
question
They take into account possible differences among the sizes of the groups.
answer
Why are percentages or rates often better than counts for making comparisons?
question
The response variable
answer
The outcome variable in a question about causality is also referred to as what?
question
The Control Group
answer
In an experiment studying the association between a treatment variable and an outcome variable, the group of people who do NOT receive the treatment are called what?
question
Controlled Experiments
answer
Of the following, which is the only method of data collection suitable for making conclusions about causal relationships? Observational Studies Anecdotes Controlled Experiments All three are suitable
question
A confounding variable
answer
A differences between two groups in an observational study that can explain why the outcomes were very different between the groups is called what?
question
To make the groups as similar as possible, minimizing bias.
answer
Why is random assignment used to assign people to treatment groups and control groups in a controlled experiment?
question
Equal sample sizes for control and treatment group
answer
Which of the following is NOT one of the criteria for the "gold standard" for experiments? Large Sample Size Random assignment of subjects to treatment or control groups Double-blinding Equal sample sizes for control and treatment group
question
distribution of a sample
answer
The ____ organizes data by recording all the values observed in a sample as well as how many times each values was observed.
question
See the data and summarize it
answer
What two-step process is used to examine distributions?
question
bins
answer
In a histogram, observations are grouped into intervals called _____.
question
changes the shape of the histogram
answer
Changing the width of bins in a histogram ______
question
proportion
answer
"Relative frequency" is the same as?
question
When technology is not availiable and the data set is not large
answer
A stemplot is often useful when?
question
How many numbers are in the data set
answer
When examining the shape of a distribution of numerical data, which of the following is NOT one of the three basic characteristics of a distribution's shape? Whether the distribution is symmetric of skewed How many numbers are in the data set How many mounds appear Whether any unusually large or small values are present
question
Two very different groups have been combined into a single collection
answer
The existence of multiple mounds in a distribution is sometimes a sign of what?
question
Outliers
answer
Values so large or so small that they do not fit into the pattern of the distribution are called what?
question
Bar graph and pie chart
answer
Two commonly used graphs to display the distribution of a sample of categorical data are?
question
A bar chart is used for numerical variables while a histogram is used for categorical variables.
answer
What is the main difference between a bar chart and a histogram?
question
They are not commonly used by statisticians or in scientific settings
answer
Because the human eye has a difficult time judging how much area is taken up by the wedge-shaped slice of a pie chart, which of the following is true of pie charts? They are only used for small data sets They are only used if they are made by a computer They are not commonly used by statisticians or in scientific settings They are preferred over bar graphs
question
mode
answer
When describing the distribution of a categorical variable, the category that appears most often is called the ____
question
Make an appropriate graph
answer
What is the first step in almost every investigation of data?
question
Change the scale of the vertical axis so that it does not start at 0
answer
What is the most common trick to mislead readers of bar graphs?
question
Decreasing the use of misleading graphics
answer
Which of the following is NOT a way in which the Internet is influencing statistical graphics? Decreasing the use of misleading graphics Increasing the use of interactive displays Allowing for a greater variety of graphical displays None of the above
question
mean
answer
The ____ is another term for the arithmetic average.
question
How many standard deviations away an observation is from the mean
answer
A standard unit measures what?
question
The observation is equal to the mean
answer
If an observation has a z-score of 0, what does that mean?
question
Z-Score
answer
What is used to compare values measured in different units, such as inches and pounds?
question
Mean and Median
answer
What are two measures of the center of distribution?
question
The median is preferred when the data is strongly skewed or has outliers
answer
Under what conditions is the use of the median preferred?
question
The mean is preferred when the data is relatively symmetric
answer
Under what conditions is the use of the mean preferred?
question
median
answer
The value that would be right in the middle if you were to sort the data from smallest to largest is called the ____
question
median interquartile range
answer
When a distribution is skewed, the ____ is used to measure the center and the ____ is used to measure variation.
question
resistant
answer
Because the median is not affected by the size of an outlier and does not change even if a particular outlier is replaced by an even more extreme value, we say the median is _____ to outliers.
question
The mean tends to be greater than the median
answer
In a right-skewed distribution which of the following is true? The mean tends to be less than the median The mean and median are approximately the same The mean tends to be greater than the median None of these
question
median
answer
In a boxplot, the vertical line inside the box marks the location of the _____.
question
IQR
answer
The length of the box in a boxplot is proportional to what?
question
1.5
answer
In a boxplot, potential outliers are points that are more than _____ IQRs from the edges of the box.
question
To the most extreme values that are not potential outliers
answer
In a boxplot, the whiskers extend to?
question
The minimum Q1 The Median Q3 The mamimum
answer
What are the fiver numbers needed to make a boxplot?
question
Variation
answer
Which of the following is NOT something that one looks for when studying scatterplots? Shape Variation Strength Trend
question
a positive association
answer
Since, in general, the longer a car is owned the more miles it travels one can say there is a ______ between age of car and mileage.
question
weak
answer
A large amount of scatter in a scatterplot is an indication that the association between the two variables is ____.
question
trend shape strength context
answer
When describing two-variable associations, a written description should always include what?
question
correlation coefficient (r)
answer
The _____ is a number that measures the strenth of the linear association between two numerical variables.
question
variables are numerical
answer
The correlation coefficient (r) makes sense only if the trend is linear and the ____
question
-1 and 1
answer
The correlation coefficient (r) is always a number between ____
question
Never
answer
When can a correlation coefficient (r) based on an observational study be used to support a claim of cause and effect?
question
It has no effect on r
answer
When computing the correlation coefficient (r), what is the effect of changing the order of the variables on r?
question
regression equation
answer
The ____ is a tool for making predictions about future observed values and is a useful way of summarizing a linear relationship.
question
"predicted"
answer
Statisticians often write the word ____ in front of the y-variable in the equation of the regression line.
question
least squares line
answer
Another name for the regression line is the ____ line.
question
To make predictions about the values of y for a given x-value
answer
An important use of the regression line is to do what?
question
predictor variable explanatory variable independent variable
answer
When writing a regression equation, what are names for the x-variable?
question
predicted variable response variable dependent variable
answer
When writing a regression equation, what are names for the y-variable?
question
A big effect
answer
What type of effect can outliers have on a regression line?
question
Do regression and correlation with and without these points and comment on the differences
answer
When on has influential points in their data, how should regression and correlation be done?
question
influential
answer
Since outliers can greatly affect the regression line they are also called ____ points
question
extrapolation
answer
Attempting to use the regression equation to make predictions beyond the range of the data is called _____
question
coefficient of determination
answer
The value that measures how much variation in the response variable is explained by the explanatory variable is called the ____.
question
variance = s² = (?(x-x?)²/(n-1)
answer
Variance is another measure of variability and is used if the distribution is symmetric. What is the variance formula?
question
r = (?ZxZy)/n-1
answer
The correlation coefficient (r) measures the strength of a linear association. What is its formula?
question
z = (x-x?)/s
answer
A z-score converts observations into standard units. Its formula is?
question
a = ? - bx?
answer
The formula for the intercept (a) of a regression line is?
question
68% 95%
answer
What percentage of the observations will be within one standard deviation of the mean? Within two?
question
IQR = Q? - Q?
answer
The interquartile range is the measurement of variability best used when the distribution is skewed. Its formula is?
question
mean = x? = ?x/n
answer
The mean is the measure of center best used if the distribution is symmetric. Its formula is?
question
"Predicted" y = a + bx
answer
The regression line summarizes the relationship of a distribution. The regression line formula is?
question
Maximum - Minimum
answer
The range is a crude measure of variability. It's formula is?
question
b = r(Sy/Sx)
answer
The formula for the slope (b) of a regression line is?
question
Standard Deviation = s = ?(?(x-x?)²/(n-1)
answer
The standard deviation is the measure of variability best used if the distribution is symmetric. The formula for standard deviation is?