# STATS Final Exam

Body mass index (BMI) is a measure of body fat based on weight and height. During a survey, the BMI of the respondents is recorded, and the mean and median are computed. In this case, BMI is a?
A. Quantitative variable
During a survey the BMI of the respondents is recorded, and its used to classify the respondents as under weight (BMI less than 18.5), normal weight (BMI at least 18.5 but less than 25) or overweight BMI at least 25 but less than 30. In this case BMI is a?
B. Categorical Variable.
The number of unprovoked alligator attacks in Florida each year is recorded in a table. The data is displayed in a frequency histogram in which the class width is 3. The height of the bar with endpoints 0-3 is 2. What does this mean?
There were two years in there were less than three unprovoked alligator attacks
A table gives the percentage of babies born in 2010 each day of the week. You wish to make a plot of these data that emphasizes the difference in the number of births by each day of the week. Which plot do you choose and why?
bar graph, where the days of the week are ordered based on the height of the bar, going from highest to lowest
In stats class, there are relatively few very low test scores, with most scores in the range from 60 to 100. The distribution of test scores in this class is?
Skewed to the left
The distribution of the mean of houses sold in my neighborhood was strongly skewed to the right. This must mean that?
The mean Is no more than the median
There are three children in the room ages 3,4,5. What would happen if a 4 year old enters the room?
The mean age will stay the same, but the variance will decrease.
On a straight stretch of highway beginning at mile marker 100 and ending in mile marker 150. What percent of accidents happen between mile marker 130 and 140?
20%
– each 10 miles represents 20% of the highway.
Using the standard normal table give the area under the standard normal curve corresponding to -0.5
Z<1.2 = .8849 Z<-0.5 = .3085 .8849-.3085 = .5764
A scatterplot is made of heights of various married couples with the woman’s height as the explanatory variable and the man’s height as the response variable. The least squares regression line is computed. Lovey and Thurston are an outlier in the x direction. Circle all that are true?
Lovey is much taller or much shorter than the rest of the women in the sample.
Circle the correct statement?
a. faculty who are good researchers tend to be poor teachers and vice-versa, so the correlation between teaching and research abilities is 0

b. women tend to be, on average 3.5 inches shorter than the men they marry, so the correlation between the heights of spouses must be negative

c. a searchers find the correlation between the show size of children and their score on a reading test to be 0.22. The researcher must have made a mistake since these two variables are clearly unrelated and must have correlation of 0.

D. If people with larger heads tend to be more intelligent then we would expect, the correlation between head size and intelligence to be positive (Correct Answer)

The association between flow rate and amount of eroded soil is
a. positive.

Positive association indicated both increase in flow rate and eroded soil at the same time.

Which of the following statements is correct?
a. The correlation Coefficient equals the proportion of times two variables lie on a straight line.

b. the correlation coefficient will be +1.0 only if all the data lie on the perfectly horizontal straight line

c. the correlation coefficient measures the fraction of outliers that appear in a scatterplot.

D. The correlation coefficient is a unitless number and must always lie between -1.0 and + 1.0 inclusive. (Correct Answer)

It is a fact that since 1960’s , atmosphere CO2 levels have increased steadily, as have obesity rates. We can conclude that?
there may be a lurking variable (such as a greater car ownership and increased reliance on car use) that simultaneously impacts both variables in this study. Correlation doesn’t imply causation.
The least squares regression line is
the line that makes the sum of the squares of the vertical distances of the data points from the line as small as possible.
In a linear regression situation, what number must all the residuals add up to?
A survey of 1,000 likely voters was performed to ask people whom they would vote for mayor. 37% said they would vote for Fred. On election day, 41% actually voted for Fred. Which number is the parameter and statistic?
Parameter- 41%

Statistic- 37%

a stratified ramdom sample corresponds to which of the following experimental designs?
a block design
To select a sample of undergraduate students in the Us, I select a simple random sample of four states. From each of these states I select a simple random sample of two colleges, finally from the 8 colleges I select 120 students. this is an example of ?
multistage sampling
At a local health club a researcher sample 75 people whose primary exercise is cardio, and 75 people whose primary exercise is strength training. The subjects were unaware of the purpose of the study, and the technician was not aware of the subjects type of exercise.
This is an observational study- No treatment was applied. If a treatment had been applied it would have been a double blind experiment.
A study attempts to compare two sunscreens. Each of the 50 subjects with varying skin complections will use both sunscreens. Screen a on one side and screen b on the other side. For each subject, a coin is tossed in order to determine which side receives screen and which receives screen b. This is an example of?
a matched pairs experiment.
I toss a penny and observe whether it lands heads up or tails up. Suppose the penny is fair, this means that?
If i flip the coin many, many times, the proportion of heads will be approximately 1/2 and this proportion will tend to get closer and closer to 1/2 as the number of tosses increases
You toss a thumbtack 100 times and observe that it lands point down 65 times. the proportion of times it landed point down is then 0.65. This proportion represents?
The sample proportion of tosses that landed point down in your 100 tosses
You randomly select 500 students and observe that 85 of them smoke. Estimate the probabilty that a randomly selected stududent smokes?
85/500 = 0.17
About 64% of boys aged 12-15 regularly play video games. If we select a single boy in this age group the porbability that he regularly plays video games is?
If you bet on red, the probability of winning is 18/38 = 0.4737. The probability of 0.4737 represents?
The proportion of times this event will occur in a very lonog series of individual bets on red.
Suppose you decide to bet on red on each of the 10 consecutive spins of the roulette. Suppose you lose all 5 of the first wagers. Which must be true?
What happended on the first 5 spins tells us nothing about what will happen on the next 5 spins.
Probability of A is 0.4 and Probability of B is 0.5. A and B are disjoint. The probability that either event A or B occur is?
0.4+0.5 = 0.9
The probability that both events occur A and B is?
0- disjoint events cannot occur together
Which of the following must be true?
If A occurs, then B does not occur, since they are disjoint and cannot occur together.
An assignment of probabilities to events in a sample space must obey which of the following?
A. The probability of any event must be a number between 0 and 1 inclusive.

B. They must sum to 1 when adding over all events in the sample space

C. They must obey the addition rule for disjoint events.

D. All of the above (Correct Answer)

Draw one card at random from the deck. The events that card is red and card is black are?
disjoint- 1 card cannot be red and black at the same time
Draw two cards at random from the deck. The event that the first card is red and the second card is red are?
neither disjoint not independent.
An event A occurs with probability of 0.2. Event B occurs with probability of 0.9. Event a and B are?
Cannot be disjoint
An event a will occur with probability 0.5. An event B will occur with probability 0.6. The probability that both a and b occur is 0.1. We may conclude
either A or B always occurs
A random variable can be described as
A variable whose value is a numerical outcome of a random phenomenon.
The density curve for a continous random variable X has which of the following properties
A. The probability of any event is the area under the density curve and above the values of X that make up the event

B. The total area under the density curve for X must be exactly 1.

C. The probability of any event of the form X= constant is 0

D. All of the above (Correct Answer)

Before an election, a survey is taken of likely voters and 45% say they plan to vote for candidate A. On election day 48% actually vote for candidate A. Circle the true statement
48% is the parameter and 45% is the statistic
The law of large numbers states that as the number of observations drawn at random from a population with a finite mean mu increaes, the mean x bar of the observed values
Tends to get closer and closer to the population mean mu.
The sampling distribution of a statistic
is the distribution of values taken by a statistic in all possible samples of the same size from the same population.
Suppose you interview 10 randomly selected workers and ask how mny miles they commute to work. You compute the sample mean distance commuted. No wimagine repeating the survey many, many times, each time recording a different sample mean distance communited. In the long run, a histrogram of these sample means represents
The sampling distribution of the sample mean.
Company A has 5,000 employees, company B has 15,000 employees. If each company selects 50 employees fo rthe surve, whcih of the following is true about the sampling distribution fo the sample means?
The sampling distributions of the sample means will have about the same standard deviation. The standard deviation for a sampling distribution of a sample mean depends only on the sample size, not the population (company size)
If each firm randomly selects 3% of its employees, which of the following is true about the sampling distributions of the sample means
the standard deviation for the sampling distribution of the sample mean will be smaller for the larger company (company B) because a larger sample is being selected.
The body mass of adult females in the U.s Has mean mu of 74.7kg and standard deviation of 34.9 kg. Body mass is not normally distributed; it is skewed to the right. The mean body mass for adult females is 70.7kg. However the cental limit theorem says that if n is large enough,
the average weight of all samples of size n will be close to 74.7 kg.
Which of the following statements is true with respect to sampling distribution of the same mean x.
A. if the sample size n increases, the standard deviation of xbar will decrease.

B. if the population standard deviation increases, the standard deviation of xbar will increase

C. according to the law of large numbers, if the sample size, n increases x bar will tend to be closer to the population mean.

D. All of the above (Correct Answer)

Event A occurs with probability 0.3 and event B occurs with probability 0.4. If A and B are independed, we may conclude that
P(A and B) = 0.12
For which of the following counts would a binomial probability model be reasonable?
The number of sevens in a randomly selected set of five random digits from your table of random digits.
A 0.95 confidence interval is
An interval computed from sample data by a method guaranteeing that the probability of interval computed contains the parameter of interestis 0.95
A 99% confidence interval for the mean mu of a population is computed from a random sample and found to be 6 +/- 3. We may conclude that?
If we took many many additional samples, and from each computed a 99% confidence interval for mu, approximately 99% of these intervals would contain mu.
I collect a random same of size n from a population and from the data collected compute and 95% confidence itnerval for the mean of the population. Which of the following would produce a new confidence interval with smaller width (smaller margin of error) based on the same data?
use a smaller confidence level.
Ed computes a 90% confidence interval for some parameter based on a random sample of size 64 and fred computes a 90% CI for the same parameter based on random sample of size 100. Which of the following is true?
The margin of error of freds CI is smaller than the margin of error of Ed’s CI
You plan to construct a confidence interval for the mean mu of a normal population with known standard deviation. which of the following will reduce the size of the margin of error?
a. Use a lower level of confidence
b. increase the sample size
c. reduce sigma
d. All of the above (correct answer)
Ed conducts a hypothesis test for anull hypothesis Ho against an alternative Ha based on a random sample of size 64. His sample mean is x bar = 1165. Fred conducts a test for the same Ho against the same Ha based on a random sample of size 100. Which of the folowing is true?
Ed’s p-value is more than fred’s.
In a statistical test of hypotheses, we say the data are statistically significant at level alpha if
The p- value is less than alpha
In a test of statistical hypotheses, the P-value tells us
The smallest level of significance at which the null hypothesis can be rejected.
A university administrator obtains a sample of the academic records of past and presen scholarship athletes at the university. The administrator reports that no significant difference was found in the mean GPA for male and female scholarship athletes (P= 0.287) this meas that?
The probability o fobtaining a difference in GPAs between male and female scholarship athletes as large as that observed in the sample if there is no difference in mean GPAs is 0.287
You conduct a statistical test of hypotheses and find that the null hypothesis is statically significant at level alpha = 0.05, You may conclude that
The test would also be significant a level alpha of 0.10
A certain population follows a normal distribution, with mean and standard deviation sigma 2.5. You test Ho: M= 1 and Ha: M does not equal 1. You obtain a p value of 0.072 which is true>
A 90% CI for mu will exclude the value 1
in testing hypotheses, if the consequences of failing to reject a null hypothesis that is actually false are very serious, we should
use a very small level of significance
In assessing the validity of any test of hypotheses, it is good practice to
Examine the probabilty modelthat serves as a basis for the test by using exploratory data analysis on the data.

determine exactl how the study was conducted

determine what assumptions the researchers made.

A radio show conducts a phone survey asking listeners to call in with a response to the question if they supported or opposed term limits for memebers of congress. 88% of listeners that called in favored term limites. we may safely conculde that?
nothing, except that a great majority of those with strong enough feelings of the issue to call in are in favor of congressional term limits. we cannot generalize any of this surveys results to any larger population.
If we accept the null hypothesis when in fact it is false we have
comitted a type II error
A type I error is
rejecteing the null hypothesis when it is true.
The power of a statistical test of hypotheses is
defined for a particular alternative value of the parameter of interest and is the probablity that fixed level of significance test will reject the null hypothesis when the particular alternative value of the parameter is true.
Which of the following will reduce the value of the power in a statistical test of hypotheses
decrease the sample size
you are thinking of employing a t procedure to test hypotheses about the mean of a population using significance level of 0.05. You suspect the distribution of the population is not normal and may be moderately skewed. which of the folowing statements is correct
you may use the t procedure provided your sample size is large enough
Four students were randomly selected from all fo the students who took the SAT twice – the first time without having completed a coaching class and the second time after having completed a coaching class. For each student, we recorded their first SAT and their second SAT score. To analyze this sata we should use?
The matched pairs t test
The average improvement in SAT score was recorded for both groups to analyze these data we should use the
the two sample t test
a tv news program conducts a call in poll about propsoed city ban on smoking in public places. of the 2467 callers, 1900 were opposed to the ban. which of the following statements are true with respect to using this sample to estimate p, the proportion of all tv news viewers that favor such a ban on smoking in public places
there is no way this sample can be viewed as an srs of all tv news viewers, so we cant use this sample to estimate p.