CSUF ISDSA Term Notes – Flashcards
80 test answers
Unlock all answers in this set
Unlock answers 80question
Ordinal Data
answer
Same as nominal data but there is ordering or ranking (ex. Credit ratings such as Excellent, Good, Fair)
Unlock the answer
question
Interval Data
answer
Quantitative or numerical in nature (ex. SAT scores, exam grades, income, height, weight, etc)
Unlock the answer
question
All arithmetic operations are possible for interval data but not for nominal and ordinal data types
answer
True
Unlock the answer
question
Cross-sectional data
answer
Data is collected at the same or approximately the same point in time
Unlock the answer
question
Time Series Data
answer
Collected over several time periods (ex. price of gas over a period of years)
Unlock the answer
question
The two types of Statistics
answer
Descriptive, Inferential
Unlock the answer
question
Descriptive Statistics
answer
Refers to the summary of important aspects of a data set
Unlock the answer
question
Inferential Statistics
answer
Goes beyond the data at our disposal (More formally refers about large data set "population" based on smaller set of "sample" data
Unlock the answer
question
Using a survey of a random sample of 5000 California residents, an economist said over 55% have a positive view of the economy
answer
Inferential Statistics
Unlock the answer
question
Say, from a survey of a random sample of 5000 students, 80% of those sampled are very excited about stats
answer
Descriptive Statistics
Unlock the answer
question
Population
answer
A set of items (experimental units) under study
Unlock the answer
question
Parameter (Variable)
answer
A descriptive measure of the population that is of interest e.g. the mean (Unknown - Use greek letter)
Unlock the answer
question
Statistic (singular term)
answer
A descriptive measure that is calculated from the sample e.g. the sample mean
Unlock the answer
question
The purpose of Inferential Stats
answer
To make inferences about a parameter of a population based on information obtained from the statistic or sample (with a certain degree of confidence)
Unlock the answer
question
The goal of data collection
answer
To obtain a "representative sample" that exhibits the characteristics of the entire population, most common approach is taking random samples
Unlock the answer
question
Sources of Statistical Data
answer
Extract from a public source, perform a designed experiment, take a survey, perform an observational study
Unlock the answer
question
Non-Random Sampling Errors
answer
Selection Bias, Non-response Bias, Measurement Errors
Unlock the answer
question
Three key features to understanding the shape of the data
answer
Symmetry, Skewness, Modality
Unlock the answer
question
Symmetry
answer
Normal distribution, the curve is shaped like a bell in the middle
Unlock the answer
question
Skewness, Skewed
answer
The bell will be on the left (negative skew) or to the right (positive skew)
Unlock the answer
question
Modality
answer
The curve will tell you if there are more than one groups in your data, camel humped curve
Unlock the answer
question
Mean
answer
The simple average
Unlock the answer
question
Median
answer
The middle observation after the data has been orderd
Unlock the answer
question
Mode
answer
The observation that occurs most often
Unlock the answer
question
Range=
answer
Largest value-smallest value
Unlock the answer
question
Interquartile Range=
answer
3rd quartile-1st quatile =Q3-Q1
Unlock the answer
question
Z Score
answer
Used to measure the location of a particular value in the data relative to the mean, the bigger the score, the farther from the mean/average
Unlock the answer
question
Empirical Rule
answer
For a normal distribution, nearly all of the data will fall withing 3 standard deviations of the mean
Unlock the answer
question
Outliers
answer
Values that fall outside of the normal range, either unusually large or unusually small
Unlock the answer
question
Chebyshev's Theorem
answer
For z>1, at least (1-1/z^2)100% of the data values must be withing z standard deviations of the mean
Unlock the answer
question
Random Variable
answer
A numerical description of the outcome of an event (ex. The amount a company pays out on an individual policy, based on the outcome of a random event)
Unlock the answer
question
Probability
answer
The likelihood that an event will occur and give values within a particular range
Unlock the answer
question
Probability Distribution
answer
The collection of all possible values of the random variable X and the associated probabilities P(X=x), always between 0 and 1. The sum of all P(X=x) is always 1
Unlock the answer
question
Central Limit Theorem
answer
Even if X does not have a normal distribution, X will be approximately normal if n is large (n=30 is usually large enough)
Unlock the answer
question
p bar is the point estimator or sample statistic for the population parameter; p
answer
pbar=x/n
Unlock the answer
question
Point Estimator
answer
The value (obtained from a sample) which is considered a best guess or estimate of a population parameter
Unlock the answer
question
Parameter
answer
We have a single population "target of interest". From this population, we identify a parameter. This parameter is a fixed numerical value. The problem is that we do not know its value, but wish to know it. Interval estimation will help us estimate it.
Unlock the answer
question
Sampling distribution of xbar
answer
Tells us (in probability of terms) how close a point estimator is to the parameter
Unlock the answer
question
Margin of Error (ME or MOE)
answer
ME or MOE is a quantification of how close a point estimator is to the parameter.
Unlock the answer
question
Is it possible that an interval estimate may not capture the parameter?
answer
Yes. This is what we call uncertainty.
Unlock the answer
question
Is there a way of controlling the uncertainty that the interval captures the parameter?
answer
Yes. Attach a level of desired certainty to the interval estimate.
Unlock the answer
question
Confidence Level or (1-alpha) (ex. we are 95% confident)
answer
Quantifies how often a confidence interval captures the population parameter
Unlock the answer
question
Margin of Error
answer
Critical Value x Standard Error
Unlock the answer
question
Confidence Level and ME work in opposite directions
answer
If we want to be more confident (a higher confidence level), we have to accept a higher ME. A higher ME means that our interval estimate will be less precise.
Unlock the answer
question
A large n
answer
Has clear benefits as ME is lower but may cost us time and money
Unlock the answer
question
A small n
answer
Leads to a higher ME, so no real benefit but costs less
Unlock the answer
question
The compromise
answer
The compromise will be to specify some desired ME or MOE and look for n that can achieve our specified goal
Unlock the answer
question
Null Hypothesis
answer
The hypothesis that can possibly be disproved using sample information or evidence
Unlock the answer
question
Type 1 Error
answer
Occurs when rejecting the null hypothesis when it is actually true, or claiming the alternate when it is not true
Unlock the answer
question
Type 2 Error
answer
Occurs when we fail to reject the null hypothesis when it is actually false
Unlock the answer
question
When using p-value approach
answer
Reject the null if p-value is less than alpha, and do not reject null if p-value is more than alpha
Unlock the answer
question
Critical Value Approach
answer
If the Critical Value is a positive, reject null if test statistic is greater than CV. IF CV is negative, reject null if test statistic is less than CV.
Unlock the answer
question
Matched Sample Design
answer
Each sampled item provides a pair of data values. This design often leads to a smaller sampling error than independent-sample design because it is a variation between sampled items is eliminated as a source of sampling error.
Unlock the answer
question
Experiment
answer
A study in which the experimenter manipulates attributes of what is being studied and observes the consequences.
Unlock the answer
question
Factors
answer
These are the attributes that are manipulated by being set to particular levels and then assigned to individuals. An experimenter identifies at least one factor to manipulate. these levels are often called treatments.
Unlock the answer
question
Observed Response
answer
A quantitative measurement in ANOVA
Unlock the answer
question
ANOVA
answer
Analysis of Variance, can be used to test for the equality of three or more population means
Unlock the answer
question
Data obtained from observational or experimental studies can be used for the analysis
answer
True
Unlock the answer
question
If Ho is rejected, we cannot conclude that all population means are different
answer
True
Unlock the answer
question
Rejecting Ho means that at least two population means have different values
answer
True
Unlock the answer
question
For each population, the response (dependent) variable is normally distributed
answer
True
Unlock the answer
question
ANOVA can be viewed as the process of partitioning the total sum of squares and the degrees of freedom into their corresponding sources: treatments and error
answer
True
Unlock the answer
question
Randomized Block Design
answer
Experimental units are the objects of interest in the experiment
Unlock the answer
question
Completely randomized design
answer
An experimental design in which the treatments are randomly assigned to the experimental units
Unlock the answer
question
Factorial Experiment
answer
Used because the conditions include all possible combinations of the factors.
Unlock the answer
question
Managerial decisions often are based on the relationship between two or more variables
answer
True
Unlock the answer
question
Regression Analysis
answer
Can be used to develop an equation showing how the variables are related
Unlock the answer
question
Dependent Variable
answer
Variable being predicted and denoted by "y"
Unlock the answer
question
Independent Variable
answer
Variable being used to predict value of the dependent variable and denoted by "x"
Unlock the answer
question
Simple Linear Regression
answer
Involves one independent variable and one dependent variable, approximated by a straight line
Unlock the answer
question
Regression Model
answer
Equation that describes how "y" is related to "x"
Unlock the answer
question
Simple linear regression model
answer
y = Bo + B1x + E
Unlock the answer
question
Bo and B1
answer
Parameters of the model
Unlock the answer
question
Bo
answer
The "y" intercept of the regression line
Unlock the answer
question
B1
answer
The slope of the regression line
Unlock the answer
question
Estimated Simple Linear Regression Equation
answer
yhat = Bo +B1x
Unlock the answer