question

An examiner administers and scores the same test numerous times without deviating from the procedure in order to reduce the possibility of measurement error. This exemplifies what?

answer

Standardization

question

The scores of a representative population sample on a test that an examiner compares an individual's scores to are referred to as ________; while they allow for comparisons on a person's performance on different tests, they do not provide the ultimate standard of performance.

answer

Norms

question

A psychological test that is regarded ________ is administered, scored, and interpreted independent of the subjective judgment of the examiner.

answer

Objective

question

The SAT and GRE are examples of ________ tests, as they provide information about a person's best possible performance, while the MMPI-2 and PAI are ________ tests, providing information about a person's usual experience.

answer

Maximum performance; typical performance

question

________ tests asses the difficulty level an examinee can attain (e.g., Information from WAIS), ________ tests asses the person's response rate (e.g., Digit Symbol from WAIS), and ________ tests help determine whether an individual can attain a certain level of acceptable performance (e.g., test of reading skills).

answer

Power; speed; mastery

question

A ________ occurs when an instrument cannot take on a value higher than some limit due to the measure not including enough difficult items, resulting in all high-achieving examinees getting similar scores (test is too easy); conversely, a ________ occurs when an instrument cannot take on a lower value and thus all low-achieving examinees get similar scores (test is too hard).

answer

Ceiling effect; floor effect

question

In contrast to normative measures, these types of measures use individuals themselves as their own frame of reference, comparing 2 or more desirable options and choosing the one that is most preferred.

answer

Ipsative measures

question

________ is the consistency of a test, or the degree to which a test provides the same results under the same conditions; ________ refers to the degree that a test measures what it claims to be measuring.

answer

Reliability; validity

question

A perfectly reliable test would yield every examinees' ________ every time it was administered, as this would indicate the examinees' actual ability on whatever the test is measuring; however, a test is never perfectly reliable due to ________, which is random and can be caused by environmental noise, examinee's mood on testing day, and any other number of factors.

answer

True score; measurement error

question

The most commonly used methods of estimating reliability of a test use a correlation coefficient, referred to as the ________, ranging in value from 0.0 to +1.0, where coefficients closer to 0.0 indicate less reliability and values closer to +1.0 indicate increasing reliability; the coefficient is not squared to determine the proportion of variability, unlike other correlation coefficients, rather it is interpreted directly.

answer

Reliability coefficient

question

A researcher administers the same instrument to the same group of college students on 2 separate occasions; following the second administration, the researcher correlates on the first and second administrations. What type of reliability is the researcher attempting to obtain?

answer

Test-retest reliability (or "coefficient of stability")

question

True or False: It is not recommended to use the test-retest coefficient when attempting to obtain reliability for a test that measures attributes that are unstable (e.g., mood)?

answer

True- low coefficients, in such cases, would likely be more a reflection of the attribute's unreliability rather than the test's unreliability

question

A researcher administers one form of a test on one day, then administers an equivalent form to the same group of people at a later date/time. What type of reliability is being sought in this example?

answer

Alternate forms reliability (or "coefficient of equivalence;" parallel-forms reliability)

question

When correlations are obtained among individual test items, ________ reliability is being assessed; the 3 methods for obtaining this reliability include ________ (involves dividing test into 2 parts then correlating responses from the 2 parts), ________ (used when test items are dichotomously scored- e.g., "true/false"), and ________ (used for tests with multiple-scored items- e.g., "never/rarely/sometimes/always").

answer

Internal consistency (or "coefficient of internal consistency"); split-half; Kuder-Richardson Formula 20; Cronbach's coefficient alpha

question

While the split-half reliability coefficient usually lowers the reliability coefficient artificially, the ________ can be used to correct for the effects of shortening the measure.

answer

Spearman-Brown formula

question

What type of tests are measures of internal consistency not good at assessing reliability for?

answer

Speed tests, as the correlation would be spuriously inflated

question

Instruments that rely on rater judgments would be best to have high ________ reliability, which is increased when scoring categories are ________ and ________.

answer

Inter-rater (interscorer); mutually exclusive (a particular behavior belongs to a single category); exhaustive (categories cover al possible responses/behaviors)

question

The ________ estimates the amount of error to be expected in an individual test score and is used to determine a range, referred to as a/an ________, within which an examinee's true score will likely fall.

answer

Standard Error of Measurement; confidence interval

question

What is the formula for the standard error of the measurement?

answer

SEmeas = SDx (standard deviation of test scores) / rxx (reliability coefficient)

question

What is the probability that a person's true score lies within a range of plus or minus 1 standard error of measurement (SEM) of their obtained score? How about plus or minus 1.96 (2) SEM? And finally, plus or minus 2.58 (2.5) SEM?

answer

68% of the time; 95% of the time; 99% of the time

question

True or False: Hypothetically, a test with a reliability coefficient of +1.0 would have a standard error of measurement of 0.0?

answer

True- a test with perfect reliability will have no error

question

The standard error of measurement is ________ related to the reliability coefficient (rxx) and ________ related to the standard deviation of test scores (SDx).

answer

Inversely; positively

question

What reliability coefficient, when practical, is the best to use?

answer

Alternate-forms

question

Classical test theory states that an observed score reflects ________ plus ________.

answer

True score variance; random error variance

question

Methods of recording behaviors include ________ recording (elapsed time that behavior occurs is recorded), ________ recording (number of times behavior occurs is recorded), ________ recording (rater notes whether subject engages in behavior during given time period), and ________ recording (all behavior during an observation session is recorded).

answer

Duration; frequency; interval; continuous

question

Simply put, ________ refers to the degree a test measures what it purports to measure.

answer

Validity

question

A depression scale that only assesses the affective aspects of depression but fails to account for the behavioral aspects would be lacking what type of validity?

answer

Content validity, which refers to the extent to which test items represent all facets of the content area being measured (e.g., EPPP)

question

True or False: Content validity assessment requires a degree of agreement between experts in the subject matter, thus it includes an element of subjectivity?

answer

True- in addition, tests should correlate highly with other tests that measure the same content domain

question

In contrast to content validity, ________ occurs when a test appears to valid by examinees, administrators, and other untrained observers; it is not technically a type of test validity.

answer

Face validity

question

A personality test that effectively predicts the future behavior of an examinee has what type validity?

answer

Criterion-related validity, which is obtained by correlating scores on a predictor test to some external criterion (e.g., academic achievement, job performance)

question

Criterion-related validity is assessed using a/an ________ to determine the relationship between the predictor and the criterion; for interpretation this value can be squared, producing the "________," which indicates the proportion of variability in the criterion that is explained by variability in the predictor.

answer

Correlation coefficient; coefficient of determination

question

The process of ________ validation involves the predictor and the criterion being collected at the same time, providing information regarding a test's usefulness for predicting a given current behavior; ________ validation involves a waiting period between collection of predictor scores and criterion data, providing information regarding a test's usefulness for predicting future behavior.

answer

Concurrent; predictive

question

When interpreting a person's predicted score on a given criterion measure, the ________ will determine within what range of scores their actual score will likely fall.

answer

Standard Error of Estimate

question

The standard error of measurement constructs a confidence interval around an examinee's ________ score (using a reliability coefficient), while the standard error of estimate does the same for an examinee's ________ score (using a validity coefficient).

answer

Obtained; predicted

question

Interviewees are given an aptitude test (predictor) to predict work success (criterion), with hiring contingent on achieving a certain minimum score, called a/an ________ score. The manager then rates performance on work tasks, an indication of success, and only those who score above a certain ________ are deemed successful.

answer

Predictor cutoff; criterion cutoff

question

Scoring above both the predictor and criterion cutoff points produces ________; scoring above the predictor cutoff point but below the criterion cutoff point produces ________; scoring below the predictor cutoff point but above the criterion cutoff point produces; and scoring below both the predictor and criterion cutoff points produces ________.

answer

True positives (valid acceptances); false positives (false acceptances); false negatives (invalid rejections); true negatives (valid rejections)

question

Some factors contributing to a low validity coefficient include the validation group being ________ or the predictor and/or criterion being ________.

answer

Homogenous; unreliable

question

When a test has a different validity coefficient for one group compared to another, the variables affecting validity are called ________ variables; when this is the case, the test is said to have ________.

answer

Moderator; differential validity

question

This is the process whereby an already validated test is re-validated with a different sample of people than the original validation sample.

answer

Cross-validation

question

What term is used to describe the reduction that occurs in a criterion-related validity coefficient after cross-validation?

answer

Shrinkage

question

The greatest shrinkage occurs when the original validation sample is ________, the original item pools is ________, the number of items retained is ________ relative to the items in the item pool, and/or item are not chosen based on ________ or ________.

answer

Small; large; small; previously formulated hypothesis; experience with the criterion

question

________ is one way a predictor might end up looking more valid than it actually is, which occurs when predictor scores themselves influence any person's criterion status (e.g., manager is aware that factory worker did well on predictor, this knowledge positively influences manager's ratings on criterion performance).

answer

Criterion contamination

question

How is criterion contamination prevented?

answer

Criterion raters should have no prior knowledge of examinees' predictor scores

question

Theorized psychological variables (e.g., personality, intelligence) that are abstract and not directly observable are referred to as ________, hence ________ provides an indication of the degree to which an instrument measures or correlates with such variables.

answer

Construct; construct validity

question

A newly developed test of personality has a high correlation with the MMPI-2 and a low correlation with the Wechsler Memory Scale, indicating the test has both ________ validity and ________ validity, respectively.

answer

Convergent; discriminant/divergent - both are forms of construct validity

question

True or False: The only time a low correlation coefficient provides evidence of high validity is when discriminant validity is indicated due to there being a low correlation between 2 tests that measure different constructs?

answer

True- in all other cases, high validity is indicated by a high correlation coefficient

question

What complex procedure for assessing convergent and discriminant validity requires the assessment of 2 or more traits (e.g., personality, depression) by 2 or more methods (e.g., self-report, peer rating)?

answer

Multitrait-multimethod matrix

question

When using the multitrait-multimethod matrix, ________ validity is indicated when tests that measure the same traits are highly correlated, even when different methods of measurement are used; conversely, ________ validity is indicated when tests that measure different constructs are minimally correlated, even when the same method of measurement.

answer

Convergent; discriminant

question

The ________ coefficient is a reliability coefficient, as it indicates the correlation between itself and the measure; correlations between two measures that measure the same trait using different methods are called ________ coefficients; correlations between two measures that measure different traits using the same method are called ________ coefficients; and correlations between 2 measures that measure different traits using different methods are called ________ coefficients.

answer

Monotrait-monomethod; monotrait-heteromethod; heterotrait-monomethod; heterotrait-heteromethod

question

When assessing validity using the multitrait-multimethod matrix, convergent validity is indicated when there is a high ________ correlation, while discriminant validity is indicated by a low ________ correlation and further confirmed by a ________ heterotrait-heteromethod correlation.

answer

Monotrait-heteromethod; heterotrait-monomethod; low

question

________, often used to assess the construct validity of a test or tests, involves reducing a larger set of variables into fewer classified sets of variables based on the construct that is primarily "picked-up" by each measure; each variable is correlated with every other variable, creating a ________.

answer

Factor analysis; factor matrix

question

The main purpose of factor analysis is to reveal how many and to what degree underlying constructs, also called ________ due to the fact that the analysis does not directly intend to measure them, can account for scores on a larger number of tests.

answer

Latent variables

question

In a hypothetical factor analysis, the factor matrix indicates a correlation coefficient of .68 between the depression subscale of the MMPI-2 and Factor II. What term is used to describe the correlation between the depression subscale and Factor II?

answer

Factor loading, which refers to the correlation between a given test and a given factor (e.g., the depression subscale loads .68 on Factor II); it can be square to determine proportion of variability

question

________ determines the proportion of variance of a test that is attributable to the factors; it is the sum of squared factor loadings.

answer

Communality (h-squared) - not the case when oblique rotation is used

question

The amount of variability in a test that can be explained by whatever traits are represented by the factors is referred to as ________, while variance that is specific to the test and not explained by the factors is referred to as ________.

answer

Common variance (represents communality); unique variance (represents specificity)

question

In a factor analysis, these values indicate the amount of variance in all the tests accounted for by the factor; they are analyzed to determine whether or not the factor is accounting for a significant amount of variability in the tests.

answer

Eigenvalues (or explained variance)

question

If a factor analysis is performed on 8 tests, what is the largest the sum of the eigenvalues can be?

answer

Since the sum of the eigenvalues can be no larger than the number of tests included in the factor analysis, the answer is 8

question

A procedure that facilitates factor matrix interpretation is ________, which involves re-dividing the test's communalities so that a clearer pattern of loadings emerges.

answer

Rotation

question

Two general rotation strategies include ________ for factors that are uncorrelated (independent of each other) and ________ for correlated factors; the decision as to which one is used is based on the researcher's theoretical assumptions.

answer

Orthogonal; oblique

question

When construct validity is being assessed using factor analysis, a high correlation between a test and a factor the test is expected to correlate highly with is referred to as what?

answer

Factorial validity

question

While factor analysis assumes variance in a variable is composed of ________, ________, and ________, principle components analysis assumes variance is composed of ________ and ________.

answer

Communality; specificity; error; explained variance; error variance

question

"Factor" is to factor analysis as ________ or ________ is to principal components analysis.

answer

Principal component; eigenvector

question

What method might a researcher who is interested in developing a taxonomy (classification system) of different personality characteristics use?

answer

Cluster analysis

question

In ________ analysis, only interval and ratio data can be used and researchers typically have an a priori hypothesis about what traits a set of variables measure; by contrast, ________ can be performed using any type of data (interval, ration, nominal, ordinal) and is not designed for studies where the researcher has an a priori hypothesis.

answer

Factor analysis; cluster analysis

question

True or False: A reliable test is not always a valid test, though a valid test must be a reliable test?

answer

True- reliability is a necessary but not sufficient condition for validity

question

The ________ coefficient is less than or equal to the square root of the ________ coefficient; it cannot be any higher, thus the latter sets a ________ on the former.

answer

Validity; reliability; ceiling (or upper-limit)

question

A researcher discovers a test has low reliability; however, she is interested in what the validity coefficient of the predictor would be if both the predictor and the criterion were perfectly reliable. What formula would she use?

answer

Correction for attenuation

question

What is the correlation between the factors in a factor analysis where an orthogonal rotation is used?

answer

By definition, the correlation would be 0.0

question

What is used to determine which test items will be retained for the final version of a test and to ensure that a test is both reliable and valid from the start?

answer

Item analysis

question

The ________ the p-value, the ________ the item.

answer

Higher; less difficult OR lower; more difficult

question

The percentage of examinees that answer an item correctly is referred to as a/an ________, which is abbreviated ________; most test developers prefer items with a ________ value at or around ________.

answer

Item difficulty index; p; p; .50

question

The rule-of-thumb for item difficulty on a test is that the optimal difficulty level of test items should be approximately halfway between 1.0 (i.e., everyone is correct) and the level of success expected by chance alone. That known, what is the optimal item difficulty level of a multiple choice test with 4 options (e.g., EPPP)?

answer

p = .625, which means there is a 62.5% chance of guessing the correct answer to an item

question

According to Anastasi, the p-level expresses item difficulty in terms of an ________ scale, as conclusions cannot be made about the differences in difficulty between items, only that certain items are easier/harder than others.

answer

Ordinal (difficulty level are rankings, according to Anastasi)

question

The degree to which a test item differentiates among test-takers in terms of the behavior the test is designed to measure is called ________ and can be assessed by calculating a/an ________, which is abbreviated as "________."

answer

Item discrimination; item discrimination index; D

question

An item on a measure of anxiety would have good ________ if low-anxiety examinees consistently answered it differently than high-anxiety examinees.

answer

Discriminability (item discrimination)

question

An item's ________ level places a ceiling on its ________ index; higher levels of discriminability are associated with ________ levels of difficulty.

answer

Difficulty; discrimination; moderate

question

True or False: The reliability of a test will decrease as the mean discrimination index (D) increases?

answer

False- there is a direct correlation between test reliability and mean D

question

A graphical depiction of both item difficulty and item discrimination is called a/an ________; analysis based on ________ is derived from these.

answer

Item characteristic curve (ICC); item response theory

question

An item characteristic curve identifies 3 ________, including item difficulty, item discrimination, and ________.

answer

The probability that a question can be answered correctly by guessing; this is indicated on a chart by the point at which the curve crosses the y-axis)

question

Item response theory assumes (1) performance on an item is related to the estimated amount of a/an ________ being measured by the item, and (2) ________ (an item should have the same characteristics regardless of the sample of people taking the test).

answer

Latent trait; invariance of item parameters

question

The computerized selection of test items for individual examinees is referred to as what?

answer

Computer adaptive assessment

question

What item difficulty level is associated with the maximum level of differentiation among examinees?

answer

.50, indicating half answered correctly and half answered incorrectly

question

What factor most affects an item's difficulty level?

answer

Characteristics of examinees

question

What type of interpretation indicates where the examinee stands in relation to others who have taken the same test?

answer

Norm-referenced interpretation

question

Providing a general indication as to the progression a person has made along the normal developmental path, ________ norms include ________ and ________.

answer

Developmental; mental age; grade equivalent scores

question

What is the calculation for ratio IQ?

answer

(mental age/chronological age) x 100

question

A 20-year-old performs as well on a test as the average 10-year-old. His mental age is ________ and his ratio IQ is ________.

answer

10-years-old; 50

question

Indicating the grade level a person's performance is equivalent to, ________ are typically used in the interpretation of educational achievement tests.

answer

Grade equivalent scores (e.g., Wide Range Achievement Test, 4th Ed [WRAT-4])

question

True or False: When using developmental norms, scores obtained by people of different age groups are not comparable?

answer

True- this is due to the fact that standard deviation is not accounted for

question

Including percentile ranks and standard scores, ________ norms compare examinee scores to those of the most nearly comparable standardization sample.

answer

Within-group

question

Z-scores, t-scores, stanine scores, and deviation IQ scores are all examples of ________, which express a raw score's distance from the mean in terms of standard deviation.

answer

Standard scores

question

Identify the mean (M) and standard deviation (sd) of: z-scores, t-scores, stanine scores, and deviation IQ scores.

answer

Z-score (M = 0, sd = 1), t-score (M = 50, sd = 10), stanine (M = 5, sd = about 2), deviation IQ (M = 100, sd = 15)

09 Test – Flashcard

Unlock all answers in this set

Haven't found what you were looking for?

Search for samples, answers to your questions and flashcards