Question 1

Reliability, in a broad statistical sense, is synonymous with:

Accepted Answer

consistency.

Question 2

A source of error variance may take the form of:

Accepted Answer

item sampling, test takers\' reactions to environment-related variables such as room temperature and lighting, and test taker variables such as amount of sleep the night before a test, amount of anxiety, or drug effects (all of these.)

Question 3

Which type of reliability estimate is obtained by correlating pairs of scores from the same person (or people) on two different administrations of the same test?

Accepted Answer

a test-retest estimate

Question 4

A reliability coefficient is:

Accepted Answer

an index, a ratio of the total variance attributed to true variance, and unaffected by a systematic source of error (All of these.)

Question 5

What is the difference between alternate forms and parallel forms of a test?

Accepted Answer

Alternate forms do not necessarily yield test scores with equal means and variances.

Question 6

Which of the following types of reliability estimates is the most expensive due to the costs involved in test development?

Accepted Answer

alternate-form

Question 7

An estimate of test-retest reliability is often referred to as a coefficient of stability when the time interval between the test and retest is more than:

Accepted Answer

6 months.

Question 8

As the reliability of a test increases, the standard error of measurement:

Accepted Answer

decreases.

Question 9

Which type of reliability estimate would be appropriate only when evaluating the reliability of a test that measures a trait that is relatively stable over time?

Accepted Answer

test-retest

Question 10

Which of the following is true of systematic error?

Accepted Answer

It has no effect on the reliability of a measure.

Question 11

Computer-scorable items have tended to eliminate error variance due to:

Accepted Answer

scorer differences.

Question 12

Which of the following might lead to a decrease in test-retest reliability?

Accepted Answer

the passage of time between the two administrations of the test, coaching designed to increase test scores between the two administrations of the test, and practice with similar test materials between the two administrations of the test (All of these.)

Question 13

If items from a test are measuring the same trait, estimates of reliability yielded from KR-20 will typically be ________ as compared to estimates from split-half methods.

Accepted Answer

higher

Question 14

Which of the following is TRUE for estimates of alternate- and parallel-forms reliability?

Accepted Answer

Two test administrations with the same group are required, Test scores may be affected by factors such as motivation, fatigue, or intervening events like practice, learning, or therapy, and Item sampling is a source of error variance (All of these.)

Question 15

If traditional measures of reliability are applied to criterion- referenced tests, the reliability estimates will likely be:

Accepted Answer

spuriously high.

Question 16

Test-retest estimates of reliability are referred to as measures of ________, and split-half reliability estimates are referred to as measures of ________.

Accepted Answer

stability; internal consistency

Question 17

For a heterogeneous test, measures of internal-consistency reliability will tend to be ________ compared with other methods of estimating reliability.

Accepted Answer

lower

Question 18

Which of the following factors may influence a split-half reliability estimate?

Accepted Answer

fatigue, anxiety, and item difficulty (all of these.)

Question 19

KR-20 is the statistic of choice for tests with which types of items?

Accepted Answer

multiple-choice and true-false (all of these.)

Question 20

The Spearman-Brown formula is used for:

Accepted Answer

correcting for one half of the test by estimating the reliability of the whole test, determining how many additional items are needed to increase reliability up to a certain level, and determining how many items can be eliminated without reducing reliability below a predetermined level (all of these.)

Question 21

Typically, adding items to a test will have what effect on the test\'s reliability?

Accepted Answer

Reliability will increase.

Question 22

Which of the following is NOT an acceptable way to divide a test when using the split-half reliability method?

Accepted Answer

Assign easy items to one half of the test and difficult items to the other half.

Question 23

Coefficient alpha is appropriate to use with all of the following test formats EXCEPT:

Accepted Answer

essay exam with no partial credit awarded.

Question 24

Which of the following is TRUE about coefficient alpha?

Accepted Answer

It is a characteristic of a particular set of scores, not of the test itself.

Question 25

A police officer mistakenly records the blood alcohol level of a suspected drunk driver after administering a breathalyzer test. This mistake is most related to which type of reliability?

Accepted Answer

interscorer

Question 26

A coefficient alpha over .9 may indicate that:

Accepted Answer

the items in the test are redundant.

Question 27

Which best conveys the meaning of an inter-scorer reliability estimate of .90?

Accepted Answer

Ninety percent of the variance in the scores assigned by the scorers was attributed to true differences and 10% to error.

Question 28

If a time limit is long enough to allow test takers to attempt all items, and if some items are so difficult that no test taker is able to obtain a perfect score, then the test is referred to as a ________ test.

Accepted Answer

power

Question 29

If a test is homogeneous:

Accepted Answer

it is functionally uniform throughout, it will likely yield a high internal-consistency reliability estimate compared with test-retest, and it would be reasonable to expect a high degree of internal consistency (all of these.)

Question 30

Which type(s) of reliability estimates would be most appropriate for a measure of heart rate?

Accepted Answer

test-retest

Question 31

Typically, speed tests:

Accepted Answer

contain items of a uniform difficulty level.

Question 32

Which type(s) of reliability estimates would be appropriate for a speed test?

Accepted Answer

test-retest, alternate-form, and split-half from two independent testing sessions (all of these.)

Question 33

Generalizability theory is most closely related to

Accepted Answer

test reliability.

Question 34

In classical test theory, there exists only one true score. In Cronbach generalizability theory, how many of these true scores exist?

Accepted Answer

many, depending on the number of different universes

Question 35

Traditional measures of reliability are inappropriate for criterion-referenced tests because variability:

Accepted Answer

is minimized with criterion-referenced tests.

Question 36

A test is considered valid when the test:

Accepted Answer

measures what it purports to measure.

Question 37

Face validity refers to:

Accepted Answer

the appearance of relevancy of the test items.

Question 38

Which is NOT a method of evaluating the validity of a test?

Accepted Answer

evaluating the percentage of passing and failing grades on the test

Question 39

Predictive and concurrent validity can be subsumed under:

Accepted Answer

criterion-related validity

Question 40

Face validity

Accepted Answer

may influence the way the test-taker approaches the situation, relates more to what the test appears to measure than what the test may actually measure, and has received little attention and is given short-shrift as compared to other indices of validity (all of these.)

Question 41

Which assessment technique has the MOST face validity?

Accepted Answer

administering a word processing test to a person applying to be a word processor

Question 42

Relating scores obtained on a test to other test scores or data from other assessment procedures is typically done in an effort to establish the __________ validity of a test.

Accepted Answer

criterion-related

Question 43

An instructor announces that an examination will cover the topics of reliability and validity. A student boasts that he will read and study only the material on reliability. In fact, all the test questions are only on reliability. The best conclusion a student of assessment could draw from this is that:

Accepted Answer

the examination lacked content validity.

Question 44

Before constructing a comprehensive final examination, your instructor reviews the objectives of the course, the textbook, and all lecture notes. Your instructor is making an effort to maximize the __________ validity of the final examination.

Accepted Answer

content

Question 45

Lawshe devised a method for determining agreement among raters or judges who rate items on how essential they are. This method provides a way to quantify what type of validity?

Accepted Answer

content

Question 46

In calculating the content validity ratio, panelists are asked to determine:

Accepted Answer

if the skill or knowledge measured by the item is essential.

Question 47

A standard against which a test or test score is evaluated is known as:

Accepted Answer

a criterion.

Question 48

The minimum value of a content validity ratio necessary to be statistically significant at the .05 level is dependent on:

Accepted Answer

the number of panelists judging the items.

Question 49

Which may best be viewed as varieties of criterion-related validity?

Accepted Answer

concurrent validity and predictive validity

Question 50

The form of criterion-related validity that reflects the degree to which a test score is correlated with a criterion measure obtained at the same time that the test score was obtained is known as:

Accepted Answer

concurrent validity.

Question 51

The form of criterion-related validity that reflects the degree to which a test score correlates with a criterion measure that was obtained some time subsequent to the test score is known as:

Accepted Answer

predictive validity.

Question 52

A key difference between concurrent and predictive validity has to do with:

Accepted Answer

the time frame during which data on the criterion measure is collected.

Question 53

Which is an example of a criterion?

Accepted Answer

achievement test scores, success in being able to repair a defective toaster, and student ratings of teaching effectiveness (all of these.)

Question 54

An index of utility can be distinguished from an index of reliability and an index of validity in that an index of utility can tell us something about:

Accepted Answer

the practical value of the information derived from what a test measures.

Question 55

Test validity:

Accepted Answer

sets a ceiling on test utility.

Question 56

One of the noneconomic benefits of a diagnostic test used to make decisions about involuntary hospitalization of psychiatric patients is a benefit to:

Accepted Answer

society-at-large.

Question 57

Costs associated with testing include all of the following EXCEPT:

Accepted Answer

return on investment.

Question 58

The end-point of a utility analysis is typically an educated decision about:

Accepted Answer

which of many possible courses of action is optimal.

Question 59

A utility analysis is conducted using:

Accepted Answer

expectancy tables, Naylor-Shine tables, and Taylor-Russell tables (All of these.)

Question 60

If targeted test-takers for a particular test consistently fail to follow the directions for taking the test then:

Accepted Answer

the test could still have great utility and the test could still be valid (b and c.)

Question 61

Validity is to ____________ as utility is to ____________.

Accepted Answer

accuracy; usefulness

Question 62

A potential noneconomic benefit of a well-run evaluation program is:

Accepted Answer

increase in quantity and quality of workers\' on-the-job performance, decrease in time it takes to train new workers, and reduction in the number of workplace accidents (All of these.)

Question 63

The Angoff method of setting cutting scores relies heavily on:

Accepted Answer

the judgment of experts.

Question 64

The \"Achilles heel\" of the Angoff method is:

Accepted Answer

interrater reliability.

Final Exam Review Test Questions Answers – Flashcards

Unlock all answers in this set

Haven't found what you were looking for?

Search for samples, answers to your questions and flashcards