Final Exam Review Test Questions Answers – Flashcards
Unlock all answers in this set
Unlock answersquestion
Reliability, in a broad statistical sense, is synonymous with:
answer
consistency.
question
A source of error variance may take the form of:
answer
item sampling, test takers' reactions to environment-related variables such as room temperature and lighting, and test taker variables such as amount of sleep the night before a test, amount of anxiety, or drug effects (all of these.)
question
Which type of reliability estimate is obtained by correlating pairs of scores from the same person (or people) on two different administrations of the same test?
answer
a test-retest estimate
question
A reliability coefficient is:
answer
an index, a ratio of the total variance attributed to true variance, and unaffected by a systematic source of error (All of these.)
question
What is the difference between alternate forms and parallel forms of a test?
answer
Alternate forms do not necessarily yield test scores with equal means and variances.
question
Which of the following types of reliability estimates is the most expensive due to the costs involved in test development?
answer
alternate-form
question
An estimate of test-retest reliability is often referred to as a coefficient of stability when the time interval between the test and retest is more than:
answer
6 months.
question
As the reliability of a test increases, the standard error of measurement:
answer
decreases.
question
Which type of reliability estimate would be appropriate only when evaluating the reliability of a test that measures a trait that is relatively stable over time?
answer
test-retest
question
Which of the following is true of systematic error?
answer
It has no effect on the reliability of a measure.
question
Computer-scorable items have tended to eliminate error variance due to:
answer
scorer differences.
question
Which of the following might lead to a decrease in test-retest reliability?
answer
the passage of time between the two administrations of the test, coaching designed to increase test scores between the two administrations of the test, and practice with similar test materials between the two administrations of the test (All of these.)
question
If items from a test are measuring the same trait, estimates of reliability yielded from KR-20 will typically be ________ as compared to estimates from split-half methods.
answer
higher
question
Which of the following is TRUE for estimates of alternate- and parallel-forms reliability?
answer
Two test administrations with the same group are required, Test scores may be affected by factors such as motivation, fatigue, or intervening events like practice, learning, or therapy, and Item sampling is a source of error variance (All of these.)
question
If traditional measures of reliability are applied to criterion- referenced tests, the reliability estimates will likely be:
answer
spuriously high.
question
Test-retest estimates of reliability are referred to as measures of ________, and split-half reliability estimates are referred to as measures of ________.
answer
stability; internal consistency
question
For a heterogeneous test, measures of internal-consistency reliability will tend to be ________ compared with other methods of estimating reliability.
answer
lower
question
Which of the following factors may influence a split-half reliability estimate?
answer
fatigue, anxiety, and item difficulty (all of these.)
question
KR-20 is the statistic of choice for tests with which types of items?
answer
multiple-choice and true-false (all of these.)
question
The Spearman-Brown formula is used for:
answer
correcting for one half of the test by estimating the reliability of the whole test, determining how many additional items are needed to increase reliability up to a certain level, and determining how many items can be eliminated without reducing reliability below a predetermined level (all of these.)
question
Typically, adding items to a test will have what effect on the test's reliability?
answer
Reliability will increase.
question
Which of the following is NOT an acceptable way to divide a test when using the split-half reliability method?
answer
Assign easy items to one half of the test and difficult items to the other half.
question
Coefficient alpha is appropriate to use with all of the following test formats EXCEPT:
answer
essay exam with no partial credit awarded.
question
Which of the following is TRUE about coefficient alpha?
answer
It is a characteristic of a particular set of scores, not of the test itself.
question
A police officer mistakenly records the blood alcohol level of a suspected drunk driver after administering a breathalyzer test. This mistake is most related to which type of reliability?
answer
interscorer
question
A coefficient alpha over .9 may indicate that:
answer
the items in the test are redundant.
question
Which best conveys the meaning of an inter-scorer reliability estimate of .90?
answer
Ninety percent of the variance in the scores assigned by the scorers was attributed to true differences and 10% to error.
question
If a time limit is long enough to allow test takers to attempt all items, and if some items are so difficult that no test taker is able to obtain a perfect score, then the test is referred to as a ________ test.
answer
power
question
If a test is homogeneous:
answer
it is functionally uniform throughout, it will likely yield a high internal-consistency reliability estimate compared with test-retest, and it would be reasonable to expect a high degree of internal consistency (all of these.)
question
Which type(s) of reliability estimates would be most appropriate for a measure of heart rate?
answer
test-retest
question
Typically, speed tests:
answer
contain items of a uniform difficulty level.
question
Which type(s) of reliability estimates would be appropriate for a speed test?
answer
test-retest, alternate-form, and split-half from two independent testing sessions (all of these.)
question
Generalizability theory is most closely related to
answer
test reliability.
question
In classical test theory, there exists only one true score. In Cronbach generalizability theory, how many of these true scores exist?
answer
many, depending on the number of different universes
question
Traditional measures of reliability are inappropriate for criterion-referenced tests because variability:
answer
is minimized with criterion-referenced tests.
question
A test is considered valid when the test:
answer
measures what it purports to measure.
question
Face validity refers to:
answer
the appearance of relevancy of the test items.
question
Which is NOT a method of evaluating the validity of a test?
answer
evaluating the percentage of passing and failing grades on the test
question
Predictive and concurrent validity can be subsumed under:
answer
criterion-related validity
question
Face validity
answer
may influence the way the test-taker approaches the situation, relates more to what the test appears to measure than what the test may actually measure, and has received little attention and is given short-shrift as compared to other indices of validity (all of these.)
question
Which assessment technique has the MOST face validity?
answer
administering a word processing test to a person applying to be a word processor
question
Relating scores obtained on a test to other test scores or data from other assessment procedures is typically done in an effort to establish the __________ validity of a test.
answer
criterion-related
question
An instructor announces that an examination will cover the topics of reliability and validity. A student boasts that he will read and study only the material on reliability. In fact, all the test questions are only on reliability. The best conclusion a student of assessment could draw from this is that:
answer
the examination lacked content validity.
question
Before constructing a comprehensive final examination, your instructor reviews the objectives of the course, the textbook, and all lecture notes. Your instructor is making an effort to maximize the __________ validity of the final examination.
answer
content
question
Lawshe devised a method for determining agreement among raters or judges who rate items on how essential they are. This method provides a way to quantify what type of validity?
answer
content
question
In calculating the content validity ratio, panelists are asked to determine:
answer
if the skill or knowledge measured by the item is essential.
question
A standard against which a test or test score is evaluated is known as:
answer
a criterion.
question
The minimum value of a content validity ratio necessary to be statistically significant at the .05 level is dependent on:
answer
the number of panelists judging the items.
question
Which may best be viewed as varieties of criterion-related validity?
answer
concurrent validity and predictive validity
question
The form of criterion-related validity that reflects the degree to which a test score is correlated with a criterion measure obtained at the same time that the test score was obtained is known as:
answer
concurrent validity.
question
The form of criterion-related validity that reflects the degree to which a test score correlates with a criterion measure that was obtained some time subsequent to the test score is known as:
answer
predictive validity.
question
A key difference between concurrent and predictive validity has to do with:
answer
the time frame during which data on the criterion measure is collected.
question
Which is an example of a criterion?
answer
achievement test scores, success in being able to repair a defective toaster, and student ratings of teaching effectiveness (all of these.)
question
An index of utility can be distinguished from an index of reliability and an index of validity in that an index of utility can tell us something about:
answer
the practical value of the information derived from what a test measures.
question
Test validity:
answer
sets a ceiling on test utility.
question
One of the noneconomic benefits of a diagnostic test used to make decisions about involuntary hospitalization of psychiatric patients is a benefit to:
answer
society-at-large.
question
Costs associated with testing include all of the following EXCEPT:
answer
return on investment.
question
The end-point of a utility analysis is typically an educated decision about:
answer
which of many possible courses of action is optimal.
question
A utility analysis is conducted using:
answer
expectancy tables, Naylor-Shine tables, and Taylor-Russell tables (All of these.)
question
If targeted test-takers for a particular test consistently fail to follow the directions for taking the test then:
answer
the test could still have great utility and the test could still be valid (b and c.)
question
Validity is to ____________ as utility is to ____________.
answer
accuracy; usefulness
question
A potential noneconomic benefit of a well-run evaluation program is:
answer
increase in quantity and quality of workers' on-the-job performance, decrease in time it takes to train new workers, and reduction in the number of workplace accidents (All of these.)
question
The Angoff method of setting cutting scores relies heavily on:
answer
the judgment of experts.
question
The "Achilles heel" of the Angoff method is:
answer
interrater reliability.
question
A hospital uses a compensatory model of selection in hiring surgeons. In their hiring evaluations, ratings regarding past safety record is given more weight than ratings regarding the surgeon's "bedside manner." From this, one could reasonably conclude that the people who are in charge of hiring surgeons believe that:
answer
bedside manner is less important compared to surgical safety.
question
The term item-mapping refers to an IRT-based method of:
answer
setting cut scores that entails an ordering or histographic representation of test items.
question
Which of the following is a direct economic cost that could result as a consequence of NOT evaluating personnel for employment positions within a large corporation?
answer
the cost of lawsuits against the corporation
question
The idea for a new test may come from:
answer
social need, review of the available literature, and common sense appeal (all of these.)
question
This term is used to refer to the preliminary research surrounding the creation of a prototype of a test:
answer
pilot work, pilot study, and pilot research (all of these.)
question
Often used for the purpose of licensing persons in professions, these tests are called:
answer
criterion-referenced tests
question
Likert scales measure attitudes using continuums. A continuum of items measuring ___________ could be used for a Likert scale.
answer
like it or not, agree/disagree, and approve to do not approve (All of these.)
question
Test items that contain alternatives with five points ranging from "strongly agree" to "strongly disagree" are characterized as using this approach to scaling:
answer
Likert scaling.
question
Guttman scales:
answer
typically are constructed so that agreement with one statement may predict agreement with another statement.
question
Which is an example of the selected-response item format?
answer
a multiple-choice item
question
Having a large item pool available during test revision is:
answer
an advantage because poor items can be deleted in favor of the good items.
question
A well-written true-false item:
answer
has a correct response that is veritably true or false, and not subject to debate.
question
Computer-adaptive testing has been found to:
answer
reduce by as much half the number of test items administered.
question
Item branching refers to:
answer
administering certain test items on a test depending on the test-takers' responses to previous test items.
question
Which statement is TRUE of the test tryout phase of test construction?
answer
Test conditions should be as similar to the actual administration as possible.
question
The item-validity index is key in determining:
answer
criterion-related validity.
question
An item-difficulty index of 1 occurs when:
answer
all examinees answer the item correctly.
question
The higher the item-difficulty index, the ________ the item.
answer
easier
question
In item analysis, the term item endorsement refers to the percent of test-takers who:
answer
indicate that they agree with a particular item.
question
An item-reliability index provides a measure of a test's:
answer
internal consistency.
question
An item-difficulty index can range from ________ to ________.
answer
. 0; 1
question
In Sternberg's study of the characteristics of academic intelligence, laypeople stressed the "interpersonal and social aspects," whereas experts stressed:
answer
motivation.
question
The test that launched the testing movement in the United States was the ______ test.
answer
Stanford-Binet
question
Neisser argued that intelligence:
answer
cannot be explicitly defined.
question
What conclusion concerning intelligence could reasonably be drawn based on the 1921 symposium published in the Journal of Educational Psychology?
answer
Experts had a multitude of definitions of intelligence.
question
Binet believed that the primary purpose of an intelligence test was to assist the test user in:
answer
classification.
question
Galton's conception of intelligence focused on:
answer
sensory abilities.
question
The Wechsler tests of intelligence:
answer
measure more than two factors.
question
According to Wechsler, as cited in the text, intelligence should be conceived as a __________capacity that is best measured by measuring ______________ abilities.
answer
global; qualitatively differentiable
question
The Stanford-Binet-5 is based on which theory?
answer
Cattell-Horn-Carroll theory of intellectual abilities.
question
Binet, Wechsler, and Piaget would most likely agree with which of the following statements?
answer
"Heredity and environment interact to influence the development of intelligence, but a person may not exceed his or her genetic potential."
question
According to Wechsler's approach to cognitive assessment of adults and children, which of the following is TRUE?
answer
Similar tasks may be used on different tests, but the actual content of the items will differ at different age levels.
question
The WPPSI-III is used to measure the intelligence of children from ages ________ through ________.
answer
2.5; 7.25
question
The concepts of social intelligence, concrete intelligence, and abstract intelligence are collectively best associated with which theorist?
answer
Thorndike
question
Which of the following is NOT true of Piaget's stages?
answer
The stages were adapted from Binet's work with children.
question
According to Piaget, a form of cognitive structure or organization is referred to as:
answer
a schema.
question
Which statement is NOT true of Cattell's two-factor theory of intelligence?
answer
Crystallized intelligence is relatively culture-free.
question
Who first hypothesized that the proportion of the variance that a number of tests have in common accounts for a general factor of intelligence?
answer
Gardner
question
Logical-mathematical, bodily-kinesthetic, linguistic, musical, spatial, interpersonal, and intrapersonal intelligence are all associated with which theory of intelligence?
answer
Gardner
question
Crystallized intelligence includes:
answer
application of general knowledge.
question
Which of the following best characterizes the basis of CHC theory?
answer
factor analysis
question
According to Howard Gardner, the ability to form an accurate and realistic view of oneself would be referred to as what type of intelligence?
answer
intrapersonal
question
Spearman's g factor refers to:
answer
what different intelligence tests have in common.
question
The best measure of "intelligence" in very young children could probably be obtained by:
answer
assessment of sensorimotor skills.
question
In discussing the role of personality in the measured intelligence of infants, the term ________ is used.
answer
temperament
question
Which is a technique or method used to minimize cultural bias in tests?
answer
minimized verbal instruction, use of teaching items, and use of sample items (All of these.)
question
Public Law 95-561 defines giftedness with reference to:
answer
creativity, leadership ability, and intellectual ability (All of these.)
question
Since the 1921 Symposium on Intelligence, researchers and theorists have agreed that:
answer
None of these (They didn't agree on anything.)
question
Children's intelligence is assessed primarily for:
answer
educational placement and planning.
question
A child is administered an IQ test at age 5 and another at age 10. The reported score at age 10 is much higher than the reported score at age 5. This may be because:
answer
the child's IQ naturally unfolded with maturation, the child is receiving an excellent education at school and at home, and the examiner used a different IQ test that assesses different abilities (all of these.)
question
A child's IQ test score may be influenced by:
answer
the person's temperament, the IQ test administered and environmental stressors such as divorced parents (all of these.)
question
The Flynn effect is characterized by:
answer
an average rise in measured intelligence each year from the year a test was normed.
question
Which of the following is TRUE regarding the stability of intelligence?
answer
Intelligence is generally stable through adulthood.
question
Which is a reasonable conclusion regarding our current state of knowledge regarding intelligence?
answer
There exists widespread disagreement on the definition of intelligence.
question
Starting with moderately difficult test items and then giving easier or harder items, depending on the test-taker's performance, is termed:
answer
adaptive testing.
question
A ceiling level refers to the:
answer
point at which a subtest is discontinued.
question
On the Wechsler tests of intelligence, the Full Scale IQ has a mean of ________________ and a standard deviation of _______________.
answer
100; 15
question
The WISC-IV is appropriate for:
answer
children ages 6-16.
question
Group intelligence tests:
answer
are efficient and cost-effective, can be useful as screening instruments, and can be useful for research purposes (all of these.)
question
Compared with individually administered intelligence tests, group intelligence tests:
answer
are more psychometrically sound, have a higher degree of predictive validity, and have the advantage in terms of cost efficiency (all of these.)
question
Children deemed to be at risk are:
answer
preschool children who may not be ready for school, and preschool children with documented difficulties in one or more psychological, social, and academic areas requiring intervention (both A and B.)
question
Psychoeducational test batteries are designed to measure:
answer
ability and achievement.
question
If John earns a full-scale IQ of 90 on the WISC-IV:
answer
John scored at the low end of the average range.
question
Normative information is available in the test's manual for WAIS-IV test-takers:
answer
as old as 90 years, 11 months.
question
The fifth edition of the Stanford-Binet Intelligence Scale was based on which theory of intelligence?
answer
the CHC model
question
How many people were in the standardization sample for the fifth edition of the Stanford-Binet Intelligence Scale?
answer
4,800
question
When administering an individual test of intelligence, the examiner is alert to:
answer
cues that the examinee is not alert, how examinee copes with frustration, and the cooperative level of the examinee (All of these.)
question
Which would NOT be considered extra-test behavior on the part of a test-taker?
answer
responding to the examiner's questions
question
Stanford-Binet Full Scale scores are converted into nominal categories designated by certain cutoff boundaries. For example, an SB5 measured IQ in the range of 110 to 119 falls into the __________ category.
answer
high average
question
Which of the following is NOT a variable assessed as part of an APGAR evaluation?
answer
cognitive ability
question
An instrument used to identify which children should receive a more comprehensive evaluation is, most likely, a ______ instrument.
answer
screening
question
The history of personality types dates at least as far back as the days of:
answer
Hippocrates.
question
A personality trait:
answer
is relatively enduring, varies within and between individuals, and is distinguishable (All of these.)
question
138. Personality tests are used for:
answer
evaluating influences on health, evaluating influences on academic performance, and planning psychotherapeutic interventions (All of these.)
question
Which BEST describes what is typically measured in personality assessment?
answer
traits and states.
question
On the Self-Directed Search, terms such as Artistic, Enterprising, and Investigative are examples of:
answer
personality types.
question
Neuroticism, Extraversion, Openness, Agreeableness, and Conscientiousness. These variables are all measured by which personality assessment instrument?
answer
Big 5
question
The Big 5 was developed by:
answer
McCrae and Costa
question
Projective tests are _____ methods of personality assessment.
answer
indirect
question
Projective tests
answer
are increasingly becoming norm-referenced
question
The Rorschach test:
answer
continues to be a widely used clinical tool, despite its questionable validity.
question
Behavioral assessment tends to focus on:
answer
the individual.
question
Which of the following is TRUE of behavioral assessment?
answer
The frequency, intensity, or duration of the behavior is generally specified.
question
Which is NOT a quantifiable definition of a target behavior?
answer
the number of seconds Johnny spends daydreaming during his social studies class.
question
A culturally sensitive psychological assessment includes sensitivity to which of the following?
answer
acculturation and language, personal identity, and values and worldview (all of these.)
question
The DSM-IV has _____ number of axis:
answer
5
question
A clinical psychologist would be LEAST likely to use individually administered tests:
answer
to evaluate and counsel clients regarding potential career choices.
question
The DSM-IV-TR is a diagnostic system that is used by psychologists:
answer
to diagnose patients, for insurance reimbursement purposes, and for research purposes (all of these.)