Final Exam Review Test Questions Answers – Flashcards
Unlock all answers in this set
Unlock answersquestion
            Reliability, in a broad statistical sense, is synonymous with:
answer
        consistency.
question
            A source of error variance may take the form of:
answer
        item sampling, test takers' reactions to environment-related variables such as room temperature and lighting, and test taker variables such as amount of sleep the night before a test, amount of anxiety, or drug effects (all of these.)
question
            Which type of reliability estimate is obtained by correlating pairs of scores from the same person (or people) on two different administrations of the same test?
answer
        a test-retest estimate
question
            A reliability coefficient is:
answer
        an index, a ratio of the total variance attributed to true variance, and unaffected by a systematic source of error (All of these.)
question
            What is the difference between alternate forms and parallel forms of a test?
answer
        Alternate forms do not necessarily yield test scores with equal means and variances.
question
            Which of the following types of reliability estimates is the most expensive due to the costs involved in test development?
answer
        alternate-form
question
            An estimate of test-retest reliability is often referred to as a coefficient of stability when the time interval between the test and retest is more than:
answer
        6 months.
question
            As the reliability of a test increases, the standard error of measurement:
answer
        decreases.
question
            Which type of reliability estimate would be appropriate only when evaluating the reliability of a test that measures a trait that is relatively stable over time?
answer
        test-retest
question
            Which of the following is true of systematic error?
answer
        It has no effect on the reliability of a measure.
question
            Computer-scorable items have tended to eliminate error variance due to:
answer
        scorer differences.
question
            Which of the following might lead to a decrease in test-retest reliability?
answer
        the passage of time between the two administrations of the test, coaching designed to increase test scores between the two administrations of the test, and practice with similar test materials between the two administrations of the test (All of these.)
question
            If items from a test are measuring the same trait, estimates of reliability yielded from KR-20 will typically be ________ as compared to estimates from split-half methods.
answer
        higher
question
            Which of the following is TRUE for estimates of alternate- and parallel-forms reliability?
answer
        Two test administrations with the same group are required, Test scores may be affected by factors such as motivation, fatigue, or intervening events like practice, learning, or therapy, and Item sampling is a source of error variance (All of these.)
question
            If traditional measures of reliability are applied to criterion- referenced tests, the reliability estimates will likely be:
answer
        spuriously high.
question
            Test-retest estimates of reliability are referred to as measures of ________, and split-half reliability estimates are referred to as measures of ________.
answer
        stability; internal consistency
question
            For a heterogeneous test, measures of internal-consistency reliability will tend to be ________ compared with other methods of estimating reliability.
answer
        lower
question
            Which of the following factors may influence a split-half reliability estimate?
answer
        fatigue, anxiety, and item difficulty (all of these.)
question
            KR-20 is the statistic of choice for tests with which types of items?
answer
        multiple-choice and true-false (all of these.)
question
            The Spearman-Brown formula is used for:
answer
        correcting for one half of the test by estimating the reliability of the whole test, determining how many additional items are needed to increase reliability up to a certain level, and determining how many items can be eliminated without reducing reliability below a predetermined level (all of these.)
question
            Typically, adding items to a test will have what effect on the test's reliability?
answer
        Reliability will increase.
question
            Which of the following is NOT an acceptable way to divide a test when using the split-half reliability method?
answer
        Assign easy items to one half of the test and difficult items to the other half.
question
            Coefficient alpha is appropriate to use with all of the following test formats EXCEPT:
answer
        essay exam with no partial credit awarded.
question
            Which of the following is TRUE about coefficient alpha?
answer
        It is a characteristic of a particular set of scores, not of the test itself.
question
            A police officer mistakenly records the blood alcohol level of a suspected drunk driver after administering a breathalyzer test. This mistake is most related to which type of reliability?
answer
        interscorer
question
            A coefficient alpha over .9 may indicate that:
answer
        the items in the test are redundant.
question
            Which best conveys the meaning of an inter-scorer reliability estimate of .90?
answer
        Ninety percent of the variance in the scores assigned by the scorers was attributed to true differences and 10% to error.
question
            If a time limit is long enough to allow test takers to attempt all items, and if some items are so difficult that no test taker is able to obtain a perfect score, then the test is referred to as a ________ test.
answer
        power
question
            If a test is homogeneous:
answer
        it is functionally uniform throughout, it will likely yield a high internal-consistency reliability estimate compared with test-retest, and it would be reasonable to expect a high degree of internal consistency (all of these.)
question
            Which type(s) of reliability estimates would be most appropriate for a measure of heart rate?
answer
        test-retest
question
            Typically, speed tests:
answer
        contain items of a uniform difficulty level.
question
            Which type(s) of reliability estimates would be appropriate for a speed test?
answer
        test-retest, alternate-form, and split-half from two independent testing sessions (all of these.)
question
            Generalizability theory is most closely related to
answer
        test reliability.
question
            In classical test theory, there exists only one true score. In Cronbach generalizability theory, how many of these true scores exist?
answer
        many, depending on the number of different universes
question
            Traditional measures of reliability are inappropriate for criterion-referenced tests because variability:
answer
        is minimized with criterion-referenced tests.
question
            A test is considered valid when the test:
answer
        measures what it purports to measure.
question
            Face validity refers to:
answer
        the appearance of relevancy of the test items.
question
            Which is NOT a method of evaluating the validity of a test?
answer
        evaluating the percentage of passing and failing grades on the test
question
            Predictive and concurrent validity can be subsumed under:
answer
        criterion-related validity
question
            Face validity
answer
        may influence the way the test-taker approaches the situation, relates more to what the test appears to measure than what the test may actually measure, and has received little attention and is given short-shrift as compared to other indices of validity (all of these.)
question
            Which assessment technique has the MOST face validity?
answer
        administering a word processing test to a person applying to be a word processor
question
            Relating scores obtained on a test to other test scores or data from other assessment procedures is typically done in an effort to establish the __________ validity of a test.
answer
        criterion-related
question
            An instructor announces that an examination will cover the topics of reliability and validity. A student boasts that he will read and study only the material on reliability. In fact, all the test questions are only on reliability. The best conclusion a student of assessment could draw from this is that:
answer
        the examination lacked content validity.
question
            Before constructing a comprehensive final examination, your instructor reviews the objectives of the course, the textbook, and all lecture notes. Your instructor is making an effort to maximize the __________ validity of the final examination.
answer
        content
question
            Lawshe devised a method for determining agreement among raters or judges who rate items on how essential they are. This method provides a way to quantify what type of validity?
answer
        content
question
            In calculating the content validity ratio, panelists are asked to determine:
answer
        if the skill or knowledge measured by the item is essential.
question
            A standard against which a test or test score is evaluated is known as:
answer
        a criterion.
question
            The minimum value of a content validity ratio necessary to be statistically significant at the .05 level is dependent on:
answer
        the number of panelists judging the items.
question
            Which may best be viewed as varieties of criterion-related validity?
answer
        concurrent validity and predictive validity
question
            The form of criterion-related validity that reflects the degree to which a test score is correlated with a criterion measure obtained at the same time that the test score was obtained is known as:
answer
        concurrent validity.
question
            The form of criterion-related validity that reflects the degree to which a test score correlates with a criterion measure that was obtained some time subsequent to the test score is known as:
answer
        predictive validity.
question
            A key difference between concurrent and predictive validity has to do with:
answer
        the time frame during which data on the criterion measure is collected.
question
            Which is an example of a criterion?
answer
        achievement test scores, success in being able to repair a defective toaster, and student ratings of teaching effectiveness (all of these.)
question
            An index of utility can be distinguished from an index of reliability and an index of validity in that an index of utility can tell us something about:
answer
        the practical value of the information derived from what a test measures.
question
            Test validity:
answer
        sets a ceiling on test utility.
question
            One of the noneconomic benefits of a diagnostic test used to make decisions about involuntary hospitalization of psychiatric patients is a benefit to:
answer
        society-at-large.
question
            Costs associated with testing include all of the following EXCEPT:
answer
        return on investment.
question
            The end-point of a utility analysis is typically an educated decision about:
answer
        which of many possible courses of action is optimal.
question
            A utility analysis is conducted using:
answer
        expectancy tables, Naylor-Shine tables, and Taylor-Russell tables (All of these.)
question
            If targeted test-takers for a particular test consistently fail to follow the directions for taking the test then:
answer
        the test could still have great utility and the test could still be valid (b and c.)
question
            Validity is to ____________ as utility is to ____________.
answer
        accuracy; usefulness
question
            A potential noneconomic benefit of a well-run evaluation program is:
answer
        increase in quantity and quality of workers' on-the-job performance, decrease in time it takes to train new workers, and reduction in the number of workplace accidents (All of these.)
question
            The Angoff method of setting cutting scores relies heavily on:
answer
        the judgment of experts.
question
            The "Achilles heel" of the Angoff method is:
answer
        interrater reliability.
question
            A hospital uses a compensatory model of selection in hiring surgeons. In their hiring evaluations, ratings regarding past safety record is given more weight than ratings regarding the surgeon's "bedside manner." From this, one could reasonably conclude that the people who are in charge of hiring surgeons believe that:
answer
        bedside manner is less important compared to surgical safety.
question
            The term item-mapping refers to an IRT-based method of:
answer
        setting cut scores that entails an ordering or histographic representation of test items.
question
            Which of the following is a direct economic cost that could result as a consequence of NOT evaluating personnel for employment positions within a large corporation?
answer
        the cost of lawsuits against the corporation
question
            The idea for a new test may come from:
answer
        social need, review of the available literature, and common sense appeal (all of these.)
question
            This term is used to refer to the preliminary research surrounding the creation of a prototype of a test:
answer
        pilot work, pilot study, and pilot research (all of these.)
question
            Often used for the purpose of licensing persons in professions, these tests are called:
answer
        criterion-referenced tests
question
            Likert scales measure attitudes using continuums. A continuum of items measuring ___________ could be used for a Likert scale.
answer
        like it or not, agree/disagree, and approve to do not approve (All of these.)
question
            Test items that contain alternatives with five points ranging from "strongly agree" to "strongly disagree" are characterized as using this approach to scaling:
answer
        Likert scaling.
question
            Guttman scales:
answer
        typically are constructed so that agreement with one statement may predict agreement with another statement.
question
            Which is an example of the selected-response item format?
answer
        a multiple-choice item
question
            Having a large item pool available during test revision is:
answer
        an advantage because poor items can be deleted in favor of the good items.
question
            A well-written true-false item:
answer
        has a correct response that is veritably true or false, and not subject to debate.
question
            Computer-adaptive testing has been found to:
answer
        reduce by as much half the number of test items administered.
question
            Item branching refers to:
answer
        administering certain test items on a test depending on the test-takers' responses to previous test items.
question
            Which statement is TRUE of the test tryout phase of test construction?
answer
        Test conditions should be as similar to the actual administration as possible.
question
            The item-validity index is key in determining:
answer
        criterion-related validity.
question
            An item-difficulty index of 1 occurs when:
answer
        all examinees answer the item correctly.
question
            The higher the item-difficulty index, the ________ the item.
answer
        easier
question
            In item analysis, the term item endorsement refers to the percent of test-takers who:
answer
        indicate that they agree with a particular item.
question
            An item-reliability index provides a measure of a test's:
answer
        internal consistency.
question
            An item-difficulty index can range from ________ to ________.
answer
        . 0; 1
question
            In Sternberg's study of the characteristics of academic intelligence, laypeople stressed the "interpersonal and social aspects," whereas experts stressed:
answer
        motivation.
question
            The test that launched the testing movement in the United States was the ______ test.
answer
        Stanford-Binet
question
            Neisser argued that intelligence:
answer
        cannot be explicitly defined.
question
            What conclusion concerning intelligence could reasonably be drawn based on the 1921 symposium published in the Journal of Educational Psychology?
answer
        Experts had a multitude of definitions of intelligence.
question
            Binet believed that the primary purpose of an intelligence test was to assist the test user in:
answer
        classification.
question
            Galton's conception of intelligence focused on:
answer
        sensory abilities.
question
            The Wechsler tests of intelligence:
answer
        measure more than two factors.
question
            According to Wechsler, as cited in the text, intelligence should be conceived as a __________capacity that is best measured by measuring ______________ abilities.
answer
        global; qualitatively differentiable
question
            The Stanford-Binet-5 is based on which theory?
answer
        Cattell-Horn-Carroll theory of intellectual abilities.
question
            Binet, Wechsler, and Piaget would most likely agree with which of the following statements?
answer
        "Heredity and environment interact to influence the development of intelligence, but a person may not exceed his or her genetic potential."
question
            According to Wechsler's approach to cognitive assessment of adults and children, which of the following is TRUE?
answer
        Similar tasks may be used on different tests, but the actual content of the items will differ at different age levels.
question
            The WPPSI-III is used to measure the intelligence of children from ages ________ through ________.
answer
        2.5; 7.25
question
            The concepts of social intelligence, concrete intelligence, and abstract intelligence are collectively best associated with which theorist?
answer
        Thorndike
question
            Which of the following is NOT true of Piaget's stages?
answer
        The stages were adapted from Binet's work with children.
question
            According to Piaget, a form of cognitive structure or organization is referred to as:
answer
        a schema.
question
            Which statement is NOT true of Cattell's two-factor theory of intelligence?
answer
        Crystallized intelligence is relatively culture-free.
question
            Who first hypothesized that the proportion of the variance that a number of tests have in common accounts for a general factor of intelligence?
answer
        Gardner
question
            Logical-mathematical, bodily-kinesthetic, linguistic, musical, spatial, interpersonal, and intrapersonal intelligence are all associated with which theory of intelligence?
answer
        Gardner
question
            Crystallized intelligence includes:
answer
        application of general knowledge.
question
            Which of the following best characterizes the basis of CHC theory?
answer
        factor analysis
question
            According to Howard Gardner, the ability to form an accurate and realistic view of oneself would be referred to as what type of intelligence?
answer
        intrapersonal
question
            Spearman's g factor refers to:
answer
        what different intelligence tests have in common.
question
            The best measure of "intelligence" in very young children could probably be obtained by:
answer
        assessment of sensorimotor skills.
question
            In discussing the role of personality in the measured intelligence of infants, the term ________ is used.
answer
        temperament
question
            Which is a technique or method used to minimize cultural bias in tests?
answer
        minimized verbal instruction, use of teaching items, and use of sample items (All of these.)
question
            Public Law 95-561 defines giftedness with reference to:
answer
        creativity, leadership ability, and intellectual ability (All of these.)
question
            Since the 1921 Symposium on Intelligence, researchers and theorists have agreed that:
answer
        None of these (They didn't agree on anything.)
question
            Children's intelligence is assessed primarily for:
answer
        educational placement and planning.
question
            A child is administered an IQ test at age 5 and another at age 10. The reported score at age 10 is much higher than the reported score at age 5. This may be because:
answer
        the child's IQ naturally unfolded with maturation, the child is receiving an excellent education at school and at home, and the examiner used a different IQ test that assesses different abilities (all of these.)
question
            A child's IQ test score may be influenced by:
answer
        the person's temperament, the IQ test administered and environmental stressors such as divorced parents (all of these.)
question
            The Flynn effect is characterized by:
answer
        an average rise in measured intelligence each year from the year a test was normed.
question
            Which of the following is TRUE regarding the stability of intelligence?
answer
        Intelligence is generally stable through adulthood.
question
            Which is a reasonable conclusion regarding our current state of knowledge regarding intelligence?
answer
        There exists widespread disagreement on the definition of intelligence.
question
            Starting with moderately difficult test items and then giving easier or harder items, depending on the test-taker's performance, is termed:
answer
        adaptive testing.
question
            A ceiling level refers to the:
answer
        point at which a subtest is discontinued.
question
            On the Wechsler tests of intelligence, the Full Scale IQ has a mean of ________________ and a standard deviation of _______________.
answer
        100; 15
question
            The WISC-IV is appropriate for:
answer
        children ages 6-16.
question
            Group intelligence tests:
answer
        are efficient and cost-effective, can be useful as screening instruments, and can be useful for research purposes (all of these.)
question
            Compared with individually administered intelligence tests, group intelligence tests:
answer
        are more psychometrically sound, have a higher degree of predictive validity, and have the advantage in terms of cost efficiency (all of these.)
question
            Children deemed to be at risk are:
answer
        preschool children who may not be ready for school, and preschool children with documented difficulties in one or more psychological, social, and academic areas requiring intervention (both A and B.)
question
            Psychoeducational test batteries are designed to measure:
answer
        ability and achievement.
question
            If John earns a full-scale IQ of 90 on the WISC-IV:
answer
        John scored at the low end of the average range.
question
            Normative information is available in the test's manual for WAIS-IV test-takers:
answer
        as old as 90 years, 11 months.
question
            The fifth edition of the Stanford-Binet Intelligence Scale was based on which theory of intelligence?
answer
        the CHC model
question
            How many people were in the standardization sample for the fifth edition of the Stanford-Binet Intelligence Scale?
answer
        4,800
question
            When administering an individual test of intelligence, the examiner is alert to:
answer
        cues that the examinee is not alert, how examinee copes with frustration, and the cooperative level of the examinee (All of these.)
question
            Which would NOT be considered extra-test behavior on the part of a test-taker?
answer
        responding to the examiner's questions
question
            Stanford-Binet Full Scale scores are converted into nominal categories designated by certain cutoff boundaries. For example, an SB5 measured IQ in the range of 110 to 119 falls into the __________ category.
answer
        high average
question
            Which of the following is NOT a variable assessed as part of an APGAR evaluation?
answer
        cognitive ability
question
            An instrument used to identify which children should receive a more comprehensive evaluation is, most likely, a ______ instrument.
answer
        screening
question
            The history of personality types dates at least as far back as the days of:
answer
        Hippocrates.
question
            A personality trait:
answer
        is relatively enduring, varies within and between individuals, and is distinguishable (All of these.)
question
            138. Personality tests are used for:
answer
        evaluating influences on health, evaluating influences on academic performance, and planning psychotherapeutic interventions (All of these.)
question
            Which BEST describes what is typically measured in personality assessment?
answer
        traits and states.
question
            On the Self-Directed Search, terms such as Artistic, Enterprising, and Investigative are examples of:
answer
        personality types.
question
            Neuroticism, Extraversion, Openness, Agreeableness, and Conscientiousness. These variables are all measured by which personality assessment instrument?
answer
        Big 5
question
            The Big 5 was developed by:
answer
        McCrae and Costa
question
            Projective tests are _____ methods of personality assessment.
answer
        indirect
question
            Projective tests
answer
        are increasingly becoming norm-referenced
question
            The Rorschach test:
answer
        continues to be a widely used clinical tool, despite its questionable validity.
question
            Behavioral assessment tends to focus on:
answer
        the individual.
question
            Which of the following is TRUE of behavioral assessment?
answer
        The frequency, intensity, or duration of the behavior is generally specified.
question
            Which is NOT a quantifiable definition of a target behavior?
answer
        the number of seconds Johnny spends daydreaming during his social studies class.
question
            A culturally sensitive psychological assessment includes sensitivity to which of the following?
answer
        acculturation and language, personal identity, and values and worldview (all of these.)
question
            The DSM-IV has _____ number of axis:
answer
        5
question
            A clinical psychologist would be LEAST likely to use individually administered tests:
answer
        to evaluate and counsel clients regarding potential career choices.
question
            The DSM-IV-TR is a diagnostic system that is used by psychologists:
answer
        to diagnose patients, for insurance reimbursement purposes, and for research purposes (all of these.)
