Title: Using Diagnostic Mathematics Measures
1Using Diagnostic Mathematics Measures
2Diagnostics Mathematics Assessments Main Ideas
- Now typically assess the knowledge and skill on
the subsets of the 10 standards specified by the
National Council of Teachers of Mathematics - Designed to identify specific strengths and
weaknesses in skill development - Attempt to assess a wide variety of skills
- Fewer diagnostic math assessments than reading
since math is more clear cut
3Purpose for Assessing Math
- Provide detailed information so that teachers and
interventionists can determine a students
mastery of skills and plan individualized math
instruction - Provide teachers with specific information on the
kinds of items that students pass or fail - Gives insight into how curriculum and instruction
are working in the class - Also allows for modification of the curriculum
4Purpose for Assessing Math
- Teachers need to know if students have mastered
facts and concepts - Occasionally used to make exceptionality and
eligibility decisions - Often used to establish special learning needs
and eligibility for programs for children with
learning disabilities in math
5National Council of Teachers of Mathematics
- Suggest that a curriculum follow these in each
and grades just at different levels. - Content Standards
- Process Standards
6National Council of Teachers of Mathematics
- Content Standards- followed at all grades
- Numbers and Operations
- Algebra
- Geometry
- Measurement
- Data Analysis and Probability
7National Council of Teachers of Mathematics
- So, you ask, what would these look like in First
grade? - Numbers and Operations- 3 1
- Algebra- 3 ? 4
- Geometry- What shape is ? __________
- Measurement- measure the temperature, time etc.
- Data Analysis and Probability- Graph how many
people have teddy bears and how many have teddy
dogs, teddy rabbits
8National Council of Teachers of Mathematics
- Process Standards
- Problem Solving
- Reasoning and Proof
- Communication
- Connections
- Representation
9National Council of Teachers of Mathematics
- What does it look like in first grade for
Process Standards - Reasoning and Proof
- Complete the patter ?????
10Group Mathematics Assessment and Diagnostic
Evaluation (G-MADE)
- Group administered, norm-referenced, standard
based test for assessing the math skills of
students in K-12 - Purpose to identify specific math skill
development strengths and weaknesses and to lead
to teaching strategies - Test materials include a CD that provides a
cross-reference between specific math skills and
teaching resources - Diagnosis of skills is broad
11G-MADE Subtests
- Concepts and Communication
- Measures student knowledge of the language,
vocabulary, and representations of math - Operation and Computation
- Measures skills in using the basic operations of
addition, subtraction, multiplication, and
division - Process and Application
- Measures skill in taking in the language and
concepts of math and applying the appropriate
operations and computations to solve a word
problem
12G-MADE Scores
- Raw scores can be converted to standard scores
with a mean of 100 and a standard deviation of 15 - Growth Scale Values are provided to track growth
of math skills - Can track growth over one year or from year to
year
13Test Materials
- Teachers Manual
- Student Booklets
- Answer Sheets
- Hand-Scoring Template
- Technical Manual
- Age-Based Norms and Grade-Based Out of Level
Norms Supplement - Scoring and Reporting Software
14Reliability
- All reliabilities exceed .74 with more than 90
exceeding .80 - Only low reliabilities are 7th grade Concepts
and Communications and Process and Applications
at all grades beyond 4th - Internal consistency and stability are
sufficient for using the test to make decisions
about individuals
15Validity
- Content is based on NCTM standards
- Created based on year long study of standards,
curriculum benchmarks, score and sequence
commonly used in math textbooks, and review of
research based on best math practices for
teaching concepts and skills - Many studies support criterion related validity
of test - In comparison with KeyMath, all correlations
were in excess of .80, making the 2 tests highly
comparable
16Other Information
- Test is not timed since it is meant to test
power not speed - Older students can complete test in one hour
long session where most students finish in about
45 minutes - With younger students, multiple, short testing
sessions are recommended
17KeyMath-3 Diagnostic Assessment (KeyMath-3 DA)
- An untimed, individually administered,
norm-referenced test designed to provide a
comprehensive assessment of essential math
concepts and skills in individuals ages 4 years,
6 months through 21 years - Time 30-40 minutes in lower elementary and
70-90 minutes for older students - Provides a means of monitoring individuals
progress over time with 2 parallel forms that can
be administered in alternating sequence every 3
months - Also provides Growth Scale Values (GSVs), a type
of developmental scale score
18Uses for KeyMath-3 DA
- Assess math proficiency by providing
comprehensive coverage of concepts and skills
taught in regular math instruction - Assess student progress in math
- Support instructional planning
- Support educational placement decisions
19KeyMath-3 DA
- 2 parallel forms (A and B) of the test
- Each test has 372 items divided into the
following subtests - Numeration
- Algebra
- Geometry
- Measurement
- Data Analysis and Probability
- Mental Computation and Estimation
- Addition and Subtraction
- Multiplication and Division
- Foundations of Problem Solving
- Applied Problem Solving
20KeyMath-3 DA Resources
- Manual
- Two free standing easels for either Form A or B
- 25 record forms with detachable Written
Computation Examinee Booklets - Two additional products that are available
- ASSIST Scoring and Reporting Software Program
- KeyMath-3 DA Essential Resources Instructional
Program
21KeyMath-3 DA Scores
- Can be hand scored or by using software
- Relative Standing scale scores, standard
scores, percentile rank - Developmental Scores grade and age equivalents,
growth scale values - Composite Scores basic concepts, operations,
application - Software can produce progress reports, narrative
summaries, export scores to Excel, parent reports
22Reliability
- Internal Consistency low in K and 1st but in
other ages exceed .80 - Alternate Form exceed .80 with exception of
different forms for Geometry and Data Analysis
and Probability - Adjusted Test-Retest based on 103 students,
grades K-12 generally exceed .80 with exception
of Foundations of Problem Solving (.70) and
Geometry (.78) subtests - Adequate for screening and diagnostic purposes
23Validity
- Correlates very highly with scores on
KeyMath-Revised normative update and scores on
Kaufman Test of Educational Achievement, Measures
of Academic Progress (MAP), and G-MADE - Evidence for content validity is good based on
alignment with state and NCTM standards
24Weaknesses for Diagnostic Math Assessments
- Recurring issue of curriculum match
- Selecting appropriate test for the type of
decision to be made - Do not test a sufficiently detailed sample of
math concepts and facts must generalize - Due to weaknesses, tests are not very useful in
assessing readiness or strengths and weaknesses
in order to plan instructional programs - Preferred practice is for teachers to develop
curriculum-based achievement tests that exactly
parallel curriculum being taught
25Goal of Oral and Written Language Assessments
- The assessment of language competence should
include evaluation of a students ability to
process, both in comprehension and in expression,
language in a spoken or written format.
26Major Communication Processes
- Oral Comprehension listening and comprehending
speech - Written Comprehension reading
- Oral Expression speaking
- Written Expression - writing
27Related Terminology
Language Component Reception/ Comprehension Expression/ Production
Phonology Hearing and discriminating speech sounds Articulating speech sounds
Morphology and Syntax Understanding the grammatical structure of language Using the grammatical structure of language
Semantics Understanding vocabulary, meaning, and concepts Using vocabulary, meaning, and concepts
Pragmatics and Supralinguistics Understanding a speakers or writers intentions Using awareness of social aspects of language
28Considerations in Assessing Oral Language
- Cultural Diversity
- Birth place, pronunciations, comparing with the
same language community - Developmental Considerations
- Sounds, linguistic structures, and some semantic
elements are developmental
29Considerations in Assessing Written Language
- Content Production
- Formulating, elaborating, sequencing, clarifying,
and precise word choice to convey meaning
- Form
- Penmanship, spelling, and style rules
30Observing Language Behavior
- The following are the three main procedures for
gathering a sample of a students language
behavior. - Spontaneous Language
- Imitation
- Elicited Language
31Observing Language Behavior
- Advantages to Spontaneous Language
- Spontaneity is the best and most natural
indicator of everyday language performance. - Informality makes assessment easy, no formal
testing atmosphere.
32Observing Language Behavior
- Disadvantages of Spontaneous Language
- There is a non-standard nature to the data
collected by this type of test. - This test can take a very long time to collect
data.
33Observing Language Behavior
- Advantages of Imitation
- Overcomes many of the problems associated with
the spontaneous approach. - Assesses many different language elements to give
a representative view of childs language system - Structure of the test allows examiner to know all
elements of language being assessed. - Test can be administered much more quickly than
with spontaneous tests.
34Observing Language Behavior
- Disadvantages of Imitation
- Childrens auditory memory may effect the results
a child can score well by imitationwithout
demonstrating productive knowledge of the
language structures being tested. - A child can repeat exactly what is said if the
utterance or sentence is too small requiring no
memory processing. - Children become very bored and cant sit still.
There is no stimuli like pictures or toys
present. Just the repetition of repeating 50 to
100 sentences after the examiner.
35Observing Language Behavior
- Advantages to Elicited Language
- Pictures can be structured to test desired
language elements while retaining some of the
spontaneous language samples. - Allows children to create language on their own.
- There is no time limit so results do not depend
on childs word retention ability.
36Observing Language Behavior
- Disadvantages of
- Elicited Language
- Difficult to find pictures to guarantee exact
word or sentence response. - Child may not produce or attempt to produce the
desired language structure.
37Tests
- Test of Written Language 4th (ed) (TOWL-4)
- Test of Language Development Primary 4th
edition (TOLD-P4) - Test of Language Development Intermediate 4th
edition (TOLD-I4) - Oral an Written Language Scales (OWLS)
- Test of Auditory Reasoning and Processing Skills
(TARPS)
38Six Subtests
- Sentence combining. The child is required to
form one compound or complex sentence from two or
more simple sentences spoken by the examiner. - Picture vocabulary. The child points to the
picture that best represents a series of two-word
items. - Word ordering. The child forms a complete,
correct sentence from a randomly-ordered string
of words, ranging from three to seven in length. - Relational vocabulary. The child tells how three
words, spoken by the examiner, are alike. - Morphological comprehension. The child
distinguishes between grammatically correct and
incorrect sentences. - Multiple meanings. The examiner says a word and
the student responds by saying as many different
meanings for that word as he/she can think of.
39Reliability and Validity
- TOLD-I4 appears to meet and often exceed the
standards for reliability for making screening
and diagnostic decisions. - The coefficients for reliability exceed 0.90
- Unlike the TOLD P4, there is good evidence for
construct validity of this test which is based on
oral language ability which is known to be
related to literacy and this test has a high
correlation with reading and writing abilities.
40Oral and Written Language Scales (OWLS)
- Individually administered assessment of receptive
and expressive language. - Test includes three scales - Listening
Comprehension- Oral Expression - - Written Expression
- Recommended uses Ages 3 21
- To determine broad levels of language skills
and specific performance in listening, speaking,
and writing. - Create intervention plans, and monitor
- student progress scores can be converted to
- obtain age equivalents/percentiles, etc.
41 Listening Comprehension Takes approx. 5
15 min
- Measures understanding of
- spoken language
- 111 items examiner reads aloud a verbal
stimulus. The student has to identify which 4
pictures is the best response to the stimulus. -
42 Oral ExpressionTakes approx. 5-15 min.
- Measures understanding of and use of spoken
language.96 items examiner reads aloud a
verbal stimulus and shows a picture. - Student responds orally by either answering
a question, completing a sentence, or generating
one or more sentences.
43 Written Expression Timed response test
Measures ability of students 5-21 yrs old
regarding use spelling, punctuation, syntax
sentence structure, phrases, etc., and
communicate with appropriate content, coherence,
organization, etc. The student responds to
direct writing prompts by the examiner.
44 Reliability and Validity
- There are wide ranges in reliability
coefficients for this test. - Results of this test are sufficient to use as a
screening device but are not sufficient to use in
making important decisions about individual
students. - Authors of this test report that the validity
studies comparing thesesubtests to established
criterionmeasured tests were similar
inperformance and within theexpected range of
validity. -
45Intelligence
- Theory of multiple intelligences
- Heredity
- Learn through experiences
- Today most theorists recognize the importance of
both heredity and experience.
46Today
- Intelligence test results are used to determine
eligibility for special services. - School Psychologists are trained professionals
who administer Intelligence Tests. - IQ tests are helpful in providing general
information as to how to pace instruction.
47What is intelligence?
- An inferred ability to explain differences in
present behavior and to predict differences in
future behavior. - It is a general ability that enables people to do
many different things.
48Acculturation
- A childs background experiences and learning
opportunities that they already have. - Culture
- Experiences available in ones environment
- Age
- ..that may influence the psychological demands
presented by the test. - Failure is NOT due to an inability to
comprehend or solve a problem, but a deficiency
in background experience
49Behaviors Sampled by Intelligence Tests
- Discrimination identify the item that is
different from the others - Generalization given a stimulus, identify from a
group the one that goes with the stimulus - Motor Behavior requires motor response in
duplicating a geometric design using blocks,
tracing a path through a maze, or reconstructing
designs from memory. - General Knowledge factual questions
- Vocabulary naming pictures or reading a
definition and selecting a picture (depending on
age)
50Behaviors Sampled by Intelligence Tests
- Induction State a rule or principle from a
series of objects - Comprehension 3 types those related to
directions, to printed material, or to social
customs and mores. - Sequencing identify the response that continues
a series - Detail Recognition identify the missing parts of
a picture - Analogical Reasoning How things are related to
each other A B C _____?
51Behaviors Sampled by Intelligence Tests
- Pattern Completion completing a pattern or
identifying a missing part of a pattern - Abstract Reasoning identify the absurdity in a
picture or verbal statement - Memory many different assessments are used to
measure memory, ex. verbatim repetition of a
sentence or series of numbers
52Three types of Intelligence Tests
- Individual Tests given one on one by a certified
evaluator most commonly used for educational
placement decisions.
53Three types of Intelligence Tests
- Group Tests may be used as a screening tool for
individual students, or to gain information about
groups of students.
54Three types of Intelligence Tests
- Nonverbal Intelligence Tests
- Picture- Vocabulary test
- Administered to non-readers, ELLs and hearing
impaired students. - This test measures only one aspect of
intelligence (receptive vocabulary,) and should
not be used to determine eligibility for special
services.
55Wechsler Intelligence Scale for Children-IV
(WISC-IV)
- Developed by David Wechsler in 1949, is has since
had several revisions. - Wechsler states, intelligence is the overall
capacity of an individual to understand and cope
with the world around him. - The test is a measure of the cognitive ability
and problem-solving process of a person ages 6
years to 16 years, 11 months.
56WISC-IV Normal Curve
57Wechsler Intelligence Scale for Children-IV
(WISC-IV)
- Subtests Core and Supplemental
- Verbal Comprehension Index (VCI)
- Similarities
- Vocabulary
- Comprehension
- Information
- Word Reasoning
58Wechsler Intelligence Scale for Children-IV
(WISC-IV)
- Subtests Core and Supplemental
- Perceptual Reasoning Index (PRI)
- Block Design
- Picture Concepts
- Matrix Reasoning
- Picture Completion
59Wechsler Intelligence Scale for Children-IV
(WISC-IV)
- Subtests Core and Supplemental
- Working Memory Index (WMI)
- Digital span
- Letter-Number Sequencing
- Arithmetic
60Wechsler Intelligence Scale for Children-IV
(WISC-IV)
- Subtests Core and Supplemental
- Processing Speed Index (PSI)
- Coding
- Symbol Search
- Cancellation
61Reliability
- The full-scale IQ (FSIQ) is reliable enough to
make important educational decisions. There is
not enough information gathered from the subtests
alone to make the educational decisions.
62Validity
-
- When using the WISC-VI to determine educational
needs for a student, examiners should only use
the FSIQ.
63Block Design
- timed test
- sample
- 2 minutes
- 9 blocks
64Picture Concepts
Pick one picture from each row with
common characteristics
65Matrix Reasoning
66Picture Completion
Look at this picture. What part is
missing?
67Woodcock-Johnson III Normative Update (WJ-III
NU)
- Measures general intellectual ability , specific
cognitive abilities, scholastic aptitudes, oral
language and achievement. - Individually administered and norm-referenced
- For ages 2-90
- Computer scored
- Each Test Record contains a seven-category Test
Session Observation Checklist to rate a students
conversational proficiency, cooperation,
activity, attention and concentration,
self-confidence, care in responding and response
to difficult tasks.
68WJ-III NU Tests of Cognitive Abilities
- 20 subtests measuring broad and narrow abilities
- Comprehension-knowledge, long-term retrieval,
visual-spatial thinking, auditory processing,
fluid reasoning, processing speed, short-term
memory. - Subtests can be combined to create additional
clusters for verbal ability, thinking ability,
cognitive efficiency, phonemic awareness and
working memory. - Additional supplemental subtests create more
clusters, broad attention, cognitive fluency and
executive processes
69WJ-III NUTests of Achievement
- 22 tests can be combined to form several
clusters. - Subtests and clusters from the standard battery
can be combined to form scores for broad areas in
reading, math and writing. - Oral expression, listening comprehension, basic
reading skills, reading comprehension,
phoneme/grapheme knowledge, math calculation
skills, math reasoning, written expression
70Reliability of WJ-III NU
- Individual tests are combined to provide clusters
for educational decision making - Cluster reliabilities for some age groups are
less than .90, but all median reliabilities
across age groups for the standard and broad
cognitive and achievement clusters exceed .90
71Validity of WJ-III NU
- Careful item selection is consistent with claims
for the content validity of both tests - Studies using a broad range of individuals
provides evidence for validity - For the Cognitive Ability Tests, the correlations
between the WJ-III General Intellectual Ability
score and the WISC-III Full-Scale IQ range from
.69 to .73 - For the Achievement Tests, the pattern and
magnitude of correlations between the Wechsler
Individual tests suggest that the WJ-III measures
the same skills similar to those measured by
other achievement tests.
72Peabody Picture Vocabulary Test-Fourth Edition
(PPVT-4)
- A non-timed test primarily given to younger
children and ELLs - Assesses the receptive(hearing) vocabulary of
examinees - It consists of stimuli sets of 12 and examinees
are tested at their ability or age level - As part of a broader assessment, can be useful
in evaluating language competence, selecting the
level and content of instruction and measuring
learning - The assessment of vocabulary is also useful when
evaluating the effects of injury or disease - It is individually administered using an easel
- Available in Spanish
73Scores of PPVT-4
- Examinees earn a raw score based on the number of
pictures correctly identified between basal and
ceiling items - Basal - the lowest set administered that
contains one or no errors - Ceiling the highest set administered that
contains eight or more errors - Testing is discontinued once a ceiling is
established
74Reliability of PPVT-4
- Multiple kinds of reliability are reported
- The scores of a PPVT-4 test are very precise and
consistent - Data also included on the testing and performance
of students with disabilities
75Validity of PPVT-4
- Five studies were conducted and indicate that
there is adequate validity - Slightly lower correlations were found on
assessments that measured broader areas of
language than primarily vocabulary - Data is also provided on how students with speech
and language impairments, hearing impairments,
specific learning disabilities, mental
retardation, giftedness, emotional/behavioral
disturbances and ADHD, perform in relation to the
general population - Results indicate the value of the PPVT-4 in
assessing these special populations
76Conclusion
- Assessing childrens IQ is controversial
- Intelligence tests assess samples of behavior
- Different intelligence tests sample different
behaviors - Educators must always ask IQ on what test?
- Test authors have their own definitions of
intelligence and therefore test those
items/behaviors they feel represent their
definition - When interpreting intelligence scores, avoid
making judgments that suggest that the score
represents much more than the specific behaviors
sampled - The quality of measurement can be affected by
several different types of student
characteristics and therefore must be taken into
consideration
77Remember.
- Many of the behaviors sampled on intelligence
tests are more indicative of actual achievement
than ability to achieve. - For example, students who have had more
opportunities to learn and achieve are likely to
perform better than those who have had less
exposure to information, even if they both have
the same overall potential to learn. - Intelligence tests are by no means a pure
representation of a students ability to learn.