EDUCATION RESEARCH MEETS THE GOLD STANDARD: - PowerPoint PPT Presentation

About This Presentation
Title:

EDUCATION RESEARCH MEETS THE GOLD STANDARD:

Description:

This session is meant to help inform the national debate over ... Betsy Jane Becker, Professor of Measurement and Quantitative Methods, College of ... Jesse A. ... – PowerPoint PPT presentation

Number of Views:79
Avg rating:3.0/5.0
Slides: 37
Provided by: macksh
Category:

less

Transcript and Presenter's Notes

Title: EDUCATION RESEARCH MEETS THE GOLD STANDARD:


1
EDUCATION RESEARCH MEETS THE GOLD STANDARD
  • STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER
    NO CHILD LEFT BEHIND
  • Mack C. Shelley, II
  • Iowa State University
  • mshelley_at_iastate.edu
  • Presented at the Joint Statistical Meetings,
    August 7-11, 2005, Minneapolis, MN

2
Background
  • This session is meant to help inform the national
    debate over the role of scientific standards for
    research in education, particularly as those
    research standards are influenced by statistical
    methods and theory.
  • This session builds on a National Science
    Foundation award to myself and Brian Hand
    (University of Iowa).

3
Background
  • The panel is designed to meld research interests
    in statistics, education, and related
    disciplines, and to discuss the dramatically
    changing context of contemporary education
    research.
  • Why, exactly, is the context changing for
    statistical research in education?

4
Background
  • Standards for acceptable research in education
    are affected greatly by
  • the recent creation of the Institute of Education
    Sciences in the U.S. Department of Education
  • passage of the No Child Left Behind Act of 2001,
    and
  • Passage of the Education Sciences Reform Act
    (H.R. 3801) in 2002

5
Background
  • Together, these developments
  • have reconstituted federal support for research
    and dissemination of information in education
  • are meant to foster scientifically valid
    research, and
  • have established what is referred to as the gold
    standard for research in education.

6
Background
  • These and other developments denote that greater
    education research emphasis now is placed on
  • quantification,
  • the use of randomized trials, and
  • the selection of valid control groups

7
Background
  • This panel is intended to be part of a sustained
    and expanded dialogue
  • between the statistical community and those who
    implement the education research agenda
  • through a discussion of whether and how to
    implement the new standards for statistical work
    in the field of education research

8
What Is The Gold Standard?
  • U.S. Department of Education, Institute of
    Education Sciences, National Center for Education
    Evaluation and Regional Assistance
  • Identifying and implementing educational
    practices supported by rigorous evidence A user
    friendly guide
  • http//www.ed.gov/about/offices/list/ies/news.html
    guide

9
What Is The Gold Standard?
  • This publication emphasizes
  • evidence-based interventions
  • educational outcomes that have been found to be
    effective in randomized controlled trials
  • researchs gold standard for establishing what
    works
  • following patterns of evidence use in medicine
    and welfare policy

10
What Is The Gold Standard?
  • The quality of studies needed to establish
    strong evidence requires
  • randomized controlled trials that are
    well-designed and implemented
  • that the quantity of evidence needed spans trials
    showing effectiveness in two or more typical
    school settings
  • including a setting similar to that of
    schools/classrooms

11
What Is The Gold Standard?
  • Possible evidence may include
  • randomized controlled trials whose
    quality/quantity are good but fall short of
    strong evidence
  • and/or comparison-group studies in which the
    intervention and comparison groups are very
    closely matched
  • in academic achievement, demographics, and other
    characteristics

12
What Is The Gold Standard?
  • Evaluating whether an intervention is backed by
    strong evidence of effectiveness hinges on
  • well-designed and well-implemented randomized
    controlled trials
  • demonstrating that there are no systematic
    differences between intervention and control
    groups before the intervention
  • the use of measures and instruments of proven
    validity
  • real-world objective measures of the outcomes
    the intervention is designed to affect
  • attrition of no more than 25 of the original
    sample
  • effect size combined with statistical
    significance
  • an adequate sample size to achieve statistical
    significance
  • controlled trials implemented in more than one
    site in schools that represent a cross-section of
    all schools

13
No Child Left Behind
  • Public Law 107110 H.R. 1
  • passed on January 8, 2002
  • An Act to close the achievement gap with
    accountability, flexibility, and choice, so that
    no child is left behind
  • the No Child Left Behind Act of 2001 (NCLB)
  • established standards for academic assessments in
    mathematics, reading or language arts, and
    science
  • multiple up-to-date measures of student academic
    achievement, including measures that assess
    higher-order thinking skills and understanding
  • These requirements for program assessment lead to
    many opportunities and circumstances for the
    application of statistical methods.

14
No Child Left Behind
  • The research program under NCLB was designed to
    examine the effect of the assessment and
    accountability systems on students, teachers,
    parents, families, schools, school districts, and
    States, including correlations between such
    systems and
  • student academic achievement
  • progress toward meeting the State-defined level
    of proficiency
  • progress toward closing achievement gap changes
    in course offerings, teaching practices, course
    content, and instructional material
  • teacher, principal, and pupil-services personnel
    turnover rates
  • student dropout, grade-retention, and graduation
    rates
  • students with disabilities
  • student socioeconomic status
  • level of student English proficiency
  • student ethnicity and race

15
The Education Sciences Reform Act and IES
  • The Education Sciences Reform Act
  • An Act to provide for improvement of Federal
    education research, statistics, evaluation,
    information, and dissemination, and for other
    purposes
  • H.R. 3801, passed January 23, 2002
  • reconstituted federal support for research and
    dissemination of information in education, to
    foster scientifically valid research
  • established the Institute of Education Sciences
    (IES)
  • replacing the Office of Educational Research and
    Improvement
  • part of the Department of Education but
    functioning separately from it

16
The Education Sciences Reform Act and IES
  • IES is the research arm of the Department of
    Education
  • Mission is to expand knowledge and provide
    information on
  • the condition of education
  • practices that improve academic achievement
  • the effectiveness of Federal and other education
    programs
  • Goal
  • the transformation of education into an
    evidence-based field in which decision makers
    routinely seek out the best available research
    and data before adopting programs or practices
    that will affect significant numbers of students
  • Consists of
  • Grover J. (Russ) Whitehurst, first Director,
    since November 2002
  • Office of the Director
  • National Center for Education Research
  • National Center for Education Statistics
  • National Center for Education Evaluation and
    Regional Assistance
  • National Center for Special Education Research

17
The Education Sciences Reform Act and IES
  • HR 3801 defined Scientifically based research
    standards to
  • apply rigorous, systematic, and objective
    methodology to obtain reliable and valid
    knowledge relevant to education activities and
    programs
  • present findings and make claims that are
    appropriate to and supported by the methods that
    have been employed

18
The Education Sciences Reform Act and IES
  • Scientifically based research also includes
  • employing systematic, empirical methods that draw
    on observation or experiment
  • involving data analyses that are adequate to
    support the general findings
  • relying on measurements or observational methods
    that provide reliable data
  • making claims of causal relationships only in
    random assignment experiments or other designs
    (to the extent such designs substantially
    eliminate plausible competing explanations for
    the obtained results)
  • ensuring that studies and methods are presented
    in sufficient detail and clarity to allow for
    replication or, at a minimum, to offer the
    opportunity to build systematically on the
    findings of the research
  • obtaining acceptance by a peer-reviewed journal
    or approval by a panel of independent experts
    through a comparably rigorous, objective, and
    scientific review
  • using research designs and methods appropriate to
    the research question posed

19
The Education Sciences Reform Act and IES
  • Scientifically valid education evaluation means
    an evaluation that
  • adheres to the highest possible standards of
    quality with respect to research design and
    statistical analysis
  • provides an adequate description of the programs
    evaluated and, to the extent possible, examines
    the relationship between program implementation
    and program impacts
  • provides an analysis of the results achieved by
    the program with respect to its projected effects
  • employs experimental designs using random
    assignment, when feasible, and other research
    methodologies that allow for the strongest
    possible causal inferences when random assignment
    is not feasible
  • may study program implementation through a
    combination of scientifically valid and reliable
    methods

20
What Works
  • What Works Clearinghouse (WWC)
  • established in 2002 by IES
  • to provide educators, policymakers, and the
    public with a central and trusted source of
    scientific evidence of what works in education
  • administered by the U.S. Department of Education,
    through a contract to a joint venture of the
    American Institutes for Research and the Campbell
    Collaboration
  • reviews and reports on existing studies of
    interventions (education programs, products,
    practices, and policies) in selected topic areas
  • apply standards that follow scientifically valid
    criteria for determining the effectiveness of
    these interventions
  • Technical Advisory Group (TAG)
  • leading experts in research design, program
    evaluation, and research synthesis
  • advises on the standards for evaluation research
    reviews
  • monitors and informs the methodological aspects
    of WWC reviews and reports
  • www.whatworks.ed.gov

21
What Works - TAG
  • Dr. Larry V. Hedges, Chairperson, Stella M.
    Rowley Professor of Education, Psychology, Public
    Policy Studies, and Sociology, University of
    Chicago, and editorial board member of the
    American Journal of Sociology, the Review of
    Educational Research, and Psychological Bulletin.
  • Dr. Betsy Jane Becker, Professor of Measurement
    and Quantitative Methods, College of Education,
    Michigan State University.
  • Dr. Jesse A. Berlin, Professor of Biostatistics,
    University of Pennsylvania School of Medicine,
    and Director of Biostatistics at the university's
    Comprehensive Cancer Center.
  • Dr. Douglas Carnine, Professor of Education,
    University of Oregon, and Director of the
    National Center to Improve the Tools of
    Educators.
  • Dr. Thomas D. Cook, Professor of Sociology,
    Psychology, Education and Social Policy,
    Northwestern University, and Faculty Fellow at
    the Institute for Policy Research.
  • Dr. David J. Francis, Professor of Quantitative
    Methods, Chairman of the Department of
    Psychology, and Director of the Texas Institute
    for Measurement, Evaluation, and Statistics,
    University of Houston.
  • Dr. Robert L. Linn, distinguished Professor of
    Education, University of Colorado at Boulder, and
    Co-Director of the National Center for Research
    on Evaluation, Standards, and Student Testing.
  • Dr. Mark W. Lipsey, Senior Research Associate,
    Vanderbilt Institute for Public Policy Studies,
    and Director of the Center for Evaluation
    Research and Methodology.
  • Dr. David Myers, Senior Fellow, Mathematica
    Policy Research, and former Director of the U.S.
    Department of Education's national evaluation of
    Upward Bound.
  • Dr. Andrew C. Porter, Patricia and Rodes Hart
    Professor of Educational Leadership and Policy
    and Director of the Learning Sciences Institute
    at Vanderbilt University.
  • Dr. David Rindskopf, Professor of Psychology and
    Educational Psychology, City University of New
    York Graduate Center, and elected Fellow of the
    American Statistical Association.
  • Dr. Cecilia E. Rouse, Professor of Economics and
    Public Affairs, and joint appointee in the
    Economics Department and Woodrow Wilson School,
    Princeton University.
  • Dr. William R. Shadish, Founding Faculty and
    Professor of Social Sciences, Humanities, and
    Arts at the University of California, Merced.

22
What Works Current Topics
  • The What Works Clearinghouse (WWC) prioritizes
    topics based on the following criteria
  • potential to improve important student outcomes
  • applicability to a broad range of students or to
    particularly important subpopulations
  • policy relevance and perceived demand within the
    education community and
  • likely availability of scientific studies.
  • Specifically, the topics were selected from
    nominations received through
  • emails from the public
  • meetings and presentations sponsored by the What
    Works Clearinghouse
  • the What Works Network
  • suggestions presented by senior members of
    education associations, policymakers, and the
    U.S. Department of Education and
  • reviews of existing research.

23
What Works Current Topics
  • Topics include
  • MathCurriculum-Based Interventions for
    Increasing Middle School Math
  • ReadingInterventions for Beginning Reading
  • Character EducationComprehensive Schoolwide
    Character Education Interventions Benefits for
    Character Traits, Behavioral, and Academic
    Outcomes
  • Dropout PreventionInterventions for Preventing
    High School Dropout
  • English Language LearningInterventions for
    Elementary School English Language Learners
    Increasing English Language Acquisition and
    Academic Achievement
  • MathCurriculum-Based Interventions for
    Increasing Elementary School Math
  • Early ChildhoodInterventions for Improving
    Preschool Childrens School Readiness
  • Delinquent, Disorderly, and Violent
    BehaviorInterventions to Reduce Delinquent,
    Disorderly, and Violent Behavior in Middle and
    High Schools
  • Adult LiteracyInterventions for Increasing Adult
    Literacy
  • Peer-Assisted LearningPeer-Assisted Learning
    Interventions in Elementary Schools Reading,
    Mathematics, and Science Gains

24
Does Not Meet Evidence Screens
  • Studies may not pass WWC screening requirements
    for the following reasons
  • Evaluation research design. The study did not
    meet certain design standards. Study designs that
    provide the strongest evidence of effects include
  • randomized controlled trials
  • regression discontinuity designs
  • quasi-experimental designs (must use a similar
    comparison group and have no attrition or
    disruption problems)
  • single subject designs
  • Topic area definition. The study did not meet the
    intervention definition developed by the WWC for
    a particular topic.
  • Time period definition (generally, the last 20
    years)
  • Relevant outcome
  • academic outcomes, not, for example, student
    self-confidence
  • needs to have only one relevant outcome to pass
    this screen
  • test reliability or validity
  • sample or description of relevant test items if a
    study outcome test is not known or available
  • Relevant student sample

25
A Real Live Current Example
  • MATHEMATICS AND SCIENCE EDUCATION RESEARCH GRANTS
    PROGRAM
  • CFDA (Catalog of Federal Domestic Assistance)
    NUMBER 84.305
  • RELEASE DATE May 6, 2005
  • REQUEST FOR APPLICATIONS NUMBER NCER-06-02
    Mathematics and Science Education Research Grants
    Program
  • http//www.ed.gov/about/offices/list/ies/programs.
    html
  • LETTER OF INTENT RECEIPT DATE September 12,
    2005
  • APPLICATION RECEIPT DATE November 3, 2005, 800
    p.m. Eastern time

26
A Real Live Current Example
  • REVIEW CRITERIA FOR SCIENTIFIC MERIT
  • Significance
  • Does applicant make a compelling case for the
    potential contribution of the project to the
    solution of an education problem?
  • Does the applicant present a strong rationale
    justifying the need to evaluate the selected
    intervention (e.g., does prior evidence suggest
    that the intervention is likely to substantially
    improve student learning and achievement)?
  • Research Plan
  • Does the applicant present
  • (a) clear hypotheses or research questions
  • (b) clear descriptions of and strong rationales
    for the sample, measures (including information
    on reliability and validity), data collection
    procedures, and research design
  • (c) a detailed and well-justified data analysis
    plan?
  • Does the research plan meet the requirements
    described in the section on the Requirements of
    the Proposed Research?
  • Is the research plan appropriate for answering
    the research questions or testing the proposed
    hypotheses?

27
A Real Live Current Example
  • Applications under Goal Three (Efficacy and
    Replication Trials)
  • Under Goal Three, the Institute requests
    proposals to test the efficacy of fully developed
    interventions that already have evidence of
    potential efficacy.
  • By efficacy, the Institute means the degree to
    which an intervention has a net positive impact
    on the outcomes of interest in relation to the
    program or practice to which it is being compared.

28
A Real Live Current Example
  • Methodological requirements
  • (i) Sample
  • The applicant should define, as completely as
    possible, the sample to be selected and sampling
    procedures to be employed for the proposed study.
    Additionally, the applicant should describe
    strategies to insure that participants will
    remain in the study over the course of the
    evaluation.

29
A Real Live Current Example
  • (ii) Design
  • Applicants should describe how potential threats
    to internal and external validity will be
    addressed.
  • Studies using randomized assignment to treatment
    and comparison conditions are strongly preferred.
  • When a randomized trial is used, the applicant
    should clearly state the unit of randomization
    (e.g., students, classroom, teacher, or school).
  • Choice of randomizing unit or units should be
    grounded in a theoretical framework.
  • Applicants should explain the procedures for
    assignment of groups (e.g., schools, classrooms)
    or participants to treatment and comparison
    conditions.

30
A Real Live Current Example
  • (ii) Design (continued)
  • Only in circumstances in which a randomized trial
    is not possible may alternatives that
    substantially minimize selection bias or allow it
    to be modeled be employed. Applicants must make
    a compelling case that randomization is not
    possible.
  • Acceptable alternatives include appropriately
    structured regression-discontinuity designs or
    other well-designed quasi-experimental designs
    that come close to true experiments in minimizing
    the effects of selection bias on estimates of
    effect size.

31
A Real Live Current Example
  • (ii) Design (continued)
  • A well-designed quasi-experiment reduces
    substantially the potential influence of
    selection bias on membership in the intervention
    or comparison group. This involves
  • demonstrating equivalence between the
    intervention and comparison groups at program
    entry on the variables measuring program outcomes
    (e.g., math achievement test scores), or
    obtaining such equivalence through statistical
    procedures such as propensity score balancing or
    regression
  • demonstrating equivalence or removing
    statistically the effects of other variables on
    which the groups may differ and that may affect
    intended outcomes of the program being evaluated
    (e.g., demographic variables, experience and
    level of training of teachers, motivation of
    parents or students)
  • a design for the initial selection of the
    intervention and comparison groups that minimizes
    selection bias or allows it to be modeled

32
A Real Live Current Example
  • (iii) Power
  • Applicants should clearly address the power of
    the evaluation design to detect a reasonably
    expected and minimally important effect.
  • For determining the sample size, applicants need
    to consider the number of clusters, the number of
    individuals within clusters, the potential
    adjustment from covariates, the desired effect,
    the intraclass correlation (i.e., the variance
    between clusters relative to the total variance
    between and within clusters), the desired power
    of the design, one-tailed vs. two-tailed tests,
    repeated observations, attrition of participants,
    etc.
  • Applicants should anticipate the degree to which
    the magnitude of the expected effect may vary
    across the primary outcomes of interest.

33
A Real Live Current Example
  • (iv) Measures
  • Investigators should include
  • relevant standardized measures of student
    achievement (e.g., standardized measures of
    mathematics achievement)
  • other measures of student learning and
    achievement (e.g., researcher-developed measures)
  • measures of teacher practices
  • information on the reliability, validity, and
    appropriateness of proposed measures

34
A Real Live Current Example
  • (v) Fidelity of implementation of the
    intervention
  • The applicant should
  • specify how the implementation of the
    intervention will be documented and measured
  • either indicate how the intervention will be
    maintained consistently across multiple groups
    (e.g., classrooms and schools) over time or
    describe the parameters under which variations in
    the implementation may occur
  • propose research designs that permit the
    identification and assessment of factors
    impacting the fidelity of implementation

35
A Real Live Current Example
  • (vi) Comparison group, where applicable
  • The applicant should
  • describe strategies to avoid contamination
    between treatment and comparison groups
  • include procedures for describing practices in
    the comparison groups
  • be able to compare intervention and comparison
    groups on the implementation of key features of
    the intervention
  • using a business-as-usual comparison group is
    acceptable
  • applicants should specify the treatment or
    treatments received in the comparison group
  • applicants should account for the ways in which
    what happens in the comparison group are
    important to understanding the net impact of the
    experimental treatment

36
A Real Live Current Example
  • (vii) Mediating and moderating variables
  • Mediating and moderating variables that are
    measured in the intervention condition that are
    also likely to affect outcomes in the comparison
    condition should be measured in the comparison
    condition (e.g., student time-on-task, teacher
    experience/time in position).
  • The evaluation should account for sources of
    variation in outcomes across settings (i.e., to
    account for what might otherwise be part of the
    error variance).
  • (viii) Data analysis
  • specific statistical procedures should be
    described
  • the relation between hypotheses, measures, and
    independent and dependent variables should be
    clear
  • the effects of clustering must be accounted for
    in the analyses, even when individuals are
    randomly assigned to condition
Write a Comment
User Comments (0)
About PowerShow.com