Reliability and Validity of International Large-Scale Assessment : Understanding IEA's Comparative Studies of Student Achievement.

Bibliographic Details
Main Author: Wagemaker, Hans.
Format: eBook
Language:English
Published: Cham : Springer International Publishing AG, 2020.
Edition:1st ed.
Series:IEA Research for Education Series
Subjects:
Online Access:Click to View
Table of Contents:
  • Intro
  • Foreword
  • Contents
  • About the Editor
  • 1 Introduction to Reliability and Validity of International Large-Scale Assessment
  • 1.1 Introduction
  • 1.2 Outline of This Book
  • References
  • 2 Study Design and Evolution, and the Imperatives of Reliability and Validity
  • 2.1 Introduction
  • 2.2 Decisions Informing the Design of IEA Studies and Test Development
  • 2.3 Addressing the Challenges
  • 2.3.1 Governance and Representation (Validity and Fairness)
  • 2.3.2 Reliability and Validity
  • 2.3.3 Changing Contexts
  • 2.3.4 Compositional Changes
  • 2.3.5 Financial Support
  • 2.3.6 Expansion in Assessment Activities
  • 2.3.7 Heterogeneity
  • 2.3.8 Advances in Technology
  • 2.3.9 Assessment Delivery
  • 2.4 Conclusions
  • References
  • 3 Framework Development in International Large-Scale Assessment Studies
  • 3.1 Introduction
  • 3.2 Assessment Frameworks
  • 3.3 Contextual Frameworks
  • 3.4 Design and Implementation
  • 3.5 Steps in Framework Development
  • 3.6 Types of Framework
  • 3.6.1 Curriculum-Referenced Frameworks
  • 3.6.2 Frameworks for Measuring Outcomes in Cross-Curricular Learning Areas
  • 3.6.3 Frameworks for Measuring Real-Life Skills
  • 3.6.4 Frameworks for Measuring Contexts
  • 3.7 Conclusions
  • References
  • 4 Assessment Content Development
  • 4.1 Introduction
  • 4.2 Key Features of ILSAs that Influence Assessment Content Development
  • 4.3 Validity in International Large-Scale Assessments
  • 4.4 The Assessment Frameworks
  • 4.5 Stimulus Material and Item Development: Quality Criteria Associated with Validity
  • 4.5.1 Representation of the Construct
  • 4.5.2 Technical Quality
  • 4.5.3 Level of Challenge
  • 4.5.4 Absence of Bias
  • 4.5.5 Language and Accessibility
  • 4.5.6 Cultural and Religious Contexts
  • 4.5.7 Engagement of Test-Takers
  • 4.5.8 Scoring Reliability
  • 4.6 Stimulus and Item Material: An Overview.
  • 4.6.1 Stimulus Characteristics, Selection, and Development
  • 4.6.2 Item Characteristics and Development
  • 4.6.3 Item Types
  • 4.7 Phases in the Assessment Development Process
  • 4.7.1 Phase 1: Drafting and Sourcing Preliminary Content
  • 4.7.2 Phase 2: Item Development
  • 4.7.3 Phase 3: The Field Trial and Post Field Trial Review
  • 4.7.4 Post Main Survey Test Curriculum Mapping Analysis
  • 4.8 Measuring Change Over Time and Releasing Materials for Public Information
  • 4.9 Conclusions
  • References
  • 5 Questionnaire Development in International Large-Scale Assessment Studies
  • 5.1 Introduction
  • 5.2 Approaches to Questionnaire Design and Framing
  • 5.3 Targeting of Questionnaires to Different Groups and a Diversity of Contexts
  • 5.4 Typology of Questions, Item Formats and Resulting Indicators
  • 5.5 Development Procedures, Process and Quality Management
  • 5.6 Questionnaire Delivery
  • 5.7 Conclusions
  • References
  • 6 Translation: The Preparation of National Language Versions of Assessment Instruments
  • 6.1 Introduction
  • 6.2 Translation Related Developments in IEA Studies
  • 6.3 Standards and Generalized Stages of Instrument Production
  • 6.4 Source Version and Reference Version
  • 6.4.1 Terms Used: Translation Versus Adaptation
  • 6.4.2 Collaborative Efforts
  • 6.5 Translation and Adaptation
  • 6.6 Decentralized Translations and Adaptations
  • 6.7 Centralized Verification
  • 6.8 Translation Verifiers
  • 6.9 Layout Verification
  • 6.10 Development Linked to Computer-Based Assessment
  • 6.11 Reviewing Results of Translation and Verification Processes
  • 6.12 Procedure Chain and Timeline
  • 6.13 Conclusions
  • References
  • 7 Sampling, Weighting, and Variance Estimation
  • 7.1 Introduction
  • 7.2 Defining Target Populations
  • 7.3 Preparing Valid Sampling Frames for Each Sampling Stage
  • 7.4 Sampling Strategies and Sampling Precision.
  • 7.4.1 Multiple Stage Sampling and Cluster Sampling
  • 7.4.2 Stratification
  • 7.4.3 Sampling with Probabilities Proportional to Size
  • 7.4.4 Estimating Sampling Precision
  • 7.5 Weighting and Nonresponse Adjustment
  • 7.6 Sampling Adjudication
  • 7.7 Conclusions
  • References
  • 8 Quality Control During Data Collection: Refining for Rigor
  • 8.1 Introduction
  • 8.2 Manuals
  • 8.2.1 Early Quality Control Procedures and the Development of Manuals
  • 8.2.2 Current Implementation of Manuals
  • 8.3 National Quality Control Procedures
  • 8.3.1 Development of National Quality Control Procedures
  • 8.3.2 Implementation of National Quality Control Procedures
  • 8.4 International Quality Control
  • 8.4.1 Development of International Quality Control Procedures
  • 8.4.2 Implementation of International Quality Control Procedures
  • 8.5 Future Directions
  • References
  • 9 Post-collection Data Capture, Scoring, and Processing
  • 9.1 Introduction
  • 9.2 Manual Post-collection Data Capture and Management Training
  • 9.2.1 Data Capture from Paper-Based Instruments
  • 9.2.2 Software Used for Data Capture
  • 9.2.3 Quality Control: Data Entry
  • 9.3 Scoring Cognitive Data: Test Booklets
  • 9.3.1 Process of Scoring Constructed-Response Cognitive Items
  • 9.3.2 Software Used for Scoring Data
  • 9.3.3 Quality Control
  • 9.4 Coding Data
  • 9.4.1 Process of Coding Data
  • 9.4.2 Software Used for Coding Data
  • 9.4.3 Quality Control
  • 9.5 International Data Processing
  • 9.5.1 Processes in International Data Processing
  • 9.5.2 Software Used for International Data Processing and Analysis
  • 9.5.3 Quality Control
  • 9.6 Conclusions
  • References
  • 10 Technology and Assessment
  • 10.1 Introduction
  • 10.2 Technology in Education
  • 10.3 Promises and Successes of Technology to Reform Assessment Data Collection
  • 10.3.1 Efficiency.
  • 10.3.2 Increased Reliability: Direct Data Capture
  • 10.3.3 Inclusion of More Comprehensive Measures of the Overall Construct
  • 10.3.4 Reading: Additional Competencies Needed in the Information Age
  • 10.3.5 Mathematics and Science: Inclusion of Innovative Problem-Solving Strategies
  • 10.3.6 Computational Thinking: Developing Algorithmic Solutions
  • 10.3.7 Increased Reliability: Use of Log-File Data
  • 10.3.8 Development of More Engaging and Better Matching Assessments
  • 10.4 The Transition
  • 10.4.1 Delivering Questionnaires Online
  • 10.4.2 Computer-Based Assessment
  • 10.5 Challenges
  • 10.6 The Future: Guiding Principles for the Design of an EAssessment Software
  • 10.6.1 Adaptive Testing
  • 10.6.2 Translation
  • 10.6.3 Printing
  • 10.6.4 Web-Based Delivery
  • 10.6.5 General Considerations
  • 10.7 Conclusions
  • References
  • 11 Ensuring Validity in International Comparisons Using State-of-the-Art Psychometric Methodologies
  • 11.1 Introduction
  • 11.2 Modern Educational Measurement: Item Response Theory
  • 11.2.1 From Chess Ranking to the Rasch Model
  • 11.2.2 Characteristics of the Rasch Model
  • 11.2.3 More General IRT Models
  • 11.2.4 Central Assumptions of IRT and Their Importance
  • 11.2.5 Unidimensionality
  • 11.2.6 Local Independence
  • 11.2.7 Population Homogeneity/ Measurement Invariance
  • 11.3 Simultaneous Modeling of Individual and Group Differences
  • 11.4 Statistical Modeling of Individual and Group Differences in IEA Survey Data
  • 11.4.1 Comparability as Generalized Measurement Invariance
  • 11.4.2 Multiple-Group IRT Models
  • 11.4.3 Population Models Integrating Test and Background Data
  • 11.4.4 Group Ability Distributions and Plausible Values
  • 11.5 Conclusions
  • References
  • 12 Publications and Dissemination
  • 12.1 Introduction
  • 12.2 Core Project Publications
  • 12.3 Project-Related Publications.
  • 12.4 Academic Journal
  • 12.5 IEA International Research Conference (IRC)
  • 12.6 IEA Compass Briefs
  • 12.7 Quality Assurance in Publications
  • 12.8 Public Dissemination of IEA's Work
  • 12.9 Conclusions
  • References
  • 13 Consequential Validity: Data Access, Data Use, Analytical Support, and Training
  • 13.1 Introduction
  • 13.2 Data Access
  • 13.3 Facilitating Analysis: The IEA IDB Analyzer
  • 13.4 Capacity Building: Workshops
  • 13.4.1 Promoting High-Quality Research Based on Large-Scale Assessment Data
  • 13.4.2 The IEA-ETS Research Institute (IERI)
  • 13.4.3 IEA International Research Conference
  • 13.4.4 Academic Visitors/Scholars
  • 13.4.5 IEA Awards
  • 13.5 Conclusions
  • References
  • 14 Using IEA Studies to Inform Policymaking and Program Development: The Case of Singapore
  • 14.1 Introduction
  • 14.2 Why Singapore Participates in International Large-Scale Assessments
  • 14.2.1 Participating in International Large-Scale Assessment Facilitates Benchmarking of Student Developmental Outcomes and Educator Practices
  • 14.2.2 Participating in International Large-Scale Assessment Provides Additional High-Quality Rich Data Sources for Secondary Analyses
  • 14.2.3 Participating in International Large-Scale Assessment Builds International Networks of Educationists and Experts
  • 14.3 How MOE Has Used Large-Scale Assessment Data
  • 14.3.1 STELLAR: "We Must and Can Do Better!"
  • 14.3.2 A New Pedagogical Approach to Learning Science: "We Tried a Different Method, Did It Materialize?"
  • 14.3.3 Bold Curricular and Pedagogical Shifts: "We Made Some Trade-Offs, What Did We Sacrifice?"
  • 14.4 Some Principles Underpinning MOE's Use of Large-Scale Assessment Data
  • References
  • 15 Understanding the Policy Influence of International Large-Scale Assessments in Education
  • 15.1 Introduction
  • 15.2 Impact, Influence, and Education Policy.
  • 15.3 Policy Influence?.