Reliability and Validity of International Large-Scale Assessment : Understanding IEA's Comparative Studies of Student Achievement.
Main Author: | |
---|---|
Format: | eBook |
Language: | English |
Published: |
Cham :
Springer International Publishing AG,
2020.
|
Edition: | 1st ed. |
Series: | IEA Research for Education Series
|
Subjects: | |
Online Access: | Click to View |
Table of Contents:
- Intro
- Foreword
- Contents
- About the Editor
- 1 Introduction to Reliability and Validity of International Large-Scale Assessment
- 1.1 Introduction
- 1.2 Outline of This Book
- References
- 2 Study Design and Evolution, and the Imperatives of Reliability and Validity
- 2.1 Introduction
- 2.2 Decisions Informing the Design of IEA Studies and Test Development
- 2.3 Addressing the Challenges
- 2.3.1 Governance and Representation (Validity and Fairness)
- 2.3.2 Reliability and Validity
- 2.3.3 Changing Contexts
- 2.3.4 Compositional Changes
- 2.3.5 Financial Support
- 2.3.6 Expansion in Assessment Activities
- 2.3.7 Heterogeneity
- 2.3.8 Advances in Technology
- 2.3.9 Assessment Delivery
- 2.4 Conclusions
- References
- 3 Framework Development in International Large-Scale Assessment Studies
- 3.1 Introduction
- 3.2 Assessment Frameworks
- 3.3 Contextual Frameworks
- 3.4 Design and Implementation
- 3.5 Steps in Framework Development
- 3.6 Types of Framework
- 3.6.1 Curriculum-Referenced Frameworks
- 3.6.2 Frameworks for Measuring Outcomes in Cross-Curricular Learning Areas
- 3.6.3 Frameworks for Measuring Real-Life Skills
- 3.6.4 Frameworks for Measuring Contexts
- 3.7 Conclusions
- References
- 4 Assessment Content Development
- 4.1 Introduction
- 4.2 Key Features of ILSAs that Influence Assessment Content Development
- 4.3 Validity in International Large-Scale Assessments
- 4.4 The Assessment Frameworks
- 4.5 Stimulus Material and Item Development: Quality Criteria Associated with Validity
- 4.5.1 Representation of the Construct
- 4.5.2 Technical Quality
- 4.5.3 Level of Challenge
- 4.5.4 Absence of Bias
- 4.5.5 Language and Accessibility
- 4.5.6 Cultural and Religious Contexts
- 4.5.7 Engagement of Test-Takers
- 4.5.8 Scoring Reliability
- 4.6 Stimulus and Item Material: An Overview.
- 4.6.1 Stimulus Characteristics, Selection, and Development
- 4.6.2 Item Characteristics and Development
- 4.6.3 Item Types
- 4.7 Phases in the Assessment Development Process
- 4.7.1 Phase 1: Drafting and Sourcing Preliminary Content
- 4.7.2 Phase 2: Item Development
- 4.7.3 Phase 3: The Field Trial and Post Field Trial Review
- 4.7.4 Post Main Survey Test Curriculum Mapping Analysis
- 4.8 Measuring Change Over Time and Releasing Materials for Public Information
- 4.9 Conclusions
- References
- 5 Questionnaire Development in International Large-Scale Assessment Studies
- 5.1 Introduction
- 5.2 Approaches to Questionnaire Design and Framing
- 5.3 Targeting of Questionnaires to Different Groups and a Diversity of Contexts
- 5.4 Typology of Questions, Item Formats and Resulting Indicators
- 5.5 Development Procedures, Process and Quality Management
- 5.6 Questionnaire Delivery
- 5.7 Conclusions
- References
- 6 Translation: The Preparation of National Language Versions of Assessment Instruments
- 6.1 Introduction
- 6.2 Translation Related Developments in IEA Studies
- 6.3 Standards and Generalized Stages of Instrument Production
- 6.4 Source Version and Reference Version
- 6.4.1 Terms Used: Translation Versus Adaptation
- 6.4.2 Collaborative Efforts
- 6.5 Translation and Adaptation
- 6.6 Decentralized Translations and Adaptations
- 6.7 Centralized Verification
- 6.8 Translation Verifiers
- 6.9 Layout Verification
- 6.10 Development Linked to Computer-Based Assessment
- 6.11 Reviewing Results of Translation and Verification Processes
- 6.12 Procedure Chain and Timeline
- 6.13 Conclusions
- References
- 7 Sampling, Weighting, and Variance Estimation
- 7.1 Introduction
- 7.2 Defining Target Populations
- 7.3 Preparing Valid Sampling Frames for Each Sampling Stage
- 7.4 Sampling Strategies and Sampling Precision.
- 7.4.1 Multiple Stage Sampling and Cluster Sampling
- 7.4.2 Stratification
- 7.4.3 Sampling with Probabilities Proportional to Size
- 7.4.4 Estimating Sampling Precision
- 7.5 Weighting and Nonresponse Adjustment
- 7.6 Sampling Adjudication
- 7.7 Conclusions
- References
- 8 Quality Control During Data Collection: Refining for Rigor
- 8.1 Introduction
- 8.2 Manuals
- 8.2.1 Early Quality Control Procedures and the Development of Manuals
- 8.2.2 Current Implementation of Manuals
- 8.3 National Quality Control Procedures
- 8.3.1 Development of National Quality Control Procedures
- 8.3.2 Implementation of National Quality Control Procedures
- 8.4 International Quality Control
- 8.4.1 Development of International Quality Control Procedures
- 8.4.2 Implementation of International Quality Control Procedures
- 8.5 Future Directions
- References
- 9 Post-collection Data Capture, Scoring, and Processing
- 9.1 Introduction
- 9.2 Manual Post-collection Data Capture and Management Training
- 9.2.1 Data Capture from Paper-Based Instruments
- 9.2.2 Software Used for Data Capture
- 9.2.3 Quality Control: Data Entry
- 9.3 Scoring Cognitive Data: Test Booklets
- 9.3.1 Process of Scoring Constructed-Response Cognitive Items
- 9.3.2 Software Used for Scoring Data
- 9.3.3 Quality Control
- 9.4 Coding Data
- 9.4.1 Process of Coding Data
- 9.4.2 Software Used for Coding Data
- 9.4.3 Quality Control
- 9.5 International Data Processing
- 9.5.1 Processes in International Data Processing
- 9.5.2 Software Used for International Data Processing and Analysis
- 9.5.3 Quality Control
- 9.6 Conclusions
- References
- 10 Technology and Assessment
- 10.1 Introduction
- 10.2 Technology in Education
- 10.3 Promises and Successes of Technology to Reform Assessment Data Collection
- 10.3.1 Efficiency.
- 10.3.2 Increased Reliability: Direct Data Capture
- 10.3.3 Inclusion of More Comprehensive Measures of the Overall Construct
- 10.3.4 Reading: Additional Competencies Needed in the Information Age
- 10.3.5 Mathematics and Science: Inclusion of Innovative Problem-Solving Strategies
- 10.3.6 Computational Thinking: Developing Algorithmic Solutions
- 10.3.7 Increased Reliability: Use of Log-File Data
- 10.3.8 Development of More Engaging and Better Matching Assessments
- 10.4 The Transition
- 10.4.1 Delivering Questionnaires Online
- 10.4.2 Computer-Based Assessment
- 10.5 Challenges
- 10.6 The Future: Guiding Principles for the Design of an EAssessment Software
- 10.6.1 Adaptive Testing
- 10.6.2 Translation
- 10.6.3 Printing
- 10.6.4 Web-Based Delivery
- 10.6.5 General Considerations
- 10.7 Conclusions
- References
- 11 Ensuring Validity in International Comparisons Using State-of-the-Art Psychometric Methodologies
- 11.1 Introduction
- 11.2 Modern Educational Measurement: Item Response Theory
- 11.2.1 From Chess Ranking to the Rasch Model
- 11.2.2 Characteristics of the Rasch Model
- 11.2.3 More General IRT Models
- 11.2.4 Central Assumptions of IRT and Their Importance
- 11.2.5 Unidimensionality
- 11.2.6 Local Independence
- 11.2.7 Population Homogeneity/ Measurement Invariance
- 11.3 Simultaneous Modeling of Individual and Group Differences
- 11.4 Statistical Modeling of Individual and Group Differences in IEA Survey Data
- 11.4.1 Comparability as Generalized Measurement Invariance
- 11.4.2 Multiple-Group IRT Models
- 11.4.3 Population Models Integrating Test and Background Data
- 11.4.4 Group Ability Distributions and Plausible Values
- 11.5 Conclusions
- References
- 12 Publications and Dissemination
- 12.1 Introduction
- 12.2 Core Project Publications
- 12.3 Project-Related Publications.
- 12.4 Academic Journal
- 12.5 IEA International Research Conference (IRC)
- 12.6 IEA Compass Briefs
- 12.7 Quality Assurance in Publications
- 12.8 Public Dissemination of IEA's Work
- 12.9 Conclusions
- References
- 13 Consequential Validity: Data Access, Data Use, Analytical Support, and Training
- 13.1 Introduction
- 13.2 Data Access
- 13.3 Facilitating Analysis: The IEA IDB Analyzer
- 13.4 Capacity Building: Workshops
- 13.4.1 Promoting High-Quality Research Based on Large-Scale Assessment Data
- 13.4.2 The IEA-ETS Research Institute (IERI)
- 13.4.3 IEA International Research Conference
- 13.4.4 Academic Visitors/Scholars
- 13.4.5 IEA Awards
- 13.5 Conclusions
- References
- 14 Using IEA Studies to Inform Policymaking and Program Development: The Case of Singapore
- 14.1 Introduction
- 14.2 Why Singapore Participates in International Large-Scale Assessments
- 14.2.1 Participating in International Large-Scale Assessment Facilitates Benchmarking of Student Developmental Outcomes and Educator Practices
- 14.2.2 Participating in International Large-Scale Assessment Provides Additional High-Quality Rich Data Sources for Secondary Analyses
- 14.2.3 Participating in International Large-Scale Assessment Builds International Networks of Educationists and Experts
- 14.3 How MOE Has Used Large-Scale Assessment Data
- 14.3.1 STELLAR: "We Must and Can Do Better!"
- 14.3.2 A New Pedagogical Approach to Learning Science: "We Tried a Different Method, Did It Materialize?"
- 14.3.3 Bold Curricular and Pedagogical Shifts: "We Made Some Trade-Offs, What Did We Sacrifice?"
- 14.4 Some Principles Underpinning MOE's Use of Large-Scale Assessment Data
- References
- 15 Understanding the Policy Influence of International Large-Scale Assessments in Education
- 15.1 Introduction
- 15.2 Impact, Influence, and Education Policy.
- 15.3 Policy Influence?.