Evaluating Technical Quality in the Context of Assessment Innovation
In this paper, the authors note that current existing federal peer review expectations are not well-suited for accommodating the design considerations and tradeoffs associated with innovative programs that intend to shift the purpose of statewide assessment beyond school identification and toward the transformation of teaching and learning. We propose a shift in federal assessment peer review processes to require the submission of a comprehensive validity argument, to more flexibly support states in gathering and submitting evidence related to the quality of the assessment system for serving its intended purposes.
To ground these ideas, we present findings from a case study of the Massachusetts Consortium for Innovative Education Assessment, which is piloting a curriculum-embedded, educator-developed performance assessment system known as the Portfolios of Performance. The Portfolios of Performance system is explicitly designed with students in mind, centering equity, authenticity and agency. The system was evaluated for alignment with academic content standards and score comparability with the state’s current summative assessment, the Massachusetts Comprehensive Assessment System, to better understand the compatibility of performance-based systems with existing federal requirements. In doing so, the report proposes a pathway for performance assessments (and innovative assessment systems in general) to meet federal peer review requirements.
Findings indicate that the Portfolios of Performance assessment system showed strong potential to meet both existing federal expectations for alignment as well as more expansive alignment considerations such as the assessment’s reflection of key instructional shifts embedded in the standards, including interdisciplinary connections and authentic engagement with disciplinary content. However, the assessment system has not yet met the necessary thresholds for score comparability. Recommendations for strengthening the system include increasing the number of tasks, refining task quality and enhancing educator scoring reliability through targeted training and calibration.
More broadly, Evaluating Technical Quality in the Context of Assessment Innovation: Policy Implications from a Case Study argues that the evaluation of assessment quality, particularly for models intended to serve as tools for instructional improvement, should be structured around a well-reasoned and coherent validity argument. Such an argument could include a clear theory of action, evidence of theoretical coherence, expanded evidence of alignment, appropriate methods for evaluating comparability and ongoing attention to implementation and systemic impacts.
We close with policy recommendations for state and federal leaders, including the need to:
Taken together, these recommendations offer a pathway for building assessment systems that are not only technically sound but also better support the goals of equity, instructional relevance and deeper learning.
This resource is created in partnership with the following organizations:
Helping education constituents navigate today’s shifting landscape with resilience and success
KnowledgeWorks recommended short- and long-term actions for the new administration in four key areas
Research tells us assessments in traditional education systems aren’t working. Explore how states can work with the federal government to…