Evaluating Technical Quality in the Context of Assessment Innovation

Publication
October 14, 2025

KEY TAKEAWAYS:

  • Findings from a case study of the Massachusetts Consortium for Innovative Education Assessment, which is piloting a curriculum-embedded, educator-developed performance assessment system known as the Portfolios of Performance
  • A case for the evaluation of assessment quality, particularly for models intended to serve as tools for instructional improvement, should be structured around a well-reasoned and coherent validity argument
  • Policy recommendations for state and federal leaders that offer a pathway for building assessment systems that are not only technically sound but also better support the goals of equity, instructional relevance and deeper learning

Evaluating Technical Quality in the Context of Assessment Innovation: Policy Implications from a Case Study explores the evaluation of technical quality in the context of assessment innovation, where the design and use of assessments are intended to support deeper learning.

In this paper, the authors note that current existing federal peer review expectations are not well-suited for accommodating the design considerations and tradeoffs associated with innovative programs that intend to shift the purpose of statewide assessment beyond school identification and toward the transformation of teaching and learning. We propose a shift in federal assessment peer review processes to require the submission of a comprehensive validity argument, to more flexibly support states in gathering and submitting evidence related to the quality of the assessment system for serving its intended purposes.

To ground these ideas, we present findings from a case study of the Massachusetts Consortium for Innovative Education Assessment, which is piloting a curriculum-embedded, educator-developed performance assessment system known as the Portfolios of Performance. The Portfolios of Performance system is explicitly designed with students in mind, centering equity, authenticity and agency. The system was evaluated for alignment with academic content standards and score comparability with the state’s current summative assessment, the Massachusetts Comprehensive Assessment System, to better understand the compatibility of performance-based systems with existing federal requirements. In doing so, the report proposes a pathway for performance assessments (and innovative assessment systems in general) to meet federal peer review requirements.

Findings indicate that the Portfolios of Performance assessment system showed strong potential to meet both existing federal expectations for alignment as well as more expansive alignment considerations such as the assessment’s reflection of key instructional shifts embedded in the standards, including interdisciplinary connections and authentic engagement with disciplinary content. However, the assessment system has not yet met the necessary thresholds for score comparability. Recommendations for strengthening the system include increasing the number of tasks, refining task quality and enhancing educator scoring reliability through targeted training and calibration.

More broadly, Evaluating Technical Quality in the Context of Assessment Innovation: Policy Implications from a Case Study argues that the evaluation of assessment quality, particularly for models intended to serve as tools for instructional improvement, should be structured around a well-reasoned and coherent validity argument. Such an argument could include a clear theory of action, evidence of theoretical coherence, expanded evidence of alignment, appropriate methods for evaluating comparability and ongoing attention to implementation and systemic impacts.

We close with policy recommendations for state and federal leaders, including the need to:

  • Strengthen support for the development and continuous improvement of innovative assessment models
  • Create the conditions to improve performance assessment quality within the current federal framework
  • Reorient the federal assessment paradigm to enable and encourage the assessment of deeper learning

Taken together, these recommendations offer a pathway for building assessment systems that are not only technically sound but also better support the goals of equity, instructional relevance and deeper learning.

Hear from the authors

Related Resources

Helping education constituents navigate today’s shifting landscape with resilience and success

Lillian Pace
Vice President of Policy and Strategic Advancement

KnowledgeWorks recommended short- and long-term actions for the new administration in four key areas

Research tells us assessments in traditional education systems aren’t working. Explore how states can work with the federal government to…

Lillian Pace
Vice President of Policy and Strategic Advancement

Menu

Search