Fall 2013

Measuring Student Success

The Promise of Performance Assessment … and the Challenges

The Promise of Performance Assessment … and the Challenges

In all ages, whether digital or pre-digital, the measurement of student success must be driven by the education outcomes we value and the type of evidence that best demonstrates success. More than ever before, the time is ripe for performance assessment ... if we can overcome the challenges.

In all ages, whether digital or pre-digital, the measurement of student success must be driven by the education outcomes we value and the type of evidence that best demonstrates success. More than ever before, the time is ripe for performance assessment ... if we can overcome the challenges.

During AdvancED’s International Summit in June, we captured instructive insights on this topic from experienced educators and policymakers through answers to two open-response questions and a 10-item assessment perceptions survey. One conclusion stands out: despite overwhelming appreciation for the benefits of performance assessment and support for its use in measuring student success, significant obstacles exist, even among proponents.

Defining Performance Assessment

To establish common ground, we explain below what is meant by “performance assessment.” Very broadly, rather than requiring students to select a response from two or more options, performance assessment asks students to apply their knowledge and skills through some form of product, presentation or demonstration focused on key aspects of academic learning. In the context of 21st century skills the term “performance assessment” commonly refers to substantive activities — either short-term, on-demand tasks or curriculum-embedded, project-based tasks that yield reliable and valid scores. Products can be extended writing, research reports, presentations, works of art, performances and more. Expert opinion and anecdotal evidence attribute to performance assessment the promotion of deeper learning, higher-order and non-cognitive skills, and student engagement.

In the context of 21st century skills the term “performance assessment” commonly refers to substantive activities — either short-term, on-demand tasks or curriculum-embedded, project-based tasks that yield reliable and valid scores.

Performance assessment can measure proficiency/mastery in accountability testing, competency-based instructional programs and badging. It can promote/gauge learning when curriculum-embedded — as part of discrete lessons or whole project-based programs. It can be implemented for selected standards, as in Ohio’s Performance Assessment Pilot Project (OPAPP), or on an immersive basis school-wide throughout the year, as in the model developed by the Boston-based Center for Collaborative Education. The Common Core State Standards (CCSS) are more performance-based, as are the consortia-developed assessments. Digital technologies can be used but are not required.

Surveying Educators

Turning to the open-response question results, just 15 percent of the answers identified academic content knowledge as what high school graduates need for college or career success. As seen in Table 1, 85 percent cited 21st century and higher-order thinking skills, executive functions, and personal dispositions/mind-set. Non-cognitive needs represented more than 30 percent of the responses (some were categorized as 21st century skills). The results might reflect the championing of deeper learning, higher-order thinking and non-cognitive skills by a broad swath of education experts and stakeholders for several years.

[[{"type":"media","view_mode":"media_original","fid":"879","attributes":{"alt":"","class":"media-image","height":"615","typeof":"foaf:Image","width":"1749"}}]]

Given the nature of the needs, Summit attendees identified for college and/or career readiness, the answers to the second open-response question — the type of evidence needed to measure student success — is not surprising: attendees overwhelmingly identified performance-based evidence. Table 2 shows that more than 90 percent of the responses involved students demonstrating they had the knowledge, skills and/or non-cognitive attributes to succeed. Just four percent of the responses cited standardized or similar tests (that typically are not performance-based).

[[{"type":"media","view_mode":"media_original","fid":"880","attributes":{"alt":"","class":"media-image","height":"1520","typeof":"foaf:Image","width":"1760"}}]]

These results clearly reflect what attendees want to measure. However, they also might reflect the impact of the chorus of criticism from voices across the education spectrum about the low-level-skills focus of most assessments and the primacy of measurement ease and economy over measuring what matters. The investment by leading nations in richer, more authentic assessment is frequently cited as a model the U.S. should emulate. Certainly, Summit attendees have heard — and accepted — these messages.

Yet, the results of the 10-item assessment perceptions survey are surprising and highlight challenges to the widespread use of performance assessment to gauge student success. On the one hand, Table 3 shows remarkable unanimity across multiple factors associated with performance assessment: its role in promoting deeper learning and non-cognitive skills, its inclusion in accountability testing and even the ability of educators to create high-quality assessments.

[[{"type":"media","view_mode":"media_original","fid":"881","attributes":{"alt":"","class":"media-image","height":"1886","typeof":"foaf:Image","width":"1750"}}]]

On the other hand, there was as much unanimity about the need for professional development for teachers to create and use performance assessment. While not surprising, given how rarely performance assessment is used today, and although the capacity to effectively use it can be built through professional development, this need arises at a time when resources to help educators transition to the CCSS are stretched thin. Moreover, CCSS-driven professional development focuses on content and instruction, not assessment. OPAPP, cited earlier, is a noteworthy exception as its professional development component merges performance-based instruction and assessment.

Dispelling Perceptions

The results for other survey items raise considerable concern and suggest that some support for performance assessment is “soft.” The greatest challenges to the widespread use of performance assessment during the year are the perceptions among at least 40 percent of attendees that it is too time consuming and represents an additional commitment disconnected from the required curriculum. In addition, almost one-third of attendees think human scoring is too subjective for data-driven decision making and almost one-quarter think performance assessment is less reliable than multiple-choice testing.

We could devote entire articles to each of these perceptions, but only have space for the following brief comments.

  • While performance (both assessment development and scoring) takes time, it is the most effective way to promote and gauge higher-order and non-cognitive skills. By directly measuring student learning and enabling teachers to see student work, it more effectively pinpoints students’ strengths and weaknesses. Proven approaches along with effective uses of technology can make performance assessment more efficient now than it has been in the past.
  • All too often curriculum-embedded performance assessment is an irrelevant extra. Rich curricula and effective professional development enable performance assessment to play an essential ongoing role in teaching and learning.
  • Decades of research and high-stakes testing evidence demonstrate the reliability of human scoring. Professional development, training, collaboration, and the use of moderation and auditing can ensure accurate scoring of students’ work products.
  • Contrary to the notion that multiple-choice items — considered “objective” measures — are more reliable than performance assessment, reliability is largely a function of the amount of evidence provided by a test. Many fewer constructed-response items and still fewer performance tasks provide the same reliability as a 50-item multiple-choice test.

Addressing these perceptions — even among ostensible supporters — is essential if performance assessment is to achieve its potential in promoting and measuring student success in the digital age.

Peter Hofman is a 15-year veteran at Measured Progress. He has been engaged in market research, marketing/communications, strategic planning, intellectual property matters, public policy, partnerships and various special projects. He currently serves as Vice President for Public Policy and External Relations.

Stuart Kahl has more than 35 years of experience in large-scale assessment. A co-founder of Measured Progress, he has led the company through a period of dramatic growth to its current position as one of the nation’s foremost assessment providers. In 2010, Dr. Kahl was honored with the Association of Test Publishers Professional Contributions and Service to Testing Award for outstanding contributions to the assessment industry.

Related Articles