Educational policy critics now gather, with echoes of the funeral oration for Cease, not to praise the No Child Left Behind Act, but to bury it. That this legislation passed eight years ago with more Democratic than Republican votes is now only a little-remembered footnote. The conventional wisdom is that the age of accountability will be, as the 21st century enters its second decade, an age of scarcity. Schools will do the best they can with limited resources, but on the whole, we certainly should not expect very much. The expectations of large scale reforms, with all students meeting standards by 2014, will be eclipsed by diminished but realistic assessments of the limitations of schools and the children that they serve. This line of reasoning concludes with the inevitable chorus of “I told you so,” by those who abhorred academic standards and the assessments that accompanied them. They pine for the halcyon days of local control, with the locality coveting independence ranging from a state board of education to an individual classroom. In either case, the result is not freedom from control, but merely the freedom to exercise a different sort of control. Thank goodness, they conclude, that we will no longer suffer from federal intrusions in the classroom.
I dissent from these views with the following three arguments. First, the mistakes and successes in educational accountability during the past decade offer not an argument against standards, assessment, and accountability, but rather insights into how to improve these essential instruments of public policy. Second, the new federal economic stimulus legislation that is being signed as this article goes to press expands rather than contracts the federal role in education. Third, there is an historic opportunity to embrace a holistic view of accountability, encompassing not only measurements of student achievement, but also assessments of teachers, administrators, policymakers, and systems that claim to support education. Let us first consider the successes and failures of accountability in the past decade.
Success 1: Standards-Referenced Assessments
During the 2008 political campaigns, No Child Left Behind had few defenders from any political quarter. Liberals said that it was inadequately funded, conservatives said that it was overly intrusive, and a nearly universal chorus joined with refrains of “over testing” and “teaching to the test.” Standards themselves, though in use in all 50 states without any discernable link to the decline of Western civilization, are relentlessly maligned as the work of “Standaristas” (Ohanian, 1999). A full consideration of the evidence, however, includes both hits and misses, and we learn as much from the former as the latter.
The first and most important success of both the law and common practice of the past decade is the prevalence of standards-referenced assessments. In almost every accountability system I have studied since the beginning of the dawn of the standards movement, new accountability systems replaced norm-referenced tests that were disconnected from the classroom curriculum. Indeed, the nature of norm-referenced tests is not to reflect a specific curriculum, but rather to provide a wide range of difficulty so that students could be more accurately placed on the normal distribution, or the bell curve. Unlike a standards-referenced test, such as the ones we administer to teenage drivers and airline pilots, the norm-referenced tests were not designed to help us understand the effectiveness of teaching, curriculum, or leadership. Rather, they only provided the assurance, at great expense and purported accuracy, that close to half of our children were above average. While there are standards-referenced tests that are deeply flawed – they are too long, poorly matched to the curriculum, and insufficiently varied in format – they are dramatically better than their norm-referenced predecessors if our goal is to understand the impact of an educational system on student performance (Reeves, 2001).
In 1994, fewer than a dozen states had approved academic standards with assessments explicitly linked to those standards. A decade later, every state had achieved this, though in widely varying levels of quality and consistency. While opposition to national standards borders on the virulent, a growing number of voices, including some of those in the Obama Administration, are suggesting that in almost every school system in the land the square of the hypotenuse is equal to the sum of the square of the two sides of a right triangle. This remarkable claim remains true even when political authorities object to any bureaucrat named Pythagoras inflicting his curriculum ideas on what should rightfully be a matter subjected to the vote of the duly elected school board. Academic standards and the tests associated with them, it turns out, are not the result of a New World Order cabal, but simply a rational and fair way to educate students.
The change in state tests to those based explicitly on openly available academic content standards has also had a salutary impact on classroom assessment. A growing movement toward assessment literacy (Ainsworth & Viegut, 2006; Stiggins, Arter, Chappuis & Chappuis, 2004 ; Stiggins, 2000) led to classroom assessments that were linked explicitly to common academic standards rather than the personal preferences of teachers and textbook authors.
Success 2: Teacher and Leadership Impact on Achievement
The research literature provides a growing number of sustained and consistent cases of improved student performance, particularly among poor, minority, and second language students. Chenoweth (2007) is only the latest in a string of studies that documents success in the most challenging of environments, joining Haycock, 2001; Reeves, 2004a, 2004b, 2009, and others. This research is only possible because researchers had access to comprehensive accountability data, with assessments that were based on the state standards. Most importantly, the research shows not only that higher achievement is possible, but also links specific decisions of teachers, administrators, and policy makers to those gains. Darling-Hammond (2000) and Sanders & Rivers (1996) provide compelling evidence of the central role of teaching quality on student achievement. That a disproportionate number of poor and minority students continue to perform poorly compared to their economically advantaged majority colleagues is a function of our persistent maldistribution of teaching quality (Yun and Moreno, 2006), not the inability of these students to succeed.
Success 3: Holistic Accountability
Defying the stereotype of feckless school administrators gaming federal and state requirements in order to accomplish as little as possible, school leaders and teachers in Virginia, Indiana, Wisconsin, and California embraced not only the test-based accountability system required by federal law, but also created other indicators that provided public accountability for senior administrators, board members, teachers, and even parents (Reeves, 2002). These systems have provided a treasure trove of insights not only for researchers, but for practitioners. In fact, the virtuous cycle of direct observation of student performance using locally available data, along with systematic analysis of the adult decisions associated with changes in student performance, is far more likely to influence teacher decision-making than the abstractions of research presented in the typical undergraduate and graduate education courses (Reeves, 2008).
Failure 1: Proficiency Fantasies
When the entity held accountable for a standard of performance is the same entity that sets the standard, then a redefinition of success is inevitable. Just as my AARP magazine proclaims that “sixty is the new thirty,” some states appear to have declared that “woeful is the new proficient.” While notable exceptions, such as Virginia, have increased their requirements for student proficiency, many others have lowered the bar. The sanctions associated with schools labeled as in need of improvement can be onerous, and thus states have every incentive to engage in the fantasy that the path to proficiency lies not in effective teaching and leadership, but in redefining what “proficient” means. The same states have standards for hygiene in restaurants, safety in the workplace, and speed limits on the highways not because these standards are universally observed, but because the public interest is better served with higher than lower standards. Imagine the outcry if a state public health department were to seek to inflate the official ratings of the restaurants that they inspected by declaring that mice and cockroaches were acceptable residents in a “proficient” kitchen. However offensive and absurd this suggestion may seem, that is precisely what would happen if federal legislation were to provide multi-million dollars sanctions against states who were unable to bring all restaurants to meet prevailing health standards and then gave the states themselves the ability to define what “proficient” hygiene in the kitchen really means.
Failure 2: Reverse Robin Hood Effects
In a recent trip to Cuba, I noticed that the Worker’s Paradise of one of the last Communist regimes on earth had strayed a bit from the revolutionary promises of taking from the rich to give to the poor. With physicians, engineers, and professors paid somewhat less, by state degree, than those who roll cigars and dramatically less than those who work in the tourist industry and thus have access to hard currency, the economic incentives are clear. Don’t be a sap who saves lives, builds infrastructure, or prepares the next generation of leaders, but make more with a busload of vapid tourists in a day than you might make in a season of your assigned occupation. Thus are services and brainpower systematically diverted from the poor to the rich. In the U.S. education system, we do the same thing with the sanctions surrounding underperforming schools. We know that these schools need more than anything, expert teaching and leadership. The current system of perverse incentives delivers the opposite. When a school has been restructured under federal guidelines, it is not unusual for the staff to include a disproportionate number (in one case I have observed, 100%) new teachers, led by a first year principal. While a few systems have provided economic incentives for teachers and leaders to take on such a challenge, the most common personnel allocation system can best be described as guaranteed institutionalized failure. The most experienced and capable teachers and leaders are not malicious, but neither are they self-destructive. Accepting a posting to a failing school risks their financial security, professional reputation, and perhaps their personal safety. Every rational incentive is against their helping students who need them the most. These teachers and school leaders have families and mortgages and college bills, and their most likely career decisions will lead them farther and farther away from schools with the greatest need, leaving these schools to be staffed by their least experienced colleagues.
Failure 3: Elevation of Effects Over Causes
Imagine that the new Surgeon General of the United States announced a program to combat teenage obesity. His plan, announced with great fanfare and supported with multi-billion dollar funding, would be to install a scale in every classroom in America. The scales would be very sophisticated – lots of electronic displays and automated analysis, but the information that they would provide includes only the weight gains and losses of the students. When students lose weight, the programs would be declared a success, but neither schools nor students would know if the weight losses were caused by programs of diet and exercise or the result of drug abuse and eating disorders. No matter how sophisticated the scales, this myopic focus on effects – the measurement of weight loss – rather than causes would not help students improve their health, even if it did encourage them to lose weight. In the context of education, our ability and willingness to measure effects (reading and math scores) far outstrips our ability and willingness to measure causes (the specific actions of teachers, administrators, and policymakers that are associated with student achievement). When federal rewards and sanctions are associated with effects rather than causes, we should not find this surprising.
The New Federal Role in Education
The federal economic stimulus package, signed into law by President Obama on February 17, 2009, includes more federal spending for education in a single year than has been previously available over the course of two presidential terms. The central question facing the federal government now is whether this investment will be focused on programs or practices. Programs – the proprietary products of vendors – are by definition dependent upon a continuous stream of funding. Technology licenses must be renewed, equipment be updated, subscriptions must be continued, and the frequently changing faculty and leadership of schools must be retrained. As a result, programs are inherently not sustainable, and thus the history of education reform is littered with programs that were associated more with enthusiastic announcements, inspirational rhetoric, and imperious mandates than long-term sustainability. Practices, by contrast, represent the cultural practices of teachers and leaders – “the way we do things around here” that are sustainable not because of outside resources but because of internal commitment.
The new federal role in education must focus on practices rather than programs. The acid test question of sustainability is, “If there is no money and no mandate, will this practice continue?” If it is associated with externally procured programs, the answer is invariably negative. If it is associated with internal capacity – intellectual property owned, controlled, and perpetuated by educators and school leaders – then sustainability is possible.
New Opportunities in Accountability – The Implementation Audit
What do we need to do in order to get accountability right? Three modest suggestions might be appropriate for consideration by Congress, the Department of Education, state policy makers, and district level leaders. First, allocate one percent of expenditures for an “implementation audit,” a rigorous process in which we ask the simple question, “Are the funds we invested used as we intended?” This question is different from inquiries such as, “Were the teachers trained? Or “Were the boxes delivered?” or “Were the computers connected?” The essential questions have to do with the degree of implementation, and the answer is invariably on a continuum of at least four levels, from exceptional to adequate to progressing to unacceptable. The other essential question has to do with the relationship of implementation to student achievement. Millions of dollars have been invested in programs that were not implemented – even our most cursory examinations reveal unopened boxes of materials that were “delivered” but never used, and unending workshops that were delivered with great fanfare but never implemented. More millions of dollars have been invested in programs that were implemented, but which were not associated with improved student achievement. It is the continuing triumph of belief over evidence, a victory that must be challenged in a new era of educational accountability.
In sum, school leaders need not wait for a directive from Washington, D.C. to improve accountability. They can embrace a comprehensive vision, consider causes as well as effects, and create incentives that are consistent with the intent of policy makers. We can learn from our successes and failures, preserving the best of standards-based education and avoiding the worst parts of effects-based accountability. The children we serve deserve no less.
Ainsworth, Larry & Viegut, Donald. Common formative assessments: How to connect standards-based instruction and assessment. Corwin Press, 2006.
Chenoweth, Karin. It’s being done: Academic success in unexpected schools. Harvard Education Press, 2007.
Darling-Hammond, Linda. Teacher quality and student achievement: A review of state policy evidence. Educational Policy Analysis Archives, 8(1), 1-50, 2000.
Haycock, Kati. Dispelling the myth revisited. The Education Trust, 2001.
Ohanian, Susan. One size fits few: The folly of educational standards. Heinemann, 1999.
Reeves, Douglas. If you hate standards, learn to love the bell curve. Education Week, 48, June 6, 2001.
Reeves, Douglas. Holistic accountability: Serving students, schools, and community. Corwin Press, 2002.
Reeves, Douglas. Accountability for learning: How teachers and school leaders can take charge. Association for Supervision and Curriculum Development, 2004a.
Reeves, Douglas. (2004b). Accountability in action: A blueprint for learning organizations (2nd ed). Advanced Learning Press, 2004b.
Reeves, Douglas. Reframing teacher leadership to improve your school. Association for Supervision and Curriculum Development, 2008.
Reeves, Douglas. Level-Five Networks: Making Significant Change in Complex Organizations. In A. Hargreaves & M. Fullan (Eds)., Change Wars. Solution Tree, 2009.
Sanders, William & Rivers, June. Cumulative and residual effects of teachers on future student academic achievement. University of Tennessee Value-Added Research and Assessment Center, 1996.
Stiggins, Rick. Student-involved classroom assessment (3rd ed.). Prentice Hall, 2000.
Stiggins, Rick, Arter. Judith, Chappuis, Jan, & Chappuis, Stephen. Classroom assessment for student learning: Doing it right, using it well. Assessment Training Institute, 2004.
Yun, John & Moreno, José.“College Access, K-12 Concentrated Disadvantage, and the Next 25 Years of Education Research.” Educational Researcher, 35(1), 12-19, January-February, 2006.