Everyone agrees we need high standards for school improvement. But who should set them and how? How will we tell if schools are meeting them? What part can Essential Schools play in this crucial public debate?
Going to school and taking tests has lots in common with an airline pilot’s job, Harvard education professor Dick Elmore has observed: “long periods of excruciating boredom punctuated by intervals of stark terror.” Essential school teachers have been working for years to dispel the “excruciating boredom” by engaging students actively in meaningful work, and to replace the “stark terror” of periodic or irrelevant testing with authentic exhibitions of mastery.
But as Coalition member schools try to make assessment integral to every classroom action, policy efforts to create and enforce higher standards in education are going on in their state capitals, in the U.S. Congress and Cabinet, and in the meeting rooms of professional and scholarly organizations. How will this often confusing array of activity affect the daily routines of schooling? What can Essential schools do to understand and influence the policies that may shape their future?
Beginning in the Bush Administration and now continuing under President Clinton, state and federal policymakers alarmed by U.S. students’ poor standing against their global competitors have pushed for new accountability measures. The 1989 “education summit” in Charlottesville, Virginia gathered 50 governors who agreed on broad goals for school improvement by the year 2000, and a number of states launched ambitious restructuring efforts in its wake. Now an Administration peopled by former governors-from Bill Clinton to Education Secretary Richard W. Riley and Riley’s deputy, Madeline Kunin- is following hard upon state heels with a plan to define national standards and goals, measure progress toward them, and reward those states that align their policies with them.
Known as the “Goals 2000: Educate America Act,” the current bill creates a structure in which the federal government authorizes major disciplinary groups to articulate content standards in their fields for a National Education Standards and Improvement Council. It urges states and districts voluntarily to develop their own student performance standards and assessments-and later, standards governing “opportunity to learn”-which that Council would approve and certify. And it supplies an unspecified amount of money to support state and local school reform efforts.
Goals 2000 will most likely not withhold other federal educational funds (which fuel much of state education departments’ budgets) if states choose not to go along. But its national education goals gain muscle indirectly as the Elementary and Secondary Education Act (E.S.E.A.) comes simultaneously before Congress for reauthorization, linking distribution of Chapter 1 funds to whether states have aligned their standards for all students with the goals.
Now schools that have been struggling to keep up with their states’ new curriculum frameworks and accountability requirements may for the first time find those policies reinforced by the federal government. At the same time, the political scramble to define “what students should know and be able to do” gets ever more complex and contradictory.
Over a dozen national subject-area associations huddle separately over content standards that outdo each other in sweeping goals; but no mechanism exists to integrate these goals so a teacher can realistically attempt to meet them in the crowded school day. State legislatures and education departments debate “performance outcomes” that define the thinking skills students must master across the disciplines, but few assessments of such skills exist, so schools and students are still rated and selected by the multiple-choice measures of the past.
And many school people express fury at the hypocrisy of a system that holds them accountable for improving student performance without holding political authorities accountable for providing adequate support for schools.
Despite the inherent contradictions in the current political discourse, however, Essential School leaders recognize the importance of working with policymakers to reach their shared objective of more rigorous standards for all American students. “Just as we ask schools to stay with the hard work of doing honest and meaningful student assessment,” says CES Chairman Theodore Sizer, “we must work for the best possible resolution at the national policy level. We may not agree on all the means by which these standards are achieved, but we are committed to staying in the conversation.”
Coalition member schools have a crucial role to play in this discussion, Sizer points out. To start with, they are already engaged in asking themselves what they want of their students and how they will know when they have it. As they pioneer new strategies of curriculum and pedagogy that challenge all students to master essential thinking skills, and as they exhibit the results in new forms of assessment, all eyes will be on them.
“We have a head start on the accountability issue in many ways,” Sizer observes, “because of our long-standing emphasis on public exhibitions of student work before the local community.” And because the Essential School principles stress assessment as a key step in learning, not an ad hoc event that occurs after teaching and learning are done, they bring an important perspective to the national discussion about testing.
Instead of worrying about whether their actions will fit whatever new policy shoe is poised to fall, Coalition leaders say, Essential schools should lobby hard for their states to give them latitude, support, and a significant voice in the setting of state standards. If they succeed, the state can help them take a giant step forward, out of isolation and into a network of shared resources, shared philosophy, and substantive rewards from college admissions to funding partnerships.
“The time has never been more ripe for Essential schools,” declares Sherry King, superintendent of the Croton-Harmon school district in New York’s Hudson River Valley. “A lot of legislation out there is supporting what we believe in.” New York’s educational Board of Regents, indeed, has approved a “New Compact for Learning” that embraces the Essential School philosophy in virtually all its aspects.
But as other education departments write their own prescriptions for state-by-state reform, Coalition member schools might look hard at the assumptions that underlie those standard-setting procedures. The big issue now is not whether but how state and national standards and assessments will drive the schools of the future. The critical questions affecting everyone from the first-year teacher through the highest level of policymakers revolve around who defines and develops those standards, by what means student progress toward them is assessed, and how the resulting information is to be used.
Who Will Set Standards?
Tensions are already evident between teacher-generated and “top down” efforts to define standards and between subject-area emphases and cross- disciplinary ones. In a flagship effort at standard-setting that won high praise for its participatory process, the National Council of Teachers of Mathematics (NCTM) recast its K-12 curriculum, putting inquiry and problem-solving at the center of how students learn math. Schoolteachers worked closely with university professors and mathematicians on developing the curricular standards and providing practical guidelines for teachers on how to put them into place. Actual classroom change has been creaky and laborious, mostly because the NCTM standards represent a dramatic shift from how almost every U.S. student (though not those in other countries) has learned math in this century. The new standards literally require a fundamental re- education of teachers and parents-an early warning, many critics believe, to those who think new standards can change schools without substantial investment of time and resources.
In eight other subject areas- science, history, geography, civics, English, foreign languages, the arts, and physical education- professional and scholarly organizations are currently at work on standards of their own. Many of these efforts are funded at varying levels by the U.S. Department of Education, and they will probably define the federal government’s approved “content area standards.” Some of them include teachers in their process; others keep their distance from the messy realities of the classroom.
Also at the federal level, a very different set of standards has been put forward by the Labor Department’s Secretary’s Commission on Achieving Necessary Skills (SCANS) report. This document, which takes a cross-disciplinary perspective, defines the workplace skills high school graduates need to succeed in the new economy, and it reflects on means to assess those skills. Though not technically part of the Goals 2000 initiative, it has already played a substantial role in the national discourse on standard-setting.
The same tensions inherent in these few examples show up as state departments of education struggle to come up with their own new definitions of student success and ways to measure it. Some states include teachers early on; others start with high-level task forces and lobby for general acceptance later on. Some states start by defining subject-area “curriculum frameworks”; others cross the disciplines to describe “learning outcomes” in terms of thinking skills. All the while, the states and the federal education department are eyeing each other warily, wondering if the plans of one may pose a threat to the other.
How to Assess Students?
The answers may hang, in the end, on what means those intent on school accountability use to assess student progress. Again, an array of possibilities presents itself, from the top to the bottom of the educational bureaucracy.
The original Bush plan to include a national test in the legislation (which was then dubbed “America 2000”) met with so much protest that it was dropped; but many policymakers still hope that test- makers will invent a way to “calibrate” state tests so that student scores will be comparable across the country. A number of states are putting their money into developing or buying new tests that emphasize open-ended questions, problem-solving strategies, essays, and other “performance” tasks. This, reformers hope, will encourage teachers to spend more class time on practicing the higher-order skills that old-style tests ignore in-depth reading and discussion, research and invention, solving complex problems, initiating projects, and creating original work.
California, for instance, recently started giving all students in grades 4, 8, and 10 in several subjects its California Learning Assessment System (CLAS) tests, a combination of open-ended, in-depth questions and “enhanced multiple-choice” questions that require more complex thinking. The tests, which are teacher-scored, rate students on a criterion-referenced proficiency scale rather than using the old-style norm-referenced system.
In addition, California is encouraging teachers to use portfolios as an ongoing, classroom-based performance assessment. One pilot program has teachers in grades K-12 learning to document student progress across the curriculum by means of the California Learning Record, an adaptation of the observation-based British Primary Language Record that has won high praise for its usefulness with multilingual populations and its involvement of parents and students.
New York’s New Compact for Learning calls for a combination of classroom-based and state-assisted assessment that uniquely reflects Essential School thinking. (See sidebar 1.) The state would work closely with schools to support local development of assessment programs that fit each school’s goals and curriculum and to provide access to a bank of assessment tasks and instruments. Graduation would follow completion of a “Regents Portfolio” that exhibits competency across the curriculum as defined in the state’s list of desired “learning outcomes.” Along with Central Park East Secondary School, Urban Academy, and University Heights High School in New York, six new “Coalition Campus” high schools begun in New York City this year already specify “graduation by exhibition” of specific cross-disciplinary tasks.
The Kentucky Instructional Results Information System (KIRIS) concentrates on assessing six instructional goals: basic skills, core concepts, self-sufficiency, group membership, problem solving, and integration of knowledge. Each year all students in grades 4, 8, and 12 must take conventional multiple-choice tests; but they are also assessed using classroom-based performance tasks and portfolios of their best work. In addition, teachers of students in other grades are urged to incorporate continuous performance assessments into daily activities.
Connecticut’s Common Core of Learning outlines what the state’s high school graduates should know and be able to do, and since 1989 its education department has been working on math and science performance assessments that reflect these goals. They rely largely on tasks embedded in the curriculum, which last from a few days to a week or more; students design and carry out investigations including gathering data, solving problems, and presenting their work orally and in writing. Teachers are coached in how to prepare and score students, and the state also annually evaluates a random sample of eleventh graders against the goals of the Common Core of Learning.
Vermont’s ambitious new assessment system requires all teachers in grades 4 and 8 to keep portfolios of students’ best work in mathematics and writing. Teacher-run regional networks train them in selecting appropriate tasks and scoring them according to a finely tuned rubric; visiting committees validate each school’s scores by rescoring random portfolios. The state supplements this information with a conventional test with which to compare districts’ progress.
Meanwhile, whether or not their state is involved in redrafting its standards and expectations, many reform-minded districts and schools have taken the lead in experimenting with classroom-based alternative assessments like portfolios, projects, and exhibitions.
In Pittsburgh, the ARTS PROPEL portfolio program, which is linked with the Performance Assessment Collaboratives for Education (PACE) group at Harvard University, assesses student progress in writing and the arts by means of “processfolios,” as Harvard’s Dennie Palmer Wolf has dubbed them These include not only a student’s best work but the drafts that led up to them, adding to the folder the critical element of student self-reflection. Another pioneer of district-level performance assessment is San Diego, where the CES-affiliated O’Farrell Community School is a leader.
And in many individual Essential school classrooms, teachers have similarly begun embedding ongoing assessment tasks in the curriculum itself, rather than treating them as separate from the rest of teaching and learning. Projects, exhibitions, and demonstrations all offer windows into student progress that can inform curricular decisions on an individual and a program level.
Which Tests Do What
Amid all this experimentation, states are continually looking for better methods of holding their schools accountable to the larger community- ranking and comparing one to the other within the state and even across the nation. And even states in the midst of reform, it seems, like to get their data for such comparisons from standardized tests (either conventional or “performance-based”) that they give either to every student or to a statistical sample selected at random. At the heart of any decision to use one kind of assessment instrument over another is how the results are scored.
Conventional standardized tests are commonly scored using a norm- referenced system, in which scores compare how one student did on the test compared to some other group of test-takers answering the same questions. The point is to provide a general gauge-usually in the area of basic skills-of individual achievement and school success.
Conventional testing continues to be widespread in many places. Such tests as the Metropolitan Achievement Test, the Iowa Test of Basic Skills, and the Stanford Achievement Test are required for Chapter 1 funding. School districts in many states use them for other sorting and selecting purposes as well, and also as their principal accountability device; local newspapers often report schools’ average scores, for example.
An important disadvantage of norm-referenced tests is that the data are often ambiguous or even meaningless when one tries to interpret them. For example, using average test scores to decide whether a school is showing improvement in its program has a fundamental flaw, as Columbia’s Linda Darling-Hammond and others point out. A school’s average overall performance can “improve” simply by juggling who takes the test-by labeling more students for special education to count them out, for instance. And it can “decline” if a school has a sudden influx of children who need help the most-non-English speakers, for example – no matter how good a job it may be doing.
A still more serious disadvantage is that teaching to such tests in order to boost scores can actually hurt students’ real intellectual achievement, since the test format emphasizes isolated low- level skills.
Advocates of performance-based testing call instead for criterion- referenced scoring, which measures how individual students do on specific tasks. Instead of reporting, for example, that the average eighth grader scored 60 percent on a standard test of written English, a criterion-referenced test might reveal that 60 percent of eighth graders did not use evidence well in writing an explanatory essay.
One proponent of this approach is the New Standards Project, a heavily funded research effort involving eighteen states and headed by Lauren Resnick at the University of Pittsburgh and Marc Tucker at the National Center on Education and the Economy. Working with the national disciplinary organizations, the project aims to fashion a national examination system that allows for significant flexibility and discretion at the state and local level.
What Is at Stake
Once student performance can be quantified, the results can be used in many ways, not all of them consistent with the original goals of performance-based assessment. Some states reward teachers and schools when their students meet performance standards and impose sanctions when they don’t. In many states, test scores still serve as a means to track students into ability groups they will stay in for years. In Kentucky, teachers can lose their jobs and schools can lose funding when they don’t come up to snuff. This “high-stakes” reasoning is tantamount, Darling-Hammond charges, to ranking hospitals by their mortality rates. “They would get rid of their AIDS units, they would get rid of their cardiac care units, and everyone would go into pediatrics in order to get their statistics up,” she says.
One antidote to possible abuses of this kind, critics suggest, would be to include other factors with any reports of schoolwide scores. Rhode Island, for example, has begun reporting a school profile that includes extensive information about the school budget, the teaching staff’s experience, the community’s socio-economic status, and other influential educational inputs. (This practice can be dangerous, however, observes CES’s Bob McCarthy, because it appears to condone and perpetuate lower expectations for disadvantaged schools.)
Vermont’s portfolios in writing and mathematics, as well as Connecticut’s performance tasks, are meant chiefly to provide instant feedback to their teacher-scorers on what areas require additional classroom emphasis. How well an individual student does on these “low- stakes” assessments will not determine the future success or failure of the student or the school, but only what direction his or her classroom work will take.
Back in the classroom, however, many wonder whether any standards and assessments generated higher up the line can really help local teachers design effective instruction. “Any state’s framework is impotent to affect practice unless it is designed as a continuous learning opportunity for teachers, not just for kids,” says Joe McDonald, who directs the Coalition’s Exhibitions Project. “The main purpose of all assessment systems has to be to generate local energy and invention in the interests of all kids’ achievement.”
On this side of the philosophical fence are Theodore Sizer and other Coalition leaders, who question both the cost and the effectiveness of standardized testing, which removes the classroom teacher from the work of performance assessments.
“Do we need to create a teacherproof assessment system?” asks Croton- Harmon superintendent Sherry King. “For one thing, these things are horrendously expensive to develop. And whenever teachers get packets of performance-based assessments from somewhere else, they’re in danger of teaching in a formulaic way, just like they did to the old- style tests. You might as well give them workbooks!”
Ted Sizer frames his objection along different lines. Every community, he argues, has its own deep-rooted values that show up in its school curriculum, reflecting what local people want their graduates to know and be able to do. The tradition of local authority over schooling rests on the principle, he says, that those most affected-parents, community members, and the like- can influence what their children are taught and tested on. Except in three areas he calls relatively free of the “value clash” clear writing, resourceful reading, and everyday mathematical reasoning-he believes the state should steer clear of the testing and curriculum business. Instead, states could insist that schools maintain good files on every student and could regularly audit those files against state standards, providing richer and fairer data at far less expense than mass testing entails. And they could oblige every school to present its own assessments to public scrutiny every year, publishing an annual report and holding a public meeting to defend it.
A few states are trying to incorporate such attitudes into their efforts to define standards and test student performance. New York’s New Compact for Learning, for instance, favors teacher ownership of the assessment process-what officials term “top-down support for bottom-up reform.” Policies lean toward the most individual of evaluation processes-getting the school community to review its own standards by opening itself to the scrutiny of its key stakeholders. Central Park East Secondary School, for instance, last year invited a committee of local college and university professors, business people, and others to review its graduation portfolios and answer whether the school’s standards lived up to their expectations.
California’s school restructuring movement has devised a protocol that asks each school’s teachers to scrutinize its student work and discuss together whether it meets their own standards. “We ask the school to use what they learn in assessing their own students’ work to drive change in their own curriculum and pedagogy,” says Maggie Szabo, a former Coalition regional coordinator who now heads the state’s school restructuring effort. “It recasts accountability as a positive thing, not a big stick from the outside.”
Instead of putting federal and state money into developing new tests, Sizer contends, government should help school people by exposing them to many examples of rigorous student work that could influence their own thinking-what assessment expert Grant Wiggins has called “standards without standardization.” One proposed Coalition project aims to develop a national electronic communications network that would link teachers with colleagues from other schools, professional organizations, employers, higher education professionals, and outside stakeholders to share and critique (using a specific “tuning protocol”) what they are asking students to do. The Exhibitions Collection database already under way at CES would be a part of this; so would innovations like a multimedia “digital portfolio,” which keeps track of student progress toward graduation requirements through not only written work but audiovisual presentations on line. (A future issue of HORACE will explore further the uses of technology for assessment and other Essential School purposes.)
Reorienting the conversation about rigorous standards so that its primary locus is the school and community, Sizer suggests, gives it a chance to influence the real problems schools struggle with-too little depth in the curriculum, not enough time for meaningful learning, scarce resources to help teachers develop their craft. “The discussion can’t take place in a vacuum,” he says. “It will be influenced by state requirements, by subject-area people, and perhaps more than anything by the expectations of colleges.” But the federal government, he believes, should play the role of persuader, not dictator. The current movement runs the risk, one recent commentator noted, of becoming a conversation among governments-another dictum from on high, complete with new curricula, new tests, new red tape, and a whole new set of hazards.
Unquestionably, Essential schools occupy a key position right now to affect the decisions being made at higher levels. Already in the midst of substantive dialogue about their expectations and experimentation with new methods of assessment, when successful they can stand as powerful examples of a different paradigm of schooling. “If there are places where local assessments can work, your chances are more than doubled to put them into place system-wide,” observes Judy Bray, a systems coordinator at the Education Commission for the States. “I believe that our dedicated network of school folk can change the system if they will take a public stand, in support of or dissent to the standards under development in their own state.”