THE QUANTITATIVE APPROACH
Recognizing the ludicrous results of the current evaluation system, many researchers and policymakers have called for using a more data-driven approach to assessing the performance of individual teachers. Though imperfect, these quantitative measures of teacher quality can dramatically improve today's rubber-stamp evaluation system.
This turn to quantitative assessments is part of a broader shift that, over the past two decades, has changed how researchers and policymakers think about public schools. In the past, education research was dominated by those who believed that schools were too complex and heterogeneous for empirical evaluations of entire systems to hold meaning. Rather than follow the scientific revolutions taking place within other disciplines, researchers in education followed what sociologist Thomas Cook has described as an R&D model based on various forms of management consulting. This "sciencephobia" (Cook's phrase) in education research during the 1980s and '90s left a major void in our understanding of the effectiveness of policies operating within public schools and created a culture in which teachers and school systems were suspicious of quantitative measurement.
Today, however, quantitative researchers — particularly economists — are at the cutting edge of research in education policy. They now hold significant positions within the U.S. Department of Education and are frequently hired to faculty positions at prestigious education colleges. Such quantitative researchers were previously uninterested in education largely because there were no meaningful data to consider. But they were drawn into the education discussion when the boom in standardized testing produced extensive data on student academic achievement. The economists in particular treated education as a production process: Organizations mix inputs (such as curriculum, class size, and teacher quality) in order to produce an output (student proficiency). This worldview values scientific procedures and quantitative measurement.
As the most important school-based input during the learning process, teachers have received considerable attention from this new crop of researchers. If the goal is to improve student learning, it is only logical that differences in teacher quality should be identified and addressed. And if quantitative measures exist to identify those differences, so much the better.
Economists and statisticians have developed just such a quantitative measure of teacher quality: "value added" assessment. This approach uses a statistical model to estimate the teacher's independent contribution to student learning, as measured by standardized-test scores. Value-added measurement (VAM) generally relies on a common statistical technique known as multiple regression. In this case, the regression analysis estimates how differences in observed characteristics about a student, his school, and his teacher are related to changes in his math or reading test scores in a particular year.
Value-added analysis predicts how well a student should perform in a given year based on a series of observable factors that are related to his academic achievement, but are beyond a teacher's control — factors such as race, gender, and family income. The analysis then compares for each teacher the estimate of how well his students were expected to perform at the end of the school year given the characteristics they brought into the classroom with their actual test scores in the spring. The teacher's VAM score represents his performance in standard-deviation units relative to the average teacher (the mean VAM score) in the school system; the mean score is set at zero. If a teacher's students tend to outperform expectations on average, then the teacher's VAM score will come back as positive; if students perform worse than expected given their characteristics, the teacher will receive a negative VAM score.
The value-added method requires access to data that follow the test scores of individual students over time and match the students to their teachers. A decade ago, such data simply did not exist. But thanks to the ubiquity of standardized testing imposed by the No Child Left Behind Act, this information should be available in all states and school districts.
The analysis itself is carried out by a data center within the governing department of education office. In this way, the use of VAM represents an important consolidation of the administrative structure: Under the current system, the on-site principal has nearly complete control over his teachers' evaluations, but a VAM-based system takes a good deal of that discretionary power out of the principal's hands.
The adoption of value-added measurement is still in the very early stages, though the Washington, D.C., school system is already using value-added measurement to evaluate teachers and make employment decisions. The district's IMPACT evaluation tool, adopted under former D.C. schools chancellor and aggressive reformer Michelle Rhee, uses both frequent classroom observations and value-added analysis to identify ineffective teachers. Washington's approach is already rooting out underperformers: The district dismissed 98 teachers this summer based on poor evaluation results, bringing the total number dismissed for poor performance to nearly 400 since 2009.
Washington is the school system furthest along in the process of embracing value-added measurement, but other reform-minded school systems across the nation have recently begun their own experiments. For instance, teachers in Colorado, Nevada, and Tennessee now revert back to probationary (i.e., non-tenured) status if, for two years in a row, they receive poor performance ratings based in part on VAM scores. New York, New Jersey, and Connecticut have also recently passed legislation to use value-added measurements as an important part of teacher evaluations.
Since the use of VAM is so recent, there are not yet enough data to measure the effects of policies that base employment decisions on value-added analysis. By looking back in time, however, we can consider the likely results of VAM-based policies. For instance, in a recent report for the Manhattan Institute, I examined the measured effectiveness of Florida teachers in 2009. This research shows that, had a VAM-based tenure system been in place in 2007, the teachers who would have been removed were far less effective in 2009 than were teachers who would have been retained. This evidence suggests that more widespread adoption of value-added policies has significant potential to accurately identify and remove underperforming teachers from the nation's schools.
The objectivity inherent in quantitative analysis is perhaps one of the most attractive attributes of value-added assessment. Unlike principals who can be swayed by the thought of having to deal with a disgruntled teacher when considering whether to issue a poor evaluation, the value-added analysis provides its measure of the teacher's performance completely irrespective of the possible consequences. The procedure used to evaluate teachers is determined well before the school year begins; the computer used to run the model doesn't care that the teacher is friends with the principal or is well liked by the students. Nor does the analysis consider that the teacher is a whistleblower within the school. All that matters is whether the students in the teacher's classroom appear to be making academic improvement.
Further, by considering the entirety of the student's measured academic growth during the school year, value-added analysis provides a much broader view of the teacher's effectiveness than do brief, infrequent observations. A teacher surely has good and bad days, but value-added assessment evaluates the end product of a long and challenging school year. As the primary concern in judging teachers is determining whether, at the end of the year, students have made academic progress, value-added analysis helps us to focus on what matters most.