At the Chalk Face: If Economists Studied Education Research, Would They Still Promote Value-Added Evaluations?

John Thompson

April 10, 2014

Vergara versus California seeks to strike down the due process rights of that state’s teachers. The case is based on the opinions of the plaintiffs that legal protections for teachers damage the civil rights of poor children of color. The evidence for this extraordinary argument is largely based on the opinions of a few administrators, mostly trained by the corporate reform Broad Academy, and the opinions of a few economists who conduct regression analyses of test score growth.

A careful reading of the work of economists like Tom Kane, Raj Chetty, and their colleagues has always raised questions about why believe their basic research was appropriate for these real-world policy issues. The idea that such academic research would justify the striking down the duly enacted laws of California seems outrageous. But, we live in “the Big Sort,” or the extreme self-segregation of contemporary America. It is easy for very smart people, who have no contact with the realities of inner city schools, to have extreme misunderstanding of what it is like in those schools.

So, I was intrigued by Raj Chetty’s testimony in Vergara. I have nothing but respect for the quality of his economic research; I just can’t understand why it is relevant to that teacher-bashing case. Three of his statements (two on the witness stand and one in the text of his major study) seem especially illustrative of apparent misunderstandings about the logistics of schools in the inner city.

As I explained previously, the most likely explanation of why Kane’s and Chetty’s studies produce different teacher effects is likely to be related to the different samples that they use. The effects of teachers who they determine to be “highly ineffective” are likely to be greatly influenced by the number of harder-to-educate students who are excluded from their samples. (When samples include more of the more challenging students, the estimated learning loss due to the least effective teachers is likely to be greater. Common sense indicates that much of the differential outcome is due to a greater learning lost to factors beyond the control of teachers.)

Another disturbing part of this issue, however, is Chetty et. al’s rationale, in “Longterm Impacts of Teacher,” for excluding classes where more than 25% of the students are on IEPs. They assumed that those classes would have co-teachers. That might be true in New York City elementary classes, but it is an exclusion that severely undermines their work in terms of measuring the effectiveness of and the effects of firing secondary teachers in Los Angeles and elsewhere.

Non-educators might believe that classes with such numbers of special education students are rare, so their exclusion would not compromise the practical value of their research in other venues. I, for one, never had a year without classes where 25 to 40% of my regular class students were on IEPs, not to mention the others on ELLs or on probation for extreme felonies. By the end of my career, most of my IEP students suffered from conduct disorders, severe emotional disabilities, or mental illness, not just cognitive disabilities. So, such an assumption by these economists raises the question of why they agreed to methodology being applied to Los Angeles in order to defend claims related to real-world policy for firing individuals, many of whom face challenges as tough or tougher than I did.

As he approached the heart of his expert testimony in Vergara, Chetty raised a second question about the extent of sorting in public schools. On the witness stand, he was not reluctant in acknowledging the importance of sorting outside of schools when wrestling with the question of teachers’ abilities to raise test scores. Then, several times, he acknowledged that there are unobserved in-school factors that could be relevant when measuring teachers’ effectiveness. He was not shy about mentioning class size and technology as possible in-school factors over which teachers have no control, but he seemed to avoid the words “sorting” and “peer pressure” as unmeasurable in-school (as opposed to out-of-school) factors. The big exception was his discussion, prompted by the judge’s questioning, about his graphic about highly effective teachers leaving schools.

Under questioning, Chetty agreed that these highly effective teachers were leaving schools where the overall value-added (for the grades studied) was declining. He admitted that the very top teachers were more likely to respond to peer pressure and leave schools that are declining. Moreover, when the overall faculty has less success in raising student outcomes, that can make it more difficult for individuals to do so. So, teachers want to teach well, he speculated, and they are more likely to leave a school in decline. This statement was the bookend of another acknowledgment of the importance of peer effects of teachers on teachers – the arrival of highly effective teachers helps their colleagues become more effective.

Such a statement, on its face, is more supportive of the defendant’s case than the plaintiffs. (Punishing an individual teacher due, in part, to the effects of his peers is problematical, especially if the goal is attracting new talent to those hard-to-staff schools.) I was especially intrigued, however, by Chetty’s willingness to admit to sorting and peer effects on teachers, and yet refused to acknowledge the importance of in-school sorting and peer pressure on students.

Common sense (and my professional experience) says that the importance of student sorting and peer pressure dwarfs the importance of peer pressure of teachers on each other.

I must emphasize that Chetty is not alone in this bifurcation. One of the original arguments for value-added accountability was that economists found as much in-school variation in teacher effects as in out-of-school variation. For a practitioner, that is a “well, duh” finding. Of course the extreme diversity within schools has as much of an effect on in-school effectiveness as out-of-school diversity. The conditions in some classes, halls, and grades are worlds apart from the conditions in other parts of the same building. (An 11^th grade teacher who thinks he faces the same problems with the same students as an 9^th grade teacher needs a reality check. I taught 10^th grade which was in the middle of the two extremes.)

As Chetty’s testimony came to an end, he noted, the arrival of a high value-added middle school math teacher did not increase the value-added of English (ELA) classes as much as math classes, and visa versa. He asserted that nobody has countered that claim.

I certainly won’t dispute that argument. I started to wonder, however, if non-educators understand that there is a long history behind patterns regarding the very different challenges of increasing math and reading scores. So, I must emphasize, I do not want to prejudge Chetty’s personal understanding of this issue. Instead, I want to ask how many economists are aware of how and why a variety of school improvement efforts have been stymied by tougher challenge of raising reading scores, as opposed to math scores.

If Chetty or any other economists are unaware of this pattern, that does not affect their academic research. But, Vergara seeks to make it easier to fire all teachers, not just math teachers. So, why would economists, who devise value-added models, testify in a case that could subject teachers of classes that require the most reading to value-added evaluations, even though they would be less predictive of the effectiveness of those teachers?

Chetty even made me wonder how many economists testifying in Vergara are aware of the “Matthew Effect.” After all, the single biggest problem with value-added evaluations may be rooted in that dynamic. Maybe economists are ignoring the Matthew Effect because they do not understand why it is so crucial.

So, I would remind Chetty that “Longterm Teacher Impacts” notes that “variation in teacher effects is about 50 percent greater in math compared with reading, but the long-term association of teacher-induced gains with future outcomes (e.g., earnings, college attendance) are larger for reading vis-à-vis math. That is consistent with the long-acknowledged difficulty of improving outcomes of students who do not read for comprehension by 3^rd grade. Cognitive scientists have shown that children who “learn to read” for comprehension by 3rd grade can then “read to learn.” Their outcomes continue to rise even in summer when they are out of school. The upward trajectory continues throughout school and, presumably, after graduation (raising post-school earnings.)

It is far harder to raise the performance of children who do not read for comprehension well enough to read to learn. Consequently, good teachers in the low-performing schools are likely to have lower value-added than equally good teachers in high-performing schools. This helps explain why value-added estimates are unfair to teachers with students with high percentages of poor and special education students, and they are most unfair to teachers with high percentages of English Language Learners.

It is harder for teachers of subjects that require reading comprehension to meet test score growth targets, and that dynamic is likely to grow more pronounced in the upper grades. Advocates of value-added evaluations may be less sensitive to the importance of this pattern because they have barely tried to research value-added in high schools. So, I wonder if these experts would have testified so assertively if they had wrestled with this pattern.

Chetty, Kane, and other expert witnesses are assisting in an all-out assault on teachers’ most basic rights. I disagree with them, but I can see why they would believe that their research is relevant to 3^rd through 8^th graders in math and, to a lesser degree, elementary reading classes. But, even though they have not studied high schools, they are participating in an effort to also destroy the rights of high school teachers.

And, nothing in their research could possibly support the opinion that once current laws are stricken that data-driven evaluations in non-tested subjects would likely benefit students in those classes. Up to 80% of students are in classes that remain virtually unstudied by value-added researchers. Yet, they are so confident in their opinions – based on their goal of addressing the bottom 5% of teachers – that they are helping a legal campaign (based almost completely on the opinions of some like-minded persons) to strike down duly enacted laws.

Of course, I would also like to understand why a few corporate reformers are so convinced in the righteous of their opinions that they have initiated this assault on teachers. But, I’ve already gone too far down the path of trying to speculate on why they engage in such overreach. I just hope the Vergara judge has the inclination to look deeply into both the testimony of expert witnesses and how it is very different than the evidence and logic they have presented in written documents.

This blog post has been shared by permission from the author.
Readers wishing to comment on the content are encouraged to do so via the link to the original post.
Find the original post here:

At the Chalk Face

The views expressed by the blogger are not necessarily those of NEPC.