Thursday, April 10, 2014

Let the VAM Lawsuits Begin: Issues and Concerns with Their High-Stakes Use

Lawsuits against states using value-added models in making teaching evaluation decisions has begun in earnest. There are now three lawsuits underway challenging the use of this controversial statistical methodology and the use of test scores to determine teacher effectiveness. This increase in litigation is both an indication of how rapidly states have adopted the practice, and how these same states failed to address so many issues and concerns with the use of VAMs in this manner.

Two lawsuits have now been filed in Tennessee against the use of value-added  assessment, known as TVAAS as a part of teacher evaluation. The first lawsuit was filed against Knox County Schools in Tennessee by the Tennessee Education Association on behalf of an alternative school teacher who was denied a bonus because of her TVAAS ratings. (See “Tennessee Education Association Sues Knox County Schools Over Bonus Plan” ) In this case, the teacher was told she would receive system-wide TVAAS estimates because of her position at an alternative school, but 10 of her students were used anyway in her TVAAS score, resulting in a lower rating and no bonus. This lawsuit contests the arbitrariness of TVAAS estimates that use only a small number of teacher’s students to determine overall effectiveness.

In the second lawsuit, filed also against Knox County Schools, but also against Tennessee Governor Bill Haslam, state Commissioner of Education Kevin Huffman and the Knox County Board of Education, an eighth grade science teacher claims he was also denied a bonus unfairly after his TVAAS value-added rating was based on only 22 of his 142 students. (See “TEA Files Second Lawsuit Against KCS, Adds Haslam and Huffman as Defendents” ) Again, the lawsuit points to the arbitrariness of the TVAAS ratings.

A third lawsuit has been filed in Rochester, New York by the Rochester Teachers Association alleging that officials in that state “failed to adequately account for the effects of severe poverty, and as a result, unfairly penalized Rochester teachers on their Annual Professional Performance Review” or yearly teacher evaluations. (See “State Failed to Account for Poverty in Evaluations”). While it appears that this Rochester suit is disputing the use of growth score models not value-added, it also challenges the whole assumption and recent fad being pushed by politicians and policymakers of using test scores to evaluate teachers.

North Carolina jumped on the value-added bandwagon in response to US Department of Education coercion, and now the state uses its TVAAS version called EVAAS, or Educator Value Added Assessment System as part of teacher and principal evaluations. Fortunately, no districts have had to make high stakes decisions using the disputed measures so the lawsuit floodgate hasn't opened in our state yet, but I am sure once EVAAS is used to make decisions about employment, the lawsuits will begin. When those lawsuits begin, the American Statistical Association has perhaps outlined some areas of contention about the use of VAMs in educator evaluations in their ASA Statement on Using Value-Added Models for Educational AssessmentHere’s some points made by their position statement that clearly outlines the questions about the use of VAMs in teacher evaluations, a highly questionable statistical methodology.
  • VAMs (Value-added models) are complex statistical models, and high-level statistical expertise is needed to develop the models and interpret their results.” States choosing to use these models are trusting third-party vendors to develop them, provide the ratings, and they are expecting educators to effectively interpret those results. Obviously, there’s so much that can go wrong with the interpretation of VAM results, the ASA is warning that there is a need of people who have the expertise to interpret those results. I wonder how many of these states who have implemented these models have spent time and money training teachers and administrators to interpret these results, other than subjecting educators to one-time webinars or "sit-n-gets"?
  • “Estimates of VAM should always be accompanied by measures of precision and a discussion of the assumptions and possible limitations of the model. THESE LIMITATIONS ARE PARTICULARLY RELEVANT IF VAMS ARE USED FOR HIGH STAKES PURPOSES (Emphasis Mine).” I can’t speak for other states, but in North Carolina there has been little to no disclosure or discussion about the limitations of value-added data. There’s been more public relations, advertising, and promotion of the methodology as a new way of evaluating educators. They even have SAS promoting the methodology for them.The Obama administration has done this as well. The attitude in North Carolina seems to be, “We’re gonna evaluate teachers this way, so deal with it.” There needs to be discussion and disclosure about SAS’s EVAAS model and the whole process of using tests to evaluate teachers in North Carolina. Sadly, that’s missing. I can bet it’s the same in other states too.
  • VAMs are generally based on standardized test scores, and do not directly measure potential teacher contributions toward other student outcomes.” In other words, VAMs only tell you how students do on standardized tests. They can’t tell you all the other many, many ways teachers contribute to students’ lives. The main underlying assumption with using VAMs in teacher evaluations is that only test scores matter, regardless of what supporting policymakers say. While its true that the North Carolina Evaluation model does include other standards, how long will it take administrators and policymakers to ignore those standards and zero in on test scores because they are seen as the most important? The adage, "What gets tested, gets taught!" is true and "What get's emphasized the most through media and promotion, matters the most" is also equally true. When standard 6 or 8 is the only standard on the educator evaluation where an educator is "In Need of Improvement" then you can bet test scores suddenly matter more than anything else.
  • “VAMs typically measure correlation, not causation: Effects---positive or negative---attributed to a teacher may actually be caused by other factors that are not captured in the model.” There are certainly many, many things----poverty, lack of breakfast, runny noses---that can contribute to a student’s test score, yet there’s a belief that a teacher directly causes a test score to happen, especially by those pushing VAMs in teacher evaluations. The biggest assumption by those promoting VAMs in teacher evaluations is that the teacher's sole job or part of their job is the production of test scores. In reality, teaching is so much more complex than that, and those reducing it to a test score have probably not spent much time teaching themselves.
  • “Most VAM studies find that teachers account for about 1% to 14% of the variability in test scores, and that the majority of the opportunities for quality improvement are found in system-level conditions.” Yet in most states, educational improvement falls almost entirely on the backs of educators in the schools in the form of VAM-Powered Teacher Evaluations. There's little effort to improve the system. There’s no effort to improve classroom working conditions, provide professional development funding/resources, adequate material/resource funding. Instead of looking at how the system prevents excellence and innovation with its top-down mandates and many other ineffective measures, many states, including North Carolina and the Obama administration place accountability entirely and squarely on the backs of educators in the classrooms and schools. If the education system is broken, you don't focus on parts, you improve the whole.
  • “Ranking teachers by their VAM scores can have unintended consequences that reduce quality.” If all learning that is important can be reduced to a one-time administered-bubble-sheet test, then all is well for VAM and the ranking of teachers. But every educator knows that tests measure only a minuscule portion of important learning. Many important learning experiences can't even be measured by tests. But, if you elevate tests in a high stakes manner, then those results become the most important outcome of the school and the classroom. The end result is teaching to the test and test-prep where the test becomes the curriculum. Getting high test scores becomes the goal of teaching. If that’s the goal of teaching, who would want to be teacher? Elevating test scores through VAM only will escalate the exit of teachers from the profession and discourage others from entering it. because there's nothing fulfilling about improving student test scores. We didn't become educators to raise test scores; we became educators because we wanted to teach kids.
  • “The measure of student achievement is typically a score on a standardized test, and VAMs are only as good as the data fed into them.” Ultimately, VAMs are only as good as the tests administered to provide the data that feeds the model. If tests don’t adequately measure the content, or if they are not standardized or otherwise of high quality, then the VAM estimates are equally of dubious quality. When states try to scramble to create tests on the fly and do not develop quality tests, then the VAM estimates are of dubious quality too. North Carolina scrambled to create multiple tests in many high school, middle and elementary subjects just to have data to feed their EVAAS model. Yet, those tests and the process of their creation and field testing, even how they’re administered makes them questionable candidates for serious VAM use. VAMs require high-quality data to provide high-quality estimates. The idea that "any-old-test-will-do" is an anathema to VAMs which require quality test data.
The American Statistical Association position statement on using value-added models in educational assessment makes some supporting statements about their use too. They can be effectively used as part of the data teachers use to adjust classroom teaching. But when a state does not return those scores until October or later, its impossible to use that data to inform teaching, three months into the school year. Also, just getting a rating does little to inform teaching. Testing provides an opportunity for policymakers to provide teachers with valuable data to improve teaching. Sadly, the current data provided is too little and too late.

As the VAM-fed teacher evaluation fad and craze continues and grows, it is important for all educators to inform themselves about the controversial statistical practice. It is not a methodology without issues despite what the Obama administration and state education leaders say. Being knowledgeable about it means understanding its limitations as well as how to properly interpret and use such data. Don't wait for states and the federal government to provide that information: They are too busy promoting its use. The points made in the American Statistical Association’s Statement on Using Value-Added Models for Educational Assessment are excellent points of entry for learning more.

No comments:

Post a Comment