This is probably a case where research didn't get translated well for those of us who have a familiarity with the subject. A measurement researcher, Walter Stroup, says the TAKS test in Texas is flawed because the IRT model it employs "is more sensitive to how it ranks students than to measuring what they have learned." Okay, but that's nothing new -- we've known that about IRT for a long time. So what details are hiding between the lines of this article that support Stroup's research?