The paradox of uniformity
Nearly a year ago I posted “Danielson Framework criticized by Charlotte Danielson” and it has generated far more interest than I would have anticipated. As of this writing, it has been viewed more than 130,000 times. It has been shared across various platforms of social media, and cited in other people’s blogs. The post has generated copious comments, and I’ve received dozens of emails from educators — mostly from North America but beyond too. Some educators have contacted me for advice (I have little to offer), some merely to share their frustration (I can relate), others to thank me for speaking up (the wisdom of which remains dubious). To be fair, not everyone has been enthusiastic. There have been comments from administrators who feel that Charlotte Danielson (and I) threw them under the school bus. Many administrators are not devotees of the Framework either, and they are doing their best with a legislatively mandated instrument.
Before this much-read post, I’d been commenting on Danielson and related issues for a while, and those posts have received a fair amount of attention also. Literally every day since I posted about Danielson criticizing the use of her own Framework the article has been read by at least a few people. The hits slowed down over the summer months, understandably; then picked up again in the fall — no doubt when teachers were confronted with the fact it’s their evaluation year (generally every other year for tenured teachers). Once people were in the throes of the school year, hits declined. However, beginning in February, the number of readers spiked again and have remained consistently high for weeks. Teachers, I suspect, are getting back their evaluations, and are Googling for information and solace after receiving their infuriating and disheartening Danielson-based critique. (One teacher wrote to me and said that he was graded down because he didn’t produce documentation that his colleagues think of him as an expert in the field. He didn’t know what that documentation would even look like — testimonials solicited in the work room? — and nor did I.)
It can tear the guts out of you and slacken your sails right when you need that energy and enthusiasm to finish the school year strong: get through student testing (e.g. PAARC), stroke for home on myriad learning outcomes, prepare students for advancing to the next year, and document, document, document — all while kids grow squirrelier by the minute with the advance of spring, warmer weather, and the large looming of year’s end.
But this post isn’t about any of that, at least not directly. The Danielson Framework and its unique failures are really part of a much larger issue in education, from pre-K to graduate school: something which I’ll call the drive for uniformity. I blame Business’s infiltration and parasitic take over of Education. It’s difficult to say exactly when the parasite broke the skin and began its pernicious spread. I’ve been teaching (gulp) since 1984 (yes, English teachers were goofy with glee at the prospect of teaching Nineteen Eighty-Four in 1984, just as I was in 2001 to teach 2001 — we’re weird like that), and even then, in ’84, I was given three curriculum guides with precisely 180 pages in each; I was teaching three different courses, and each guide had a page/lesson for each day of the school year. Everyone who was teaching a particular course was expected to be doing the same thing (teaching the same concept, handing out the same handout, proctoring the same test) on the same day.
Not every school system was quite so prescriptive. I moved to another district, and, thankfully, its curriculum was much less regimented. Nevertheless, it was at that school that I vividly recall sitting in a faculty meeting and the superintendent uttering the precept “We shall do more with less.” The School Board, with his encouragement, was simultaneously cutting staff while increasing curricular requirements. English teachers, for example, were going to be required to assign twelve essays per semester (with the understanding that these would be thoroughly read, commented on, and graded in a timely fashion). At the time I had around 150 students per day. With the cuts to staff, I eventually had nearly 200 students per day. This was the mid 1990s.
The point is, that phrase — We shall do more with less — comes right out of the business world. It’s rooted in the idea that more isn’t being achieved (greater productivity, greater profits) because of superfluous workers on the factory floor. We need to cut the slackers and force everyone else to work harder, faster — and when they drop dead from exhaustion, no problem: there are all those unemployed workers who will be chomping at the bit to get their old job back (with less pay and more expectations). CEOs in the business world claimed that schools were not doing their jobs. The employees they were hiring, they said, couldn’t do math, couldn’t write, had aversions to hard work and good attendance. It must be the fault of lazy teachers, the unproductive slackers on the factory floor so to speak.
Unions stood in the way of the mass clearing of house, so the war on unions was initiated in earnest. Conservative politicians, allied with business leaders, have been chipping away at unions (education and otherwise) wherever they can, under the euphemism of “Right to Work,” implying that unions are preventing good workers from working, and securing in their places lazy ne’er-do-wells. The strategy has been effective. Little by little, state by state, protections like tenure and seniority have been removed or severely weakened. Mandates have increased, while funds have been decreased or (like in Illinois) outright withheld, starving public schools to death. The frustrations of stagnant wages, depleted pensions, and weakened job security have been added to by unfair evaluation instruments like the Danielson Framework.
A telltale sign of business’s influence is the drive for uniformity. One of the selling points of the Danielson Framework was that it can be applied to all teachers, pre-K through 12th grade, and even professionals outside the classroom, like librarians and nurses. Its one-size-fits-all is efficient (sounding) and therefore appeals to legislators. Danielson is just one example, however. We see it everywhere. Teaching consultants who offer a magic bullet that will guarantee all students will learn, no matter the subject, grade level, or ability. Because, of course, teaching kindergarteners shapes is the same as teaching high school students calculus. Special education and physical education … practically the same thing (they sound alike, after all). Art and band … peas in a pod (I mean, playing music is a fine art, isn’t it? Duh.).
And the drive for uniformity has not been limited to K-12 education. Universities have been infected, too. All first-year writing students must have the same experience (or so it seems): write the same essays, read the same chapters in the same textbook, have their work evaluated according to the same rubric, etc., etc. Even syllabi have to be uniform: they have to contain the same elements, in the same order, reproduce the same university policies, even across departments. The syllabus for a university course is oftentimes dozens of pages long, and only a very small part of it is devoted to informing the students what they need to do from week to week. The rest is for accreditation purposes, apparently. And the uniformity in requirements and approaches helps to generate data (which outcomes are being achieved, which are not, that kind of thing).
It all looks quite scientific. You can generate spreadsheets and bar graphs, showing where students are on this outcome versus that outcome; how this group of students compares to last year’s group; make predictions; justify (hopefully) expenditures. It’s the equivalent of the much-publicized K-12 zeal for standardized testing, which gives birth to mountains of data — just about all of which is ignored once produced, which is just as well because it’s all but meaningless. People ignore the data because they’re too busy teaching just about every minute of every day to sift through the voluminous numbers; and the numbers are all but meaningless because they only look scientific, when in fact they aren’t scientific at all. (I’ve written about this, too, in my post “The fallacy of testing in education.”)
But this post isn’t about any of those things either.
It’s about the irony of uniformity, or the paradox of it, as I call it in my title. Concurrent with the business-based drive for uniformity has been the alleged drive for higher standards: more critical thinking, increased expectations, a faster track to skill achievement. Yet uniformity is the antithesis of higher standards. We’re supposed to have more rigor in our curricula, but coddle our charges in every other way.
We can’t expect students to deal with teachers who have varying classroom methods. We can’t expect them to adjust to different ways of grading. We can’t expect them to navigate differences in syllabi construction, teacher webpage design, or even the use of their classroom’s whiteboard. We can’t expect students to understand synonyms in directions, thus teachers must confine themselves to a limited collection of verbs and nouns when writing assignments and tests (for instance, we must all say “analyze” in lieu of “examine” or “consider” — all those different terms confuse the poor darlings). This is a true story: A consultant who came to speak to us about the increased rigor of the PAARC exam also advised us to stop telling our students to “check the box” on a test, because it’s actually a “square” and some students may be confused by looking for the three-dimensional “box” on the page. What?
But are these not real-world critical-thinking situations? Asking students to adapt to one teacher’s methodology versus another? Requiring students to follow the logic of an assignment written in this style versus that (or that … or that)? Having students adjust their schoolwork schedules to take into account different rhythms of due dates from teacher to teacher?
How often in our post-education lives are we guaranteed uniformity? There is much talk about getting students “career-ready” (another business world contribution to education), yet in our professional careers how much uniformity is there? If we’re dealing with various customers or clients, are they clones? Or are we expected to adjust to their personalities, their needs, their pocketbooks? For that matter, how uniform are our superiors? Perhaps we’re dealing with several managers or owners or execs. I’ll bet they’d love to hear how we prefer the way someone else in the organization does such and such, and wouldn’t they please adjust their approach to fit our preferences? That would no doubt turn into a lovely day at work.
I’ve been teaching for 33 years, and over that time I’ve worked under, let’s see, seven building principals (not to mention different superintendents and other administrators). Not once has it seemed like a good idea to let my current principal know how one of his predecessors handled a given situation in the spirit of encouraging his further reflection on the matter. Clearly I am the one who must adapt to the new style, the new approach, the new philosophy.
These are just a few examples of course. How much non-uniformity do we deal with every day, professionally and personally? An infinite amount is the correct answer. So, how precisely are we better preparing our students for life after formal education by making sure our delivery systems are consistently cookie-cutter? We aren’t is the correct answer. (Be sure to check the corresponding squares.)
Education has made the mistake of allowing Business to infect it to the core (to the Common Core, as a matter of fact). Now Business has taken over the White House, and it’s taken over bigly.
But this blog post isn’t about that.
The fallacy of testing in education
For the last several years education reformers have been preaching the religion of testing as the lynchpin to improving education (meanwhile offering no meaningful evidence that education is failing in the first place). Last year, the PARCC test (Partnership for Assessment of Readiness for College and Careers) made its maiden voyage in Illinois. Now teachers and school districts are scrambling to implement phase II of the overhaul of the teacher evaluation system begun two years before by incorporating student testing results into the assessment of teachers’ effectiveness (see the Guidebook on Student Learning Objectives for Type III Assessments). Essentially, school districts have to develop tests, kindergarten through twelfth grade, that will provide data which will be used as a significant part of a teacher’s evaluation (possibly constituting up to 50 percent of the overall rating).
To the public at large — that is, to non-educators — this emphasis on results may seem reasonable. Teachers are paid to teach kids, so what’s wrong with seeing if taxpayers are getting their money’s worth by administering a series of tests at every grade level? Moreover, if these tests reveal that a teacher isn’t teaching effectively, then what’s wrong with using recently weakened tenure and seniority laws to remove “bad teachers” from the classroom?
Again, on the surface, it all sounds reasonable.
But here’s the rub: The data generated by PARCC — and every other assessment — is all but pointless. To begin with, the public at large makes certain tacit assumptions: (1) The tests are valid assessments of the skills and knowledge they claim to measure; (2) the testing circumstances are ideal; and (3) students always take the tests seriously and try to do their best.
But none of these assumptions are true most of the time — and I would go so far as to say that all of them being true for every student, for every test practically never happens. In other words, when an assessment is given either the assessment itself is invalid, and/or the testing circumstances are less than ideal, and/or nothing is at stake for students so they don’t try their best (in fact, it’s not unusual for students to deliberately sabotage their results).
For simplicity’s sake, let’s look at the PARCC test (primarily) in terms of these three assumptions; and let’s restrict our discussion to validity (mainly). There have been numerous critiques of the test itself that point out its many flaws (see, for example here; or here; or here). But let’s just assume PARCC is beautifully designed and actually measures the things it claims to measure. There are still major problems with its data’s validity. Chief among the problems is the fact that there are too many factors beyond a district’s and — especially — a classroom teacher’s control to render the data meaningful.
For the results of a test — any test — to be meaningful, the test’s administrator must be able to control the testing circumstances to eliminate (or at least greatly reduce) factors which could influence and hence skew the results. Think about when you need to have your blood or urine tested — to check things like blood sugar or cholesterol levels — and you’re required to fast for several hours beforehand to help insure accurate results. Even a cup of tea or a glass of orange juice could throw off the process.
That’s an example that most people can relate to. If you’ve had any experience with scientific testing, you know what lengths have to be gone to in hopes of garnering unsullied results, including establishing a control group — that is, a group that isn’t subjected to whatever is being studied, to see how it fares in comparison to the group receiving whatever is being studied. In drug trials, for instance, one group will receive the drug being tested, while the control group receives a placebo.
Educational tests rarely have control groups — a group of children from whom instruction or a type of instruction is withheld to see how they do compared to a group that’s received the instructional practices intended to improve their knowledge and skills. But the lack of a control group is only the beginning of testing’s problems. School is a wild and woolly place filled with human beings who have complicated lives, and countless needs and desires. Stuff happens every day, all the time, that affects learning. Class size affects learning, class make-up (who’s in the class) affects learning, the caprices of technology affect learning, the physical health of the student affects learning, the mental health of the student affects learning, the health of the teacher affects learning (and in upper grades, each child has several teachers), the health and circumstances of the student’s parents and siblings affect learning, weather affects learning (think “snow days” and natural disasters); sports affects learning (athletes can miss a lot of school, and try teaching when the school’s football or basketball team is advancing toward the state championship); ____________ affects learning (feel free to fill in the blank because this is only a very partial list).
And let me say what no one ever seems to want to say: Some kids are just plain brighter than other kids. We would never assume a child whose DNA renders them five-foot-two could be taught to play in the NBA; or one whose DNA makes them six-foot-five and 300 pounds could learn to jockey a horse to the Triple Crown. Those statements are, well, no-brainers. Yet society seems to believe that every child can be taught to write a beautifully crafted research paper, or solve calculus problems, or comprehend the principles of physics, or grasp the metaphors of Shakespeare. And if a child can’t, then it must be the lazy teacher’s fault.
What is more, let’s look at that previous sentence: the lazy teacher’s fault. Therein lies another problem with the reformers’ argument for reform. The idea is that if a student underachieves on an exam, it must be the fault of the one teacher who was teaching that subject matter most recently (i.e., that school year). But learning is a synergistic effect. Every teacher who has taught that child previously has contributed to their learning, as have their parents, presumably, and the other people in their lives, and the media, and on and on. But let’s just stay within the framework of school. What if a teacher receives a crop of students who’d been taught the previous year by a first-year teacher (or a student teacher, or a substitute teacher who was standing in for someone on maternity or extended-illness leave), versus a crop of students who were taught by a master teacher with an advanced degree in their subject area?
Surely — if we accept that teaching experience and education contribute to teacher effectiveness — we would expect the students taught by a master teacher to have a leg up on the students who happened to get a newer, less seasoned, less educated teacher. So, from the teacher’s perspective, students are entering their class more or less adept in the subject depending on the teacher(s) they’ve had before. When I taught in southern Illinois, I was in a high school that received students from thirteen separate, curricularly disconnected districts, some small and rural, some larger and more urban — so the freshman teachers, especially, had an extremely diverse group, in terms of past educational experiences, on their hands.
For several years I’ve been an adjunct lecturer at University of Illinois Springfield, teaching in the first-year writing program. UIS attracts students from all over the state, including from places like Chicago and Peoria, in addition to students from nearby rural schools, and everything in between (plus a significant number of international students, especially from India and China). In the first class session I have students write a little about themselves — just answer a few questions on an index card. Leafing through those cards I can quickly get a sense of the quality of their educational backgrounds. Some students are coming from schools with smaller classes and more rigorous writing instruction, some from schools with larger classes and perhaps no writing instruction. The differences are obvious. Yet the expectation is that I will guide them all to be competent college-level writers by the end of the semester.
The point here, of course, is that when one administers a test, the results can provide a snapshot of the student’s abilities — but it’s providing a snapshot of abilities that were cured by uncountable and largely uncontrollable factors. How, then, does it make sense (or, how, then, is it fair) to hang the results around an individual teacher’s neck — either Olympic-medal like or albatross like, depending?
As I mentioned earlier, validity is only one issue. Others include the circumstances of the test, and the student’s motivation to do well (or their motivation to do poorly, which is sometimes the case). I don’t want to turn this into the War and Peace of blog posts, but I think one can see how the setting of the exam (the time of day, the physical space, the comfort level of the room, the noise around the test-taker, the performance of the technology [if it’s a computer-based exam like the PARCC is supposed to be]) can impact the results. Then toss in the fact that most of the many exams kids are (now) subjected to have no bearing on their lives — and you have a recipe for data that has little to do with how effectively students have been taught.
So, are all assessments completely worthless? Of course not — but their results have to be examined within the complex context they were produced. I give my students assessments all the time (papers, projects, tests, quizzes), but I know how I’ve taught them, and how the assessment was intended to work, and what the circumstances were during the assessment, and to some degree what’s been going on in the lives of the test-takers. I can look at their results within this web of complexities, and draw some working hypotheses about what’s going on in their brains — then adjust my teaching accordingly, from day to day, or semester to semester, or year to year. Some adjustments seem to work fairly well for most students, some not — but everything is within a context. I know to take some results seriously, and I know to disregard some altogether.
Mass testing doesn’t take into account these contexts. Even tests like the ACT and SAT, which have been administered for decades, are only considered as a piece of the whole picture when colleges are evaluating a student’s possible acceptance. Other factors are weighed too, like GPA, class rank, teacher recommendations, portfolios, interviews, and so on.
What does all this mean? One of things that it means is that teachers and administrators are frustrated with having to spend more and more time testing, and more and more time prepping their students for the tests — and less and less time actually teaching. It’s no exaggeration to say that several weeks per year, depending on the grade level and an individual school’s zeal for results, are devoted to assessment.
The goal of assessment is purported to be to improve education, but the true goals are to make school reform big business for exploitative companies like Pearson, and for the consultants who latch onto the movement remora-like, for example, Charlotte Danielson and the Danielson Group; and to implement the self-fulfilling prophecy of school and teacher failure.
(Note that I have sacrificed grammatical correctness in favor of non-gendered pronouns.)
2 comments