How do student evaluations survive?

Among the few replicable findings from research on higher education, one of the most notable is that student evaluations of teaching are both useless as measures of the extent to which students have learned anything and systematically biased against women and people of color. As this story says, reliance on these measures could lead to lawsuits.

But why hasn’t this already happened. The facts have been known for years, and potential cases arise every time these evaluations are used in hiring or promotion: arguably every time the data is collected. And student evaluations are particularly popular in the US, where litigation is the national sport. Yet no lawsuits have yet taken place AFAICT.

Maybe the zeitgeist is changing. I was going to write this post before seeing the linked article, which turned up in my Google search. Any lawyers or potential litigants want to comment?

17 thoughts on “How do student evaluations survive?

  1. Who relies on these to hire academics? Are academics not hired on the strength of their research or potential (or for knowing someone who can see this potential).
    Who cares about the surways, even when students complain? Who are the departments answerable to? Isn’t it easy enough to appease a student, for those men running the show?
    How is a poor woman or a group of them going to prove a bias against her/them?
    On the strangth of the surways?
    There are so many ‘surways’ out there that are biased (eg Frydenberg’s citizenship vs another poor person’s who deserves more sympathy yet gets none). Most if us don’t live in a fair world, and so we try not to dwell on it to much, or let ourselves be additionaly abused trying to get justice (since there is no such thing).

  2. The subjectivity of these evaluations may make them useless to third parties, but that does not mean that trachers should not make an effort to find out what their students think of their work.

  3. And what about a young hansome academic who is very good at teaching versus an old professor who is perhaps not a good teacher and has no passion for teaching? The first will have his future somewhat determined by the second even though the students will likely give great feedback to the first, and no one will know what feedback the second gets (there will be bias here too, as the second will likely ‘pass’ only on the strength of his age and title).

  4. Believe it or not, Australian university managements do rely on these things to make decisions about hiring and firing.

    To give a personal experience of what is wrong with them: some years ago I was convening a course with a relatively small enrolment. I caught two of the students blatantly plagiarising their main assignment, and the resultant sanctions under the university’s academic integrity management system meant that they failed the course. However, as long as they remained enrolled they were able to complete an evaluation of my teaching. Probably not coincidentally, at the end of the semester I found that two of the evaluation responses gave me the worst possible rating on all of the Lijkert scale questions. In the context of the overall relatively modest sample size this significantly skewed my overall score.

  5. As it happens, a few years ago I interviewed a woman who, as a student activist at Sydney University and then as an office-bearer of NUAUS/AUS in the late 1960s and early 1970s, was an active proponent of the introduction of student evaluations of courses and teaching. Her aim, and that of her peers, was that the evaluations should be used as a basis for improving courses, teaching and learning. She was seriously displeased when I told her about the misuses of them in the modern academy.

  6. Unfortunately there will never be a perfect way to implementing these sort of surveys, but they are necessary in my opinion. The result of unreasonable practice of forcing researchers to teach often result in poor classroom experiences of students, this is particularly often from PhD candidate who are allocated teaching responsibility so universities can save money. But besides that, it is important to learn ways to improve the course, and evaluate teachers performance via these surveys.

    If numerous research already points out that there is consistent racism and sexism in survey responses, then it is not hard for the university to calibrate the survey results, also based on if the feedback are given to people of colour and the gender. Calibration processes are widely used whether in calibrating the HSC results of different schools which varies in difficulty of their teaching, or in the commercial world where people managers’ performance review of their team members are calibrated based on their “harshness”.

  7. In my opinion, using surveys as ways to improve courses and evaluating teachers’ performance is necessary. If researches suggest there is consistent bias against people of colour or gender, then the student feedback given to them should be calibrated accordingly. Calibration process is a common tool whether in re-scaling HSC scores between different school that varies in difficulty of their teaching, or in the commercial world, where people managers’ rating of their team members are re-scaled depending on the “harshness” of the person.

  8. Tom, the problem isn’t just that they are biased (which we might be able to correct for, as you say) but that they are absolutely useless. They don’t measure anything that correlates with student learning or outcomes. The reason they are retained is simply because administrators prefer something quantifiable, even if it is spurious, to nothing at all.

  9. My University certainly monitors student feedback. I suspect it is more the change in student evaluation scores. Usually, an academic will take over a unit for several years. The student evaluations for a unit coordinator are often strikingly different for good vs bad teachers. So yeah, people can certainly be hired and fired on the basis of student evaluations and are for better or worse, probably necessary and provide a metric we can compare against past performance.

    There are a few issues I have noticed that are of real concern. The first is for tutors and demonstrators, if they have a couple of disgruntled students, they are motivated to leave a bad evaluation. The second is student expectations can be ‘a bit whack’, for example, we give comprehensive written and oral feedback on most assignments and directly tell students “you are getting feedback”, but often the evaluation is for lack of feedback. There’s literally, not much you can do! Thirdly, (and this is pretty contentious) I have seen suggestions some academics do odd things with units when they are trying to get a promotion and artificially boost the evaluation score. Contentious, because ultimately, these academics put a lot of effort into teaching to get that promotion, so the students are arguably better off with the extra attention.

  10. Oops, to follow up previous comment. If a couple of disgruntled students leave a bad evaluation, but the happy students don’t, you end up with a rather skewed result.

  11. @Neil

    Many things can be adjusted about the survey results, or improvements can be made about the surveys. Even including bias, disgruntled students, low response rates, or courses that are offering simple materials to appease students. Statistical outliers can and should also be analysed. To say that these problems exists and therefore surveys are useless is pretty much saying almost if not all surveys are useless (unless you think every other types of survey are perfect). The important thing is to know the deficiencies and make improvements. If you know of better ways for student to leave feedback other than survey which would be free of all the above-mentioned problems, then I’m interested to know.

    To be clear, I’m not saying these surveys are perfect, but rather, they can be improved depending on how much effort you want to put into it, the same applies to everything else. Survey is just a method of data collection.

  12. There were no people of colour or women in many STEM departments (not counting a few exceptions) a few years ago, to start with. There are not that many these days either.
    And some of these universities have almost 50% percent of international students (are these equally biased?).
    The surways are useful to course coordinators to keep track of how demonstraters/tutors are performing. How else could this be done? Still in some subjects there are no female tutors/demonstrators to start with (but the bias of any kind would be almost impossible to prove).
    I’d be extremly surprised if anyone has lost their job due to surways (women will work extra hard, men will ‘improve’ after a warning or two).
    Why blame the surways for biases? Why not address real causes of the bias toward various groups?

    I don’t think the surways particularly useful but they can identify issues to fix for those who care. I have seen teaching staff behaving badly and the student at least can report this in the surway (hopefully).

  13. @Tom,

    Adjust all you like. There’s no good evidence that these surveys measure anything worth measuring. Obviously that doesn’t entail that no survey on any topic can be any good. It just entails that right now we have no survey- based method of measuring teaching quality.

  14. Chompei: “If a couple of disgruntled students leave a bad evaluation, but the happy students don’t, you end up with a rather skewed result.”

    Well, that’s the thing. Good, engaged students will seldom give you a 7 on the Lijkert scale. They will give a 5 or 6 with generally complementary comments but also constructive criticisms and suggestions. This is the sort of response that I think the former student activist I interviewed was hoping would be harnessed to positive effect in her proposals for the use of evaluations. The ones I sprung plagiarism gave me a 1 on all questions.

  15. Opt in surveys are always hard to interpret: people only bother with them if they are really motivated to do so, which means you get mainly responses by the delighted and the pissed off. You’re also skewing your responses not only to those with particularly good or bad experiences, but also those with particular personality profiles. Professional surveyors go to great lengths to get a representative population, and then often adjust the results to control for demographic skew. Even so, they know that their results remain distorted. See Paul Bloom’s recent discussion of the lizardman constant for a start.

    https://www.newyorker.com/culture/annals-of-inquiry/perverse-incentives

  16. In my experience, the student surveys of teachers at universities are not helpful for teachers. Whenever I designed a new course, I wrote my own survey with specific questions on content, teaching material, and teaching method, in addition to the standardised compulsory survey. My survey was administered by a student representative before the examination, given to third person to keep until after I had done the marking (How else can one illustrate the idea of incentive compatibility in context?). IMO the standardised compulsory survey is at best a very crude management tool to pick disasters (eg lecturer did not show up for classes on x days), which are likely to be noticed anyway – although managers do like to have ‘evidence’ in the form of a sheet of paper provided by someone else. But they are useless for improving a course. (The idea that teaching can be separated from course content is another silly one.) Suppose one gets a high score on these standard surveys. Possible response: Go to the pub and celebrate. Suppose one gets a low score. Possible response: Go to the pub and drown the sorrow. Useless.

Leave a comment