How Is Your Teaching Evaluated?

. October 25, 2018 at 9:13 am 37

It seems that every few months a new study is published demonstrating some kind of problem with student evaluations of teaching. Recently I’ve seen one going around that confirms that students who had access to free chocolate cookies while being taught evaluated their teachers “significantly better” than the control group.

Still, the numerical scores such evaluations provide seem to be what’s emphasized in the formal evaluation of teaching for, say, annual reviews. The situation prompted one philosophy professor, M.G. Piety (Drexel) to write in asking for input on others as to whether this is the case at their institutions.

She notes that her chair asked the department faculty to read a book about pedagogy in higher education. It’s an admirable step, however:

My sense is that these books are going to do little to improve the quality of college and university-level instruction if all the emphasis in annual reviews of faculty falls simply on numerical scores on teaching evaluations, independently of the kinds of assignments that are given in individual courses and the kinds of feedback that are given on those assignments.

In fact, emphasizing mere numerical scores on teaching evaluations could well undermine other efforts to improve the quality of teaching in that instructors who require little of their students and who routinely award high grades often get correspondingly high numerical scores on teaching evaluations, while faculty who give more work and assign a more reasonable distribution of As, Bs, and Cs, etc., often get lower numerical scores on teaching evaluations.

She’d like to know of her fellow philosophy professors at other schools:

Whether their teaching is evaluated on the basis of numerical scores alone, or whether any attention is paid to more substantive aspects of their teaching, such as the number and type of assignments they give and the nature of the feedback they give on those assignments.

Readers?

(It might be useful to note what kind of institution you’re at.)

Emmanuelle Moureaux, “Forest of Numbers”

Beyond the Ivory Tower. Workshop for academics on writing short pieces for wide audiences on big questions. Taking place October 18th to 19th. Application deadline July 30th. Funding provided.

37 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Jon

5 years ago

My university lets us “opt out” of teaching evaluations because nobody’s figured out a way to make them informative or useful. It’s literally in our collective bargaining contract that they’re optional. (Graduate students have to do them, but bargaining-unit faculty don’t.) I’ve never minded this stance.

Jonathan Jenkins Ichikawa

Reply to Jon

5 years ago

Jon, I have heard a few stories about faculty collective bargaining contracts reducing or removing the use of teaching evaluations. I’m interested in collecting some of that information more systematically; I’d be grateful if you’re willing to identify your university. (Either here publicly, or via private email to me at [email protected].)

M.G. Piety

Reply to Jon

5 years ago

Interesting. I don’t actually mind teaching evaluations being used as part of the process of evaluating teaching. My concern is simply that an objective evaluation of teaching effectiveness has to have multiple sources of information relating to that effectiveness: Examples of assignments, examples of the feedback students receive on assignments, peer evaluation, etc., etc.

Sam

5 years ago

With so many philosophy departments facing dramatic cuts or elimination – I don’t know how you don’t take teaching evaluations into account. Maybe there needs to be some way to evaluate both quality and whether students like you, but dismissing the later seems suicidal. Also, I hear a lot of talk about higher grades and higher teaching evaluations – but in my experience this isn’t true. I know plenty of professors who hand out easy A’s and are still not liked by students.

sahpa

Reply to Sam

5 years ago

Anecdotes are not data.

M.G. Piety

Reply to Sam

5 years ago

I would never argue that teaching evaluations shouldn’t be taken into account. They should be. My concern is that they give very incomplete information about teaching effectiveness. Also, read more of the comments below. You will see that there are actually studies that show a link between the grades students expect to get and how they rate their instructors on evaluations.

DocFE

Reply to Sam

5 years ago

I had a colleague years ago whose grade profiles were very low, comparatively, few and sometimes no A’s; she was tough but very fair. In addition, she was a great professor (anthropology). Her student evals were among the highest at the school, while her grade profiles fell well below the average. The students clearly appreciated her teaching and grades did not determine her evaluations. This is high praise, we actually thought about a satirical award in her name that would recognize the professor with the highest evals and toughest grading standards. The key, I think is VERY FAIR (and explained standards).

Daniel Kaufman

5 years ago

Student evaluations feature significantly in our hiring, tenure, and promotion decisions. They are not, however, the only measure of teaching quality. Teaching observation reports by senior faculty are also employed, as are course documents, such as syllabi, exams, etc.

M.G. Piety

Reply to Daniel Kaufman

5 years ago

That is fantastic. Do you know whether your university/department has a system for calculating the relative importance of these different measures of teaching effectiveness in the production of an overall evaluation of teaching?

Daniel Kaufman

Reply to M.G. Piety

5 years ago

Yes, teaching evaluations are not allowed to count for more than 50%.

M.G. Piety

Reply to Daniel Kaufman

5 years ago

That’s great. Are specific percentages also allotted for observation reports, course documents, etc? Also, is it the chair alone who is the final determiner of teaching quality/effectiveness or is that determination the product of a committee. In my department, a committee makes that determination for the purposes of tenure and promotion, but the chair does it alone for annual reviews.

Justin Sytsma

5 years ago

Although the evidence on student evaluations of teaching (SET) isn’t univocal, the consensus from those who have studied it seems to be that they don’t actually measure teaching effectiveness, and while they might plausible measure whether students like a teacher, this has some highly problematic aspects — e.g., students tend to prefer attractive people, men, non-minorities. In my opinion, it is absurd and ethically problematic that universities put any weight on SET in tenure and promotion decisions at this point in time. My university (research university in New Zealand) uses SET. We’re not required to do them, with a few exceptions, but they’re included in promotion applications. It is unclear how much weight is put on them, but my impression is that they do carry some weight.

Here’s a quick summary of a bit of the evidence (ht to Jonathan Livengood who pointed me toward most of this a while back):

Stark and Feishtat (2014), “An Evaluation of Course Evaluations”: “The common practice of relying on averages of SET scores as the primary measure of teaching effectiveness for promotion and tenure decisions should be abandoned for substantive and statistical reasons.” Statistical reasons include that they generally have low response rates and that we can’t reliably infer from responders to non-responders; that evaluation typically focus on average scores on omnibus questions, but there is no reason to treat the scales typically used as interval; average scores tend to be high with large standard deviations, raising questions about comparisons. Further, SET scores are known to vary with type of course (class size, level, whether it is required, subject) independently of who is teaching it: “These variations are large and may be confounded with SET (Cranton and Smith, 1986; Feldman, 1984, 1978). It is not clear how to make fair comparisons of SET across seminars, studios, labs, prerequisites, large lower-division courses, required major courses, etc. (see, e.g., McKeachie, 1997).”

Uttl, White, and Gonzalez (2017), “Meta-analysis of faculty’s teaching effectiveness: SET ratings and student learning are not related”: “SET ratings are used to evaluate faculty’s teaching effectiveness based on a widespread belief that students learn more from highly rated professors…. Our up-to-date meta-analysis of all multi-section studies revealed no significant correlations between SET ratings and learning.”

Braga, Paccagnella, and Pallizzari (2011), “Evaluating Student’s Evaluations of Professors”: “contrasts measures of teacher effectiveness with students’ evaluations for the same teachers…. The effectiveness measures are estimated by comparing the subsequent performance in follow-on coursework of students who are randomly assigned to teachers in each of their compulsory courses…. We find that our measure of teacher effectiveness is negatively correlated with the students’ evaluations: in other words, teachers who are associated with better subsequent performance receive worse evaluations from their students.”

So what does SET measure? Evidence that it is highly correlated with students’ grade expectations (Marsh and Cooper, 1980; Short et al., 2012; Worthington, 2002); related to enjoyment ratings from students (1486 stat students from UC Berkeley in Fall 2012: r=0.75); correlate with reactions to 30 seconds of silent video of the instructor suggesting that looks matter)…

Riniolo, Johnson, Sherman, and Misso (2006), “Hot or Not: Do Professors Perceived as Physically Attractive Receive Higher Student Evaluations?”: “The data suggested that professors perceived as attractive received higher student evaluations when compared with those of a nonattractive control group (matched for department and gender). Results were consistent across four separate universities. Professors perceived as attractive received student evaluations about 0.8 of a point higher on a 5-point scale.”

And gender, ethnicity, and age all seem to matter (Anderson and Miller, 1997; Basow, 1995; Cramer and Alexitch, 2000; Marsh and Dunkin, 1992; Wachtel, 1998; Weinberg et al. 2007; Worthington, 2002)….

Boring, Ottoboni, and Stark (2016), “Student Evaluations of Teaching (mostly) do not Measure Teaching Effectiveness”: “We show: SET are biased against female instructors by an amount that is large and statistically significant. The bias affects how students rate even putatively objective aspects of teaching, such as how promptly assignments are graded…. SET are more sensitive to students’ gender bias and grade expectations than teaching effectiveness.”

And, of course, bribes seem to work (or alternatively students are just in a better mood after eating chocolate)!

Youmans and Jee (2007), “Fudging the Numbers: Distributing Chocolate Influences Student Evaluations of an Undergraduate Course”: “Research has shown that student evaluations can be mediated by unintended aspects of the course. In this study we examined whether an event unrelated to a course would increase student evaluations. Six discussion sections completed course evaluations administered by an independent experimenter. The experimenter offered chocolate to three sections before they completed the evaluations. Overall, students offered chocolate gave more positive evaluations than students not offered chocolate.”

So, if all of this is more or less accurate, what should you do to be rated as a great teacher? Be an attractive white man, lead students to believe that they’ll get a good grade in the course, do a song and dance, and give them chocolate before the evaluations.

Jonathan Livengood

Reply to Justin Sytsma

5 years ago

Thanks for the h/t. It should be noted that not all SET are created or better, administrated, equally. At Pitt, where Justin and I were graduate students, the university’s Office of Measurement and Evaluation of Teaching actually sent people around to administer the evaluations. That means that, among other things, there is relative standardization with respect to how the instructions are read to the students, and it makes it harder for instructors to bias their students *right before* the evaluation. However, at Illinois, where I’m at now, the instructors administer the evaluations directly. That is, we read the instructions and pass out the evaluation forms. We don’t *collect* the responses, so it’s not *maximally* bad. But it’s pretty bad. When an instructor administers her own evaluations, she can easily tell the students how to interpret the questions. For example, one might say, “Look, a 5 on this scale means that you’re giving me an A for the course,” or “You know, people often feel uncomfortable giving extreme answers, but you should know that giving a 5 is typical,” or as I’ve often heard from auto repair services that do customer satisfaction surveys, “Anything less than a 5 means a failure for me.” One can also bias students in much more subtle ways. For example, you can ask students to fill out the written part of the evaluation before filling out the numerical part. Then specifically ask students to list, say, 10 examples of weaknesses with the course, you will set an anchor for students that 10 weaknesses is typical. When they cannot list 10, they will conclude that the course is better than is typical. And on average, they will give higher ratings than a group that is asked to list 2 such examples.

M.G. Piety

Reply to Justin Sytsma

5 years ago

Wow, thanks for this. It is enormously helpful!

Joshua Mugg

5 years ago

Another way to show teaching excellence is to focus on improvement and innovation in the classroom. Here are two ways to demonstrate this in your annual report.

One option: attend SoTL (scholarship of teaching and learning) workshops, apply what you learned in the classroom, and explain in your annual review how it helped. Evidence for how it helped will vary. For example, I attended a workshop on transparency in teaching and learning (TILT), which focused on making assignments clearer in A) purpose B) learning outcomes C) product and D) how students will be assessed. I rewrote several assignments using their method, resulting in clearer assignments for my students. I turned in these re-worked assignments as part of my annual review.

A second option that I learned about from some education faculty is to focus one specific aspect of one’s teaching for a given year, and make predictions about how, if you achieved improvement, it would increase scores in your student evals. Suppose you want students to come to class more prepared, and so you institute a required pre-class reflection on the reading every week. If this is effective, you should see a rise in the response ‘I come to class prepared’ on your evals.

M.G. Piety

Reply to Joshua Mugg

5 years ago

These are good suggestions. The only difficulty with them is that they will work only if assessments of teaching effectiveness already look beyond the mere numerical scores on SETs.

David Wallace

5 years ago

Let me half-heartedly make the case for student evaluations of teaching – half-hearted because I do recognize how seriously flawed it is. (I should say that my own institution is in the middle of a debate on this issue at the moment.)

1) There is a basic methodological problem in trying to measure the correlation between SET and teaching quality, which I think is at most partially addressed by the bits of the literature I’m familiar with. The ideal methodology (a) measures the quality of teaching via an unambiguous, agreed-upon quality-of-teaching metric, and (b) correlates that against SET.

But of course, if we had an “unambiguous, agreed upon quality-of-teaching metric”, it wouldn’t matter how well SET tracks it. We’d just use that metric instead! The whole reason we’re having this debate is that measuring teaching quality is really hard. Consider, for an analogy, that in the K-12 sector, the proxy metric for quality of teaching is exam success – at least, that’s true in the UK, and I think Race to the Top has made it true in the US to a large extent. I could make a similarly half-hearted defense of that proxy, but anyone who’s paid any attention to K-12 educational politics in the last 10 years knows how controversial and problematic it’s been. So correlations, or lack thereof, of SET with exam results, say, are only informative insofar as you think exam results themselves track teaching quality.

2) One sort of evidence used to argue against SET is correlation with various indicators that “should” have nothing to do with teaching quality (Justin Sytsma, above, mentions how enjoyable the class is, and how attractive the teacher is). But in the first place, the existence of noise doesn’t rule out the detection of a signal. And in the second, some of those indicators aren’t obviously uncorrelated with teaching quality! I find it easier to pay attention if I’m enjoying a lecture or class; I find it easier to pay attention to attractive people; I don’t think I’m unusual on either metric. So I find it perfectly plausible that making your class enjoyable, and being attractive, are positively correlated with better learning outcomes. (There is a separate conversation to be had about whether it is just to reward people more qua teachers if they are more attractive; I don’t myself see a problem, but in any case it’s not connected to the baseline question about what SET measures.)

3) The arguments for various sorts of bias in SET are pretty persuasive. But sad to say, that’s just as likely to be true for other forms of assessment, including peer assessment. Let’s not kid ourselves that we as faculty possess some magic bias-cancelling ability that our students lack.

4) Given the amount of noise associated with SET, and the various biases etc, I agree that using it as a fine-grained measure of anything is a mistake. But there is a need for coarse-grained measurements, and in particular there is a need for some indicator of actively bad teaching. Sad to say, that does occur, and I’m cynical enough to fear that it would occur more if we didn’t audit to some degree. Really serious problems with teaching are much more likely to be shown up by SET than by any realistic alternative, given that no peer-observation system is likely to observe more than a tiny minority of someone’s teaching, and given that at least in the UK, grades are also in the control of the teacher and don’t really offer an independent metric.

Similarly, I think achieving a genuinely outstanding SET is indicative at least of something. There are ways of gaming that achievement (easy grading, bribery) but those can be caught by other forms of assessment. (I’m not in favor of SET alone as a teaching-quality metric).

5) SET is a form of assessment that is relatively un-theorized. A lot of proposed alternatives seem to involve measuring teaching quality on the basis of various theorized measures, and my exposure to educational science makes me skeptical of the overall quality of those theorized measures – I’ve interacted with some very good people in the field doing very serious work, but overall, the field – at least as it translates into direct advice to academics – seems to have an ongoing problem with faddishness, a certain dogmatism about the unique best way to do something, and often a pretty poor grasp of statistical significance and experimental methodology. It’s obviously important for us to recognize that our teaching methods aren’t set in stone, and to be willing to learn from educational research, but equally, I value SETs (along with peer observation) as a way of demonstrating that my teaching is fine even if it doesn’t conform to flip-the-classroom or whatever the latest trend might be.

David Wallace

Reply to David Wallace

5 years ago

Sorry, typo: in (4) I meant that at least in the *US*, grades are also in control of the teacher.

Justin Sytsma

Reply to David Wallace

5 years ago

I’m largely in agreement with most of this (I’m least sure what to think with regard to attractiveness — rewarding people for being attractive in academia strikes me as wrong, even if I see the pragmatic reasons, but that is largely just a gut feel). My primary concern is not with conducting SET, but with how they often seem to be used in performance reviews and promotion decisions. As you say, SET shouldn’t be used as a stand-along measure and shouldn’t be used as if it were a fine-grained measure. I think that what SET can do is to indicate when someone is an especially bad “teacher.” If students really hate a class, that isn’t good. And that can be used as a first indicator to be followed up with having people evaluate their classes, offer suggestions, recommend (or require) pursuing additional training, and so on. If none of that works, this would of course be relevant to performance reviews and promotion decisions.

Avalonian

Reply to David Wallace

5 years ago

Just to add to David’s (2): it is implausible that charisma or likability are entirely irrelevant when it comes to teaching effectiveness, particularly in the humanities. Many (though not all) effective teachers are attractive and likable partly because they communicate joy and enthusiasm for the material.

Also, re: the alleged gender bias, I think this is an important read: https://slate.com/technology/2018/04/hotness-affects-student-evaluations-more-than-gender.html. The biggest data set we have (i.e. not the tiny, single-class studies referenced above) shows no gender skew. Not to say there isn’t one in reality, but that’s pretty powerful evidence against one nonetheless.

M.G. Piety

Reply to David Wallace

5 years ago

This is great. Thanks so much. You are absolutely right that SETs measure something important. My issue is not with the use of SETs. It is with the exclusive use of SETs as indicators of teaching effectiveness.

Not Yet Tenured Female

Reply to David Wallace

5 years ago

Yeah, screw the ugly people. ¯\_(ツ)_/¯
btw, if you read the link, the data do not indicate that there’s no gender bias.

RJB

5 years ago

I’m at a business school, but I doubt that changes much for this question. We still use student evaluations of teachers, but that only matters for promotion and tenure if someone is exceptionally bad or good (e.g., below 3 or above 4.9, both highly unusual given the compressed mean around 4.5). I don’t think the statistical biases reported above, while real and important, are affecting our promotion and tenure decisions very much.

To provide more substantive info, we have faculty evaluations of teachers for all promotion and tenure decisions. Senior people in the discipline review teaching materials and sit in on at least one class session.

Not perfect, but not as bad as some stories I hear.

Michael Cholbi

5 years ago

My institution, like many, makes use of quantitative student evaluations, along with peer observations and instructor self-evaluations. A few points:
1. There’s not much evidence I’m aware of to suggest peer observation is better than student evaluations. I’d hate to see institutions eliminate student evaluations and thereby leave peer observations as the primary evidence used to evaluate teaching.
2. Much of the validity of evaluations depends on what we’re asking. Students may not be in a good position to evaluate their learning, but they’re in a pretty good position to say whether faculty run organized class meetings, are prepared, turn work back in a timely way, are available outside class, etc. — instructor behaviors, in other words, rather than broad outcomes.
3. I’d generally be in favor of moving away from evaluating outcomes toward evaluating practices. There are too many variables that affect student learning (the composition of the student body, the time and day of the meeting, the difficulty of the material, etc.) that holding people directly responsible for gains in student learning is unfair and unrealistic. I’d prefer an approach that evaluates instructors based on whether they pursue practices known to be conducive to student learning — the science of learning is still developing, but I think we know enough to have some idea of what best practices in this area are.

M.G. Piety

Reply to Michael Cholbi

5 years ago

I think we need multiple sources of information concerning teaching effectiveness in order to get something that even approximates an objective evaluation of it. SETs are important. I would never argue that they aren’t. They are only part of what should be taken into account, though. Other things such as the analytical rigor of the readings (e.g., are instructors assigning original sources or simplified textbook-versions of subjects), the nature and number of written assignments, as well as the amount of feedback given to students on those assignments, peer observation reports, etc.

Kevin DeLapp

5 years ago

Timely topic, especially given the recent settlement at Ryerson about SETs: https://www.universityaffairs.ca/news/news-article/arbitration-decision-on-student-evaluations-of-teaching-applauded-by-faculty/

Another aspect of the debate is that regional accrediting agencies often require institutions to submit “evidence of teaching effectiveness.” This doesn’t necessarily have to take the form of classic student evals, but that’s usually the default. So I suspect it’s partly this external accreditation pressure that’s responsible for evals migrating to online as opposed to hard-copy (which seemed, in my experience, to have much higher levels of student participation for whatever reason), and focusing more on quantitative measures (since those are easier to harvest for reports).

harry b

5 years ago

I’d second everything Cholbi said. It would be surprising if people who are entirely untrained either in teaching or in evaluating teaching, and who rarely experience other people’s teaching (that is faculty peer observers) will yield better information about effectiveness by watching a one or two class periods than people who see a great deal of teaching and experience the whole of the course, that is, numerous class periods, the way the professor behaves in office hours, how inviting they are, whether they know all the students’ names (or whether they even try), whether they humiliate students in class (something they are much less likely to do when observed by a peer), the quality of the assignments and the quality of feedback, etc.. We could find out by studying peer observation but I am confident there is no body of work doing that (my confidence is empirically based).

Actually, its not clear we could find out by studying it because in the background to this discussion is the fact that despite decades of complaints about student teaching evaluations, none of the professional associations in the humanities and interpretative social sciences has spearheaded an effort to measure student learning. Indeed, student learning (as opposed to student achievement, a quite different matter) is not part of our professional discourse.

Have none of you been on a divisional committee? For numerous faculty members at tenure I have read through 6 years worth of student comments. I know which of those faculty bothered to reflect on their practice and try to improve, and which didn’t, because for some faculty many astute observations about the same bad practices fill their evaluations in the first few years and then disappear; and for others, astute observations about the same bad practices persist throughout their tenure period.

M.G. Piety

Reply to harry b

5 years ago

No single measure of teaching effectiveness is going to give a fully reliable account on its own. That’s my concern. We need multiple measures of teaching effectiveness. In particular, we need administrators who emphasize to faculty that more is involved in the measure of good teaching than can be established via SETs to be evince this awareness in their own evaluations of teaching effectiveness.

harry b

5 years ago

Regarding online versus hard copy participation in evaluations: you can easily get high participation by i) giving them ample in-class time at the beginning of class, ii) explaining in some detail what the evals are for, iii) saying, explicitly, that you seek extensive written feedback about what you are doing well, what you are not doing well, and how the students think you could improve, and iv) meaning what you say in iii). We just moved online, and I have gotten much more extensive written comments by doing that. (There are lists of best practices and protocols dotted around the web — Cornell had a particularly useful one I think).

Derek Bowman

Reply to harry b

5 years ago

In my experience ii-iv are also good ways of getting useful feedback out of paper evaluations.

5 years ago

I work at a community college and the averages of teaching evaluations are not supposed to play a direct role in promotion or retention of full-time faculty. The main way our teaching is evaluated is through in class observations by our deans. Since the deans usually have backgrounds with a lot of teaching experience at the community college level, and many still occasionally teach courses, this seems a pretty good system to me. If the person doing the observing hadn’t taught a class below the 300 level since grad school, as was the case in some observations I’ve had elsewhere, I’d be a bit more skeptical though. As part of our evaluation procedures we are supposed to show that we’ve considered and reacted to student evaluations as part of our development. This is a good idea in theory, but given that the students who bother to respond to evaluations tend to be the ones who either really love or hate you, it’s hard to do this in practice and the exercise is not always helpful. (In my more smart alecky moments I’ve been tempted to quote the evaluation where a student said don’t change a thing as a reason for not changing anything, though since I have changed some stuff since then that wouldn’t work).
Now on to another point. What I find really remarkable here is that no one has really addressed the elephant in the room as far as teaching evaluations go and that is the really poisonous way they combine with the precarious situations of adjuncts and VAPs to really undermine academic standards. We all know that a sure fire way to up our evaluations is to lower our grading standards, and there is reams and reams of research showing that low grades = low evaluations and vice versa. At the job I had before this as a year-by-year lecturer at a large state school I had pretty mediocre evals my first semester and my boss all but ordered me to make the class easier to get better evaluations (I believe the actual phrase he put in his email was “Your teaching evaluations are adequate, and will surely improve once you acclimate yourself to the expectations of University of _________ students.) So having been told to make the class easier that’s exactly what I did and guess what my evaluations went up. Adjuncts are even more vulnerable to these pressures, and this is rotting academic standards from the inside in ways that I’m certain we’ll come to regret.

M.G. Piety

Reply to SD

5 years ago

I could not agree with you more on this! Contingent faculty are put in a really impossible situation, and you are absolutely correct, this situation is a huge contributor to the problem of grade inflation. It wouldn’t be so bad if administrations would stand behind faculty who insist on maintaining rigorous grading standards. Too often, however, they respond in the manner you describe. Faculty in my department were given a list of “General Principles of Teaching Evaluations” that we were asked to determine whether we could “embrace.” One of these principles was “An ability to appropriately adapt classroom pedagogy to increase student learning.” There was general concern among the faculty present that “appropriately adapt classroom pedagogy” was a euphemism for dumbing down the curriculum.

Reply to M.G. Piety

5 years ago

I think administrators standing behind faculty would help, but there would still be a problem. At the place I adjuncted before the one where I was urged to lower my standards, the full-time faculty really encouraged us to maintain high standards and would stand behind us if we did even if it meant low evals. But then the thing is when one goes to apply for a full time job one is stuck with those low eval numbers. At that job I remember a lot of adjuncts trying to thread a needle between the department’s desire to keep high standards and our desire not to end up with evals that would harm us in the future. I also wonder whether adjuncts know when their boss cares about evals and not. My current boss very much supports maintaining high standards, and has even encouraged me to bring the hammer down harder than I’d be inclined to in some cheating cases. But do the adjuncts know where her priorities are? Do they know they don’t live and die by evaluations? That I’m not so sure about.

5 years ago

Here’s a dispatch from a small, teaching-focused school. During new faculty orientation, the tenure committee chair emphasized that teaching quality would by far be the most important factor in our tenure file, and that poor or mediocre teaching quality would sink our chances for tenure, no matter our service and research output. I asked about how teaching quality was measured. The Dean said that it was almost exclusively measured by student teaching evaluations.

Mohan Matthen

5 years ago

On the narrow point of how reliable student numbers are, and not addressing the broader issues of what they indicate and whether they should be used in annual reviews etc., the University of Toronto did a very large study of student evaluations and found that they are surprisingly consistent and show relatively little bias. Here’s a link:
https://teaching.utoronto.ca/wp-content/uploads/2018/09/Validation-Study_CTSI-September-2018.pdf

Sara L. Uckelman

5 years ago

Starting this year, my university requires us to provide numeric scores in our yearly promotions applications.

I did not, but instead provided references and links to all the studies showing that student evaluations are biased and that the use of them in employee promotions is discriminatory and potentially illegal.

Irene McMullin

5 years ago

The suggestion that ‘students are in a pretty good position to say’ whether, for example, ‘faculty turn back work in a timely way’ is in fact highly questionable: MacNell et al. (2015) used an online platform to disguise the gender of the teacher and identified significant gender bias in student evaluation. The instructor that students thought was a woman received significantly lower ratings on fairness, professionalism, respectfulness, enthusiasm, promptness, etc. The differences in ‘promptness’ scores were particularly striking, since work was marked and returned at the exact same time in both the ‘male’ and the ‘female’ led modules. Yet the lecturer students *thought* was male was given a 4.35 rating out of 5. The lecturer students *thought* was female received a 3.55.

	This comment is spam
	This comment includes a personal attack
	This comment disparages people based on demographic qualities (e.g., it is racist, sexist, homophobic, etc.)
	This comment otherwise violates the comments policy
	Other