We’re confusing consumer satisfaction with product value.
That’s Philip B. Stark, a professor of statistics at Berkeley, discussing a mathematical critique of student evaluations of teachers he has written with a colleague, Richard Freishtat. There’s an article about the critique in The Chronicle of Higher Education. The study itself is here. Here’s a recap of major points from the study:
● We might wish we could measure teaching effectiveness reliably simply by
asking students whether teaching is effective, but it does not work.
● Controlled, randomized experiments—the gold standard for reliable inference
about cause and effect—have found that student ratings of teaching
effectiveness are negatively associated with direct measures of effectiveness.
Student teaching evaluations can be influenced by the gender, ethnicity, and
attractiveness of the instructor.
● Summary items such as “overall effectiveness” seem most susceptible to
● Student comments contain valuable information about students’ experiences [not necessarily teacher quality].
● Survey response rates matter. Low response rates need not signal bad teaching,
but they make it impossible to generalize reliably from the respondents to the
● It is practical and valuable to have faculty observe each other’s classes at least
once between “milestone” reviews.
● It is practical and valuable to create and review teaching portfolios.
● Teaching is unlikely to improve without serious, regular attention.