“ChatGPT has just woken many of us up to the fact that we need to be better teachers, not better cops.”
In the following guest post, Matthew Noah Smith, associate professor of philosophy at Northeastern University, argues against the idea that we should be focused on preventing students from using large language models (LLMs) to cheat in our courses.
Rather, professors should aim to teach more effectively and increase students’ interest in the subject, which in turn is likely to reduce their motivation to cheat. He includes in the post several suggestions for how to do this.
Policing is not Pedagogy: On the Supposed Threat of ChatGPT
by Matthew Noah Smith
People are wringing their hands over the threat that ChatGPT and other LLM chatbots pose to education. For these systems can produce essays that, for all intents and purposes, are indistinguishable from very good undergraduate (or perhaps even graduate student) essays. AI’s essay writing capacity allegedly “changes everything” from a pedagogical perspective.
This is incorrect: ChatGPT has not changed everything.
Let’s be clear about the sole reason why people think that ChatGPT’s powers will transform pedagogy: cheating will be easier. (I will focus on ChatGPT only here, although there are and will be other LLM systems that can perform—or outperform—the functions ChatGPT has). Students, rushing to complete assignments, will simply spend an hour or so refining their prompt to ChatGPT, and then ChatGPT will spit out a good enough essay. At that point, the student merely needs to massage it so that it better conforms to the instructor’s expectations. It seems a lot easier than spending scores of hours reading articles and books, participating in class discussion, paying attention to lectures, and, finally, composing the essay. The charge many people make is that the average student will give in and use ChatGPT as much as possible in place of doing the required work.
In short, students will cheat their way to an A. Or, to put it more gently, they will cheat their way to a completed assignment. It follows that because they will use ChatGPT to cheat, students will not get as much out of school as they otherwise would have. Call this the cheating threat to learning.
The only solution, it has been suggested, is that we must force our students to be free to learn. The only available tactics, many seem to think, are either aggressively policing student essays or switching to in-class high stakes testing. On this view, we’re supposed to be high-tech plagiarism cops armed with big exams.
But how much responsibility do teachers have to invigilate their students in order to prevent cheating? Not much, I think. And so we do not have a particularly robust responsibility to police our students’ usage of ChatGPT. So we should reject the idea that we should be high-tech plagiarism cops. I have two arguments for this claim.
First, it is inappropriate for us to organize our assessments entirely around viewing our students as potential wrongdoers. We should commit, throughout our classes—and especially when it comes to points at which our students are especially vulnerable—to seeing them primarily as if they want to be part of the collective project of learning. Insofar as we structure assessments, which are an important part of the class, around preventing cheating, we necessarily become suspicious of our students, viewing them as opponents who must be threatened with sanctions to ensure good behavior or, barring that, outsmarted so that they cannot successfully break the rules. This both limits and corrupts the collective project of learning.
It limits the collective project of learning because strategic attempts at outsmarting our
opponents, uh I mean our students, is the opposite of a collective project (analogy: a house dweller installing a security system and a burglar trying to break through that system are not engaged in a collective project).
It corrupts the collective project of learning because even if we are engaged together in that project, we will no longer view each other as sharing values. We are instead in conflict with each other. We view each other as threats to what matters to us. For the professor, what matters is that the student doesn’t cheat. For the student, what matters is getting the good grade. The teacher-student relationship is shot through with suspicion and resentment, both of which can quickly turn to anger. These are, I believe, sentiments that corrupt shared activities.
The second reason we should not worry about the cheating threat to learning is that focusing on preventing cheating is not the same thing as focusing on good pedagogy. Good pedagogy involves seeking to engage students in ways that create a sense of wonder at—or at least interest in—the material and help motivate them to learn it. If we focus on selecting assessment methods solely with an eye towards thwarting ChatGPT’s cheating threat to learning, we are less likely to be selecting assessment methods that facilitate good pedagogy.
For example, some prominent people, like the political theorist Corey Robin, have argued that we should switch to in-class exams in order to limit the cheating threat to learning. Why? Is it because in-class exams are better for learning? No. It’s because ChatGPT cannot be employed during in-class exams, thereby neutralizing the cheating threat.
This approach is a mistake. We have some evidence that infrequent high stakes in-class exams produce worse learning outcomes than frequent low-stakes in-class exams. And if stereotype threat is real, then high stakes in-class exams might be a net negative, at least for women. In fact, since there are alternatives (see suggestions 1-10 below) that may alleviate worries about Chat-GPT-related cheating threats to learning, high stakes in-class testing may be all things considered worse from a pedagogical perspective.
So, should we simply ignore cheating? No. We should build assessments that make cheating less likely. But we should not build assessments to trigger in students the fear of getting caught due to surveillance and suspicion. Rather, we should build assessments that, when properly integrated with the course material, make cheating either irrelevant or just not worth it—even if not getting caught is a near certainty.
Before suggesting some these alternative assessments, it’s worth noting that it can be rational for a student to use Chat-GPT to cheat on an essay assignment. After all, some undergraduate students are very stressed out and overburdened. In order to manage their lives, certain corners must be cut. Why not just use this useful chatbot to write that paper?
Yet this observation does not constitute an argument for high stakes in-class testing. Rather, it is an argument for structuring our classes in ways that both accommodate stressed out students and produce desirable pedagogical outcomes. In other words, we need teaching tactics that generate engagement and even wonder, and that reduce the rational motivations to cheat.
In what follows, I will quickly list some assessment methods I’ve learned about. Some of these I also regularly use. My dedicated and brilliant teaching professor colleagues at Northeastern taught me many of these, as have colleagues outside my university whose professional focus is teaching rather than research. So, I want to emphasize first that I claim no credit for coming up with these. I am merely transmitting the wisdom of others.
From a structural perspective, this whole obsession with the cheating threat to learning reveals clearly that great pedagogy is the only solution to the myriad structural factors pushing students to cheat. And great pedagogy cannot be an afterthought. It should be a full-time effort. So, perhaps the best thing we can do to address the threat ChatGPT poses to learning is to organize to demand that our universities make a genuine commitment to education. One (still insufficient) approach that my university has taken is the establishment of well-respected, permanent teaching professorships.
* * *
Let us now turn to the space in which each of us, as individual professors, have the most amount of power: our own classrooms. In particular, here are some different practices that can be used to engage students, that altogether can determine the grade of the student, and whose focus is pedagogy, not policing.
- Regular reading questions
Students should be asked to submit reading questions for some portion of the assigned readings. (I require questions for approximately 33% of the readings.) The students should be told to produce structured questions that directly cite the assigned readings. For example, I tell my students that their questions should take this form: “On page XX, Author writes ‘blah blah blah’. I think that means that p but I am unsure of this because of such-and-such a reason. Please explain.” I give other examples of how they can write questions but all of them involve a specific textual citation and some attempt at interpretation.
Instructors can read these questions ahead of class and then use some of them during class. For example, the instructor might start class by putting two of the questions on the board and breaking students into groups whose task is to attempt to answer the questions. The groups can then report back to the whole class.
- Regular and short in-class quizzes
Students are regularly given quizzes testing whether they have read the assigned texts and whether they were engaging in the previous classes. The students could have their annotated glossaries (see 8 below) open during the quiz. Regular low-stakes quizzes are well established as effective tools for facilitating information retention.
- Discussion Forum Posts
Students are required to post a significant number of 100-word or more discussion forum posts on the course website (we use Canvas, others use Blackboard, and still others use bespoke course management software). The posts should directly respond to the readings, or to classroom activities, or to other posts on the forum. I typically require students to do a mix. I also ask students to work through the semester to make their posts more polished. For example, after a few weeks, I ask students to begin to cite specific passages in their posts. Or I ask them to reconstruct an argument from a particular passage or to raise an objection to an argument a student has reconstructed.
I also use the discussion forum for the final paper. I invite students to collaborate on the forum to work through their papers, and then require that they cite each other if they use another person’s ideas from the forum. I indicate that there are limits to what they can use, and that if they have questions we can talk about it in class. This is a useful opportunity to discuss originality and attribution in scholarship.
- Scaffolding essay writing
Students are required to produce their long essays in steps, from formulating the question to working on an annotated bibliography, to outlining the paper, to drafting the paper, to completing the final paper. Any number of these steps can be used, or just one of them. But, the goal is to help students through each step of the process of writing the paper. This distributes the work over several weeks. It also provides opportunities for instructors to identify where students are having the greatest difficulty.
- Students commenting on each other’s drafts
Students read another student’s draft and write a short critique of the draft (or they annotate it with comments in a word processing application). The students might be required to meet with each other to discuss their comments. I require students to include a paragraph in their final paper on how they incorporated (or why they didn’t incorporate) the comments they got from a fellow student.
- Student Presentations
Students are organized into groups each of which is tasked to present on an assigned reading or a specific topic. The presentations should be formally organized, with a list of questions the students will answer in the presentation, summaries of assigned text(s), some analysis of the text(s) (I often ask my students to reconstruct arguments), an objection or two, a possible response, and then a list of further questions. I also ask presenters to lead a structured discussion in the classroom. An added bonus of this is that it can help student presenters to understand the importance of students voluntarily engaging in classroom discussion.
- Comment on Student Presentations
Students are required to write one short commentary on a student presentation in class. They have to summarize the presentation, indicate a question they had going into the presentation that the presenters answered, a question they had that wasn’t answered, and finally some reflections on a class theme inspired by the presentation. These comments can be made available to the presenters.
- Annotated Glossary
Students are required to maintain an annotated glossary of terms introduced in the class. The glossary should include definitions, citations to the sources of the definitions (including date of the lecture if it was from a lecture), examples that illustrate the terms, and updates as the new definitions of the terms are offered (which inevitably happens when it comes to contested technical terms). These can be turned in several times throughout the semester.
- Students must cite the discussion forum and lectures in their final paper
Students are required to use material generated within the course when writing their final papers. This, in addition to citing assigned readings, ensures that the students will be engaging the ideas we’ve all developed together throughout the class.
- Almost all grades are Pass/Fail
Students receive full credit for doing the work to an adequate standard. We still give the students comments, but not too many, as the goal is to identify, at most, one thing the student did wrong, one thing the student did well, and one thing the student can work on in the next assignment. (I learned this 3-comment approach in grad school when I TA’d for Geoff Sayre-McCord.) This is a better steer than a punitive grade, which can just stress students out. I usually reserve the final paper for a stricter grading scale, with the possibility of a student even getting an F on the final paper. But, if I’ve scaffolded the paper writing well enough then this is unlikely.
This is hardly an exhaustive list of assessments. But they all avoid high-stakes in-class exams or one or two lengthy standalone essays, and they all can help students in different ways. Some are designed to support the collective project of learning together in the classroom. Some are designed to draw students into the material through a series of low stakes contributions. Some are designed to make students feel like they have a stake in how the class goes. Some are tools for promoting information retention. Some help to reduce the burden of completing a big project such as writing an essay while still requiring that the student complete the big project.
These assessments probably can all be hacked by ChatGPT. But it’s hardly a simple matter to do so. Furthermore, even if some students do use ChatGPT repeatedly to hack these assignments, most of the students in the class probably won’t, as the stakes are so low and the requirements for completing the assignments are not all that taxing. In other words, these assessments make it less rational to cheat and more rational to do the work. They additionally are, as already argued, likely to be at least as pedagogically effective—and in some cases are demonstrably more pedagogically effective—than high stakes in-class testing and one-off essay writing.
* * *
But, you say, these just are big changes to the normal practice in philosophy classrooms, which usually are mostly lectures punctuated by big paper assignments. So, hasn’t ChatGPT actually changed everything?
No. The ChatGPT cheating threat to learning is not the reason to adopt these assessment changes. The reason to adopt these assessment changes is that these assessments would yield better learning outcomes even if ChatGPT never existed at all. ChatGPT has just woken many of us up to the fact that we need to be better teachers, not better cops.
In general, there is no reason to view contemporary punitive grade systems as necessary for hard work. After all, at Harvard University there is already such extreme grade inflation that students are basically guaranteed an “A-” just for showing up and it is not much different at many other universities. And yet most students at these universities are still putting in at least some effort! In fact, I find it somewhat astonishing that people are so certain that an especially punitive grading environment is necessary for positive learning outcomes. Some kind of sanction may be required to inspire most students to make an effort. But, I think that we have some reason to believe that students will work hard if they are interested in the topic and if they care. I at least enter the classroom presuming that if my students have a problem with the class, it is not due to a lack of desire but instead due to too many competing demands on their time and emotions. Yes, there are some students in my classes who just don’t care. This is certain. But there is no way I am going to organize how I assess every other student around the vicious goal of making the slackers suffer for their slacking.
Insofar as students are making the rational choice to forgo doing actual coursework and instead, because of how demanding their lives are, rely on ChatGPT, I fail to grasp how we are in fact improving their admittedly difficult lives by making them take a few high stakes exams in order to determine their grades. Why not adjust many of our other pedagogical practices so that we can better assist our students in joining us in an exploration of the ideas and texts we all lovingly study?
* * *
I believe many of us would like to make various changes to how we teach. But, I’ve also encountered a surprising amount of dogmatic insistence that high stakes take-home essays and in-class exams are the ne plus ultra of assessment, and furthermore that punitive grading is probably a necessary component of education, at least for the average student. My entirely speculative diagnosis of these sclerosed attitudes is twofold.
First, many of us are so used to—and have so embraced—both the disciplining and performative practices of the modern academy that we have fetishized both discipline and performative teaching as the ultimate modes of instruction. To truly teach is to profess on a stage—to be the authority who lectures. For students to truly learn is for them to recognize this authority via submission to the discipline of grading. On this view, to shape a student’s mind is not a collaborative process but is instead a form of knowledge transfer conditioned by repeated corrections when that transfer doesn’t quite take. All this must proceed along our own and not the students’ timescales. In contrast, the somewhat stochastic process of intellectual accretion, the slow sedimentation of heterogeneous understandings and haphazard bits of different academic skills, which occurs when the classroom is more collaborative and slow can feel unpleasantly out of our control.
Second, the development of new teaching and assessment techniques is hard work. There are few institutional incentives to engage in that hard work. Furthermore, it takes constant time and effort to manage all that students submit when you use all these assessments together. I, for one, expend much effort reading, commenting on, and incorporating into each day’s class the small bits of work my students regularly submit. I also have hours and hours of one-on-one meetings with students when it comes time to discuss their paper drafts, all of which I have read and commented on prior to these meetings. While boring and a lot of work, these tasks also make the educational process more collaborative and less punitive. This in turn may lower my students’ anxiety.
But I am abusing my privileges here. This diagnosis is conjectural, not a strongly held view.
Discussion welcome. Readers are encouraged to share their own thoughts on, and strategies and tactics for, teaching in a world in which our students have access to LLMs.
Readers may be interested in also sharing their ideas for teaching in a world of LLMs on a Google doc organized by Klaas Kraay (Toronto Metropolitan).
• Teaching Philosophy in a World with ChatGPT
• If You Can’t Beat Them, Join Them: GPT-3 Edition
• AI, Teaching, and “Our Willingness to Give Bullshit a Pass”
• Conversation Starter: Teaching Philosophy in an Age of Large Language Models
• Teacher, Bureaucrat, Cop
• How Academics Can Make Use of ChatGPT