A Philosopher’s Reflections on Teaching in a World with AI
“I once believed my students and I were in this together, engaged in a shared intellectual pursuit. That faith has been obliterated over the past few semesters.”
That’s Troy Jollimore, a philosopher and poet at California State University, Chico, writing at The Walrus about the widespread use of ChatGPT and other large language models by college students.

[Eduardo Paolozzi, Chimpanzee in a Test Box Designed for Space Flight / Mobot Mark 1]
.
Likening the technology to a “magic bag,” he notes that it is irresistible to many students, even when they know its use would constitute cheating:
To judge by the number of papers I read last semester that were clearly AI generated, a lot of students are enthusiastic about this latest innovation. It turns out, too, this enthusiasm is hardly dampened by, say, a clear statement in one’s syllabus prohibiting the use of AI. Or by frequent reminders of this policy, accompanied by heartfelt pleas that students author the work they submit.
Educators are always battling strong cultural forces. “Our society has always manifested strongly anti-intellectual tendencies,” he notes. And AI makes that battle even harder.
Combating such skepticism is a large part of what education is all about. The best way to get people to see why education matters is to educate them; you can’t really grasp that value from the outside. There are always students who are willing but who feel an automatic resistance to any effort to help them learn. They need, to some degree, to be dragged along. But in the process of completing the assigned coursework, some of them start to feel it. They begin to grasp that thinking well, and in an informed manner, really is different from thinking poorly and from a position of ignorance.
That moment, when you start to understand the power of clear thinking, is crucial. The trouble with generative AI is that it short-circuits that process entirely.
Writing is a crucial part of learning, Jollimore says, and work-arounds to prevent cheating with AI are inadequate:
Writing in a classroom can never approximate the sustained concentration required to produce a carefully thought-out, polished piece done over a period of time. Being assigned such projects encourages students to work at a higher level and to flex more of their intellectual capacities. It teaches them that good writing is something you craft, not something you spit out at a moment’s notice. And it demonstrates to them that they are capable of producing something they can be proud of. Moreover, restricting students’ writing to the classroom overlooks the fact that a student’s comprehension and thinking ability will nearly always be more accurately represented by sentences they take time to mull over, to compose, and to revise than by words they whip up when confronted by an exam question.
I have said this sort of thing to students from time to time, in explaining why most of their grade—all of it, in some cases—is based on such long-term writing assignments. Such assignments have been the most powerful teaching tool in my arsenal, particularly when they are developed in stages: a first draft followed by feedback, followed by a revision, and so on. Assuming that the student puts in the effort, the paper inevitably gets better through such a process.
But what is more important is that the student gets better. After all, I don’t actually need the essays. I don’t have a shortage of things to read. The essays are a means to an end, the end being the transformation of the author into an educated person. This kind of writing assignment is simply the best available instrument for bringing about this highly desirable end.
But it works only, of course, if the students themselves write the essays.
The above excerpts touch on only some of the important issues Jollimore considers in his essay. Read the whole thing here.
There are several less-explicated premises in this work that I wish the author laid bare. Mainly, I don’t think students un-think due to LLMs more than the amount they’ve already been un-thinking. My experience has been that the default if not the norm is skipping readings, distracting oneself and finding excuses to procrastinate; my experience is *also* that professors are made uncomfortable by knowing that, and are complacently unvigilant about how little students read. The author suggests that this is a newish problem, but I think the mindset of students has long been this way and LLMs happen to make that obviously visible.
i mean there’s truth to this and it’s just part of the human condition for students to want to cut corners sometimes, have other things going on, etc.
at the same time, it is one thing to say (a) “there have always been problem drinkers” and another to say (b) “due to X, problem drinking has tripled.” the truth value of (a) does not centrally bear on the issue noted in (b).
Right, it’s a serious concern. What I mean is that I can’t know in advance, as a reader, the extent of the problem with the piece written as it currently is. The zone of disagreement is in degree rather than existence, and there is no way for me to gauge the author’s prior engagement with student negligence. I wish the author had used this question as its starting point.
Indeed.
In this culture, manual labor is belittled, and a four-year degree has been deemed necessary for any career that is not manual labor, and states disinvest in four-year colleges, so those colleges make students who wouldn’t be caught dead in a manual labor job pay a lot of money to attend, and those students understandably want a good return on their investment in the form of a good job, so those students rationally see their goal as getting the grades required to get the degree, and those students naturally see themselves as high-paying customers whose desires for those grades deserve to met by the four-year college.
This incentive structure has been in place for much, much longer than LLM’s have been on the scene. And nothing in it is ordered toward the activity of thinking, while everything is ordered toward results, products, assessments. When it’s the credential that matters, why do the thinking when you can get the credential without doing the thinking?
While I don’t disagree with you, these obstacles don’t really undermine the author’s point that AI poses a serious threat to instructors who care about the quality of their students’ educational experience. In fact, the author acknowledges some of what you both write: “‘Our society has always manifested strongly anti-intellectual tendencies” so teaching has never been without its challenges. There have always been privileged kids who treat college as a right and more about networking than developing a rich inner life, while less privileged kids have lacked access to a quality education (and thus quite naturally belittle it and its intrinsic value). And of course there is the rather silly strict division between manual work and intellectual pursuits as if both can’t be profitably enjoyed by the same individual. But that doesn’t change the fact that AI has made it less likely that students (especially first generation college students) will get the most out of their college experience and discover interests and capabilities that they never knew they had. Just because our culture is unhealthy, doesn’t mean encouraging wide-scale smoking is a good idea. I should add that despite all these and other obstacles, some students still find college to be transformational. It is important not to give up on these students.
Has anyone thought of changing course assignments? Writing 3 papers over the course of a semester – in every class – can start to feel like factory-work for students, which maybe why they check-out and are reticent to be drawn to water. AI, like any other tech, (eg the internet and citation management software, etc) can handle the drudgery that comes with research and editing, freeing up time to do better thinking. Just wondering if there’s anyone who is embracing AI and coming up with new solutions…
The main impediments facing students of philosophy with respect to better thinking is not typically free time. So although it is true AI can give you a lot of time, I don’t think it can give students what they need to do better thinking. Indeed it actively robs them of what they need to do better thinking.
Suggesting that people are using AI just to do research and free up time so they can think over and write about the issue’s prompted is insanely disingenuous and makes me wonder if you’re in the AI-tech industry or use it yourself to cheat
Mine use AI for reflections, discussion questions, reading responses, class presentations… Basically, any work done at home.
Work done in class also uses AI. Unless I take all electronics away. But even then, I’ve caught many earpieces/smartwatches/burner phones being used in the final exam room so that the student can use AI.
So, I dunno. What assignments have you got in mind? I’m happy to try anything.
I *do* ban all that stuff during exams and on in-class writing day.
There are certainly people who embrace AI in the classroom, but I don’t think it usually results in “better thinking,” despite what many instructors might tell themselves. Simply put, learning how to write well doesn’t come natural to most students. It is often hard, tedious, painful, and for some students, even traumatic. But utilizing AI to help students avoid all the pain and drudgery of writing papers themselves only theoretically frees them up for better thinking. It turns out that there is a fair amount of pain and drudgery in higher order thinking as well, not to mention in maintaining strong relationships or engaging in most human practices. AI promises to do all these things for us, but I don’t think we benefit in the end. Banning AI is probably not the solution. Instructors have to find a way to convince students to put in the effort (and avoid all the distractions around them) to enjoy the fruits of their labor. In my experience, this has become harder to do in recent years, and I doubt that AI is going away any time soon. So rather than make compromises (as we have done with online and large lecture courses) perhaps we have to include students in our deliberations about the proper use of AI and try to convince them why the painful process of reading difficult texts, thinking about ideas in these texts, and engaging with these ideas in speech and writing can’t really be sidestepped if you want to get the most out of your college experience.
Perhaps a better lesson is that banning new and popular technologies is not the answer. The professor worries that the students have outsourced their learning responsibilities by using the tech. It seems to me that we have neglected our responsibilities as educators by simply banning it rather than thinking through our approach to this technology and meeting students halfway. ChatGPT is not like classical cheating: it has been adopted as part of best practices by nearly every industry. Banning ChatGPT is like banning word processing and forcing students to write by hand, in my mind.
Evaluative processes and assignment designs must adapt to technological innovations. As professors we have no right to declare that the way we learned is the only correct way to learn. I don’t deny that LLMs pose new and difficult problems. But to claim that students have betrayed the educational relationship is disingenuous and close minded, imo.
There are good and bad ways to use ChatGPT as a student. Our goal should be to teach our students how to discern and evaluate these, not to ban them and try to live in the past.
It’s not like banning word processing. Word processing is like typing, only better. But you’re still doing the kind of thinking that writing makes you do. If you ask the AI to give you the essay, you aren’t. You say it’s not like classical cheating – that’s right, it’s worse. It’s worse because (a) it’s harder to detect and (b) it’s easier for them to talk themselves into thinking it’s ok, because half of their profs are saying things like “we need to embrace this new technology and redesign our courses.”
Hear, hear.
We may not have a right to “declare” what the “only correct way” is, but surely we have a right to a view on how to do our jobs, and we ought to have the confidence to defend it. If we don’t, why should anyone respect us? I don’t see why we should think Jollimore’s view disingenuous and close-minded rather than a product of careful reflection on one’s experience.
I’m all for open-mindedness and humility, and I am not anti-tech, but I don’t think we are compelled to adopt new technology simply in the name of open-mindedness (a mind so open that our brains fall out…).
ChatGPT is not like classical cheating: it has been adopted as part of best practices by nearly every industry.
Wait, how is that different from other cheating?
All industries outsource jobs that they aren’t good at, paying others to do that work. A student who buys a term paper is just doing the same thing.
Any engineering firm would use Mathematica to do a difficult calculation rather than having someone work it out in-house — using Mathematica to solve your differential equations homework problems is still cheating.
Lots of assignments are important and valuable precisely because you actually do them yourself. That’s always been true, and getting someone or something else to do them for you has always been cheating, even though ‘industry’ would quite reasonably be happy to farm out the job.
A philosophy student writing an essay in which they are supposed to be grappling with a difficult question is different from a company writing an internal TP report. I am fine with the later being written by ChatGPT, but not the former.
And yet mathematicians, both students and professors, use Mathematica all the time. And they use theorem provers, proof assistants, and proof verifiers. They are pouring enormous resources into developing these tools. Why? Because mathematics actually has a subject matter. One might even give a new criterion for the cognitive meaningfulness of a field: A field has cognitive meaningfulness if and only if it can be improved using AI.
You can use a calculator when you know how to perform the operation in question. But you do have to learn how to do the thing first. Which is where students are falling short.
Case in point: I just marked an in-person midterm for a first-year course that nevertheless managed to use AI to answer the essay question. The question was about Lewis’s Truth in Fiction; the answer given was about counterfactual conditionals and modal realism (almost verbatim what ChatGPT spits out given my prompt).
The student in question did manage to defeat my safeguards–likely with a smart watch or a well-hidden burner phone, but knew so little about the course and readings that they couldn’t ascertain that the answer was completely irrelevant.
Now multiply this by hundreds every semester.
False. I have forgotten how to do long division. I no longer know how to compute roots. Yet I use calculators for those things all the time. All you need to know is the meaning of the operation, not how to do it.
Perhaps Michel’s “can” was deontic, as in “you *may* use a calculator when you know how to…”. That would be more plausible—obviously you can, in the alethic sense, use a calculator for stuff you don’t understand—and it would also cohere with what many teachers tend to think about calculators already.
It seems like you’re saying or pre-supposing that philosophy can’t be improved using AI. But so many philosophers have given so many examples (on Daily Nous) of how they’ve used AI to help them with their research, that I can’t believe you’re saying that. But then I can’t figure out what your point is.
No, not at all – I think philosophy can be improved by AI! And thus philosophy has a subject matter. Yes, I’m well aware of, and very supportive of, those philosophers who are using AI in productive ways both in teaching and research. I’m responding to the luddites who think AI can’t be used well in philosophy. Sorry for the unclarity.
a bad way to use chat-gpt is to use to write a humanities paper for you
The problem here is you think the only way to use it is to write a humanities paper for you. Try using it differently.
I took the point of the OP that that’s what _undergrads_ are doing with it, which seems to be true.
did i say that? i don’t recall doing so
Swept everything under this sentence “I don’t deny that LLMs pose new and difficult problems”, just go on to ignore them all XD
The issue is not about “goals”, mate. The issue is about how to get students to do work at all to reach **any** goals. Sure, say you want to teach students to use LLM in the good way. Say that’s your goal. Tell me how you can get students not to use LLM to do whatever assignment you want them to do to learn how to use LLM properly. Without anything like that, this is at best disingenuously self-serving
Analogy. I want to teach students to lift-weight. I forbid them to use machine to liftweight in my weightlifting class — is my stance against machine my failure to meet students halfway in my approach to technology? Of course not
You said: ChatGPT “has been adopted as part of best practices by nearly every industry.”
We’ll need a citation for that.
Yes, LLMs are being adopted across sectors, but much if it is because of FOMO. Customer-service chatbots are a primary use-case for LLMs, but they usually end in embarrassment for those companies, even for Google which is a leader in AI.
And here’s Microsoft’s CEO earlier this month on the lack of value seen to date from AI/LLMs, i.e., how can this be a “best practice” when it’s not even demonstrated to be valuable?
https://futurism.com/microsoft-ceo-ai-generating-no-value
You said, “Perhaps a better lesson is that banning new and popular technologies is not the answer.”
That seems to be a very hasty generalization.
Imagine I’ve created a Student-as-a-Service app that matches a student with another human being who’s willing to sit in on the student’s classes and do all of their coursework, and that this new technology became popular and widely adopted. The student gains free time to pursue other interests, and the rented human gets income; so, win-win.
Are you saying this SaaS tech should not be banned in the classroom because it’s popular, and teachers need to meet these “students” halfway??
This is a close analogy to LLMs. If you would resist that SaaS tech, why wouldn’t you also resist LLMs?
That’s a great analogy!
A great startup idea for any unscrupulous philosophy grads struggling to land a job!
I haven’t banned ChatGPT outright in my classes this term (including one on AI and Ethics). Instead, I’ve explicitly stated when and (as much as I can) to what extent the student may use these resources. This has made, I believe, a difference; possibly because it telegraphs to them that I am “on the lookout” or perhaps because it is no longer a black-and-white issue. I’m not honestly sure. I have noticed a remarkable decrease in the number of obviously AI-generated papers.
On the other hand, maybe they’re just getting better at fooling me.
The most likely answer is neither. The reason they are fooling you is that the AI is far, far better than it was even a year ago.
The following is directed at everyone not you:
If you have not extensively experimented Claude Sonnet 3.7 or ChatGPT o1/4.5 (including the ‘canvas’ feature, with its handy dial to finetune the word-length of the output), then frankly you have no business having an opinion on how good this technology is at writing essays.
And ‘experimented’ includes, among other things,
1. having uploaded the articles/lecture slides the essay is working with, so that the LLM can draw on them and cite them
2. Asking it to write like a student
3. having it write one section at a time, and then prompting it to write the next section or subsection, and the next, and so on
5. reading it through a few times in case there are glaring problems, and refining the prompt to make it better.
this may all be true but the effort and consideration required to produce a really strong ai paper in this fashion could be said to be equivalent to or even beyond the effort and consideration required to actually write the paper itself, if the paper is reasonably short.
practically speaking, i also don’t think (and i could be very wrong here) that most llm using students are paying the monthly premium for the most recent models.
lastly, and i’ve seen variants of this thinking here before, the underlying assumption of the comment is that the student actually knows what a good paper looks like and knows enough about the structure of such a paper and perhaps the material being written about to be able to tell the llm to produce a strong essay in the manners you mention. in my experience this isn’t the case for most college students, even sone of the very clever ones.
the approach you describe above presupposes an awareness of what a successful undergraduate paper is, an awareness that the vast majority of underclass persons just don’t have. you, a grader of such papers, can make it write strong papers but that does not mean that the typical sophomore will.
if we consider also the reasonable assumption that the majority of ai cheating is submitted by students who are lazy or completely overworked or totally disengaged etc it seems to me clear that the kind of effort you describe above is probably highly unusual.
certainly “make it sound like a student” may be catching on, but iterating to achieve a strong result? not the typical scenario.
Firstly, I did not say that all students are doing this. I said that unless you have done this, you don’t have business commenting on what the technology can and can’t do. Once you understand what the tools can do, then you can decide for yourself whether students are using it, having developed some humility about the possibility of detecting it. If you teach at a selective university, then your students are clever enough and lazy enough to do every thing I mentioned every time. Even if were ‘more work’ overall, it takes no hard thinking and that is enough to make it worth their while. And I do believe many of them are paying for the premium models, yes. They use ChatGPT all day every day, and would not have enough credits otherwise.
my reading of your post is that you are saying two things: 1) ai is better at writing papers than you think, and 2) given enough iteration students can fool even the most alert/experienced graders.
the response i posted argues that point 1 may be true but that practically speaking this is not the common use case, for reasons already noted. i stand by this statement. ai is in fact *not good* at writing these papers *unless* the user knows what a good paper looks like and is willing to work it through.
again with point 2, i don’t dispute the basic assertion. with enough work, students *can* certainly use ai to successfully cheat. but having taught at two ‘selective’ liberal arts schools and an r1 since over the past calendar year (and having taught in higher ed since 2009), and having conferenced with students/examined drafts, and having used llm models fairly extensively, and having failed more than a few students for turning in ai-created papers, my experience doesn’t reflect what you seem to be claiming—that students are successfully using ai to entirely compose papers that fool us again and again.
maybe this isn’t what you’re suggesting, but i take this comment to say that huge numbers of more or less 100% llm-produced papers are getting through without raising an eyebrow. my experience across different schools in different states is that this isn’t the case, at least not yet, because of inherent limitations having to do with llms as they currently exist and limitations in undergraduate canniness that i’ve observed over 15 years of teaching.
i’m not saying that ai-produced papers haven’t gotten by me or that i’m super accurate at detection. i’m also not saying that ai use isn’t rampant. i’m also not saying that current llm ai models are incapable of producing papers that can fool me or others.
i *am* saying that potential for the kind of widespread successful use you seem to be suggesting is not the same thing as actual successful use. anyone can get online and figure out how to put a chip purchased over the internet in their car to overcome engine performance limitations. yet the vast majority of people don’t for all sorts of reasons from ignorance to concern about consequences.
it’s not a perfect analogy as there’s no ignorance of llm ai in students right now. but the point stands—one *can* do all sorts of things that one doesn’t actually do. without compelling evidence that students are getting tons of 100%-ish ai papers past us without us knowing about it (two questions: 1.what kind of evidence would that be if they’re getting past us? 2. are we counting papers that we suspect are ai-generated but for whatever reason we don’t investigate?), we have something closer to a conspiracy theory than an empirical observation.
just my two cents
At a certain point, if the person is having the AI write just a paragraph or two at a time, reading the paragraph, figuring out whether it does what it’s supposed to do, and editing it a bit to make it better, then I think they’re actually using the tool to help them achieve what the goal of the class actually is – to learn how to compose clear essays that explain ideas.
Of course, I don’t know if they actually have to go that far to make it *look* enough like a decent paper that it gets an A-.
I strongly disagree that that is the goal of essay writing. But quite apart from that, as you recognise you can get an A- without approaching anything that would teach you how to do that. And you would still be cheating.
Like Alma, I disagree that the purpose of a philosophy class is as you describe it. The point is not merely to receive a product whose content is clear and explanatory. It should also reflect an authentic understanding of the course materials and an ability to express that understanding for themselves.
What about all the intro ethics papers in the ‘C+ to B’ range where the latest AI, after a few relatively mindless prompts, generates something that has a decent amount of accidental likeness to genuine thinking? Perhaps it’s not a very good paper overall, but points to one or two promising ideas that the professor mistakes as the student’s own, or can’t prove otherwise.
At an early undergraduate level, and given the sophistication of some of the newest models, this seems possible to me. It’s also not crazy to think that some clever students might be pretty decent at detecting text that would produce a ‘B-‘ paper without being able to actually produce it themselves if asked. Maybe it’s not a 100% reliable method, and it will sometimes result in papers that make outrageous errors, but I suspect in many other cases the student can get lucky.
If we have to give grades, then we have to do so fairly. I don’t see how I can fairly assign a ‘B-‘ to a paper that could, for all I know, have little to zero serious thinking behind it.
Separately: it’s also a complete waste of our time to grade and, worse, provide feedback, to such garbage.
Okay. But I still need help figuring out what to do to evaluate their learning. I’m a crap teacher, so help me out!
“it has been adopted as part of best practices by nearly every industry”— LMAO.
I think it’s fair to ban calculators to teach math. Students who use them when they are prohibited are cheating. AI bans aren’t any different.
Using calculators is obviously extremely useful in mathematics and other related fields, but I find it very puzzling that people don’t seem to get why it makes sense to ban calculators when what is being taught is what the calculator does.
There is value in knowing how to calculate yourself.
I’m under the impression that students had refined the ability to write a few pages of technically sound but intellectually vapid prose even before GPT became able to fully imitate it.
Call me a Luddite, but I’m doing closed book, in person exams again. To my surprise, they are quite effective. I know they use ChatGPT for studying (e.g. summarizing and quizzing practice), but these uses seem to be entirely unobjectionable to me.
In my last gened ethics exam, I arrived to a room of students quizzing each other on Kant and Mill. Just from what I overheard, they engaged with the material much more deeply in their exam preparation than if they had written a few more by-the-numbers essays.
Yep. I have switched to this almost entirely. But our First-Year Seminar program requires paper-writing. I’ve been doing that in-class, but it definitely has the drawbacks Jollimore mentions.
I’ve been doing this too. Keep up the good fight!
You say the exams are quite effective but you don’t say what they are effective at. Presumably you mean something like “getting students to memorize the material in sufficient depth to regurgitate it at the level one might reasonably require in an in-class exam,” right? But what if I want students to engage with the material more deeply and to produce thoughts about it sophisticated enough to expect from (say) a take-home paper but unreasonable to expect them to develop at a moment’s notice during a time-pressure exam? Do you think exams are effective for that purpose? Since I care so much more about that sort of thing than the sort of thing I think exams are good at, I am very reluctant to move to exams. (ChatGPT may force my hand eventually, but not yet.)
For example, I really want students to be able to develop an argument about (say) what Aristotle thinks by drawing on material from the various parts of the Nicomachean Ethics that we’ve read, and by properly citing that material etc. Unless I allow them to do an open-notes exam in which they are provided with the prompt ahead of time (at which point they can have ChatGPT do the work for them) I don’t see how I can test that skill with an in-class exam.
I’m also a little unsure about your initial sentence. The suggestion seems to be that even writing papers at all is a waste of time since students managed to write bad papers in the past. But after having read I don’t know how many thousands of pages of in-class writing and done-at-home writing, I think the average vapidity of a blue book is through the roof compared to a paper written at home. Of course my students were not churning out masterpieces on the regular with their traditional paper assignments, but I definitely saw (on average) more creativity, deep thinking, engagement with the text, and so on in papers than in exams. Is this an unusual experience? Or does it match yours (pre-GPT)?
I give them a list of questions, a subset of which are on the exam. So they have to read and think and synthesize and – this is the key thing – know what they want to say. Even if they asked AI for help, they still have to have mentally processed and assimilated whatever it was.
Effective at getting students to engage with the material.
I don’t just do memorization tasks. For instance, I sometimes include on the exam a thought experiment and ask to discuss it in terms of a reading.
Students know in advance that such thinking is required, so they prepare accordingly. I reveal the questions, but not what topic or thought experiment they’ll be about. These exams are effective at getting the students to engage with the material (in their preparation, not during the exam itself — the exam is a means to this end).
To your last point, yes I had soured on take home essay assignments already (but shortly) before GPT. I agree in principle that long form writing is the best way to gain a grasp of something. But I found that, first, this is only true given a level of supervision I cannot provide in most undergraduate classes. And, second, that high school today apparently entrains a style of essay that’s mostly formulaic slop. I get kids who were excellent in high school, which means that they are very good at producing formulaic slop.
The depth of engagement, I have found, is ultimately improved by requiring preparation for spontaneous responses on closed exams. As said, this was a surprise to me.
It is sad that I am missing out on deep arguments that require multiple days of thinking, revising, comparing. But I’m not getting these anyway, except in small group, writing-focused classes. For these, I do have scaffolding with take-home components.
Well, maybe in response Profs should use AI to mark the essays. Brave new world?
Brave new world indeed! The professors and the students engage in other activities while both sides employ AI to engage the other side’s AI, and nobody learns a single thing. The transfer of our accumulated knowledge and wisdom to the next generation grinds to a halt. Meanwhile, the same huge amount of money is transferred from the pockets of the students’ parents into the university’s coffers, after which each university inflicts hordes of incompetent degree holders upon society to do whatever damage they will do, after those degrees are bought in which what was once a country that led the world in the quality of its universities and now becomes a hotbed of credential-purchasing.
But the social elites get to keep buying their way into the perpetuation of their status in the next generation, and the universities get to keep chugging along, extracting untold riches from families who slave away to purchase these fraudulent privileges for their children! What’s the downside, really? 😉
Funnily enough everything you say following *meanwhile* is true now.
Indeed.
I can see arguments for switching to other forms of evaluation. For now, I continue to assign traditional essays in my business ethics classes (in addition to paper-based multiple-choice tests and brief in-class presentations). I prohibit the use of LLMs at any stage of the writing process, including “idea generation.” I set aside class time to demonstrate to students that LLMs give reliably bad answers to questions about business ethics, e.g., by misrepresenting the views of major figures in the field.
I know I have students who have written essays with LLM aid and gotten away with it. I also have many students who have not gotten away with it. Last semester, when I had roughly 190 students enrolled in my sections, 12 were formally found responsible for academic integrity violations (of varying degrees of severity). Students who choose to use LLMs in my sections are running a large risk.
The problem is that finding and punishing those who use LLMs is a significant investment of time and energy, and turns me from a good cop trying to help students learn the material to a bad cop out to punish people, thereby sucking all the joy out of teaching. So having over 50% of the grade determined by in-class assignments / exams is the only real solution, as far as I can see.
I see the issue of students using AI as similar to the issue of chess players using AI in chess. AI gives you the best chess moves. So humans do not need to make a big effort to find them. However, if you ban AI from chess tournaments (this is what is actually happening) humans have no choice but to make a mental effort to find the best move. AI did not kill chess. Chess is more popular than ever. What we can learn from this is that the only way for professors to force students to think and make a mental effort is by banning AI. There is not any other solution.
Chess players constantly use AI to study positions. It’s an amazing training tool. It has advanced the game considerably. Of course, AIs are banned in some tournaments, but they are permitted in others, so that there are human-AI centaurs.
I have my students write their assignments in Google Docs so they can send me the link. I then use the Google Chrome Draftback extension (which you can get for $40 per year as an educator if your institution doesn’t have license) that allows me to watch a revision history of their document in real time (or sped up 6x) and I can also see how many sessions they spent writing the assignment and how much time they spent overall.
A standard 500 word assignment will involve thousands of revisions (since each letter, space, or punctuation counts a revision). But if you cut and pasted the entire document from AI (either all at once) or in chunks, each paste counts as single revision. As such, if I look at the revision history and it has hundreds or dozens of revisions, it raised a red flag and I can take a closer look. So, while I allow students to use AI in the proscribed way, they know in advance that when they submit the link, they are giving me the evidence that I would need to catch them cheating.
The only way for students to cheat is to sit down and input the AI output one letter or word at a time into Google Docs. But then when I watch the revision, I can see that they just typed straight through in a single session (or over multiple sessions). So, there will still be far fewer revisions than average.
As an added bonus, DB is useful for helping students who struggle with writing, since I can see their creative process (or lack thereof). I can also tell if they wrote the assignment at the very last minute!
Obviously, this isn’t 100% effective in deterring or detecting cheating, but it is a very easy way to make it much more difficult. You can find videos on YT for directions for how to use DB.
p.s. Until this semester DB was a free extension. Now there’s a fee as I mentioned before. There is another Google Chrome extension (that remains free) that performs a similar function–it’s called Revision History.
This is really invaluable advice. (Thanks.) Are there any downsides to RH vs DB?
I haven’t really tinkered around with RH that much. The main advantage is that it’s free where DB is $40 a year. But just as DB went from being free to fee-based, I wouldn’t be surprised if the same thing happened with RH.
I implemented the same policy this semester and have found Revision History more user-friendly than Draftback. While it hasn’t stamped out AI cheating completely, it has diminished it – and provides something ‘concrete’ to discuss with a student when you suspect AI use.
This is great to know, and I’ll probably try implementing this in some of my courses. But one aspect of cheating that I think will still be a problem is using AI to cut corners in thinking and generating ideas. One thing a lot of students do now is skip the reading entirely by having one of the AI tools summarize it for them. Then they use that as their ‘knowledge base’ for writing essays. Do you find that the Google Docs from students who have used AI in this way easily reveal that they are cheating by way of their speed in writing or number of revisions?
I am not sure DB could help me detect something like the cases you mention–which are quite common unfortunately. My sense is that so long as they are doing the writing based on the material from the course, then at least they’re doing the most important part. I view using AI to summarize readings no differently that using CliffsNotes which students–like me–used back in dark ages. Let’s imagine I didn’t do a reading and instead used the CliffsNotes as my knowledge base. Did I cheat? No, I just used a lazy shortcut. So, while I may have cheated myself, I didn’t cheat on the assignment. I guess that’s how I view cases like the one you mention.
That said, I do tell them that they are not to use AI for outlines, etc. The rough draft is supposed to be theirs alone. They have to submit their rough draft–which is graded–along with their AI-revised final draft–which is also graded. But just because I tell them not to do what you’re inquiring about certainly doesn’t mean that they don’t! Like I said, DB isn’t perfect, but it super simple and more effective than any AI-detector (in my experience).
I thought it might be helpful if I shared the section on Draftback that I share with my students:
Draftback is a free Google Chrome extension that allows students and instructors to play back the revision history of a Google Doc file. It visualizes each keystroke made in the document, effectively showing how the document was constructed over time. The data used to create these document histories is already embedded in Google Doc files. This extension just organizes the information in a pedagogically useful way. These document histories are especially useful for educators who wish to better understand a student’s writing process.
Some key features of Draftback include:
In addition to providing information that is useful for helping those of you who are struggling with the writing process, the Draftback document histories also serve to deter you from using AI in inappropriate ways.
A single 500-word assignment that you type from scratch in Google Docs would involve hundreds (or even thousands) of revisions. But if you cut and paste that same 500 words from a generative AI (GAI) chatbot, it will show up as only a single revision in the Draftback document history. This means that the same document history that helps me better understand your writing process also makes it much easier to tell whether you used AI (or secondary sources) in inappropriate ways.
Keep in mind that just as AI is transforming the world, it is transforming higher education. I am trying in real time to find the best way of implementing policies for distance education that allow students to effectively use developing technologies to improve their writing (and even their thinking).
For every written assignment in this course, you will be submitting two things: (a) you will download your assignment from Google Docs as either a .docx or .pdf and then upload the file to the appropriate assignment folder on OAKS, and (b) you will cut and paste the link for that assignment into the “comment” section of the assignment folder. You must make sure to change the general access from “restricted” to “anyone with the link and then you have to change the role from “viewer” to “editor.”
**While it takes the students an assignment or two to get the hang of this, it ends up being very seamless. I then provide all of my feedback directly on the Google Doc for the assignment. I also made a short tutorial video so I could walk them through the steps.
Have you found any value in seeing the history of the composition process, for anything other than detecting AI cheating? Reading this description made me realize that it *might* be possible to identify some misconceptions a student has by seeing what order they do things, or that it *might* be helpful to be able to tell a student “that section you write Tuesday evening at 7 pm and then deleted was actually really good, you should look at that again!”
But I have no idea if these sorts of things are at all likely. Have you done anything similar, or just used this as a way to identify cheating? (Even just that last purpose seems good enough.)
No, but only because none of the students have taken me up on my repeated offers to go through their revision history with them! I suppose I could require it for students who didn’t do well. But I have watched a lot of the histories just to get a sense for how it works. If you watch it in real time (rather than sped up 6x) it’s very interesting. You can indeed see that they cut things that they should have kept, etc. To be honest, I should think more carefully about how to use it for this purpose. There are some good videos on Youtube by English professors that look pretty helpful when it comes to how to use this to help students. But I haven’t watched them carefully. This is just the second semester that I have used DB, so I am still learning the ropes.
I do the exact same thing and think it’s a very effective (and low-cost) strategy to detect AI plagiarism. I also let my students know that I will be doing this before they write their papers, since I believe this will strongly discourage many of them to use AI programs in the first place.
This is clever, and I think it’s a great approach. As a good student who never used AI (now 25, so I was undergrad relatively recently), I’d like to flag a couple of problems this would have caused for me which you might wish to consider (or might not! I’m just a name on the internet):
tl;dr: I worry that this procedure ends up marking students partly on their writing processes, and that isn’t something students are or should be marked on (so long as they aren’t cheating).
All that said I want to be clear: I think the effects described above are probably minimal, and yours is the best approach I have seen so far for detecting AI, so maybe I should just lump it.
First, let me just respond to your first comment about your writing process. You are correct that if you wrote your essay in a plain text editor and then copied and pasted it into Google Docs, then you did not follow the directions. Moreover, it would appear like you had cheated since the essay you turned in would have far more words than revisions (which, given the way I used AI, constitutes prima facie evidence of cheating).
And while I would not accuse you of cheating in this context, I would force you to turn in a properly formatted assignment (even if that means you have to rewrite it from scratch in the proscribed way). Given the way I construct the assignments, students are required to use Google Docs for all their writing. I do this because as far as I have determined, using Google Docs + Draftback is the only way I can both allow my students to use AI in the prescribed way and be able to detect when they have used it in proscribed ways. In my eyes, this is, by itself, a sufficient reason to have students use Google Docs. I explain all of this in detail in the technology guidelines I give my students.
At least at my institution all students have Google Docs as part of their campus tech. Most of them already use it to write their essays anyway. So, the only extra step my approach requires is that they set the permissions for their assignment so that I have editor status. That’s it. I’ve had this policy now for two semesters, with nearly 200 students, and none of them have raised the worry that you raised.
But if a student did, I’d be happy to work with the student but then the onus would be on the student to convince me that AI hasn’t been used with the alternative format. Because the way I check to see whether AI was used depends on students following the specific writing guidelines that I provide, any alternative approach to writing opens the door to cheating with AI. Given this, the student would need a very good reason for not using Google Docs as required.
Another issue you raised was whether the students’ writing process could be inadvertently influencing how I grade the students’ assignments. I short answer is “no.” That’s because I only look at the revision history if the stated number of revisions that is displayed at the top of every assignment seems off. For instance, I just graded an essay yesterday that had roughly 1,000 words but only 500 revisions. So, I watched the revision and it was clear where the student was cutting and pasting from somewhere else. Otherwise, the only time I look at the revision history if a student who isn’t happy with their grade and wants to know how to improve. Then, we can go over the revision history together. But at this point, they already received their grade. So, I don’t think the worry you raise—while understandable—has much of an impact on grading.
The final worry you raised is that if you sat down at the last minute and wrote your essay straight through in one sitting at the very last minute, it might look like you used AI even if you didn’t. In a case like this, I would look at the revision history because the number of revisions and words would be close enough to make me take a closer look. I would see both that you wrote it in one session and that you wrote it at the last minute. I would also be able to see how you wrote it from beginning to end. As someone who has been teaching college in some shape or form for 25 years, I have read thousands of essays. Very few students can sit down and write 750-1,000 words in one sitting with few mistakes and few revisions and produce a high-quality paper. But some students–like you–can do this. As I explained, a case like this sits at the border of my policy. That’s because I can’t distinguish between a student who did what you suggest—namely, write the essay in one sitting with few mistakes, etc.—and a student who used AI to write the assignment, then opened Google Docs and simply typed the AI-written essay into Google Docs. So far, I haven’t had any cases like this. But eventually I will. Every policy has its limitations.
That said, thanks for taking the time to share your concerns as a student. I give my students anonymous polls throughout the semester so they can share concerns about how I am running the class, structuring the assignments, etc. So, I always welcome to this kind of thoughtful feedback.
Thank you for taking the time to respond! All very good points.
It’s probably worth saying that — although students like me exist — I am aware that so few students write good last minute essays or care at all about the tech in their writing process that I think I will adopt exactly this setup in the future. It’s very clever.
In particular, I hadn’t thought of this:
This is a really fantastic revision tool, quite apart from students who have challenged their grades, I can just imagine that keen students would find it useful to go through the writing process with a professor.
Cheers
There are several products out there that are designed to defeat the strategy of tracking changes in a Google Doc. One example is https://incogniton.com/knowledge%20center/paste-as-human-typing-fingerprint/ but there are others as well. Some go so far as “typing” synonyms and then changing them to the pasted text, or starting and then deleting false start sentences. I suspect that some even “proofread” by automatically word-spinning within the essay at the end of the process.
I no longer trust the integrity of anything submitted by a student that I didn’t watch them write in my presence. But so long as credits “earned” in asynchronous humanities courses are accepted by transfer institutions, our institutions (I’m in a community college setting) have no interest in forcing integrity on our students.
This will undoubtedly be unpopular, but it is too much to expect individual instructors to design sound policies for navigating the cataclysmic shifts in education AI is bringing/is sure to bring about. Such policies are very likely to be ineffective, inappropriate, arbitrary, or, at best, a lucky success (or some combination thereof). What’s needed are institution-level policies and disciplinary standards and best practices, backed by actual evidence — perhaps, in our case, established by the APA. In light of this conviction, I have not changed my teaching practices at all, and I will refrain from doing so until we have an evidence-based sense for what changes are needed and what works.
At an institutional level, one change that could be helpful is the creation of monitored writing labs, where students could complete assignments on university-owned machines with monitoring software and in-person overseers in the lab to detect and/or prevent cheating, while being barred from bringing in any electronics of their own. This would seem far better than cutting take-home papers out of the philosophy curriculum — truly a non-starter, in my opinion — or dedicating class time to writing, which will never be sufficient for longer papers (and in which students would still be using their own devices, doing god knows what when I’m not looking at their screens). But of course, this would require huge investment from institutions, which would likely be happier to just let students continue cheating and graduate as satisfied customers.
Until actual large-scale policies and practices are in place (which I don’t see forthcoming), I’ll simply continue making a heartfelt plea to my students at the start of each semester not to use LLMs because that undermines the very purpose of education, while being fully aware that, by and large, the actual purpose of higher education for decades now (restricting my focus to the US) has been white-collar vocational training and the reproduction of the class hierarchy, which I have no chance at all of combating successfully as a lone voice in the wilderness. AI just adds fuel to that fire. I’m a mere philosopher — not a firefighter.
Given the lack of a clear and plausible solution, there are many reasonable or understandable ways to respond to this AI crisis, including yours. But I’m concerned about your last remark: “I’m a mere philosopher — not a firefighter.”
With the LA wildfires still fresh in my mind, it seems that everyone is or can be a firefighter, if the fire gets close enough to you or your interests.
Even if you can’t save the entire neighborhood, you can still save your house and maybe neighboring ones. A heartfelt plea (thoughts and prayers?) is better than nothing, but it isn’t futile to be more engaged with fighting this AI fire.
Do what you can, with what you have.
I don’t get it. What evidence do you need that in-class exams solve the problem of having chatGPT write your essays for you?
Perhaps my remark that cutting essays in favor of purely in-person work is a non-starter would help you think through this delicate issue.
I find this line of thinking very sad. It can be summarized like this: Lots of philosophy professors believe the old way they learned to teach and assess is best. And they can’t be bothered to develop new methods. Are the students failing to do creative and critical thinking? No, it’s us, we are failing. AI is being used very well in many academic fields. It’s fine with me if you want to use blue books and ban AI, but don’t complain when your department is closed.
I think this is not a good summary, because I do not think they old way I learned to teach and assess is best. I did not learn merely the old way: I got the old way in all the courses I took and the new way in a course I TAed for and then later in a pedagogy class (which was where the teacher in the course I was a TA for learned the new way). The new way is way better than the old way. So I don’t think I can fairly be accused of just wanting to stick to the old way. And yet I share the concerns of the article.
I am also all ears with respect to people developing new methods for teaching that work better in light of AI. If you have even a single thing to say about this I am very interested, let alone if you have an entire revolutionary pedagogical theory in light of AI. I have already changed some stuff about my syllabi in light of AI and I would in principle be willing to change anything if I thought it would help students learn better. I just happen to be skeptical that this is an area where AI can do much good. I agree with this comment from Philosophy Teacher above:
AI in philosophy seems to me to be useful solely in contexts where one has already built up intellectual muscle. This sort of intellectual muscle can’t be built up effectively if one relies on AI, as far as I can tell. Indeed it seems like a sort of inverse correlation: the more one uses AI the less one puts oneself into a position to evaluate the use of AI in order to make sure it’s not bullshitting. So the best option is to use none for a while.
Sure, you might not think the old ways are best, which is good, but look at all the folks here who think writing papers is the only way for students to do philosophy.
The June 2024 issue of Teaching Philosophy is a special issue devoted to new ways to use AI in the philosophy classroom. The articles are well worth reading.
Who is saying that writing papers is the only way to do philosophy? What I understand these folks to be saying is that writing is one good way to do philosophy. And in order to do philosophy this way, one must be the one who is doing the writing. Of course, one can also do philosophy by reading philosophical texts and discussing them with others. But again, you have to read the texts and participate in the discussions for this to count. The worry is that many students aren’t actually putting in the effort to do philosophy or acquire the intellectual virtues to do it well.
Or you could do philosophy by having a dialog with an AI, or by training an AI.
Sure. But are you saying that this is the only way or even the best way to do philosophy? Again, the issue isn’t that AI can’t be utilized by dedicated students to hone their philosophical craft, or improve the quality of their writing and speech for that matter. (We also have libraries for that.) To me, the issue is whether AI will be utilized in such a way that students will all too easily avoid the time and effort that everyone at some point in their education must spend in order to read, write, think, and argue well. While it isn’t impossible to imagine students doing philosophy without writing papers — Socrates did just fine — I do think, along with Socratic dialogue, it is still one of the best ways to learn how to do philosophy. I also think that AI can’t replace good teachers or the educational experience of coming together with others to attend to a shared topic. Call me old fashioned I guess.
We need a mirror question of doing research in the age of AI. Philosophy as a workplace is as vulnerable as any.
AI has become my assistant in all sorts of ways, to the extent that I am highly reliant upon it, as much as how one may depend on an extremely capable assistant.
what alarms me is the amount of cognitive offloading. I am now incapable of getting around to doing what ai can do for me. In the long term, it will for sure undermine my capabilities in some respects. And what if the future Ai becomes even more capable?
But, as of now, I am still doing everything that ai can’t do for me, including checking its own correctness. I do not let it generate philosophical text for me, for example, which, among other things, would not preserve my distinct voice. I wonder, however, whether the situation is different for students who are still in their early formative years. I do not think less nor use it to cheat, but this may not be the case for others, or in the future.
What I try to say, in summary, is that I do feel the benefits of AI but am alarmed by my reliance on it, and concerned by others’ more harmful reliance on it (even including fellow researchers, I am afraid) and future development.
Genuine question: if you are alarmed, then why don’t you stop relying on it? Is it that the benefits are so great? Something else?
Maybe I will find a time for a more detailed explanation. For now, can I just say that I find it super useful? And I say this as someone who is *not* yet charmed by AI philosophy (unlike a post a while ago which compares AI with smart colleagues).
Speaking for myself, it’s because it makes many parts of my life and work my much easier. I have never had that much energy and struggle with motivation so I will always go back to it. So day to day it is really helpful even if the long-run negatives outweigh that. Of course if I were capable of focusing only on my long-run benefits I wouldn’t be reading daily nous in bed eating a tub of ice cream, and yet here we are
Yes I can say the same as Cap. But here are some more detailed examples. AI is extremely well-rounded comparing to humans. One of my weaker suit relative to other academics is literature searching, and AI has been immensely helpful (though surely imperfect) in finding relevant literature across disciplines over decades. But it is not only about complementing weaker ones: reading comprehension has been one of my strongest suit, and with the aid of AI, there is almost no text (in any discipline) that I cannot read at the level of understanding that satisfies me.
Can you give some advice on how to do this? Are you using ChatGPT deep research?
Yes, I have been using deep research since it came out and had to subscribe to pro (funded, happily). Not sure if it is an illusion, but I do notice that quality tends to deteriorate over time–for the same version, the earlier the better.
There are at least two precedents to the student use of AI (and the danger AI poses to thinking). First, recall the criticism of the relatively new (to Greek culture) technology of writing that comes under criticism by Socrates in the Platonic dialogues. Socrates wrote nothing. Plato did write a lot and did so in a manner that seems intended to avoid the dangers that writing poses to philosophical thinking (whether he succeeded is another matter). Secondly, what did mathematics teachers do to keep students learning how to solve math problems once calculators (or the earlier abacus or the slide rule) became ubiquitous? One strategy that Plato employed with writing and that mathematics teachers employ is to incorporate the new “dangerous” technology into one’s pedagogy. With AI, this might also make sense given that whatever we are supposed to be teaching students to do (viz., think and write critically) they will henceforth (once out of school) be doing with the assistance of AI. The concern over students writing their own philosophy papers is itself interesting, given that Plato (or Socrates) would have been concerned primarily about the use of writing in itself. In a more pessimistic vein, I believe that AI is hardly the greatest problem philosophy professors face at this moment when more existential threats to our profession loom large.
But what mathematics teachers do to keep students learning how to solve math problems is precisely to mostly forbid the use of calculators etc. until students are sufficiently skilled at the tasks that can partially be off-loaded onto such technologies.
I can hardly think of a clearer analogy to illustrate the appropriateness of banning new technologies from much of the education process than mathematics. Unless prevented from doing so, many primary school children will try to sabotage their arithmetic education with calculators, and many high school/college students will try to sabotage their mathematics education by just putting everything into wolfram alpha (or something like that). Similarly, unless prevented from doing so, many college students will try to sabotage their philosophy/writing/thinking education by overusing generative AI technologies.
Also, for what it’s worth, as someone who was up until very recently a college student and who knows many current college students, my impression is that way too many people weighing in on this debate seem to be very naive about how much students use generative AI, how much time and effort it saves them, and how much learning it prevents them from having to do. (Not a point aimed at you)