Teachers: Was the Semester AI-pocalyptic or Was It AI-OK?


A survey conducted at the end of last year indicated that 30% of college students had used ChatGPT for schoolwork. Undoubtedly, the number has gone up since then. Teachers: what have your experiences been like with student use of ChatGPT and other large language models (LLMs)?

Here are some questions I’m curious about:

  • Did you talk to your students about how they may or may not use LLMs on work for your courses?
  • Have you noticed, or do you suspect, that your students have used LLMs illicitly on assignments for your courses?
  • Have you attempted to detect illicit LLM use among your students, and if so, what methods or technology did you use?
  • If you reported a student for illicit LLM use, how did your institution investigate and adjudicate the case?
  • Have you noticed a change in student performance that you suspect is attributable to increased prevalence of LLMs?
  • Did you incorporate LLM-use into assignments, and if so, how did that go?
  • Did you change or add assignments (or their mechanics/administration) in response to increased awareness of LLMs that do not ask the students to use the technology? (e.g., blue book exams in class, oral exams)
  • Have your LLM-related experiences this semester prompted you to think you ought to change how you teach?

I’m also curious about which other questions I should be asking about this.

 

 

 

Hedgehog Review
Subscribe
Notify of
guest

63 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
benjamin s yost
benjamin s yost
4 months ago

In the first day of my 2xxx Social/Political class, I explained my view on the intellectual drawbacks of using GAI. On the second day, I had them anonymously vote on whether they would be allowed to use GAI. I noted that if the class voted against using GAI, any student who did so would be wronging his or her peers. They voted against using GAI by about 70-30, maybe a little closer. And I had no problems with it on discussion fora or papers, at least that I could detect. In my first-year writing seminar I didn’t have any problems either, but that class required drafts and some prewriting. Next semester I am teaching a class that I usually assess with take home exams, and I’m thinking of moving them to in class. I feel the temptation is just too great in such circumstances.

Kate Norlock
4 months ago

I am definitely changing and reweighting some assignments on my course outlines for next term as a result of the rise of LLMs. A colleague shared that one student said, “Everyone’s using it. If anyone denies it, they’re lying to you. It’s part of the writing process now, and it will always be part of every student’s writing process from now on.” I’ve come to the conclusion that wishing it away is like my long-ago wish for Google to disappear when students started Googling everything instead of trying to figure out how to find out information in more effective but less instant ways. The fact is that technology is here now which will reduce many individuals’ motivations to be more resourceful. This is not the world I prefer, but it’s here, and I am adjusting. It’s neither apocalyptic nor OK with me. It’s in a mildly depressing middle. (I think that’s the most 2020s thing I’ve written, ha!)

benjamin s yost
benjamin s yost
Reply to  Kate Norlock
4 months ago

I think that student comment is overblown, at least for now. I polled my students and in two classes combined a bit over half said they had used GAI at all in a writing context, and very few of them, perhaps none of them, ever had used it to compose text. A lot of them said it wouldn’t feel like it was their writing if they did that. And as I noted in my comment, my first year writing seminar students did not use it either. I know this because of the quality of the drafts was quite crappy… definitely lower than it would have been with GAI (I know this because I tested my prompts with GAI). I suppose it also depends what one means by using AI as part of the writing process. Grammarly uses AI (I think it’s incorporating LLM tools, but not 100% sure), and Word will soon, if it doesn’t now.

Kate Norlock
Reply to  benjamin s yost
4 months ago

Oh agreed, I’m certain that student’s comment is overstating the case as well. (Yeah, people keep mentioning the Grammarly thing too but I’m really talking about the drafting of whole papers.)
I’m still finding it all middlingly sad, though.

Patrick Lin
Reply to  benjamin s yost
4 months ago

If you can’t beat ’em, then join ’em. I don’t like it either, but this is the Way.

ModallyFragile
ModallyFragile
Reply to  Kate Norlock
4 months ago

I’ve come to the conclusion that wishing it away is like my long-ago wish for Google to disappear when students started Googling everything instead of trying to figure out how to find out information in more effective but less instant ways.

This actually gives me some hope. Google is an enormously useful tool, and I’m sure its development has been driven in part by its uptake across domains including academia. There’s nothing wrong with using Google these days (nothing philosophically/academically wrong anyway), and improvements in Google itself and googling abilities have driven improvements in other search engines (like the PhilPapers search engine and online library cataglogues). This has all resulted in new, really good ways of finding information. Perhaps something similar will happen with current AI tooling.

benjamin s yost
benjamin s yost
4 months ago

I think my first comment got eaten…. Anyway, after some discussion, I had my 2xxx Social/Political class vote on whether they would be able to use GAI. They voted against, about 70-30. I noted that as a result, anyone using GAI would be wronging their classmates. I could not detect any GAI use on either the discussion fora or their papers. My first-year writing seminar requires drafts, and I didn’t notice any issues there either, even though I just prohibited GAI use as course policy. Next semester I have a class for which I usually assign take home exams. I will likely move those to in class, as the temptation with take homes seems too great, and AI use harder to detect.

Aeon J. Skoble
Reply to  benjamin s yost
4 months ago

Yes, in all my 2xx classes I have switched (back) to blue-book exams. Still optimistically (naively?) hoping I can continue to assign essays in 3xx and 4xx classes, but for 2xx, it’s over; all in-class writing now.

Python
Python
4 months ago

I do not mean this in *any way* to be a critique of Daily Nous or Justin–who I think does a great service to the profession:

I don’t know about others, but I can’t find the motivation to care about the sort of issue in this post or really about any other sort of issue these days–not that they don’t matter/aren’t important. I’m just so utterly occupied and saddened by the suffering in Gaza.

This isn’t unusual for me–I’m usually occupied by the staggering suffering humans inflict on animals every day. But there’s a weird compartmentalization that can happen when I see new examples of cruelty towards animals or reflect on the animal suffering I already know about. And I’m able to engage in and focus on other ethical matters and issues related to our profession. Maybe it’s because after so many years, I’ve become accustomed to how so many incredibly thoughtful, otherwise kind people willfully choose to ignore it (by not even trying to cut out meat, let alone become vegetarian or hopefully vegan) or how they concoct outrageously convoluted or outright implausible justifications for their participation in that suffering.

But with Gaza, things feel different. I know that these same people generally care about other humans, and since caring about humans doesn’t typically involve much sacrifice on their part, I guess I expect to see them as preoccupied with the suffering there as I am. I just don’t understand how anyone can care about the effects of AI on teaching, gift guides for philosophers, or much of anything else related to our profession or academia right now. Again, this is no critique of Justin (I understand fully why he cannot simply focus on what is happening in Israel or Gaza)

A9
A9
Reply to  Python
4 months ago

What is happening now in Gaza is surely horrible. I’m curious though. Do you have this reaction for all the other horrible things that happen around the globe on a daily / monthly / yearly basis? Horrible things in the Sahel, Congo, Yemen, South Sudan, even Ukraine, and other places (see e.g. https://concernusa.org/news/worlds-worst-humanitarian-crises/), plus ongoing food insecurity and preventable deadly diseases in developing countries, not to mention all the harm that will happen because of climate change. What is special about the Gaza situation that has you incapable of caring about this serious problem for education and for in intellectual life more generally? Are you okay in other times?

Python
Python
Reply to  A9
4 months ago

Absolutely fair question. And the answer is that I am gravely upset by the other cases you mention where there is cruelty at horrible scale. But it is a fair question and there a lot of reasons I am distinctly concerned about Gaza but the main one boils down to the significant disparity in military & economic power between the relevant two groups.

Python
Python
Reply to  Python
4 months ago

I’ll also add another central reason: I don’t see philosophers attempting to justify the suffering in Congo, Yemen, Ukraine, etc.

Wayne
Wayne
Reply to  Python
4 months ago

I kind of feel the same way.

Thornton
Thornton
4 months ago

AI seemed to me totally overblown. I’m at a non-selective university where writing tends to be mediocre at best. One student this semester submitted a paper that was so far above that mediocre level of writing that it was obviously not the student’s own work (and the student wrote two other mediocre papers, so it was demonstrably obvious that she couldn’t write at that level). That doesn’t stop students from using AI to help write a paper–but if the paper itself is pure AI, it will look inherently suspicious at my institution. Maybe that’s not the case at Ivies and R1 institutions–but of course most of us don’t ever teach students of that caliber.

Nathan
Nathan
4 months ago

I went back to blue book exams for the first time since 2019 because I was worried about ChatGPT in online exams. It’s a waste of my time proctoring and a loss of class time for studying or further lectures or discussions, but I’m not sure what else to do given the extent of cheating online I’ve seen in previous semesters.

Colleague123
Colleague123
4 months ago

My colleagues in a college writing center have noticed that many a student who seeks advice on their writing will have ChatGPT open in another tab on their computer. When asked, many will say they’ve used ChatGPT in the drafting or editing stage. They are disinclined to see it as hugely problematic; they’ve seen its use becoming normalized. Some professors permit using gen AI to edit a paper but not to create the initial draft. This carries over to other classes — even when a professor does not permit using gen AI, students assume it can’t be a major infraction (since some professors allow it). In some conversations, it sounds like the line between editing and drafting is blurred.

Confusion results from the use of gen AI not being classified as plagiarism. At my school, it’s classified as unauthorized use of an aid (like using a calculator when not permitted to). Many students hear, “It’s officially not plagiarism,” and they don’t seem to register that it’s still an academic offence if the professor hasn’t permitted use of gen AI.

The students are mostly using the clumsier free version of Chat GPT, not the advanced more-human-than-human paid version. They seem to know about some of the free version’s defects (e.g., verbosity, cliché) and want advice on how to improve a draft.

Eric Steinhart
4 months ago

My experience (which is merely anecdotal) is that students have come to dislike LLMs like ChatGPT. They’ve expressed two reasons:

They hate the way it writes. They do seem to take some pride in the notion that their writing is in their voice and belongs to them.

They learned that professors can immediately recognize ChatGPT by its style, and they get punished for that (I think they had bad experiences last semester).

Even when I encouraged them to use ChatGPT for drafting, etc, they didn’t use it. I had a few use it for drafting, but very very few.

Kyle Hodge
Kyle Hodge
4 months ago

Did you talk to your students about how they may or may not use LLMs on work for your courses?

  • I mentioned it to them, and told them that I assumed they were aware of such tools. I told them I don’t care if they use it for the sake of improving their grammar, phrasing, etc. since I cannot effectively monitor or prevent their doing so, though I told them it was entirely my discretion to reduce a grade if I suspected unrefined LLM output. I also tell them that it reduces important skill-building which the course is designed to facilitate, so I don’t recommend it in general considering its output isn’t even that good and is easy enough for me to spot since I am interested in it myself and regularly mess around with it.

Have you noticed, or do you suspect, that your students have used LLMs illicitly on assignments for your courses?

  • I have definitely had students use LLMs illicitly on assignments.

Have you attempted to detect illicit LLM use among your students, and if so, what methods or technology did you use?

  • My eyeballs and brain mostly. You get a sense of the style of LLMs if you play around with them enough.

If you reported a student for illicit LLM use, how did your institution investigate and adjudicate the case?

  • I wouldn’t bother reporting it since no single assignment they can complete in most of my courses is of sufficient weight to make a violation of academic integrity sufficiently weighty. I usually just tank the grade on the assignment, tell them I suspect it was AI generated, and move on.

Have you noticed a change in student performance that you suspect is attributable to increased prevalence of LLMs?

  • No. Student performance is still on average comfortably below college level.

Did you incorporate LLM-use into assignments, and if so, how did that go?

  • No, and I think the technology is at present too crappy to be worth anything right now.

Did you change or add assignments (or their mechanics/administration) in response to increased awareness of LLMs that do not ask the students to use the technology? (e.g., blue book exams in class, oral exams)

  • No. I simply added the threat that I have discretion to penalize whatever I suspect is generated by an LLM. Assignments that are susceptible to LLM-based integrity violations are also assignments that give students time to write, revise, and think over their own thoughts on important subjects. I don’t want to take that away from any good faith students who have no intention of using the LLMs to cheat.

Have your LLM-related experiences this semester prompted you to think you ought to change how you teach?

  • No. So far I think my experiences have confirmed that my attitude is appropriate since I see enough unlettered and poor responses to assume that it is not being used frequently. The scores I see on multiple choice online exams do not suggest cheating either. Students who use LLMs to cheat don’t conceal it very well, and LLM output is (often enough) detectable if you play around with it enough to see what its general style is like.

Eric Steinhart mentions that some of his students have learned that available LLMs write like (legible) crap and are easily detectable. I wish I had asked my students more about their experiences with LLMs. As an aside on this, I spent more time asking my students about low participation in synchronous online classes, to which many of them responded that they treat live lectures like a live podcast while they perform other tasks, such as working, driving, or caring for loved ones. No idea how to work around that right now.

Naomi
Naomi
Reply to  Kyle Hodge
4 months ago

“I simply added the threat that I have discretion to penalize whatever I suspect is generated by an LLM.”

But surely you don’t actually have that discretion, even though you describe not just threatening with it but actually using it? Mere suspicion used by an instructor to ‘tank grades’ without involving a dean / provost / examination committee sounds pretty worrisome to me.

Patrick Lin
Reply to  Naomi
4 months ago

You don’t need to actually have the discretion or power to do x in order to make an effective threat to do x. Per deterrence theory, all you need is for the other parties to believe you can and will do x.

Naomi
Naomi
Reply to  Patrick Lin
4 months ago

Sure — my point is not that an empty threat is ineffective, my point is that it’s morally problematic.

Patrick Lin
Reply to  Naomi
4 months ago

An empty threat is morally problematic? I’d be curious what the argument is for that, esp. given the goal here of deterrence away from a more problematic scenario.

Is a real threat morally better, e.g., threatening to use a WMD that you have vs. a bluff because you have no WMDs?

Kenny Easwaran
Reply to  Naomi
4 months ago

If some component of the grade depends on how interesting or well-written a paper is, then it would be very reasonable to mark those points down if you think a paper is bland and boring enough that it sounds like an AI wrote it.

Naomi
Naomi
Reply to  Kenny Easwaran
4 months ago

I very much agree. But I think applying a substantive and relevant criterion is something altogether different than penalising a student for suspected fraud (and it’s easy to come up with cases where the two come apart — e.g. the work is written at a level much higher than what that particular student would typically achieve).

Kyle Hodge
Kyle Hodge
Reply to  Naomi
4 months ago

I think instructors have that discretion. I mean, I have discretion to penalize, and discretion with how much to penalize, a student assignment that is suspiciously similar to the work of another student, content in a class reading, or other source material. Analogously, I have discretion to penalize work that is suspiciously similar to the output of commonly used and widely available AI chat services. I think you’re being uncharitable when you say it is mere suspicion; the suspicion is warranted where the similarity is present.

Naomi
Naomi
Reply to  Kyle Hodge
4 months ago

I didn’t mean to be uncharitable. Maybe the policy at your institution is just very different from what I’ve seen in other places, where fraud cannot be formally determined by an instructor alone and it’s not up to instructors to penalise fraud or other forms of academic misconduct. Anyway, for people like the junior lecturer commenting above it is probably good to know that what you describe would not be okay in many places.

Last edited 4 months ago by Naomi
Geoffrey Bagwell
Geoffrey Bagwell
Reply to  Naomi
4 months ago

Naomi is right that instructors normally do *not* have the legal authority to impose sanctions on students for violations of student conduct codes. This includes plagiarism which in every state that I know is classified as a form of academic dishonesty by statute and subject to legal penalties. Again, in every state that I know of, sanctions can only be issued by the institution and its designated student conduct officer. Faculty are not student conduct officers and, so, do not have the power to impose sanctions on students for violations of academic dishonesty. Some institutions might designate faculty as student conduct officers, but such institutions would have to do so explicitly, and this would be highly unusual because it invites litigation which they generally want to avoid.

I realize that it has been common practice for faculty to sanction students for plagiarism by giving them a failing grades for a plagiarized assignment or by giving them a failing grade for the course. But my understanding that this common practice has always technically been illegal because it violates student’s due process rights under title 20.

Justin Kalef
Reply to  Geoffrey Bagwell
4 months ago

Could you please quote to us from that Section of Title 20? I’ve never heard of such a thing, and many universities’ official documents explicitly say that instructors may fail students on an assignment or course in response to an academic integrity violation, in addition to or instead of reporting the student to the academic integrity office.

Geoffrey Bagwellgeoffreybagwell@gmail.com
Reply to  Justin Kalef
4 months ago

Thanks Justin! There is no specific provision of Title 20 which outlines a students due process rights in detail, but the relevant section of title 20 is chapter 31, subchapter 3, part 4, which among other things guarantees that students have due process rights in general in accord with the 14th amendment. So any institution which receives federal money or disburses federal student aid must guarantee that students have due process rights in general. Additionally, the Supreme Court held in Goss v Lopez that institutions rather individuals impose sanctions for student misconduct and that students due process rights include the right to a hearing and the right to appeal any sanctions imposed. The implication is that faculty cannot legally impose sanctions for plagiarism unless the student is afforded a hearing and a chance to appeal the decision.

For what it’s worth, I can understand why what I wrote might sound strange. The situation I am describing is, I think, a case of university policy catching up with established law. For instance, it was only two years ago that the Supreme Court in my home state of Washington ruled that faculty cannot impose sanctions on students for violating an institutions student code of conduct because faculty are not student conduct officers and cannot provide students with a hearing or the chance to appeal.

Louis Zapst
Louis Zapst
Reply to  Geoffrey [email protected]
4 months ago

There seems to be some confusion here. A teacher failing a student’s plagiarized (or AI-assisted) assignment is not a sanction, but an evaluation of the assignment similar to giving zero credit when the assignment is not turned in or giving less credit for a late assignment. A sanction may separately be levied by the institution, which, for example, could suspend or expel a student for serial plagiarism. Even the teacher’s finding of plagiarism and giving zero credit or a lower grade, though, is subject to appeal by the student at every institution I know.

Patrick Lin
Reply to  Louis Zapst
4 months ago

To Louis’ point: if title 20 and a student’s due process were interpreted so broadly as to prohibit giving a failing grade for a paper suspected of AI cheating, then instructors are not empowered to grade papers at all.

A student who earned a grade other than an A could demand “due process” and escalate the issue to the institutional level. But that’s a silly interpretation, of course (for all but the most egregious cases, e.g., actual discrimination).

That said, I might be ok with some university committee doing all of my grading…

Geoffrey Bagwell
Geoffrey Bagwell
Reply to  Patrick Lin
4 months ago

Thanks Patrick. You make a good point about the breadth of Title 20, which is important to keep in mind. But the federal code is partly written to give states latitude to implement their own rules so long as they stick to the boundaries of the federal code. And many states have statues which distinguish between sanctions and grades. Grades are academic and awarded by subject matter experts (not the institution as a whole). Sanctions are not academic and issued by the institution. This kind of distinction is supposed to protect a faculty member’s right to evaluate a student’s work according to the standards of that faculty member’s field as well as the student’s due process rights under the 14th amendment.

The trouble is that grades have and continue to be used to sanction students, especially in situations of academic dishonesty. This has obscured the distinction, but I don’t think it undoes it, though it probably depends on which state we are talking about. Given title 20 and the supreme court’s opinions on the matter, it seems to me that the federal law is settled even if states and the institutions in them haven’t caught up.

Geoffrey Bagwell
Geoffrey Bagwell
Reply to  Louis Zapst
4 months ago

Thank Louis! The clarification you mention is necessary and important to make. There is a great difference between failing a student for poor performance and sanctioning a student for violating the student code of conduct. That’s why in my comments above I have stressed that faculty cannot sanction a student for violating the institution’s student code of conduct. This does not imply that a faculty member cannot give a student a failing grade for an assignment if the failing grade is given for reasons having to do with poor performance on the assignment (like not doing it according to the directions, etc). This is not a sanction. I grant that whether a failing grade is given for plagiarism or for poor performance is in practice ambiguous. But other common practices are not. For instance, faculty often have provisions in their syllabi that specify that repeated instances of plagiarism can result in a failing grade for the entire course. This is a sanction. If this means that the instructor will penalize a student who is caught plagiarizing more than once, then these provisions are not legally enforceable and the opens the door to litigation.

Enrico Matassa
Enrico Matassa
4 months ago

Nothing makes me sadder than all the people on DN who are like: It’s all blue books all the time now baby! I think two things can be true and are: 1. In their professional lives students will use Chat GPT a lot and a lot of gruntwork writing will get done by Chat GPT. Most people I know with white collar jobs who are halfway clever are already doing that 2. We need to make sure students learn how to write without using Chat GPT. I take it no engineer or economist is ever going to have to calculate the cube root of 37 in her head or graph x^3-5x+2 by hand in her working life but I think we’d all be very worried indeed if we found many people who graduated with engineering or econ degrees couldn’t do fairly basic math like that. They need enough nuts and bolts math to know what to put into the machine and also when to recognize that the machine has spat back something stupid either because of user error. The same I think holds true for writing. To get Chat GPT to do your work for you you have to actually understand how to write decently.
As for me personally, I’ve seen a lot of it this semester. I’ve taken the time to run my writing topics through it and I’ve tried to only keep the ones it messes up horribly in some way or other. (Like a lot of AI it’s weird. It’ll do pretty well with a question that seems hard to us and completely crash on something much more basic to a human being).I’ve also been explicit about expecting writing assignments to be tied to specific lecture material or what we’ve covered in discussion sections. So I’ve failed most students who used it without having to get into issues of proving they used it. But the amount to which students used it (I’d say 25%) is depressing. The students who got F after F using it and kept plowing forward are a particularly frustrating lot.

Dan Pallies
Reply to  Enrico Matassa
4 months ago

I’m sympathetic to this, but my problem is that my many of my students have very little experience with argumentative writing. For these students, an essay by Chat GPT might be comparable to what they would write on their own, despite the fact that Chat GPT is prone to making bizarre mistakes. The students make bizarre mistakes too! And the standards for this course are not high enough to justify Fs for papers that make these kinds of bizarre mistakes.

Enrico Matassa
Enrico Matassa
Reply to  Dan Pallies
4 months ago

If I say spend ten minutes in two separate lectures talking about how utilitarians do not value fairness in and of itself and this is a major problem with the theory and then a whole paragraph of your response says “Utilitarians would oppose y because of it is unfair” I don’t have much problem bringing the hammer down. I work at a CC, I’m not by nature a hardass grader, but we’ve got to have some minimal standards. I don’t say this out of some desire to play meritocracy’s cop. To do otherwise shortchanges our students.

CLL
CLL
4 months ago

This semester, I taught symbolic logic online. (Online teaching is not my preference, but it is all I have been assigned since the start of the pandemic.) I have a syllabus policy barring the use of ChatGPT and its ilk, and I have restated this policy in announcements and on the first few assignment instructions.

In a class of 40, I caught 5 students submitting ChatGPT generated answers (two of them continued to do so after being caught and referred to the student discipline office). It is possible other students used it but were more savvy about avoiding detection. The main tell for the students who were caught is using different symbols and formatting from those used in the course (and using inconsistent symbol sets and formatting from proof to proof). ChatGPT and Bard randomly vary these (the most amusing is when a student uses 1’s and 0’s instead of T’s and F’s on truth tables). Another common tell was using rules or techniques (e.g. conditional proofs) before they were introduced.

This used to be the only class I didn’t really have to worry about cheating in. What a bleak world to try to teach in….

Last edited 4 months ago by CLL
Brian Robinson
4 months ago

I did see some students using AI, which I forbid in my Intro to Phil course. It was, however, far fewer than last Spring, because I made it much harder to use ChatGPT. (https://brobinson.info/surviving-ai-this-term/) I did report these few students, and my assoc. dean agreed with my conclusions because they were based on more than just TurnItIn’s AI checker.

New teacher
New teacher
Reply to  Brian Robinson
4 months ago

Thanks for linking this post. I like your approach to assignment design.

At what point do you get your chair or associate dean get involved? I’m a new instructor with zero mentoring, so I’m sorry if this is a very basic question.

Do you fail students that you suspect beyond a shadow of a doubt, and then just get chair, etc involved if the student disputes?

Louis Zapst
Louis Zapst
Reply to  New teacher
4 months ago

My advice is to ask your colleagues what they do and ask the relevant administrators what they want you to do. Some institutions have strict policies about reporting all academic dishonesty to chair/dean/provost. Others don’t. Some administrations want you to report all cases so that they can handle them through clear procedures and keep track of serial offenders. Others will turn it around and make you the bad guy who is creating problems for administrators or who is unfairly accusing students who need the administration’s protection from you.

On the Market
On the Market
4 months ago

The AIs produce, at best, essays I’d grade with a C. I ran a few of my standard prompts through ChatGPT and slightly adjusted my grading rubrics to penalize what I inferred to be typical AI mistakes.

I’ve always required essays that substantively cite some of the assigned readings. Some cursory experimentation with ChatGPT indicated to me that the AI has a hard time meeting this requirement, either outright hallucinating or giving wrong page references.

I just tell this to my students outright. I’m a very easy A if you work with me. AI is high-risk, low-reward. If I catch it, it’s plagiarism. If I don’t catch it, it’s still a bad grade.

Just to make that point, I walked them through two dialogues I had with ChatGPT where I got it to make provably wrong claims, but that are wrong in a subtle way that you won’t catch if you haven’t studied for my class.

I don’t think I’ve seen a single AI-generated essay so far. I can’t tell if anyone used an AI to generate an outline or workshop an idea. But if so, I don’t think I would mind. To produce a decent essay from what the AI can give you still requires all the skills and knowledge that I’m testing for.

I’ll keep an eye on the technology, though, and will experiment with GPT4 over winter break. If it catches up with me, I’m torn. In-person exams are the obvious solution, but at some point we should wonder whether we are just obstinately refusing a new reality. All my industry contacts use the new technology and consider it a resource for, among many other purposes, debugging code, setting up narrative structures, and workshopping ideas.

We’ll either have to find a way to incorporate this into examination, or rethink examination as a whole. I’m personally in favor of abolishing grades altogether, but that’s a larger conversation.

Derek Baker
Reply to  On the Market
4 months ago

This matches my experience for the most part. I have a few papers that are likely to be generated by AI, but they’re all pretty bad on the merits. It’s grammatically impeccable filler that does not engage with any of the assigned readings.

GradStu
GradStu
Reply to  Derek Baker
4 months ago

“It’s grammatically impeccable filler that does not engage with any of the assigned readings”

That’s better than the grammatically defective filler that doesn’t engage with any of the assigned readings which many of my students turn in!

naive skeptic
naive skeptic
4 months ago

“which other questions I should be asking about this?”

For me the big question is: what comes next?

Obviously the AI development isn’t done, so where do we go from here? I’m pretty sure the main AI company shut down the generative AI detection tools because they just didn’t work. The recent models supposedly are able to do text image video and audio all at the same time, which might imply a wider range of use cases.

If we have a better idea of what is coming down the pipe we might be able to get out of head of it instead of being blind sided will like we were with ChatGPT. Any ideas what we might be looking forward to in the next few years?

Master Debater
Master Debater
Reply to  naive skeptic
4 months ago

Respectfully, people were only blind sided by ChatGPT because they weren’t paying attention to what was right in front of them. Even Daily Nous posted about GPT-3 back in 2020 (https://dailynous.com/2020/07/30/philosophers-gpt-3/)

So the problem isn’t so much about what might be coming next as it is getting people to take what’s coming next seriously.

Demora Lized
4 months ago

Cheating was endemic at my institution beforehand. I’d catch at least 5 a semester per course, and suspect many more (contract cheating, in particular, was very popular). When I gave timed take-home exams, I’d find them posted to Chegg in real time.

Now, it’s really out of control. I’m catching twenty or more, and strongly suspect half again as many. A colleague recently caught 30/35 of her students.

It’s really, really demoralizing. Basically, students here have just stopped doing _any_ of the work, including low-stakes stuff like discussion posts. They have outsourced everything to the AI. As far as I can see, there’s no point building it into anything because they’re just not willing to do any of the work for themselves in the first place, and because they aren’t acquiring the basic skills needed to be able to actually use the AI to do something decent.

Our teaching and learning centre keeps sending out of touch memos telling us it’s our fault for being bad or boring instructors, and telling us that we have to show them how to use the AI productively, then have them reflect on its use. But, again, (1) they don’t have the basic skills necessary, and (2) they’re not interested in doing any of the work themselves. You try, but you just get more AI bullshit.

I don’t know what to do. I’ve pretty much lost all trust in my students, and that’s not a good feeling (or a good situation, either). The hassle associated with all those zeroes is pretty high, too. Not to mention the AI-generated emails they send.

Uuuuuuuuuuurgh.

Matt L
4 months ago

In my administrative law class this term, for the mid-term writing asignment, I told students that they were not allowed to use ChatGPT or similar such things. If many students used it in the actual writing of their assignments, it’s not a good advertisement for the product, because the writing was, overall, not that good. The students who did better had essays that didn’t “sound” like Chat GPT, but I didn’t try too hard to check. What I was told was that students were using it to find cases to discuss/support their arguments. To me, this is pretty distressing, because it is significantly worse than google (let alone specific legal search tools like Lexis) for tasks like this, given its tendency to simply make things up. (This has, of course, been a big real-world problem for lawyers trying to use it.) The term before this one I had a couple of cases where I suspected students had done something like this, because they were citing cases that, while real, had no or extremely little relevance to the subject matter under discussion. Those students received poor grades, but because their grades were already very poor, I didn’t think it was worth my time to try to prove cheating, by Chat GPT or otherwise.

LetsBeReal
4 months ago

The alarmists have it right, if my experience is any guide. By all appearances, there is a marked uptick in cheating via AI since around last winter. I detect it in a number of ways: sudden drop in office hours requests before writing assignments are due (90% drop from what it consistently was 10 years prior); uniformly solid, professional vocabulary even for students whose emails are barely conversational; when I run the assignments on Chat GTP, I find a half dozen markers — terms it always uses and I and my assigned readings never use — and they’re in many assignments. And a few other giveaways.

I know it’s happening and I don’t lie to myself about what it means: many students are simply not doing any of the work we build our courses around. No reading, no lectures, and no writing of their own.

And I’m frankly fed up with people sugarcoating this by saying the work quality suffers anyway, AI is bad at philosophy, it’s the new normal so there must be a way to live with it, etc. Horseshit. Maybe AI won’t get them an A, but it could get them a B, and for no effort. Too many students will take that option. This is not just an imperfect world, it is a grave threat to what we do, and we should stop pretending it isn’t.

We need to fight this. I’ve already tried a number of approaches, with at least some apparent success, though it’s hard to tell. One is a return to timed, in-class exams as much as possible (it isn’t always).
Another involves crafting a new kind of assignment, using terms and exercises invented solely for the course, so there’s no place for AI or anyone else to access them elsewhere. AI performs horribly on these, because it makes up answers to questions it doesn’t understand. Another step, a little more obvious, is making clear to students that we do not allow it and won’t tolerate it.

I’ve had students confess and apologize to me. It’s a start. I’m open to other ideas and to sharing more of my own. This isn’t over.

Charles Pigden
Charles Pigden
Reply to  LetsBeReal
4 months ago

Dear LetsBeReal
You talk about minimising or detecting ChatGPT cheating by ‘crafting a new kind of assignment, using terms and exercises invented solely for the course, so there’s no place for AI or anyone else to access them elsewhere’

Now I have always set essay questions (and in the past exam questions) that were deliberately tailored to my courses (some of which are quite idiosyncratic) and which were designed to be difficult to answer if you had not attended class or had not done the relevant readings and processed them in an intelligent way. And when I say ‘always’ I am talking about a teaching career that goes back to 1987.  

So my question for you is this: ‘Is that what you mean by ‘crafting terms and exercises invented solely for the course’ or are you thinking of something else?’ If that *is* what you mean then maybe I won’t have to modify my (hitherto successful) teaching & assessment practices all that much to make them reasonably cheat-proof. (I really don’t want to do this during the last five years of my teaching career since it would be a colossal effort.) If not – and if colossal effort is called for – could you be a bit more specific about the kinds of assessment exercises that have helped you to minimise ChatGPT-based cheating or to make it more detectable?   

A bit of context for this question. 

1) In recent years I have moved away from end-of-term exams to an assessment regime in which 80% to 95% of the mark depends on three longish essays. I dread going back to a system in which 50-60% of the mark depends on a final exam  
a) because I hate ploughing my way through hand-written assignments 
and 
b) because I think that exams are a bad way of assessing student performance, their only merit being that they are very nearly cheat-proof. 

 However I may have to make the Great Leap Backwards if AI-based cheating becomes rife. 

2) like to cram all my teaching into one semester, freeing up the other for research. This means that I taught no courses in the second semester of 2023 and won’t be doing much until the second semester of 2024. I did not see any evidence of ChatGTP-based cheating in the Fall semester of 2023 (I live in the Antipodes) but if I been teaching in the Spring semester things might have been otherwise. Technological change has been moving pretty fast.  Ditto, I suspect, the student response. 

So is it possible to devise essay questions that are likely to trip up ChatGPT and consequently ChatGPT cheats? If so, how? 

Nick Byrd
4 months ago

1. For paper assignments: Since Fall 2022, I’ve required students to turn in papers as argument maps. My papers require students to generate an argument for or against a view that I choose, then form their best objection, and then provide their best counter objection. My students have always been allowed to consult peers on paper writing. They can also consult a genAI system. And if they have genAI write the whole essay, they still need to turn the essay into an argument map (something that requires some understanding and skill). Mapping an argument can also reveal its flaws, allowing the student an opportunity to improve upon the ideas they’ve learned from the reading, their peers, or a chatbot — mapping a bad argument or mapping an argument badly can also reveal what a student hasn’t learned.

2. A chatbot assignment: students in Philosophy of Mind have to give a genAI system various cognitive tests to determine whether it has achieved what Microsoft has described as “human-level reasoning”—thanks to Cameron Buckner for providing the core of this idea. Then they have to map an argument about whether it has achieved this, map their best objection, and map their best counter objection.

I’m satisfied with the results. Students ability to map arguments varies about as much as students ability to construct arguments before the popularity of generative AI. (I’m not yet convinced that argument mapping is as beneficial as many think it is, but it is a standardized output that many genAI systems cannot reliably produce for students. And students do seem to understand something better once it’s well mapped, either by them or someone else).

Not only will most students use genAI, but many employers will expect students to use it. I’d like to prepare students to use it well. Moreover, students often need an interlocutor when another person is not available; if we can help them understand how to use genAI Socratically, then students may have more opportunities for such social reflection and, therefore, more benefits thereof.

Last edited 4 months ago by Nick Byrd
Cameron Buckner
Reply to  Nick Byrd
4 months ago

Glad to hear it worked! FWIW after making a cogsci-and philosophy-inspired pitch in class that learning to write philosophy is in large part learning to philosophize, and keeping blue book essay exams for short comprehension essays, I’ve also continued to do individual and eventually group work to cultivate effective prompting and critical thinking skills (what are LLMs good at and what are they bad at?) to great success. Our students will spend their professional lives swimming in gen AI and it behooves us to teach them to engage with it critically and reflectively. I think requiring argument maps is a great idea.

W.R.T. ideas in other threads above: I think empty threats won’t work; someone will try it, and if they get away with it, they’ll tell the other students in the private class group chat that all classes surely have. It’s also very easy to defeat detectors (high-dimensional text space is very big and you don’t have to move very far to escape any detection criteria), and Grammarly now has a drafting function which basically writes the paper for you.

Cameron Buckner
Reply to  Cameron Buckner
4 months ago

The most successful assignment for me is a group project where students have to give pro- and anti- evidence that GenAI can do some cognitively or ethically interesting thing, and present to class. After bootstrapping critical engagement with the tools, my students were extremely creative in techniques to reveal strengths or expose flaws (e.g engaging it in theater improv prompts to expose emotional inconsistencies, compositional where’s Waldo image search to explore visual understanding, systematic prompt variations to study subtle demographic bias, etc… Great stuff).

There are obvious reasons to be wary of group work and presentations, but it seems better all around to me compared to alternatives like individual interviews with instructors, and I had very minimal free rider problems. The students are fascinated by the technology and want to learn more about it. The particular assignment design obviously makes more sense in mind/cogsci courses, but I’m happy to share rubrics and other materials with anybody who wants to try to adapt them to their courses.

Nick Byrd
Reply to  Cameron Buckner
4 months ago

Thanks a million for continuing to share your ideas and experiences, Cam! We seem to agree on a bunch of this stuff.

For example, I do an in-class team assignment in every class (after a brief class-wide discussion that I facilitate), give teams an opportunity to get in-class feedback on one part of the team assignment (because all test questions are derived form in-class team assignments), and proctor all tests in-person using conventional Exam Books.

Students rave about the benefits of the team assignments (both the team component and the test-prep component).

NB: The in-class team assignments are graded for compilation (not accuracy). And I give feedback on them only if students/teams ask for it.

Michael Fuerstein
4 months ago

I prohibited the use of AI. In order to help enforce that, I gave my students an in-class “post-paper exam” after their essay assignments. They were required to do things like: “Identify the most important assumption in your argument and explain why it matters in your paper.” Or “identify the most important idea in the text that you wrote about and explain how you interpreted it. Cite specifically to the text to support your interpretation. Offer a different plausible interpretation of the idea. Then explain how that different interpretation affects your argument.” They did not know what questions would be asked in advance, but they did know they were going to receive an assessment of this type. As I explained to the students, it would be very difficult to answer these questions well if you leaned on an AI to write your paper. One can never say for sure, but so far as I could tell these post-paper exams functioned as very effective mechanisms for dissuading cheating. I only received one paper all term where there was compelling evidence that the student had used AI, and that student performed terribly on the post-paper exam. Apart from the deterrent effect, these were useful philosophical exercises in themselves and so had some pedagogical value. And they were easy to grade quickly.

Last edited 4 months ago by Michael Fuerstein
Patrick Lin
Reply to  Michael Fuerstein
4 months ago

Interesting idea. But doesn’t this create a lot of extra work for you, if the post-paper assignment is graded, i.e., you have to check if the replies match up with the paper and evaluate that?

Or do you map the assignment back to the paper only when the assignment’s discussion seems confused? Or maybe you don’t grade them but like to have them handy, in case there’s a case of suspected AI cheating? Or, vice versa, maybe you don’t even look at the paper if the post-paper discussion is confused and suggests AI cheating?

Could a student generate an AI-written essay and memorize/understand how that argument flows, or is the thinking that such a student wouldn’t bother putting in the work to understand the AI’s argument (e.g., it’s easier to just construct your own argument)?

Michael Fuerstein
Reply to  Patrick Lin
4 months ago

My focus was more on creating disincentives to use AI than on creating a foolproof method of catching its use. I didn’t try to match their exams with what they wrote in their papers. In fact I allowed them to have a printed out copy of their own paper with them. The idea was to design a task that would be difficult (not impossible, obviously) to perform well for a student who hadn’t paid their dues working through the ideas themselves when they wrote the paper. If a student had an AI write their paper but then spent time thinking carefully about the argument and linking it to specific bits of the text that they engaged with, they could have succeeded on the exam. My working assumption was that a student looking for shortcuts would be unlikely to do that. In order to keep things simple/easy, I graded the exam answers on a 1-3 scale, and so I was able to work very quickly. Using this system, an AI-cheater could still receive credit for their mediocre AI-paper if they didn’t get caught, but a poor exam would drag down their overall grade (I made the exam worth one third of their overall paper grade). But one further benefit of having them write the exam is that, in cases where you do suspect AI-based cheating, you can compare the style/quality of their writing on the exam to what was in their paper. Nothing is foolproof but this approach worked well enough that I’ll probably try using it again in some form next term.

Patrick Lin
Reply to  Michael Fuerstein
4 months ago

Thanks for those details, Michael. I may try something like this in my classes, but I need to think about whether it’s a graded assignment or something else.

And it might be something more straightforward as opposed to reflective (like yours), e.g., “Outline the main argument/discussion of your paper” in class when they don’t have access to their submitted papers, i.e., from memory. And if their outline doesn’t come close to reflecting the paper, then that signals possible cheating (AI or otherwise) and may trigger an oral exam for that student or a one-on-one meeting. Good idea, or bad?

Michael Fuerstein
Reply to  Patrick Lin
4 months ago

Hi Patrick – I worry that that approach would only work well if the question was a total surprise to the students. Which means that it would only work at most once per class I think. I actually asked a version of your question – “Explain your main line of argument in a way that would make sense to a peer who knew nothing about philosophy” – while allowing them access to their own paper. Many of the students struggled with this even if they didn’t use AI. The student who did use AI, however (he later admitted to having done this), could not handle it at all. I think the effectiveness of these kinds of approaches will probably vary somewhat with the kind of students you teach. I’m going to continue experimenting.

Patrick Lin
Reply to  Michael Fuerstein
4 months ago

Thanks, Michael. Right, I wouldn’t tell them the question in advance, but maybe I need others in my backpocket if/when word gets out about what I’ll be asking…hmm…

Nick
4 months ago

I pivoted to a heavily discussion-based grading scheme, which had its challenges but which worked extremely well in the context of an introductory-level class. I strongly urge anyone who can to give it a shot. Make discussion teams, given them propositions to debate based around a core reading, give them points for deploying substantive ideas from those readings in their debates. Make a general grading rubric based around what you take to be good philosophical discussion, take meticulous notes during debates/discussions, and send detailed feedback to each group. The logistics are tricky and a TA would have really helped, but I genuinely think it is The Way Forward. Return philosophy to its dialogical roots!

Instead of debating I allowed one student to write traditional papers as a health accommodation and they…. used ChatGPT to write one of those papers.

Monte Johnson
Reply to  Nick
4 months ago

This sounds great (regardless of ChatGPT), and I’d love to hear more and see the details. Perhaps I could backchannel email you– or you me?

Janella Baxter
Janella Baxter
4 months ago

Does anyone have a useful paper on the use of LLMs for academic or student writing that they’d recommend to include for an Intro to Philosophy syllabus? This topic would fit well into my syllabus that covers epistemological issues in the internet age and, importantly, get Intro students to think critically about their reliance on LLMs for college work.

Tee Kay
Tee Kay
4 months ago

This is live footage of a typical day of grading in the ChatGPT era.

https://youtu.be/DHKxoARmjLU?si=X0EqAieLG-I6yKhJ&t=83

I wish I were joking.