Using Generative AI to Teach Philosophy (w/ an interactive demo you can try) (guest post)


Philosophy teachers—Michael Rota, a professor of philosophy at the University of St. Thomas (Minnesota), is about to make your teaching a bit better and your life a bit easier.

Professor Rota recently began learning about how to use artificial intelligence tools to teach philosophy. In the following guest post, he not only shares some suggestions, but also let’s you try out two demos of his GPT-4-based interactive course tutor.

The course tutor is part of a program he is helping develop, and which should be available for other professors to use and customize sometime this summer.


Using Generative AI to Teach Philosophy
by Michael Rota

I have a friend who leads AI product strategy at a medium-sized tech company, and for about a year he’s been telling me about various impressive tasks one can accomplish with Large Language Models, like OpenAI’s GPT-4. In December I finally started listening, and began investigating how one might use AI tools as a teacher. (I’m a philosophy professor at the University of St. Thomas.) I’ve been amazed by the promise this new technology holds for instructors—in part because of the potential to increase productivity (of the teacher), but even more because of the potential to improve student learning.

In this post I’ll focus on the practical and discuss three free or low-cost tools that can be employed by a philosophy professor without any special technical expertise: (1) an interactive course tutor for your students, which you can load with your own questions and answers from your course, (2) a tool for quickly drafting a new exam, quiz, or HW assignment, and (3) a chatbot created from your own syllabus and lecture notes, so your students can query the content of your course.

The interactive course tutor

GPT-4 mimics human reasoning remarkably well (it scored in the 88th percentile on the LSAT). But it sometimes just makes stuff up. What if you could provide GPT-4 with good answers to questions you wanted your students to work through? It turns out you can, and thus it is possible to create an alarmingly capable course tutor by supplying GPT-4 with a series of question/answer pairs. This allows each student to have a one-on-one tutoring experience, and get immediate feedback as they work through an assignment.

You can play with a demo of this here.

Take the first assignment in the first module of this demo: “Think up a false conjunctive proposition.” This task has an infinite number of possible correct responses, yet GPT-4 can competently assess student answers,because the instructor-provided answer passed to GPT-4 by the program is general—it’s a recipe for correct answers, as it were. In this demo, the instructor-provided answer GPT-4 has been given is this:

A conjunctive proposition is any proposition of the form A and B, where A is a complete proposition and B is a complete proposition. A and B are called the ‘conjuncts’ of the conjunctive proposition. A conjunctive proposition is false if and only if A is false or B is false or both A and B are false. It counts as true otherwise.

That’s it. That’s enough for the AI tutor to respond accurately to almost any possible student response. A student can get the question wrong in a number of ways: for example, by entering a conjunctive proposition that’s true, or a proposition that’s not a conjunction, or something that’s not a proposition at all. GPT-4 handles all of these possibilities.

Using generative AI in this way offers several advantages over traditional homework assignments:

(a) students get immediate, specific feedback on each question
(b) students who need more practice can get it without having to make other students do busy work
(c) there’s less grading for teachers
(d) there is a decreased need for the teacher to explain the same thing multiple times.

How will grading work? In my view it’s too soon to hand grading over to AIs, so in my classes I plan to split the grading and the learning. The grading will be based on class participation and in-class, pen and paper exams. The learning will be facilitated in the standard ways but also with the help of an interactive course tutor based on questions and answers from my course.

Here is a second demo, allowing an instructor to test functionality by inputting a single question/answer pair and then checking how well the AI tutor handles mock student answers.

The demos linked above use an early version of the product I’m helping to design. It should be available by the summer, at which points professors will be able to create an account, input their own modules of question/answer pairs, and hit ‘submit’ to create a tutor based on their material, accessible for their students as a Web App.

For a larger discussion of the promise of interactive tutors in education, see this TED talk by Sal Khan of Khan Academy.

Assignment generation

The creation of new questions for homeworks, quizzes, and exams can be time-consuming, whether one is designing a new course or just creating a new version of an existing assignment or test. Large language models are great for speeding up this process.

If you go to chat.openai.com, you can sign up for a free account with OpenAI and use GPT 3.5 at no cost. That allows you to type into a textbox, entering a prompt like “Can you give me ten sample questions on ____, suitable for a college level” or “Here’s a question on this topic {insert a question from an old assignment}. Can you can give me a very similar question, but with different details?” Used in this way, GPT 3.5 can provide some value.

But GPT 4 is much better, both because it is better at mimicking human reasoning and because it allows you to attach files. So you can attach an old assignment and ask it for a very similar assignment in the same format. The downside here is that to use GPT-4 you need a GPT Plus account, which costs $20 a month. An upside is that additional functionality comes along with a GPT Plus account: you can access the GPT store. There you will find customized versions of GPT-4 like the “Practice Exam/Quiz/Test Creator for School” GPT, which allows you to upload course content (e.g. your lesson plans on a topic), and then ask for sample questions based on that material. With only a little more work, you can create your own GPT with access to knowledge about your own course, uploading up to 20 files, and use it to generate drafts of assignments tailored to your material.

As with any AI generated content, think of the process like this: with the right prompting from you, the AI produces initial drafts almost instantaneously, but you’ll need to evaluate and refine before you have a final product.

Course chatbot

Another thing you can do with GPT Plus is to create a course chatbot. If you create a GPT and upload files with information about the content in your course (your syllabus, lessons plans, handouts, etc.), then anyone with access to the chatbot can ask questions like “When is the final exam in this class?”, “How did we define a ‘level of confidence’?”, or “what are the steps in Aristotle’s function argument?”. And you can give others access to the chatbot by making it available to anyone with a link. However, your students would need a GPT-Plus account to use it, and that may not be feasible. But there is a free workaround: If you put your course content in a pdf that is no more than 120 pages (or break it up into several), you can give the pdf(s) to your students and direct them to the landing page of ChatPDF, where they can upload the pdf and then query it for free.

If you have further questions about any of this, raise them in the comments or email them to me.

 

Hedgehog Review
Subscribe
Notify of
guest

27 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Marc Champagne
2 months ago

We can spot vicious circles, but judging by the post we cannot spot this particular circle, where we skilled workers train AI to take our jobs: https://www.wired.com/story/remotasks-ai-expert-data-labor/

Last edited 2 months ago by Marc Champagne
naive skeptic
naive skeptic
Reply to  Marc Champagne
2 months ago

“it is easier to imagine an end to the world than an end to capitalism”

Ben M-Y
Ben M-Y
2 months ago

This is really interesting, and I really like that OP is thinking creatively about ways to help our students. But I, too, have some concerns about jobs. And it’s not my own that most concerns me in connection with this proposal.

My institution has a thriving peer tutoring program. And among the many benefits (eg, increased success rates in courses) are those for the tutors themselves. It’s not just the pay and work experience that they get, but also the community they build, professional development training opportunities they have, and so on. Our tutors enjoy their jobs and really do make a difference for their peers. They also graduate with valuable skills that help them find employment. They learn and grow themselves. If we train chatbots to tutor, then we eliminate a meaningful opportunity for personal, professional, and intellectual development for our students. We dilute the educative opportunities afforded by our institutions.

Of course, there’s not always funding for these sorts of programs (ours is partly grant funded right now). But it seems to me that the increased efficiency promised by LLMs is, for many reasons, not something we should rush to pursue. There are foreseeable human costs that, in my estimation, often outweigh the apparent gains, even in the narrow context of undergraduate teaching.

Kenny Easwaran
Reply to  Ben M-Y
2 months ago

If we think the tutoring program was created by an administration that cares only about the benefit to the tutees, and thinks of the involvement of the tutors as an unfortunate cost that has to be paid to provide this benefit, then this administration might see the possibility of AI tutors as a good full replacement.

But if we have seen that the tutoring program has benefits for the tutors as well, and if these benefits to the tutors are great enough to make up for the cost in their time, then we should be able to make that case to the administration that the tutoring program should continue to exist.

I suppose there would be a question of whether enough tutees would still sign up to work with human tutors, if AI tutors are also available. Presumably at least some of them would. But if all the tutees would prefer the AI tutor to a human tutor, I’m not sure I would be confident enough in the value of tutoring to the tutor to want to prevent the tutees from having access to the kind of tutoring they prefer.

Nick
2 months ago

Dr. Rota, I was glad to see one of your students enthusing about you online:

Awesome prof. Really cares about the students and their learning. Will go into more detail about any questions you have and really knows his stuff! Helpful outside of class and in class and has extremely organized notes which translates really well into student note-taking. Recommended.”

How do you think this student would have felt if this much-appreciated “outside of class” help had been conducted by GPT-4? Do you think that you, as a teacher, have any duties to students like this one? What are the nature of those duties? Does charging (U St. Thomas) $50,366 USD in undergraduate tuition strengthen any of those duties?

What psychological and moral consequences might there have been for a student like this in such a scenario, where their out-of-class help was AI-driven? How do those consequences play out at scale?

And to everyone else: why aren’t we asking any of these questions?

Michael Rota
Michael Rota
Reply to  Nick
2 months ago

Thank you for these questions. People seem to be thinking of an AI tutor as a replacement to a human being, but I see this as one more among many tools in my toolbox, and a complement to what I’m already doing. So I’ll still have office hours just like before, and will encourage students to come to them. But I’ll also offer this additional way to learn. I next teach introductory philosophy in the Fall, and am excited to use an AI tutor in a way that actually creates more classroom time for me to engage personally with my students about existentially-meaningful topics. Right now I spend about the first 3 weeks of the semester covering basic propositional logic, and I assign logic HW assignments on paper. The learning the students do outside the class is only so-so…some students really get it, but many others don’t, and if I slow things down too much to reach the ones that don’t get it easily, I bore the other portion of the class. So next time I teach the course I will give those logic assignments via the AI tutor. The students who need help will be able to practice until they get it. I anticipate better student learning of material in their hours outside of class, which should mean that I have more time in class to do other things, like engage in Socratic dialogue and discussion.

And I think it’s very unlikely that AI will replace philosophy instructors: this is a tool a teacher can use to help students learn (and when so guided, I think students will benefit from using it), but very few students would on their own initiative spend much time dialoguing with a chatbot about logic (or Aristotle’s function argument, or whatever topic one preps the AI tutor to ask questions on).

Rita Alfonso
Reply to  Michael Rota
2 months ago

I am also getting a lot of pushback about how AI might replace humans… this poster is right in saying that it is a tool that, when used well, will augment instruction and allow a much higher quality experience more quickly. It will also make instruction more expensive, not less. Those looking to use it to replace humans will likely have a rude awakening. I mean I think they will try, and I guess we’ll see how that turns out.

Kenny Easwaran
Reply to  Nick
2 months ago

I think we are asking precisely these questions – would students prefer human feedback during office hours at a couple times of the week, or AI feedback available 24/7, or both? If students prefer having access to both of these rather than just one, then it would seem to be our duty to provide both, especially if they are paying a lot for the service, rather than providing only one and refusing to provide the other.

Meme
Meme
Reply to  Kenny Easwaran
2 months ago

“If students prefer having access to both of these rather than just one, then it would seem to be our duty to provide both…”

Unless–to go a bit further than Nick suggested–our pedagogical duties to students transcend merely satisfying their preferences (and held irrespective of what they pay).

oudemian
oudemian
2 months ago

Can you fine-tune it with paper prompts, sample papers, and feedback on papers, so that it produces usable feedback on writing, customized to specific classes? This could be a game-changer at large teaching schools. This could be TA support for people without TAs. Or at least, for doing assessment – could your software help in that way?

Michael Rota
Michael Rota
Reply to  oudemian
2 months ago

If you experiment with the second demo (linked in the article) you may be able to go some distance toward what you describe. Put the instructions for the paper prompt in the Question box, and then in the instructor “answer” box put a long explanation of the features of a good paper that you’re looking for. Then copy and paste in a short sample paper and see what you think of the feedback GPT-4 can give (when aided by your instructions). GPT-4 is already pretty good at that sort of thing, but it’s even better when you give it clear criteria and examples.

Jake Wright
2 months ago

Philosophy teachers—Michael Rota, a professor of philosophy at the University of St. Thomas (Minnesota), is about to make your teaching a bit better and your life a bit easier.

This does neither. I say this as an award-winning teacher (not as a brag, but to establish some bona fides).

After playing around with this for a bit, the best I can say is that it can reinforce surface-level knowledge but doesn’t seem to do anything to develop the deep learning I think we genuinely want in our students. Promoting surface learning does not make my teaching better, and that’s working under the assumption that the LLM (a) is providing reliably genuine answers or (b) is not, but in a way that is detectable and addressable by instructor or student straightforwardly. I’m not sure I buy that assumption.

It certainly doesn’t make my life easier both because I don’t regularly assign student work that requires them to produce content knowledge in this way (I note that I don’t teach symbolic logic, which I think is a different case) but it’s the kind of shiny tool that administrators and state legislators who fund my campus would conceivably look to as an excuse to eliminate my NTT position. Worrying about that, especially over a tool that doesn’t seem to do anything to facilitate the kind of learning I care about, is not an easier or better life.

Peter
2 months ago

I think there could be quite a bit of value in this, especially for introducing logic skills. Thanks for sharing! I have found that when I use the Gradio link, though, every request just gets stuck in an endless queueing. Am I doing something wrong?

Michael Rota
Michael Rota
Reply to  Peter
2 months ago

Thank you for alerting me, Peter — the program crashed! Now it’s working again. This is a proof-of-concept version done by an amateur programmer (me!), but I have a friend in software who is turning this into a bona fide product…should be ready (for other instructors to use to create tutors specialized to their courses) by early summer.

Anco
2 months ago

So this is the very first interaction I had using the first link:

CourseTutor: Think up a false conjunctive proposition.

Me: It’s not raining and it’s not not raining.

CourseTutor: Your answer “It’s not raining and it’s not not raining” is not a false conjunctive proposition. It is actually a contradiction because “it’s not not raining” is the double negative of “it’s raining,” which means they cannot both be true at the same time. A correct example of a false conjunctive proposition would be “It is raining and it is sunny,” assuming that ‘sunny’ implies no rain and the context is such that it cannot be both raining and sunny at the same time.

Without getting into a big discussion about the credentials of intuitionist logic: isn’t it first telling me that a contradiction is not a false conjunction, before explaining to me that the contradiction of “it is raining and not raining” is a ‘correct’ example of a false conjunction?

Joshua Harris
Joshua Harris
Reply to  Anco
2 months ago

I was going to mention similar nonsense I’ve recieved from GPT-4 on basic logic questions. I’ve seen it waffle back and forth on whether a simple two-premise argument is valid or not. One time it said that an argument was both valid and invalid. Maybe our own training can help with these sorts of things? Not a great look for a “tutor.”

Anco
Reply to  Joshua Harris
2 months ago

It was, naturally, not a coincidence that I started with a statement containing negations. Large Language Models are good at association and patterns, but notoriously bad at logical reasoning, particularly with negations (see: https://arxiv.org/abs/2306.16638). It can try and mimic logical reasoning, by teaching it common logical patterns (as this tutor seems to do), but the model has no clue what it is doing. I was actually surprised CourseTutor got as far as it did in the example above.

Without building a model that can ‘reason’ across the scope of a negation, which goes against the very nature of the architecture of LLMs, I don’t see this issue being solved. It’s definitely not a matter of ‘more parameters’. There are attempts at combining different model types modularly (is this a word?), with an LLM like GPT as sort of a ‘in between communicator’ (like with AutoGPT a while back), but this has other issues.

Last edited 2 months ago by Anco
Michael Rota
Michael Rota
Reply to  Anco
2 months ago

Hi Anco. Nice point. Yes, LLMs are bad at double negation. But try it again and see how it handles it now — I added a line to the instructor provided answer stating that any contradiction will count as a false conjunctive proposition, and then just tried your answer “It’s not raining and it’s not not raining” and it gave a good response. I think if you try hard you’ll still be able to get a bad answer. But, larger point: cases where the LLM gives the wrong answer will create good teachable moments. So on the class day after they’ve done on a given logic HW with the AI tutor, I can ask: “Did anyone get a response that they think is incorrect?” that will be fun, b/c if the students did find a mistake, they will see that they are smarter than the AI (and that’s good!) and the whole class will be treated to a finer point (like double negation). But, for the vast majority of the answers my students actually give to this question, the model will give them good feedback, I think.

Derek Bowman
Derek Bowman
Reply to  Michael Rota
2 months ago

That would have been a great educational experience for a student like me, who would have loved trying to break the bot. But I think that kind of unreliably would just be terribly frustrating for many students I’ve had who were struggling and needed help. But students like me didn’t need tutors, robot or human – it’s the struggling students who needed help. But they need help they can rely on, not help whose unreliability might be a nice teaching moment later.

Rob Hughes
2 months ago

Machine: Give an example of a true negation.

My answer: This statement is not true.’

Machine: Based on the context provided by your instructor, your answer “This statement is not true” is a negation, as it includes a shorter proposition “This statement is true” with the logical operator “not” added. However, without additional context, it’s not possible to determine if your negation is a true proposition. For it to be a true negation, the statement “This statement is not true” must accurately reflect reality. If the original statement is indeed false, then your negation would be a true negation. If the original statement is true, then your negation would not be a true negation.

Sorry, folks. GPT-4 does not “mimic human reasoning remarkably well.” It’s an automated bullshit generator.

Don’t waste your time with this nonsense. And don’t give AI companies free marketing for products whose value is dubious at best.

Derek Bowman
Derek Bowman
2 months ago

Although I appreciate the interest and effort that went into this, I think it is a grave mistake for two reasons.

First, however impressive the output of these LLMs sometimes are, they are too unreliable to give to our students this way. For the exact same reasons the OP rightly thinks these bots aren’t ready for grading, we shouldn’t invite students to see their outputs as offering authoritative verdicts on what they’re trying to learn. Perhaps these are sometimes useful tools for someone who has the domain specific skills and knowledge to filter out bad answers, but those are, by definition, not the students these tools are meant to serve. (See, e.g. the earlier comments from Anco, Joshua, and Rob).

Yeah, yeah, yeah, but just imagine the future version that magically works those kinks out. Yeah, maybe. But until then I wouldn’t encourage my students to rely on these tools.

Second, I think this is based on a fundamental misconception of the nature and purpose of philosophy education. Philosophy professors are not teaching a set of domain specific knowledge and skills that most students come into college wanting to learn, especially in our core curriculum ‘service’ classes. We often have to convince our students of the value of what we’re trying to teach them. When students do homework assignments or write papers in our courses, we are often the only audience for that work. And if even *we* don’t care enough about the work we’re asking our students to do to read and help them with it, how are we going to convince them to care?

I guess, as Sal Kahn shows in his video, we can also program our tutor-bots to engage our students in dialogue about their goals and values. But that would be to give our students a fake version of what a real philosophical conversation of values should be like. A big part of what we’re doing in philosophy classes is showing our students how to take their own beliefs and values seriously and how to engage in productive dialogue with others. I don’t deny that there could be some valuable use of text-bots as part of that process, but only if we’re able to give students the ability to see the difference between a real dialogue and a fake one. Based on his TED talk, I’m not sure if Sal Kahn knows the difference, so I have no confidence that the students who are asked to use these tutor-bots will learn the difference either.

Jason Swartwood
2 months ago

I’m with the skeptics. This will neither make my teaching better nor my life easier. It will just give administrators more excuses to refuse to make the changes that will actually achieve both of those.

Tom
Tom
2 months ago

Wouldn’t this post be better titled how to “Teach Logic”? Can chatGPT work to “Teach Philosophy” more broadly? (And here I mean the bot’s interactions with the students – the suggestions for how it might help the professor are separate)

Hey Nonny Mouse
Hey Nonny Mouse
2 months ago

Great article! The future of philosophy teaching will be wrapped up in AI.

For a bit of fun, try this. Sign in to Bing Chat (Co-Pilot) and tell it

“Act as a text adventure set in Ancient Athens. I win if I learn from Socrates how to live well, and then live well in accordance with his theories. I win if I achieve 10 goodness. Don’t make the game too easy.”

Games like this on Bing aren’t quite ready for students because Bing is too prone to making up “facts”, but it’s getting close.

Peter H
Peter H
2 months ago

I think the program may have crashed again. Just as Peter mentioned 3 days ago, “I [too] have found that when I use the Gradio link, though, every request just gets stuck in an endless queueing.”

Ironically my name is Peter too. This feels like déjà vu.

The 2nd Demo link did work for me and it was extremely impressive, thanks!

Michael Rota
Michael Rota
Reply to  Peter H
2 months ago

Thank you Peter H. I see now the problem in the code, which made the program vulnerable to the situation in which a user inputs nothing (instead of a number) for the question number. I’ve just fixed it and, while the program will no doubt crash again eventually, it will be for a new reason!

Rita Alfonso
2 months ago

I’m a former academic philosopher turned web dev, and I have been working with OpenAI’s LLMs since released. I’m happy to see some Phi profs starting to use these tools! Not only can AI make learning from texts interactive, but it can tailor itself to the student’s interests and skill level. But it is a tool that you have to learn to use well, so I am focused on teaching people how to learn with AI to augment learning and teaching. I now teach people to code.
I’ve created some philosophy GPTs as well, will try to share these soon.
BTW, You can have students save to a pdf the convo they have with your GPT and turn that in for grading purposes. Upload icon is in the upper right hand corner of the chat window.