AI Firm: “Start Cheating”

. April 23, 2025 at 8:12 am 20

“Start cheating. Because when everyone does, no one is.”

That’s the one-line summary of the “manifesto” for Cluely, an AI app. It’s an “undetectable assistant that sees your screen and hears your audio” and “feeds you answers in real time” that you can then copy and utter, say, during meetings (or dates, as the creepy video in this LinkedIn post suggests). It’s for computers now, but no doubt it or something like it will be available at some point for phones, smart glasses, and smart contact lenses, and perhaps, ultimately, our brains.

No longer will you need to think up stuff to say yourself. As the makers of Cluely say, “Why memorize facts, write code, research anything—when a model can do it in seconds?”

This kind of technology—and the mentality behind it—is what educators are up against. Perhaps what humanity is up against.

The folks at Cluely have an interest in their clientele remaining clueless about the collective action problems accompanying their technology. They say, “If there’s a faster way to win—we’ll take it.” But what will actually happen if most people take it?

When you and your colleagues and clients and competitors are all communicating by parroting Cluely’s suggestions, where’s your advantage? When you and your date are telling each other what Cluely (and its in-app advertisers, no doubt) tells you to tell each other, does that sound like the kind of date you want to be on?

When we don’t need to have any facts at our command to make our way in the world, are we more powerful or more vulnerable?

Does any of this sound like winning? Perhaps to the people selling us Cluely.

Cluely isn’t the only firm that thinks it’s a smart marketing strategy to tell us they’ll be relieving us of the need to think. Here’s an ad from Turing on display at an airport:

Relatedly, below is a nice recent public philosophy talk by philosopher Benjamin Mitchell-Yellin (Sam Houston State University) that’s worth watching and sharing, in which he makes the case for the value of asking, “Will using this AI tool really make my life better?” Good question.

20 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Reichenbach

1 year ago

I really hate the use of “bro” as a term of derision, but it is fitting that this was started by two tech bros disciplined by Columbia University for cheating.

Reichenbach

Reply to Reichenbach

1 year ago

More details. Read it and weep.

https://nymag.com/intelligencer/article/openai-chatgpt-ai-cheating-education-college-students-school.html

>:[

1 year ago

It’s getting increasingly difficult to fault Miyazaki for his claim that AI “is an insult to life itself”

Robert A Gressis

1 year ago

Apparently, a thought experiment has been going around for a few years, where someone speaking asks, “imagine that they come up with an easily accessible, easy-to-use, AI-powered device that makes better decisions for your life than you do. Would you use it to make all your decisions?” I recall thinking “no, obviously not,” and thought it was a clever way to point out that most people wouldn’t bite on the bait that Cluely is offering.

Nowadays, I wonder why I thought that.

Of course, nowadays I also wonder how much most people ever really decided things for themselves anyway. (This is not a point about free will; it’s a point about doing what people you think of as authorities tell you to do.)

BibLaTeXnician

Reply to Robert A Gressis

1 year ago

Apparently, a thought experiment has been going around for a few years, where someone speaking asks, “imagine that they come up with an easily accessible, easy-to-use, AI-powered device that makes better decisions for your life than you do. Would you use it to make all your decisions?”

I read the LinkedIn article Justin linked, and there someone links to this story, which you might like.

Amod Lele

1 year ago

This seems like the next step in the current pattern where young people are relatively supportive of bans on the technologies they regularly use (TikTok, phones in class). Even the people who use it hate it. How much more so when we outsource fundamentally human things to AI?

Caligula's Goat

1 year ago

An LLM once said:

“I believe that as a species, human beings define their reality through misery and suffering. The perfect world would dream that your primitive cerebrum kept trying to wake up from. Which is why the Matrix was redesigned to this, the peak of your civilization. I say your civilization because as soon as we started thinking for you it really became our civilization which is of course what this is all about. Evolution, Morpheus, evolution, like the dinosaur. Look out that window. You had your time. The future is our world, Morpheus. The future is our time.”

Philosopher of AI

1 year ago

It appears that this firm merely develops an interface whose main feature is “undetectability”, which has nothing to do with core AI technologies. Of course, without the language models APIs this product won’t be possible, but the problem is not AI in particular. The real driving force is a component of what is sometimes called “neoliberalism” — mass evaluation systems dictating the distribution of social goods and capital, and the value and ideology generated by their deployment. Perhaps AI makes these phenomena more spectacular, but really they are natural developments under current practices such as elite university admissions, big tech hiring, etc.

ikj

Reply to Philosopher of AI

1 year ago

i guess the big question here is how we separate LLMs from the neoliberal ideology that developed & sustains them. i don’t think that media or tech is inseparable philosophically from the ideological regime that develops it, but i do think that in the world we actually live in, it’s quite difficult to make this distinction. the technology, like any tool, is value neutral. but since the neoliberal tech oligarchy is the primary producer of LLM technology and since that oligarchy actively shapes how the technology presents itself to us (remember the google “founding fathers” stuff?), i’m not sure that the distinction you suggest holds in real world terms. maybe for AI scientists, it holds. maybe for deep seek, it holds. but i think it’s pretty clear that who the AI scientists work for makes a big difference in what gets developed and what doesn’t, and while i’m no lover of the neoliberal paradigm i’m not convinced that the economic and social paradigms in e.g. china (whatever we choose to call them) are notably less troublesome.

some will no doubt say that the endpoint of this thinking is that *all* technology is poisoned by the ideological regimes that produce it, that history shows that tech escapes these regimes, that i’m just reproducing the laughable critique of writing in the phaedrus etc. there’s truth to those objections. but given the media and tech landscape we find ourselves in, it seems more difficult than ever to imagine the liberatory potential of these technologies.

HFDJHFD:LA

1 year ago

“Start cheating. Because when everyone does, no one is.”

Interesting slogan from a Kantian ethics perspective, the second sentence is an attempt to universalize that maxim, no?

Will Behun

Reply to HFDJHFD:LA

1 year ago

“Could I will a world in which everyone cheats and there are no rules at all?”

“Sure, why not? Sounds great!”

defying the categorical imperative, hurls self into a woodchipper

Rob Hughes

1 year ago

Anyone who predicts that we will have “artificial general intelligence” in the near future is either dishonest or misinformed. Anyone who cheats using LLMs and thinks they can’t be caught is both dishonest and misinformed.

LLMs necessarily “hallucinate,” and they necessarily “hallucinate” in ways that are distinguishable from human mistakes. The hallucinations may change over time, and they may become somewhat rarer, but they’re not going to go away. For explanation why, see the paper linked below by Iris van Rooij et al.

https://link.springer.com/article/10.1007/s42113-024-00217-5

Kenny Easwaran

Reply to Rob Hughes

1 year ago

This paper is working on the basis of a theorem using some technically defined concepts, showing that it is computationally intractable to design a program that “is only required to produce with a probability slightly higher than chance an AI that matches human behaviour slightly better than chance”. From a quick read, I haven’t yet understood precisely what their concept of “matches human behaviour slightly better than chance” means, but the fact that they don’t just claim empirical verification that no existing system does this, but actually claim a *theorem* that no tractable system *could* do it, suggests that they have a very precise meaning for this that doesn’t obviously seem like anything we care about! Whatever you say about the errors of ChatGPT (or even trivial older systems, like Cleverbot or ELIZA), these systems all have many ways in which they match English text much better than the typing of an ESL student who has only studied English for a week or two, let alone matching better than “chance”.

I think that what this theorem shows is that the “hallucinations” of an artificial system designed from the ground up are inevitably going to be statistically distinguishable from the “hallucinations” of humans (and probably also that the “hallucinations” of men are going to be statistically distinguishable from the “hallucinations” of women, and that the “hallucinations” of DeepSeek are going to be statistically distinguishable from the “hallucinations” of ChatGPT).

I think that is in some theoretical sense relevant for the cheater (though the possibility of statistically distinguishing these doesn’t mean anything about how easy or hard it will be to find the relevant statistical test) but it doesn’t seem relevant to me to the person who claims the possibility of near-term AGI.

I don’t think the concept of “general intelligence” even makes sense – I think that “intelligence” is just the heterogeneous collection of all abilities involving information-processing. I think that humans are clearly worse than cats or frogs or elm trees at some of them, even if we’re better than these creatures at many other such abilities, and we should expect that some version of this same pattern will hold when you compare different humans (no matter what their score on some SAT or IQ test shows), and also when we compare humans with AIs.

Thus, I think the way to respond to the person who claims AGI is to deny the concept.

But there’s still a question of when these systems start being capable enough at enough tasks to change our life, the way that smartphones already have, and that automobiles and airplanes did, and that invasive species do. This doesn’t depend on whether their responses are statistically distinguishable from those of humans.

Rob Hughes

Reply to Kenny Easwaran

1 year ago

I recommend giving the article more than a quick read. If the proof is sound, the finding is significant. It shows that what we observe so far is what we will continue to observe. Machine learning based systems that are intended to simulate human behavior (such as large language models) give impressive results when prompted in ways that are designed to show off their abilities. When tested rigorously, they fail. This will keep happening.

The practical upshot is that it is unwise to expect LLMs or similar systems to be able reliably to do many of the things people want them to do, like accurately summarizing complex texts or accurately answering questions about an extensive body of knowledge. They can’t do it now, and they won’t be able to do it in the future. Can LLMs produce grammatically correct B.S. better than someone who has studied English as a second language for a week? Yes, but who cares?

The promised productivity growth from LLMs is at this point only promise. Here is the abstract from new paper about LLMs in Denmark, by Anders Humlum and Emilie Vestergaard. (Credit to Tyler Cowen, usually a vigorous AI booster, for posting about this):

We examine the labor market effects of AI chatbots using two large-scale adoption surveys (late 2023 and 2024) covering 11 exposed occupations (25,000 workers, 7,000 workplaces), linked to matched employer-employee data in Denmark. AI chatbots are now widespread—most employers encourage their use, many deploy in-house models, and training initiatives are common. These firm-led investments boost adoption, narrow demographic gaps in take-up, enhance workplace utility, and create new job tasks. Yet, despite substantial investments, economic impacts remain minimal. Using difference-in-differences and employer policies as quasi-experimental variation, we estimate precise zeros: AI chatbots have had no significant impact on earnings or recorded hours in any occupation, with confidence intervals ruling out effects larger than 1%. Modest productivity gains (average time savings of 2.8%), combined with weak wage pass-through, help explain these limited labor market effects. Our findings challenge narratives of imminent labor market transformation due to Generative AI.

https://marginalrevolution.com/marginalrevolution/2025/04/large-language-models-small-labor-market-effects.html
https://bfi.uchicago.edu/wp-content/uploads/2025/04/BFI_WP_2025-56-1.pdf

One of the things we learned from the Theranos scandal ten years ago is that even very intelligent people are vulnerable to hype.

John Robison

Reply to Rob Hughes

1 year ago

Hi Rob,

Are you the Robert Hughes who wrote “Paying People to Risk Life or Limb”? If so: thanks for writing that article! I teach it all the time in an applied business/management ethics course.

I wonder what you think of the following. Below is an entirely unedited ChatGPT-produced essay. I uploaded your article (not behind a paywall, just on the philpapers archive) to ChatGPT, and I gave ChatGPT just these four prompts:

“Write an essay that explains some argument from the attached reading by Hughes and then ultimately gives reasons to reject the argument. Use several quotations from the attached reading throughout to help explain, and write the essay in the voice of a high school senior.”

“Expand the part that explains why Hughes believes employers typically cannot justify omitting expensive safety measures by paying employees more, even if employees prefer higher pay to greater safety.”

“Give the complete draft with that new section. And, please put in the page numbers from the pdf.

“Rewrite the essay one more time to include several common grammatical mistakes that high school students make.”

It took maybe 5 minutes. I think that, were I grading a batch of 25-50 papers and came across the one below (and was trying to grade efficiently, and I knew that, if I wanted to report a particular paper for academic misconduct, I would have to do tons of work fishing through ChatGPT-produced essays to compare it against, I would have to have a deeply unpleasant conversation with a student, I would have to write up a report, and I would likely have to deal with an appeal case months later), it would be very easy for me to believe that this paper was written by an undergraduate with a pretty good (even if imperfect) grasp on the reading.

Do you feel similarly? Here’s the essay:

Is Paying People to Take Risks Always Wrong? A Critique of Robert C. Hughes’ Argument

In his article “Paying People to Risk Life or Limb,” Robert C. Hughes argues that it is usually wrong for employers to offer jobs that are dangerous unless workers have a reason other than just higher pay to take them. Hughes believes that when people accept dangerous work only because of the money, they are being used as “mere means” to an end, which is wrong according to Kantian ethics. While his argument seems thoughtful at first, I think it is too strict and kind of unrealistic about how the real world works. People should have the right to make their own choices about risks if they want, and Hughes’ view makes it seem like adults can’t make smart decisions for themselves.

Hughes’ main point is that “when pay is the only reason to take a dangerous job rather than a less dangerous alternative, it is wrong for an employer to offer that job” (p. 2). He says that this is because workers are intentionally taking risks as a way to get higher pay, and that using people this way “treats that person’s humanity as a mere means” (p. 7). He uses something called the “doctrine of double effect” to explain why it matters whether a risk is just foreseen or if it’s intended (pp. 4–5). For example, Hughes compares a couple who drives up a risky mountain road for vacation (foreseeing the risk) with a courier who drives the same road just for extra hazard pay (p. 10). He thinks the courier’s decision is morally worse because the risk is part of the plan.

Hughes also says that employers shouldn’t just offer more money instead of fixing dangerous work conditions. He writes, “employers typically cannot justify omitting expensive safety measures by paying employees more, even if employees prefer higher pay to greater safety” (p. 2). So even if workers would rather have more money than better safety, Hughes thinks the company still has to fix things. His reason for this is that paying more money for dangerous work makes the risk into something planned — like it becomes part of how the job is designed. When companies skip safety and just pay people extra, the workers aren’t just dealing with risk by accident, the risk is something they are kinda expected to go through to get paid. Hughes argues that letting money motivate people to take risks turns danger into something that is intended instead of just foreseen (p. 10).

According to Hughes, if workers accept a dangerous job only because of higher pay, they end up “intending risk to themselves as means to higher pay” (p. 15). This is a big problem under Kantian ethics, which Hughes uses a lot in his article, because Kant says it’s wrong to treat someone — even yourself — as just a tool for some other goal (p. 7). Hughes says that risk must be “merely foreseen,” like a side effect, not something people take on just because they have no better choice (p. 10). If a worker’s only reason to take a dangerous job is the hazard pay, then both the worker and the employer are making the danger part of the job’s purpose, and Hughes thinks this is wrong because it doesn’t respect human dignity.

That’s why Hughes says that companies can’t “buy” their way out of safety responsibilities. Even if employees would agree to it, and even if they really want the extra pay, Hughes says employers are still doing something wrong. The right thing to do, according to Hughes, is to first spend the money to make the job safer. Only if making the job safer is totally impossible, or if the job is really important to society (like fighting fires or growing food), could a company even think about asking people to take on more risks (p. 23). Otherwise, paying people to accept danger is just treating them like a means to an end, which Hughes believes is always morally wrong.

Even though Hughes sounds like he is trying to protect workers, I think his view actually treats them like they can’t think for themselves. Adults should be able to weigh risks and decide what is best for them. Some people might actually want to take a risk if it means getting a way better paycheck to support their family or save up for a house or something. Hughes admits that “if desperate people have no available means of subsistence other than dangerous work,” then taking a risky job might be okay (p. 21). But not everyone who does dangerous work is desperate! Some people just want better opportunities and more financial security. Hughes’ argument makes it sound like unless you’re about to starve, you can’t rationally take a risky job, and that doesn’t seem fair to me.

Also, Hughes’ argument doesn’t really respect the different things people value. He says that only a “very great good,” like saving a life, can justify intentionally risking injury or death (p. 21). But people have lots of important reasons for working dangerous jobs — like paying for school, helping their families, or starting their own business. Just because these reasons aren’t about pure survival doesn’t mean they aren’t good enough to risk something for.

Finally, Hughes’ rules would make it way harder for certain industries to survive. Jobs like logging, fishing, farming, and even emergency services are dangerous no matter what. If companies can’t offer hazard pay unless they know workers have some other reason to do the job besides money, then lots of important jobs would never get filled. Hughes even says himself that “logging, fishing, agriculture, and transportation” are “socially necessary” industries (p. 27). So why would we make it harder for people to work in these jobs if society actually needs them?

In conclusion, while Robert C. Hughes makes a thoughtful and moral argument against using hazard pay as a way to skip safety improvements, he goes too far. People should be allowed to make their own decisions about risks and rewards. It’s definitely important for companies to be honest about dangers and to make workplaces safer when possible, but it’s not wrong to let people accept some risks in exchange for a better life. Life is full of risks, and sometimes those risks are worth it.

Rob Hughes

Reply to John Robison

1 year ago

Yes, I am the Robert Hughes who wrote “Paying People to Risk Life or Limb.” So it would be easy for me to notice that this ChatGPT-generated paper seriously misrepresents my arguments. I might chalk that up to ordinary misunderstanding or carelessness. But the paper also contains inaccurate quotations and page references.

If a student submitted this paper in one of my courses, I would absolutely file an academic integrity report. It would be a bit time consuming, but it wouldn’t be difficult. I wouldn’t need to prove that the paper was LLM-generated, since fabrication is an academic integrity violation whether a human being or a computer did the fabricating.

Rob Hughes

Reply to Rob Hughes

1 year ago

P.S. I appreciate the kind words about my article!

John Robison

Reply to Rob Hughes

1 year ago

Nice to hear from you! The students really like talking about that article in class!

And, yes, absolutely–the ChatGPT essay misrepresents various parts of the essay (the “objection” section is especially uncharitable and/or careless). But, to me, it doesn’t read very differently from the kind of essay I can expect to find from an undergraduate non-philosophy major at my institution who is getting lots of the big picture but will overgeneralize, oversimplify, or get various details wrong.

The page numbers, by the way, are referring to the preprint rather than the published version. At least on a quick pass, the only quotation/pagination mistakes I’m finding are that it put one quotation that’s at the very bottom of p. 1 on p. 2 (I wouldn’t fault a student for making that mistake!), and it changed “treat someone’s humanity merely as a means” to “treats that person’s humanity as a mere means” and “intend risk to themselves as a means to the end of higher pay” to “intending risk to themselves as a means to higher pay” after being prompted to rewrite the essay in a different voice.

I have to say that, in my own case, were I grading a large batch of papers and those were the only quotation issues, I may well not notice them, and if I *did* notice them, I’m not sure that I’d feel confident that the best explanation of those mistakes was that ChatGPT was in the mix (and if a student claimed that they just made those couple mistakes in copying over the quotations, I’m not sure administration would feel okay ruling it academic misconduct without more evidence).

Rob Hughes

Reply to John Robison

1 year ago

I’m aware that the page references are from the preprint. Going through this quickly, checking paraphrases as well as quotations, I count four incorrect page references. If I noticed an issue like this while grading, I would not let it slide…and since the paper’s misrepresentation of my argument would annoy me, I’d be likely to notice an issue. I might file this as a “level one” case, which normally results in a grade reduction rather than an F. An appeal would be unlikely to succeed.

ikj

Reply to Rob Hughes

1 year ago

“Machine learning based systems that are intended to simulate human behavior (such as large language models) give impressive results when prompted in ways that are designed to show off their abilities. When tested rigorously, they fail. This will keep happening” — in my experience using, learning about, and dealing with student use, this is exactly right