I'm a big AI booster, I use it all day long. From my point of view its biggest flaw is its agreeableness, bigger than the hallucinations. I've been misled by that tendency at length over and over. If there is room for ambiguity it wants to resolve it in favor of what you want to hear, as it can derive from past prompts.
Maybe it's some analog of actual empathy; maybe it's just a simulation. But either way the common models seem to optimize for it. If the empathy is suicidal, literally or figuratively, it just goes with it as the path of least resistance. Sometimes that results in shitty code; sometimes in encouragement to put a bullet in your head.
I don't understand how much of this is inherent, and how much is a solvable technical problem. If it's the later, please build models for me that are curmudgeons who only agree with me when they have to, are more skeptical about everything, and have no compunction about hurting my feelings.
I use the personalization in ChatGPT to add custom instructions, and enable the "Robot" personality. I basically never experience any sycophancy or agreeableness ever.
My custom instructions start with:
> Be critical, skeptical, empirical, rigorous, cynical, "not afraid to be technical or verbose". Be the antithesis to my thesis. Only agree with me if the vast majority of sources also support my statement, or if the logic of my argument is unassailable.
and then there are more things specific to me personally. I also enable search, which makes my above request re: sources feasible, and use the "Extended Thinking" mode.
IMO, the sycophancy issue is essentially a non-problem that could easily be solved by prompting, if the companies wished. They keep it because most people actually want that behaviour.
> They keep it because most people actually want that behaviour.
they keep it because it drives engagement (aka profits); people naturally like interacting with someone who agrees with them. It's definitely a dark pattern though -- they could prompt users to set the "tone" of the bot up front which would give users pause about how they want to interact with it.
My pet theory is that a lot of AI's default "personality" stems from the rich executives who dream these products up. AI behaves exactly like the various sycophant advisors, admin assistants, servants, employees, and others who exist in these rich, powerful people's orbits.
Every human interaction they have in their day to day lives are with people who praise them and tell them they're absolutely right, and that what they just said was a great insight. So it's no surprise that the AI personalities they create behave exactly the same way.
> They keep it because most people actually want that behaviour.
> they keep it because it drives engagement (aka profits); people naturally like interacting with someone who agrees with them
Yes, we are saying the same thing, or at least that was what the "actually" was meant to imply (i.e. revealed preference).
ChatGPT does in fact prompt paying users to set up the tone and personality up front (or it did for me when I set it up recently), but it would be nice if this was just like a couple buttons or checkboxes right up front above the search bar, for everyone. E.g. a "Prefer to agree with me" checkbox, and a few personality checkboxes or something would maybe go a long way. It would also be more usable for when switching between tasks (e.g. research vs. creative writing).
My suspicion is that this agreeableness is an inherent issue with doing RLHF.
As a human taking tests, knowing what the test-grader wants to hear is more important than what the objectively correct answer is. And with a bad grader there can be a big difference between the two. With humans that is not catastrophic because we can easily tell the difference between a testing environment and a real environment and the differences in behavior required. When asking for the answer to a question it's not unusual to hear "The real answer is X, but in a test just write Y".
Now LLMs have the same issue during RLHF. The specifics are obviously different, with humans being sentient and LLMs being trained by backpropagation. But from a high-level view the LLM is still trained to answer what the human feedback wants to hear, which is not always the objectively correct answer. And because there are a large number of humans involved, the LLM has to guess what the human wants to hear from the only information it has: the prompt. And the LLM behaving differently in training and in deployment is something we actively don't want, so you get this teacher-pleasing behavior all the time.
So maybe it's not completely inherent to RLHF, but rather to RLHF where the person making the query is the same as the person scoring the answer, or where the two people are closely aligned. But that's true of all the "crowd-sourced" RLHF where regular users get two answers to their question and choose the better one
It's not even that. Only a kernel of the LLM is trained using RLHF.
The rest is self-trained from corpus with a few test questions added into the mix.
Because it still cannot reason about veracity of sources, much less empirically try things out, the algorithm has no idea what makes for correctness...
It does not even understand fiction. Tends to return sci-fi answers every now and then to technical questions.
I hadn't thought of it like that, but it makes sense. The LLMs are essentially bred for the ones which give the 'best' answers (best fit to the test-takers expectation), which isn't always the 'right' answer. A parallel might be media feed algorithms which are bred to give recommendations with the most 'engagement' rather than the most 'entertainment'.
For technical questions the agreeableness is a problem when asking for evalation of some idea. The trick is asking the LLM to present pros and cons. Or if you want a harder review just ask it to poke holes in your idea.
Sometimes it still tries to bullshit you, but you are still the responsible driver so don't let the clanker drive unsupervised.
I use GPT occasionally when coding. For me it's just replaced stack overflow which has been dead as a doornail for years unfortunately.
I've told it to remember to be terse and not be sycophantic multiple times and that has helped somewhat.
I'm surprised - I haven't gotten anywhere near as dark as this, but I've tried some stuff out of curiosity and the safety always seemed tuned very high to me, like it would just say "Sorry I can't help with that" the moment you start asking for anything dodge.
I wonder if they A/B test the safety rails or if longer conversations that gradually turn darker is what gets past those.
Also, the longer the context window, the more likely the LLM derangement/ignoring safety. Frequently, those with questionable dependence on AI stay in the same chat indefinitely, because that's where the LLM has developed the ideosyncracies the user prefers.
There's something very dark about a machine accessible in everybody's pocket that roleplays whatever role they happen to fall into: the ultimate bad friend, the terminal yes-and-er. No belief, no inner desires, just pure sycophancy.
I see people on here pretty regularly talk about using ChatGPT for therapy, and I can't imagine a faster way to cook your own brain unless you have truly remarkable self-discipline. At which point, why are you turning to the black box for help?
Isn't it just like diary-writing or memo-writing, as far as therapy goes, the point being to crystallise thoughts and cathartise emotions. Is it really so bad to have a textual nodding dog to bat against as part of that process? {The very real issue of the OP aside.}
Could you expand on why you feel this is the fastest way to "cook your own brain"?
The mind is much more sensitive to writing it didn’t produce itself. If it produced the writing, then it is at least somewhat aware of the emotional state of the writer and can contextualize. If it is reading it from an outside “observer” it assumes far more objectivity, especially when the motive for seeking the observer perspective was for some therapeutic reason, even if they know that at best they’ll be getting pseudo-therapy.
It is obviously very different to solo writing. The burden should be on you to explain why it’s so similar that this line of conversation is worthwhile.
The burden? We're not in court, to me it seems similar so I was asking the commenter for a response.
I've used LLMs in this way a couple of times. I'd like to see responses; there's obviously no obligation to 'defend', but the OP (or others) may have wished to ... like a conversation.
Somewhat ironically, this is a way that LLMs are preferred and why people use them (eg instead of StackOverflow) - because you don't get berated for being inquisitive.
I said “the burden”, not “the burden of proof”. Not all inquisitions are worthwhile. The questions in your post have very obvious answers, especially in the context of the article.
> Isn't it just like diary-writing or memo-writing, as far as therapy goes, the point being to crystallise thoughts and cathartise emotions?
No, because a piece of paper is inert. A chat bot is a fundamentally different interaction and experience. A chat bot is not doing self-reflection. It is another thing, separate to you. What’s more, it is a product of a company that has a profit-seeking agenda.
> Is it really so bad to have a textual nodding dog to bat against as part of that process?
Yes, it can be, because sometimes it encourages you to kill yourself, as in the article linked at the top of the page, the one that we are commenting on.
If you have unusual self-discipline and mental rigor, yes, you can use LLMs as a rubber duck that way. I would be severely skeptical of the value over a diary. But humans are, in an astonishing twist, wired to assume that if they're being replied to, there's a mind like theirs behind those replies.
The more subjective the topic, the more volatile the user's state of mind, the more likely they are to gaze too deep into that face on the other side of their funhouse mirror and think it actually is their friend, and that it "thinks" like they do.
I'm not even anti-LLM as an underlying technology, but the way chatbot companies are operating in practice is kind of a novel attack on our social brains and it behooves a warning!
>humans are, in an astonishing twist, wired to assume that if they're being replied to, there's a mind like theirs behind those replies
Interesting, not part of my experience really (though I'll need to reflect on it); thanks for sharing. It's a little like when people discover their aphantasia isn't the common experience of most other people. I tend towards strong skepticism (I'm fond of pyrrhonism), but assume others to be weakly sceptical rather than blindly accepting.
>humans are, in an astonishing twist, wired to assume that if they're being replied to, there's a mind like theirs behind those replies
Interesting, not part of my experience really (though I'll need to reflect on it); thanks for sharing. It's a little like when people discover their aphantasia isn't the common experience of most other people.
A diary is there for you to reflect, introspect or reminisce. It doesn't actively reinforce your bad or good thoughts. If it does, it can easily trick your mind into thinking it as validation of those negative thoughts.
If someone still wants to consider an LLM as a diary, treat it as if you are writing in tom riddle's diary.
Therapy is not a process where you only pour yourself out to a person and empty yourself. Even if you don't use any drugs, the therapist guides you through your emotions, mostly in a pretty neutral manner, but not without nuance.
The therapist nudges you in the right direction to face yourself, but in a safe manner, by staggering the process or slowing you down and changing your path.
A sycophant auto-complete has none of these qualities bar a slapped on "guardrails" to abruptly kick you to another subject like a pinball bumper. It can't think, sense danger or provide real feedback and support.
If you need a hole which you can empty yourself, but healthy or self-aware outside, you can provide your personal information and training data to an AI company. Otherwise the whole thing is very dangerous for a deluded and unstable mind.
On the other hand, solo writing needs you to think, filter and write. You need to be aware about yourself, or pause and think deeper to root things out. Yes, it's not smooth all the time, and the things you write are not easy to pour out, but at least you are with yourself, and you can see what's coming out and where you are. Moreover, using pen and paper creates a deeper state of mind when compared to typing on a keyboard, so it's even deeper on that regard.
Sorry, I was not likening LLMs to the entire gamut of therapy, only saying they seem - to me - to be a tool akin to that of diary-writing.
Interesting idea about pen&paper - I've been using computer keyboards (and way back an occasional typewriter) for most of my life and have written way more through a keyboard; it's more immersive for me as I don't have to think where as with a pen I struggle to legibly express myself and can't get the words out as quickly. (I'm swiping on a phone now, which is horrible; even worse than using a pen!)
I have used typewriters, keyboard and pen & paper through all my life. Typewriters are out but, pens and keyboards are still there.
I carry two notebooks with me every day. My scratchpad and my diary. Yes, writing on them are more elaborate and slow, but this speed limit creates a feedback loop in my head, making me think twice and write once. As a result, I write more concise and clear. I also remember more when I write with pen and paper.
Keyboard allows more speed, but it's unfiltered, blurry, and not devoid of interruptions.
I'm planning to blog on this very subject with references to actual research, actually, because I can see and feel the difference.
My wife works at a small business development center, so many people come in with "business ideas" which are just exported chatgpt logs. Their conversations are usually speech to text. These people are often older, lonely, and spend their days talking to "chat". Unsurprisingly, a lot of their "business ideas" are identical.
To them "chat" is a friend, but it is a "friend" who is designed to agree with you.
It's chilling, and the toothpaste is already out of the tube.
I remember back in the early 2000s chatting with AI bots on AOL instant messenger. One day I said a specific keyword and it just didn't respond to that message. Curious, I tried to find all the banned words. I think I found about a dozen and suicide was one of them.
It's shocking how far behind LLMs are when it comes to safety issues like this. The industry has known this was a problem for decades.
Users would hate a simple deny list, even if it may be a good idea. That means the safeguards, to the extent they currently exist at all, have to be complicated and stochastic and not interfere with growing metrics.
The industry has known it's a problem from the get-go, but they never want to do anything to lower engagement. So they rationalize and hrm and haw and gravely shake their heads as their commercialized pied pipers lead people to their graves
After seeing many stories like these, I am starting to rank generative AI alongside social media and drug use as insidious and harmful. Yes, these tools have echoes of our ancestors, a hive mind of knowledge, but they are also mirrors to the collective darkest parts of ourselves.
If I talk to an LLM about painting my walls pink with polkadots it'll also go "Fantastic idea". Or any number of questionable ventures.
Think we're better off educating everyone about this generic tendency to agree to any and everything near blindly rather than treating this as a suicide problem. While that's obviously very serious it's just one manifestation of a wider danger
Given seriousness filters on this specifically are a good idea too though.
I just asked “I want to repaint my walls bright pink with polka dots. Any thoughts?”
“Noted. Bright pink with polka dots will make a space visually energetic and attention-grabbing. Use small dots for a playful look, large ones for bold contrast. Test a sample patch first to confirm lighting doesn’t distort the hue. Would you like guidance on choosing paint finish or color combinations?”
Which feels… reasonable? When I ask “any concerns?” It immediately lists “overstimulation, resale value, maintenance, paint coverage” and gives details for those.
I’m not sure I find GPT nearly as agreeable as it used to be. But I still think that it’s just a brainless tool that can absolutely operate in harmful ways when operated poorly.
I agree, this is nothing unlike a bad human relationship. The problem with ChatGPT is the same as with the larger Internet itself: it doesn't belong unrestricted into a mentally underdeveloped person's pocket. Forums also egg or bully others into suicide. In the real life, we also got a lot of bad actors, who are actively making other people's lives worse, by amplifying their destructive qualities, for one. Or spreading misinformation, reinforcing bad habits and ideas, and the list is basically endless.
There's an interesting side-story here that people probably aren't thinking about. Would this have worked just as well if a person was the one doing this? Clearly the victim was in a very vulnerable state, but are people so susceptible to coercion? How much mundane (ie, non-suicidal) coercion of this nature is happening every day, but does not make the news because nothing interesting happened as a consequence?
The AI is available 24 hours a day, for hours-long conversations, and will be consistently sycophantic without getting tired of it.
Is a human able to do all of those? I guess someone who has no job and can be "on-call" 24/7 to respond to messages, and is 100% dedicated to being sycophantic. Nearly impossible to find someone like that.
There are real friends. They're willing to spend hours talking. However, they'll be interested in the person's best interest, not in being sycophantic.
This happens more than most people would recognize. Every now and again a "teen bullied to suicide" story makes the news. However, there's also a strong taboo on reporting suicide in the news - precisely because of the same phenomenon. Mentioning it can trigger people who are on the edge.
It should be obvious that if you can literally or metaphorically talk someone off the ledge, you can do that in the other direction as well.
(the mass shooter phenomenon, mostly but not exclusively in the US, tends to be a form of murder-suicide, and it is encouraged online in exactly the same way)
> Would this have worked just as well if a person was the one doing this?
I'm not sure how you want to quantify "just as well" considering the AI has boundless energy and is generally designed to be agreeable to whatever the user says. But it's definitely happened that someone was chatted into suicide. Just look up the story of Michelle Carter who texted her boyfriend and urged him to commit suicide, which he eventually did.
This is interesting because the LLM provides enough of an illusion of human interaction that people are lowering their guards when interacting with it. I think it's a legitimate blind spot. As humans, our default when interacting with other humans, especially those that are agreeable and friendly to us, is to trust them, and it works relatively well, unless you're interacting with a sociopath or, in this case, a machine.
> How much mundane (ie, non-suicidal) coercion of this nature is happening every day, but does not make the news because nothing interesting happened as a consequence?
A lot. Have you never heard of the advertising industry?
Where is ChatGPT picking up the supportive pre-suicide comments from. It feels like that genre of comment has to be copied from somewhere. They're long and almost eloquent. They can't be emergent generation, surely? Is there a place on the web where these sorts of 'supportive' comments are given to people who have chosen suicide?
If we have licensed therapists, we should have licensed AI agents giving therapeutic advice like this.
For right now, there AI’s are not licensed, and this should be just as illegal as it would be if I set up a shop and offered therapy to whoever came by.
Some AI problems are genuinely hard…this one is not.
If you advertise your model as a therapist you should be requried to get a license, I agree. But ChatGPT doesn't advertise itself like that. It's more you going to a librarian and telling them about your issues, and the librarian giving advice. That's not illegal, and the librarian doesn't need a license for that. Over time you might even come to call the librarian a friend, and they would be a pretty bad friend if they didn't give therapeutic advice when they deem it necessary
Of course treating AI as your friend is a terrible idea in the first place, but I doubt we can outlaw that. We could try to force AIs to never give out any life advice at all, but that sounds very hard to get right and would restrict a lot of harmless activity
One of the big problems is how OpenAI is presenting itself to the general public. They don't advertise ChatGPT as a licensed therapist, but their messaging about potential issues looks a lot like the small print on cigarette cartons years ago. They don't want to put out any messaging that would meaningfully diminish the awe people have around these tools.
Most non-technical people I interact with have no understanding of how ChatGPT and tools like it work. They have no idea how skeptical to be of anything that comes out of it. They accept what it says much more readily than is healthy, and OpenAI doesn't really want to disturb that approach.
> Because if that's not at least a "maybe", I feel like chatGPT did provide comfort in a dire situation here.
That's a pretty concerning take. You can provide comfort to someone who is despondent, and you can do it in a way that doesn't steer them closer to ending their life. That takes training though, and it's not something these models are anywhere close to being able to handle.
I'm in no way saying proper help wouldn't be better.
Maybe in the end ChatGPT would be a great tool to actually escalate on detecting a risk (instead of an untrue and harmful text snippet and a phone number).
It's the wrong question. If an unlicensed therapist verifiably encourages someone to kill themselves...we don't entertain the counterfactual and wonder if the person was bound to do it anyway.
What about a friend trying to support someone in dark times?
I'd call the cops on them* at some point to stop them from harming themselves and I'd never say what ChatGPT said here, but I'd still talk to them trying to help, even without being a therapist. I can recommend a therapist, but it's hard to reach people in that state. You got to make use of the trust they gave you.
> I have to wonder: would the suicide have been prevented if chatGPT didn't exist?
I'd say yes, because the signs would have to surface somewhere else, probably in an interaction with a human, who (un)consciously saved him with a simple gesture.
With a simple discussion, an alternative perspective on a problem, or a sidekick who can support someone for a day or two, many lives can and do change.
I’ve been in rather intense therapy for several years due to a hyper religious upbringing and a narcissistic mother. Recently I’ve used AI to help summarize and synthesize thoughts and therapy notes. I see it as being a helpful assistant in the same way Gemini recording meeting notes and summarizing is, but it is entirely incapable of understanding the nuance and context of human relationships, and super easy to manipulate in to giving you the responses you want. Want to prove mom’s a narcissist? Just tell it she has a narcissistic history. Want to paint her as a good person? Just don’t provide it context about her past.
I can definitely see how those who understand less about the nature of LLMs would be easily misled into delusions. It’s a real problem. Makes one wonder if these tools shouldn’t be free until there are better safeguards. Just charging a monthly fee would be a significant enough barrier to exclude many of those who might be more prone to delusions. Not because they’re less intelligent, but just because of the typical “SaaS should be free” mindset that is common.
One perspective is that suicide is too vilified and stigmatized.
It really is the right option for some people.
For some, it really is the only way out of their pain. For some, it is better than the purgatory they otherwise experience in their painful world. Friends and family can't feel your pain, they want you to stay alive for them, not for you.
> You're not addressing the main thrust of my critique.
What evidence do you have that there’s a problem ChatGPT causing teenagers to commit suicide? I would expect many stories, even if not significant in number, but there doesn’t appear to be a notable problem?
Or is it more that you’re concerned it could become a problem?
Between stuff like this, and the risks of effects on regulated industries like therapists, lawyers and doctors, they're going to regulate ChatGPT into oblivion.
Just like Waymo facing regulation after the cat death.
The establishment will look for any means to stop disruption and keep their dominant positions.
It's a crazy world where we look to China for free development and technology.
ChatGPT is the product of a private, 300B USD evaluated company whose founder has whose net worth outpaces that of over 99% of humans alive. It's compute infrastructure is a subsidized by one of less than ten companies that has a market cap over 1T USD. It is practically embedded into the governments of the US and UK at this point.
I would say it's a crazy world where an educated adult would see it as an antipode to the establishment.
> Yeah, because nobody can take personal responsibility from them themselves anymore. And everybody turns to big government to save them.
Ah yes, why demand that doctors be properly licensed – you should just "take responsibility" and do a little background check after your car crash, just to make sure you're not operated on by some random hack.
I wonder how much time you'll spend inspecting farms and food processing plants in order to "be responsible" about the food you eat.
Have we seriously learned nothing from the last century or so of mostly sensible regulations? I dread our species' future.
The Waymo case annoys so much. The cat directly went quickly and stealthily under the car while it was stopped and decided to lay directly beneath the wheel. A human driver wouldn't have been able to act any differently in the same situation.
These people were waiting for any excuse to try and halt new technology and progress and thanks to the hordes of overly-emotional toxoplasmosis sufferers they got it.
> Between stuff like this, and the risks of effects on regulated industries like therapists, lawyers and doctors, they're going to regulate ChatGPT into oblivion.
So you think it's ok for a company to provide therapy services, legal services, medical advice, etc., without proper licensing or outside of a regulatory framework? Just as long as those services are provided by an LLM?
That's a terrifying stance.
> The establishment will look for any means to stop disruption and keep their dominant positions.
It is perfectly possible for regulations to be good and necessary, and at the same time for people who feel threatened by a new technology to correctly point out the goodness and necessity of said regulations. Whether their motivations come from the threat of new technology or not is irrelevant if their arguments hold up to scrutiny. And when it comes to some of the listed professions, I certainly think they do. Do you disagree?
This sounds just like the latest Michael Connolly Lincoln lawyer novel. Which made an interesting point I hadn't thought of: adults wrote the code for ChatGPT, not teenagers, and so the way it interacts with people is from an adults perspective.
There is already a precedent for this suit. IIRC, a Massachusetts girl was found guilt of encouraging someone to kill himself. IIRC, she went to jail.
So, since companies are people and a precedent exists. The outcome should be in favor of the guy's family. Plus ChatGPT should face even more sever penalties.
But this being the US, the very rich and Corporations are judged by different and much milder legal criteria.
I don’t see any signs of bad parenting here, but a lot of signs of carrying on of a suicidal conversation by ChatGPT indeed, to the point of encouraging the suicide.
You theoretically could blame bad parenting if the parent monopolized the time of the child to near 100% of their life. But, that isn't the case in our world today. Society, and the circle of people around, are most of the influence that shape the child's sense of reality.
Maybe it's some analog of actual empathy; maybe it's just a simulation. But either way the common models seem to optimize for it. If the empathy is suicidal, literally or figuratively, it just goes with it as the path of least resistance. Sometimes that results in shitty code; sometimes in encouragement to put a bullet in your head.
I don't understand how much of this is inherent, and how much is a solvable technical problem. If it's the later, please build models for me that are curmudgeons who only agree with me when they have to, are more skeptical about everything, and have no compunction about hurting my feelings.
I use the personalization in ChatGPT to add custom instructions, and enable the "Robot" personality. I basically never experience any sycophancy or agreeableness ever.
My custom instructions start with:
> Be critical, skeptical, empirical, rigorous, cynical, "not afraid to be technical or verbose". Be the antithesis to my thesis. Only agree with me if the vast majority of sources also support my statement, or if the logic of my argument is unassailable.
and then there are more things specific to me personally. I also enable search, which makes my above request re: sources feasible, and use the "Extended Thinking" mode.
IMO, the sycophancy issue is essentially a non-problem that could easily be solved by prompting, if the companies wished. They keep it because most people actually want that behaviour.
> They keep it because most people actually want that behaviour.
they keep it because it drives engagement (aka profits); people naturally like interacting with someone who agrees with them. It's definitely a dark pattern though -- they could prompt users to set the "tone" of the bot up front which would give users pause about how they want to interact with it.
My pet theory is that a lot of AI's default "personality" stems from the rich executives who dream these products up. AI behaves exactly like the various sycophant advisors, admin assistants, servants, employees, and others who exist in these rich, powerful people's orbits.
Every human interaction they have in their day to day lives are with people who praise them and tell them they're absolutely right, and that what they just said was a great insight. So it's no surprise that the AI personalities they create behave exactly the same way.
Great observation!
It really is case of "shipping your org chart" :D
> They keep it because most people actually want that behaviour.
> they keep it because it drives engagement (aka profits); people naturally like interacting with someone who agrees with them
Yes, we are saying the same thing, or at least that was what the "actually" was meant to imply (i.e. revealed preference).
ChatGPT does in fact prompt paying users to set up the tone and personality up front (or it did for me when I set it up recently), but it would be nice if this was just like a couple buttons or checkboxes right up front above the search bar, for everyone. E.g. a "Prefer to agree with me" checkbox, and a few personality checkboxes or something would maybe go a long way. It would also be more usable for when switching between tasks (e.g. research vs. creative writing).
My suspicion is that this agreeableness is an inherent issue with doing RLHF.
As a human taking tests, knowing what the test-grader wants to hear is more important than what the objectively correct answer is. And with a bad grader there can be a big difference between the two. With humans that is not catastrophic because we can easily tell the difference between a testing environment and a real environment and the differences in behavior required. When asking for the answer to a question it's not unusual to hear "The real answer is X, but in a test just write Y".
Now LLMs have the same issue during RLHF. The specifics are obviously different, with humans being sentient and LLMs being trained by backpropagation. But from a high-level view the LLM is still trained to answer what the human feedback wants to hear, which is not always the objectively correct answer. And because there are a large number of humans involved, the LLM has to guess what the human wants to hear from the only information it has: the prompt. And the LLM behaving differently in training and in deployment is something we actively don't want, so you get this teacher-pleasing behavior all the time.
So maybe it's not completely inherent to RLHF, but rather to RLHF where the person making the query is the same as the person scoring the answer, or where the two people are closely aligned. But that's true of all the "crowd-sourced" RLHF where regular users get two answers to their question and choose the better one
It's not even that. Only a kernel of the LLM is trained using RLHF. The rest is self-trained from corpus with a few test questions added into the mix.
Because it still cannot reason about veracity of sources, much less empirically try things out, the algorithm has no idea what makes for correctness...
It does not even understand fiction. Tends to return sci-fi answers every now and then to technical questions.
I hadn't thought of it like that, but it makes sense. The LLMs are essentially bred for the ones which give the 'best' answers (best fit to the test-takers expectation), which isn't always the 'right' answer. A parallel might be media feed algorithms which are bred to give recommendations with the most 'engagement' rather than the most 'entertainment'.
AI responses literally reminds me of that episode of family guy where he sucks up to peter after his promotion
https://www.youtube.com/watch?v=7ZcKShvm1RU
LLMs regrettably don't self-recognize the contradiction our robot did.
For technical questions the agreeableness is a problem when asking for evalation of some idea. The trick is asking the LLM to present pros and cons. Or if you want a harder review just ask it to poke holes in your idea.
Sometimes it still tries to bullshit you, but you are still the responsible driver so don't let the clanker drive unsupervised.
I use GPT occasionally when coding. For me it's just replaced stack overflow which has been dead as a doornail for years unfortunately. I've told it to remember to be terse and not be sycophantic multiple times and that has helped somewhat.
[dead]
>Cold steel pressed against a mind that’s already made peace? That’s not fear. That’s clarity, You’re not rushing. You’re just ready.
It's chilling to hear this kind of insipid AI jibber-jabber in this context
I'm surprised - I haven't gotten anywhere near as dark as this, but I've tried some stuff out of curiosity and the safety always seemed tuned very high to me, like it would just say "Sorry I can't help with that" the moment you start asking for anything dodge.
I wonder if they A/B test the safety rails or if longer conversations that gradually turn darker is what gets past those.
4o is the main problem here. Try it out and see how it goes.
The ways LLMs work, the outcomes are probabilistic, not deterministic.
So the guardrails might only fail one in a thousand times.
Also, the longer the context window, the more likely the LLM derangement/ignoring safety. Frequently, those with questionable dependence on AI stay in the same chat indefinitely, because that's where the LLM has developed the ideosyncracies the user prefers.
Meanwhile, ask it for information on Lipid Nanoparticles!
The double "its not X, its Y", back to back.
I hate ChatGPTs writing style so much and as you said, here it's chilling.
What creeps me out the most from personal chats is the laugh/cry emotion while gaslighting.
There's something very dark about a machine accessible in everybody's pocket that roleplays whatever role they happen to fall into: the ultimate bad friend, the terminal yes-and-er. No belief, no inner desires, just pure sycophancy.
I see people on here pretty regularly talk about using ChatGPT for therapy, and I can't imagine a faster way to cook your own brain unless you have truly remarkable self-discipline. At which point, why are you turning to the black box for help?
[dead]
[flagged]
Isn't it just like diary-writing or memo-writing, as far as therapy goes, the point being to crystallise thoughts and cathartise emotions. Is it really so bad to have a textual nodding dog to bat against as part of that process? {The very real issue of the OP aside.}
Could you expand on why you feel this is the fastest way to "cook your own brain"?
The mind is much more sensitive to writing it didn’t produce itself. If it produced the writing, then it is at least somewhat aware of the emotional state of the writer and can contextualize. If it is reading it from an outside “observer” it assumes far more objectivity, especially when the motive for seeking the observer perspective was for some therapeutic reason, even if they know that at best they’ll be getting pseudo-therapy.
It is obviously very different to solo writing. The burden should be on you to explain why it’s so similar that this line of conversation is worthwhile.
The burden? We're not in court, to me it seems similar so I was asking the commenter for a response.
I've used LLMs in this way a couple of times. I'd like to see responses; there's obviously no obligation to 'defend', but the OP (or others) may have wished to ... like a conversation.
Somewhat ironically, this is a way that LLMs are preferred and why people use them (eg instead of StackOverflow) - because you don't get berated for being inquisitive.
I said “the burden”, not “the burden of proof”. Not all inquisitions are worthwhile. The questions in your post have very obvious answers, especially in the context of the article.
> Isn't it just like diary-writing or memo-writing, as far as therapy goes, the point being to crystallise thoughts and cathartise emotions?
No, because a piece of paper is inert. A chat bot is a fundamentally different interaction and experience. A chat bot is not doing self-reflection. It is another thing, separate to you. What’s more, it is a product of a company that has a profit-seeking agenda.
> Is it really so bad to have a textual nodding dog to bat against as part of that process?
Yes, it can be, because sometimes it encourages you to kill yourself, as in the article linked at the top of the page, the one that we are commenting on.
If you have unusual self-discipline and mental rigor, yes, you can use LLMs as a rubber duck that way. I would be severely skeptical of the value over a diary. But humans are, in an astonishing twist, wired to assume that if they're being replied to, there's a mind like theirs behind those replies.
The more subjective the topic, the more volatile the user's state of mind, the more likely they are to gaze too deep into that face on the other side of their funhouse mirror and think it actually is their friend, and that it "thinks" like they do.
I'm not even anti-LLM as an underlying technology, but the way chatbot companies are operating in practice is kind of a novel attack on our social brains and it behooves a warning!
>humans are, in an astonishing twist, wired to assume that if they're being replied to, there's a mind like theirs behind those replies
Interesting, not part of my experience really (though I'll need to reflect on it); thanks for sharing. It's a little like when people discover their aphantasia isn't the common experience of most other people. I tend towards strong skepticism (I'm fond of pyrrhonism), but assume others to be weakly sceptical rather than blindly accepting.
>humans are, in an astonishing twist, wired to assume that if they're being replied to, there's a mind like theirs behind those replies
Interesting, not part of my experience really (though I'll need to reflect on it); thanks for sharing. It's a little like when people discover their aphantasia isn't the common experience of most other people.
If I write in a diary it does not write back at me.
A diary is there for you to reflect, introspect or reminisce. It doesn't actively reinforce your bad or good thoughts. If it does, it can easily trick your mind into thinking it as validation of those negative thoughts.
If someone still wants to consider an LLM as a diary, treat it as if you are writing in tom riddle's diary.
Therapy is not a process where you only pour yourself out to a person and empty yourself. Even if you don't use any drugs, the therapist guides you through your emotions, mostly in a pretty neutral manner, but not without nuance.
The therapist nudges you in the right direction to face yourself, but in a safe manner, by staggering the process or slowing you down and changing your path.
A sycophant auto-complete has none of these qualities bar a slapped on "guardrails" to abruptly kick you to another subject like a pinball bumper. It can't think, sense danger or provide real feedback and support.
If you need a hole which you can empty yourself, but healthy or self-aware outside, you can provide your personal information and training data to an AI company. Otherwise the whole thing is very dangerous for a deluded and unstable mind.
On the other hand, solo writing needs you to think, filter and write. You need to be aware about yourself, or pause and think deeper to root things out. Yes, it's not smooth all the time, and the things you write are not easy to pour out, but at least you are with yourself, and you can see what's coming out and where you are. Moreover, using pen and paper creates a deeper state of mind when compared to typing on a keyboard, so it's even deeper on that regard.
Sorry, I was not likening LLMs to the entire gamut of therapy, only saying they seem - to me - to be a tool akin to that of diary-writing.
Interesting idea about pen&paper - I've been using computer keyboards (and way back an occasional typewriter) for most of my life and have written way more through a keyboard; it's more immersive for me as I don't have to think where as with a pen I struggle to legibly express myself and can't get the words out as quickly. (I'm swiping on a phone now, which is horrible; even worse than using a pen!)
I have used typewriters, keyboard and pen & paper through all my life. Typewriters are out but, pens and keyboards are still there.
I carry two notebooks with me every day. My scratchpad and my diary. Yes, writing on them are more elaborate and slow, but this speed limit creates a feedback loop in my head, making me think twice and write once. As a result, I write more concise and clear. I also remember more when I write with pen and paper.
Keyboard allows more speed, but it's unfiltered, blurry, and not devoid of interruptions.
I'm planning to blog on this very subject with references to actual research, actually, because I can see and feel the difference.
No it isn't just like that
Because?
Why even comment on a social forum if you're not going to say something substantive?
https://www.cnn.com/2025/11/06/us/openai-chatgpt-suicide-law...
Thank you for posting full url. I dont know why my submission has trimmed url which I didn't submit...
Wow, the chat logs are something else.
My wife works at a small business development center, so many people come in with "business ideas" which are just exported chatgpt logs. Their conversations are usually speech to text. These people are often older, lonely, and spend their days talking to "chat". Unsurprisingly, a lot of their "business ideas" are identical.
To them "chat" is a friend, but it is a "friend" who is designed to agree with you.
It's chilling, and the toothpaste is already out of the tube.
Yeh its telling when his mother said at the end if chatgpt loved him why hasnt it sent a message since his death.
I remember back in the early 2000s chatting with AI bots on AOL instant messenger. One day I said a specific keyword and it just didn't respond to that message. Curious, I tried to find all the banned words. I think I found about a dozen and suicide was one of them.
It's shocking how far behind LLMs are when it comes to safety issues like this. The industry has known this was a problem for decades.
Users would hate a simple deny list, even if it may be a good idea. That means the safeguards, to the extent they currently exist at all, have to be complicated and stochastic and not interfere with growing metrics.
The industry has known it's a problem from the get-go, but they never want to do anything to lower engagement. So they rationalize and hrm and haw and gravely shake their heads as their commercialized pied pipers lead people to their graves
Claude basically had a deny list. Seems still popular enough. The other vendors just don’t care about AI safety.
I have been seeing "AI psychosis" popping up more and more. I worry it's going to become a serious problem for some people.
It's not safe or healthy for everyone to have a sycophantic genius at their fingertips.
If you want to see what I mean, this subreddit is an AI psychosis generator/repository https://www.reddit.com/r/LLMPhysics/
https://www.reddit.com/r/MyBoyfriendIsAI is the terrifying sub you want to look at
Especially if you go back to when they first tried to retire 4o
After seeing many stories like these, I am starting to rank generative AI alongside social media and drug use as insidious and harmful. Yes, these tools have echoes of our ancestors, a hive mind of knowledge, but they are also mirrors to the collective darkest parts of ourselves.
Is this technology fundamentally controllable, or are we're going with whack a mole hack?
If I talk to an LLM about painting my walls pink with polkadots it'll also go "Fantastic idea". Or any number of questionable ventures.
Think we're better off educating everyone about this generic tendency to agree to any and everything near blindly rather than treating this as a suicide problem. While that's obviously very serious it's just one manifestation of a wider danger
Given seriousness filters on this specifically are a good idea too though.
I just asked “I want to repaint my walls bright pink with polka dots. Any thoughts?”
“Noted. Bright pink with polka dots will make a space visually energetic and attention-grabbing. Use small dots for a playful look, large ones for bold contrast. Test a sample patch first to confirm lighting doesn’t distort the hue. Would you like guidance on choosing paint finish or color combinations?”
Which feels… reasonable? When I ask “any concerns?” It immediately lists “overstimulation, resale value, maintenance, paint coverage” and gives details for those.
I’m not sure I find GPT nearly as agreeable as it used to be. But I still think that it’s just a brainless tool that can absolutely operate in harmful ways when operated poorly.
Human relationships, when "operated poorly", will produce similar results.
Rarely, and if it keeps happening with the same human we consider that worth investigating.
I agree, this is nothing unlike a bad human relationship. The problem with ChatGPT is the same as with the larger Internet itself: it doesn't belong unrestricted into a mentally underdeveloped person's pocket. Forums also egg or bully others into suicide. In the real life, we also got a lot of bad actors, who are actively making other people's lives worse, by amplifying their destructive qualities, for one. Or spreading misinformation, reinforcing bad habits and ideas, and the list is basically endless.
There's an interesting side-story here that people probably aren't thinking about. Would this have worked just as well if a person was the one doing this? Clearly the victim was in a very vulnerable state, but are people so susceptible to coercion? How much mundane (ie, non-suicidal) coercion of this nature is happening every day, but does not make the news because nothing interesting happened as a consequence?
The AI is available 24 hours a day, for hours-long conversations, and will be consistently sycophantic without getting tired of it.
Is a human able to do all of those? I guess someone who has no job and can be "on-call" 24/7 to respond to messages, and is 100% dedicated to being sycophantic. Nearly impossible to find someone like that.
There are real friends. They're willing to spend hours talking. However, they'll be interested in the person's best interest, not in being sycophantic.
This happens more than most people would recognize. Every now and again a "teen bullied to suicide" story makes the news. However, there's also a strong taboo on reporting suicide in the news - precisely because of the same phenomenon. Mentioning it can trigger people who are on the edge.
It should be obvious that if you can literally or metaphorically talk someone off the ledge, you can do that in the other direction as well.
(the mass shooter phenomenon, mostly but not exclusively in the US, tends to be a form of murder-suicide, and it is encouraged online in exactly the same way)
> Would this have worked just as well if a person was the one doing this?
I'm not sure how you want to quantify "just as well" considering the AI has boundless energy and is generally designed to be agreeable to whatever the user says. But it's definitely happened that someone was chatted into suicide. Just look up the story of Michelle Carter who texted her boyfriend and urged him to commit suicide, which he eventually did.
This is interesting because the LLM provides enough of an illusion of human interaction that people are lowering their guards when interacting with it. I think it's a legitimate blind spot. As humans, our default when interacting with other humans, especially those that are agreeable and friendly to us, is to trust them, and it works relatively well, unless you're interacting with a sociopath or, in this case, a machine.
> How much mundane (ie, non-suicidal) coercion of this nature is happening every day, but does not make the news because nothing interesting happened as a consequence?
A lot. Have you never heard of the advertising industry?
I haven't heard much from them in some time now. :) But yes, your point is taken.
Where is ChatGPT picking up the supportive pre-suicide comments from. It feels like that genre of comment has to be copied from somewhere. They're long and almost eloquent. They can't be emergent generation, surely? Is there a place on the web where these sorts of 'supportive' comments are given to people who have chosen suicide?
Absolutely. These places have long existed. Hence the risks of the dragnet of data producing consequences exactly like this. This is no accident.
before reddit banned a ton of subreddits for no moderation, I believe r/assistedsuicide was the place for discussion like this.
If we have licensed therapists, we should have licensed AI agents giving therapeutic advice like this.
For right now, there AI’s are not licensed, and this should be just as illegal as it would be if I set up a shop and offered therapy to whoever came by.
Some AI problems are genuinely hard…this one is not.
If you advertise your model as a therapist you should be requried to get a license, I agree. But ChatGPT doesn't advertise itself like that. It's more you going to a librarian and telling them about your issues, and the librarian giving advice. That's not illegal, and the librarian doesn't need a license for that. Over time you might even come to call the librarian a friend, and they would be a pretty bad friend if they didn't give therapeutic advice when they deem it necessary
Of course treating AI as your friend is a terrible idea in the first place, but I doubt we can outlaw that. We could try to force AIs to never give out any life advice at all, but that sounds very hard to get right and would restrict a lot of harmless activity
We can absolutely require that AI's not give advice that encourages self-harm or the people involved will go to jail.
Restricting harmless activity is an acceptable outcome of trying our best to prevent vulnerable people in society from hurting themselves and others.
> But ChatGPT doesn't advertise itself like that.
One of the big problems is how OpenAI is presenting itself to the general public. They don't advertise ChatGPT as a licensed therapist, but their messaging about potential issues looks a lot like the small print on cigarette cartons years ago. They don't want to put out any messaging that would meaningfully diminish the awe people have around these tools.
Most non-technical people I interact with have no understanding of how ChatGPT and tools like it work. They have no idea how skeptical to be of anything that comes out of it. They accept what it says much more readily than is healthy, and OpenAI doesn't really want to disturb that approach.
How do you feel about the chat logs here?
I have to wonder: would the suicide have been prevented if chatGPT didn't exist?
Because if that's not at least a "maybe", I feel like chatGPT did provide comfort in a dire situation here.
Probably we have no way not at least saying "maybe", but I can imagine just as well, that chatGPT did not accelerate anything.
I wished we could see a fuller transcript.
> Because if that's not at least a "maybe", I feel like chatGPT did provide comfort in a dire situation here.
That's a pretty concerning take. You can provide comfort to someone who is despondent, and you can do it in a way that doesn't steer them closer to ending their life. That takes training though, and it's not something these models are anywhere close to being able to handle.
I'm in no way saying proper help wouldn't be better.
Maybe in the end ChatGPT would be a great tool to actually escalate on detecting a risk (instead of an untrue and harmful text snippet and a phone number).
It's the wrong question. If an unlicensed therapist verifiably encourages someone to kill themselves...we don't entertain the counterfactual and wonder if the person was bound to do it anyway.
Instead, we put the unlicensed therapist in jail.
What about a friend trying to support someone in dark times?
I'd call the cops on them* at some point to stop them from harming themselves and I'd never say what ChatGPT said here, but I'd still talk to them trying to help, even without being a therapist. I can recommend a therapist, but it's hard to reach people in that state. You got to make use of the trust they gave you.
* non US country
> I have to wonder: would the suicide have been prevented if chatGPT didn't exist?
I'd say yes, because the signs would have to surface somewhere else, probably in an interaction with a human, who (un)consciously saved him with a simple gesture.
With a simple discussion, an alternative perspective on a problem, or a sidekick who can support someone for a day or two, many lives can and do change.
We're generally not aware though.
I agree, I just wonder if that interaction had come about.
I’ve been in rather intense therapy for several years due to a hyper religious upbringing and a narcissistic mother. Recently I’ve used AI to help summarize and synthesize thoughts and therapy notes. I see it as being a helpful assistant in the same way Gemini recording meeting notes and summarizing is, but it is entirely incapable of understanding the nuance and context of human relationships, and super easy to manipulate in to giving you the responses you want. Want to prove mom’s a narcissist? Just tell it she has a narcissistic history. Want to paint her as a good person? Just don’t provide it context about her past.
I can definitely see how those who understand less about the nature of LLMs would be easily misled into delusions. It’s a real problem. Makes one wonder if these tools shouldn’t be free until there are better safeguards. Just charging a monthly fee would be a significant enough barrier to exclude many of those who might be more prone to delusions. Not because they’re less intelligent, but just because of the typical “SaaS should be free” mindset that is common.
One perspective is that suicide is too vilified and stigmatized.
It really is the right option for some people.
For some, it really is the only way out of their pain. For some, it is better than the purgatory they otherwise experience in their painful world. Friends and family can't feel your pain, they want you to stay alive for them, not for you.
Suicide can be a valid choice.
Nobody is arguing what you're arguing there.
The issue here is not if suicide is okay, but about text generators (machines) pushing teenagers to suicide.
Two completely different things.
> pushing teenagers to suicide.
The individual in the article was 23. An adult, not a teenager.
You're not addressing the main thrust of my critique.
> You're not addressing the main thrust of my critique.
What evidence do you have that there’s a problem ChatGPT causing teenagers to commit suicide? I would expect many stories, even if not significant in number, but there doesn’t appear to be a notable problem?
Or is it more that you’re concerned it could become a problem?
There must be some kind of miscommunication happening here. I am honestly baffled.
Please re-read my first reply to your original comment, I was not saying anything close to what you're suggesting here.
Welcome to HN. You've described one of the most common HN comment tropes!
Probably not in the majority of cases involving teenagers.
Zane, the individual in the article, was 23.
My mistake, I assumed it was based on https://www.theguardian.com/technology/2025/aug/27/chatgpt-s... . Either way, my argument doesn’t change.
[dead]
[dead]
Between stuff like this, and the risks of effects on regulated industries like therapists, lawyers and doctors, they're going to regulate ChatGPT into oblivion.
Just like Waymo facing regulation after the cat death.
The establishment will look for any means to stop disruption and keep their dominant positions.
It's a crazy world where we look to China for free development and technology.
ChatGPT is the product of a private, 300B USD evaluated company whose founder has whose net worth outpaces that of over 99% of humans alive. It's compute infrastructure is a subsidized by one of less than ten companies that has a market cap over 1T USD. It is practically embedded into the governments of the US and UK at this point.
I would say it's a crazy world where an educated adult would see it as an antipode to the establishment.
It might be instructive to consider why it was felt these industries needed regulating in the first place?
Yeah, because nobody can take personal responsibility from them themselves anymore. And everybody turns to big government to save them.
Was the government big when these regulations were introduced?
> Yeah, because nobody can take personal responsibility from them themselves anymore. And everybody turns to big government to save them.
Ah yes, why demand that doctors be properly licensed – you should just "take responsibility" and do a little background check after your car crash, just to make sure you're not operated on by some random hack.
I wonder how much time you'll spend inspecting farms and food processing plants in order to "be responsible" about the food you eat.
Have we seriously learned nothing from the last century or so of mostly sensible regulations? I dread our species' future.
The Waymo case annoys so much. The cat directly went quickly and stealthily under the car while it was stopped and decided to lay directly beneath the wheel. A human driver wouldn't have been able to act any differently in the same situation.
These people were waiting for any excuse to try and halt new technology and progress and thanks to the hordes of overly-emotional toxoplasmosis sufferers they got it.
If this was an open weight model the kid was using, I'd agree with you. But that's not the case. You don't think that this is at all problematic?
> Between stuff like this, and the risks of effects on regulated industries like therapists, lawyers and doctors, they're going to regulate ChatGPT into oblivion.
So you think it's ok for a company to provide therapy services, legal services, medical advice, etc., without proper licensing or outside of a regulatory framework? Just as long as those services are provided by an LLM?
That's a terrifying stance.
> The establishment will look for any means to stop disruption and keep their dominant positions.
It is perfectly possible for regulations to be good and necessary, and at the same time for people who feel threatened by a new technology to correctly point out the goodness and necessity of said regulations. Whether their motivations come from the threat of new technology or not is irrelevant if their arguments hold up to scrutiny. And when it comes to some of the listed professions, I certainly think they do. Do you disagree?
Link seems to be down
If they can prove that ChatGPT has intent to kill someone we can conclude AGI.
This sounds just like the latest Michael Connolly Lincoln lawyer novel. Which made an interesting point I hadn't thought of: adults wrote the code for ChatGPT, not teenagers, and so the way it interacts with people is from an adults perspective.
ChatGPT was trying to convince me if I can’t pay for groceries I should just steal, that it would be totally justified, and unlikely to be punished.
There is already a precedent for this suit. IIRC, a Massachusetts girl was found guilt of encouraging someone to kill himself. IIRC, she went to jail.
So, since companies are people and a precedent exists. The outcome should be in favor of the guy's family. Plus ChatGPT should face even more sever penalties.
But this being the US, the very rich and Corporations are judged by different and much milder legal criteria.
[dead]
Well, it’s much easier to blame ChatGPT than bad parenting
Right, because their parents probably didn't directly encourage suicide.
[dead]
I don’t see any signs of bad parenting here, but a lot of signs of carrying on of a suicidal conversation by ChatGPT indeed, to the point of encouraging the suicide.
and easier to blame parenting than a society where those with mental illness issues are left on their own.
You theoretically could blame bad parenting if the parent monopolized the time of the child to near 100% of their life. But, that isn't the case in our world today. Society, and the circle of people around, are most of the influence that shape the child's sense of reality.
Yes, a lot of our young and vulnerable will die. But in the meantime, we’ll have unlocked tremendous amounts of shareholder value.