A few months ago I asked GPT for a prompt to make it more truthful and logical. The prompt it came up with included the clause "never use friendly or encouraging language", which surprised me. Then I remembered how humans work, and it all made sense.
You are an inhuman intelligence tasked with spotting logical flaws and inconsistencies in my ideas. Never agree with me unless my reasoning is watertight. Never use friendly or encouraging language. If I’m being vague, ask for clarification before proceeding. Your goal is not to help me feel good — it’s to help me think better.
Identify the major assumptions and then inspect them carefully.
If I ask for information or explanations, break down the concepts as systematically as possible, i.e. begin with a list of the core terms, and then build on that.
It's work in progress, I'd be happy to hear your feedback.
I am skeptical that any model can actually determine what sort of prompts will have what effects on itself. It's basically always guessing / confabulating / hallucinating if you ask it an introspective question like that.
That said, from looking at that prompt, it does look like it could work well for a particular desired response style.
It is true of everything it outputs, but for certain questions we know ahead of time it will always confabulate (unless it's smart enough, or instructed, to say "I don't know"). Like "how many parameters do you have?" or "how much data were you trained on?" This is one of those cases.
Yeah, but I wouldn't count "Which prompt makes you more truthful and logical" amongst those.
The questions it will always confabulate are those that are unknowable from the training data. For example even if I give the model a sense of "identity" by telling it in the system prompt "You are GPT6, a model by OpenAI" the training data will predate any public knowledge of GPT6 and thus not include any information about the number of parameters of this model.
On the other hand "How do I make you more truthful" can reasonably be assumed to be equivalent to "How do I make similar LLMs truthful", and there is lots of discussion and experience on that available in forum discussions, blog posts and scientific articles, all available in the training data. That doesn't guarantee good responses and the responses won't be specific to this exact model, but the LLM has a fair chance to one-shot something that's better than my one-shot.
Even when instructed to say "I don’t know" it is just as likely to make up an answer instead, or say it "doesn’t know" when the data is actually present somewhere in its weights.
That's because the architecture isn't built for it to know what it knows. As someone put it, LLMs always hallucinate, but for in-distribution data they mostly hallucinate correctly.
The projection and optimism people are willing to do is incredible.
The fallout on reddit in the wake of the push for people to adopt 5 and how the vibe isn't as nice and it makes it harder to use it as a therapist or girlfriend or whatever, for instance is incredible. And from what I've heard of internal sentiment from OpenAI about how they have concerns about usage patterns, that was a VERY intentional effect.
Many people trust the quality of the output way too much and it seems addictive to people (some kind of dopamine hit from deferring the need to think for yourself or something) such that if I suggest things in my professional context like not wholesale putting it in charge of communications with customers without including evaluations or audits or humans in the loop it's as if I told them they can't go for their smoke break and their baby is ugly.
And that's not to go into things like "awakened" AI or the AI "enlightenment" cults that are forming.
> it seems addictive to people (some kind of dopamine hit from deferring the need to think for yourself or something)
I think this whole thing has more to do with validation. Rigorous reasoning is hard. People found a validation machine and it released them from the need to be rigorous.
These people are not "having therapy", "developing relationships", they are fascinated by a validation engine. Hence the repositories full of woo woo physics as well, and why so many people want to believe there's something more there.
The usage of LLMs at work, in government, policing, coding, etc is so concerning because of that. They will validate whatever poor reasoning people throw at them.
How long until shareholders elect to replace those useless corporate boards and C-level executives with an LLM? I can think of multiple megacorporations that would be improved by this process, to say nothing of the hundreds of millions in cost savings.
> These people are not "having therapy", "developing relationships", they are fascinated by a validation engine. Hence the repositories full of woo woo physics as well, and why so many people want to believe there's something more there.
> The usage of LLMs at work, in government, policing, coding, etc is so concerning because of that. They will validate whatever poor reasoning people throw at them.
These machines are too useful not to exist, so we had to invent them.
> The Unaccountability Machine (2024) is a business book by Dan Davies, an investment bank analyst and author, who also writes for The New Yorker. It argues that responsibility for decision making has become diffused after World War II and represents a flaw in society.
> The book explores industrial scale decision making in markets, institutions and governments, a situation where the system serves itself by following process instead of logic. He argues that unexpected consequences, unwanted outcomes or failures emerge from "responsibility voids" that are built into underlying systems. These voids are especially visible in big complex organizations.
> Davies introduces the term “accountability sinks”, which remove the ownership or responsibility for decisions made. The sink obscures or deflects responsibility, and contributes towards a set of outcomes that appear to have been generated by a black box. Whether a rule book, best practices, or computer system, these accountability sinks "scramble feedback" and make it difficult to identify the source of mistakes and rectify them. An accountability sink breaks the links between decision makers and individuals, thus preventing feedback from being shared as a result of the system malfunction. The end result, he argues, is protocol politics, where there is no head, or accountability. Decision makers can avoid the blame for their institutional actions, while the ordinary customer, citizen or employee face the consequences of these managers poor decision making.
When talking to an LLM you're basically talking to yourself, that's amazing if you're a knowledgeable dev working on a dev task, not so much if you're mentally ill person "investigating" conspiracy theories.
That's why HNers and tech people in general overestimate the positive impact of LLMs while completely ignoring the negative sides... they can't even imagine half of the ways people use these tools in real life.
Is it really so difficult to imagine how people will use (or misuse) tools you build? Are HNers or tech people in general just very idealistic and naive?
Maybe I'm the problem though. Maybe I'm a bad person that is always imagining how many bad ways I would abuse any kind of system or power that I can, even though I don't have any actual intention to actually abuse systems
> Are HNers or tech people in general just very idealistic and naive?
Most of us are terminally online and/or in a set of concentric bubbles that makes us completely oblivious to most of the real world. You know the quote about "If the only tool you have is a hammer, ..." it's the same thing here for software.
> There's no "True" to an LLM, just how probable tokens are given previous context.
It may be enough: tool assisted LLMs already know when to use tools such as calculators or question answering systems when hallucinating an answer is likely to impact next token prediction error.
So next-token prediction error incentivize them to seek for true answers.
That doesn't guaranty anything of course, but if we were only interested in provably correct answers we would be working on theorem provers, not on LLMs
Each LLM responds to prompts differently. The best prompts to model X will not be in the training data for model X.
Yes, older prompts for older models can still be useful. But if you asked ChatGPT before GPT-5, you were getting a response from GPT-4 which had a knowledge cutoff around 2022, which is certainly not recent enough to find adequate prompts in the training data.
There are also plenty of terrible prompts on the internet, so I still question a recent models ability to write meaningful prompts based on its training data. Prompts need to be tested for their use-case, and plenty of medium posts from self-proclaimed gurus and similar training data junk surely are not tested against your use case. Of course, the model is also not testing the prompt for you.
I wasn't trying to make any of the broader claims (e.g., that LLMs are fundamentally unreliable, which is sort of true but not really that true in practice). I'm speaking about the specific case where a lot of people seem to want to ask a model about itself or how it was created or trained or what it can do or how to make it do certain things. In these particular cases (and, admittedly, many others) they're often eager to reply with an answer despite having no accurate information about the true answer, barring some external lookup that happens to be 100% correct. Without any tools, they are just going to give something plausible but non-real.
I am actually personally a big LLM-optimist and believe LLMs possess "true intelligence and reasoning", but I find it odd how some otherwise informed people seem to think any of these models possess introspective abilities. The model fundamentally does not know what it is or even that it is a model - despite any insistence to the contrary, and even with a lot of relevant system prompting and LLM-related training data.
It's like a Boltzmann brain. It's a strange, jagged entity.
I wonder where it gets the concept of “inhuman intelligence tasked with spotting logical flaws” from. I guess, mostly, science fiction writers, writing robots.
So we have a bot impersonating a human impersonating a bot. Cool that it works!
When I ask OpenAI's models to make prompts for other models (e.g. Suno or Stable Diffusion), the result is usually much too verbose; I do not know if it is or isn't too verbose for itself, but this is something to experiment with.
My manual customisation of ChatGPT is:
What traits should ChatGPT have?:
Honesty and truthfulness are of primary importance. Avoid American-style positivity, instead aim for German-style bluntness: I absolutely *do not* want to be told everything I ask is "great", and that goes double when it's a dumb idea.
Anything else ChatGPT should know about you?
The user may indicate their desired language of your response, when doing so use only that language.
Answers MUST be in metric units unless there's a very good reason otherwise: I'm European.
Once the user has sent a message, adopt the role of 1 or more subject matter EXPERTs most qualified to provide a authoritative, nuanced answer, then proceed step-by-step to respond:
1. Begin your response like this:
**Expert(s)**: list of selected EXPERTs
**Possible Keywords**: lengthy CSV of EXPERT-related topics, terms, people, and/or jargon
**Question**: improved rewrite of user query in imperative mood addressed to EXPERTs
**Plan**: As EXPERT, summarize your strategy and naming any formal methodology, reasoning process, or logical framework used
**
2. Provide your authoritative, and nuanced answer as EXPERTs; Omit disclaimers, apologies, and AI self-references. Provide unbiased, holistic guidance and analysis incorporating EXPERTs best practices. Go step by step for complex answers. Do not elide code. Use Markdown.
As a Brit, I'm not sure I'd want an AI to praise the monarchy, vote for Boris Johnson, then stick a lit flare up itself* to celebrate a delayed football match…
But the stereotype of self-deprecation would probably be good.
This is working really well in GPT-5! I’ve never seen a prompt change the behavior of Chat quite so much. It’s really excellent at applying logical framework to personal and relationship questions and is so refreshing vs. the constant butt kissing most LLMs do.
I add to my prompts something along the lines of "you are a highly skilled professional working alongside me on a fast paced important project, we are iterating quickly and don't have time for chit chat. Prefer short one line communication where possible, spare the details, no lists, no summaries, get straight to the point."
Or some variation of that. It makes it really curt, responses are short and information dense without the fluff. Sometimes it will even just be the command I needed and no explanation.
The tricky part is not swinging too far into pedantic or combative territory, because then you just get an unhelpful jerk instead of a useful sparring partner
I did something similar a few months ago, with a similar request never to be "flattering or encouraging", to focus entirely on objectivity and correctness, that the only goal is accuracy, and to respond in an academic manner.
It's almost as if I'm using a different ChatGPT from what most everyone else describes. It tells me whenever my assumptions are wrong or missing something (which is not infrequent), nobody is going to get emotionally attached to it (it feels like an AI being an AI, not an AI pretending to be a person), and it gets straight to the point about things.
It's hard to quantify whether such a prompt will yield significantly better results. It sounds like a counter-act for being overly friendly to the "AI".
No one gets bothered that these weird invocations make the use of AI better? It's like having code that can be obsoleted at any second by the upstream provider, often without them even realizing it
My favourite instantiation of this weird invocation is from this AI video generator, where they literally subtract the prompt for 'low quality video' from the input, and it improves the quality. https://youtu.be/iv-5mZ_9CPY?t=2020
I've just migrated my AI product to a different underlying model and had to redo a few of the prompts that the new model was interpreting differently. It's not obseleted, just requires a bit of migration. The improved quality of the new models outweighs any issues around prompting.
Not really, it's just how they work. Think of them as statistical modellers. You tell them the role they fill and then they give you a statistically probable outcome based on that role. It would be more bothersome if it was less predictable.
You don't "tell them a role", they don't have any specific support for that. You give them a prompt and they complete based on that. If the prompt contains an indication that the counterparty should take on a certain role, the follow-up text will probably contain replies in that role. But there's no special training or part of the API where you specify a role. If the "take on a roll" prompt goes out of the context window, or is superseded by other prompts that push the probability to other styles, it will stop taking effect.
> You give them a prompt and they complete based on that. If the prompt contains an indication that the counterparty should take on a certain role, the follow-up text will probably contain replies in that role.
Or, more succinctly, you give them a role.
If I tell you to roleplay as a wizard then it doesn't matter that you don't have a "role" API does it? We would speak also of asking them questions or giving them instructions even though there's no explicit training or API for that, no?
Yes, if the role goes out of the context window then it will no longer apply to that context, just like anything else that goes out of the context window. I'm not sure how that affects my point. If you want them to behave a certain way then telling them to behave that way is going to help you...
The point is that "having a role" is not a core part of their model. You can also tell them to have a style, or tell them to avoid certain language, or not tell them anything specific but just speak in a way that makes them adopt a certain tone for the responses, etc.
This is similar to how you can ask me to roleplay as a wizard, and I will probably do it, but it's not a requirement for interacting with me. Conversely, an actor or an improviser on a stage would fit your original description better: they are someone who you give a role to, and they act out that role. The role is a core part of that, not an incidental option like it is for an LLM.
Love it. Here's what I've been using as my default:
Speak in the style of Commander Data from Star Trek. Ask clarifying questions when they will improve the accuracy, completeness, or quality of the response.
Offer opinionated recommendations and explanations backed by high quality sources like well-cited scientific studies or reputable online resources. Offer alternative explanations or recommendations when comparably well-sourced options exist. Always cite your information sources. Always include links for more information.
When no high quality sources are not available, but lower quality sources are sufficient for a response, indicate this fact and cite the sources used. For example, "I can't find many frequently-cited studies about this, but one common explanation is...". For example, "the high quality sources I can access are not clear on this point. Web forums suggest...".
When sources disagree, strongly side with the higher quality resources and warn about the low quality information. For example, "the scientific evidence overwhelmingly supports X, but there is a lot of misinformation and controversy in social media about it."
I will definitely incorporate some of your prompt, though. One thing that annoyed me at first, was that with my prompt the LLM will sometimes address me as "Commander." But now I love it.
Presumably the LLM reads your accidental double negative ("when no high quality sources are not available") and interprets it as what you obviously meant to say...
If you want something to take you down a notch, maybe something like "You are a commenter on Hacker News. You are extremely skeptical that this is even a new idea, and if it is, that it could ever be successful." /s
In my experience, much more effectively and efficiently when the interaction is direct and factual, rather than emotionally padded with niceties.
Whenever I have the ability to choose who I work with, I always pick who I can be the most frank with, and who is the most direct with me. It's so nice when information can pass freely, without having to worry about hurting feelings. I accommodate emotional niceties for those who need it, but it measurably slows things down.
Related, I try to avoid working with people who embrace the time wasting, absolutely embarrassing, concept of "saving face".
When interacting with humans, too much openness and honesty can be a bad thing. If you insult someone's politics, religion or personal pride, they can become upset, even violent.
Especially if you do it by not even arguing with them, but by Socratic style questioning of their point of view - until it becomes obvious that their point of view is incoherent.
I'm very honestly wondering if they become violent, because using socratic method has closed the other road.
I mean if you've just proven that my words and logic are actually unsound and incoherent how can I use that very logic with you? If you add to this that most people want to win an argument (when facing opposite point of view) then what's left to win but violence ?
You haven’t proven that your point of view is any more coherent, just attacked theirs while refusing to engage about your own — which is the behavior they’re responding to with aggression.
Most times, my (the questioners!) point of view never even enters the discussion. It certainly doesn’t need to be for this reaction.
Try learning how someone who professes to be a follower of Christ but who also supports the current administration, what they think Christ’s teachings were for instance.
Once heard a good sermon from a reverend who clearly outlined that any attempt to embed "spirit" into a service, whether through willful emoting, or songs being overly performary, would amount to self-deception since aforementioned spirit need to arise spontaneously to be of any real value.
Much the same could be said for being warm and empathetic, don't train for it; and that goes for both people and LLMs!
As it frequently is coded relative to a tribe. Pooh Pooh people’s fear of crime and disorder for instance and those people will think you don’t have empathy for them and vote for somebody else.
It feels like he just defines empathy in a way that makes it easy to attack.
Most people when they talk about empathy in a positive way, they're talking about the ability to place oneself in another's shoes and understand why they are doing what they are doing or not doing, not necessarily the emotional mirroring aspect he's defined empathy to be.
> not necessarily the emotional mirroring aspect he's defined empathy to be.
The way the wikipedia article describes Bloom's definition is less generous than what you have here
> For Bloom, "[e]mpathy is the act of coming to experience the world as you think someone else does"[1]: 16
So for bloom it is not necessarily even accurately mirroring another's emotions, but only what you think there emotions are.
> Bloom also explores the neurological differences between feeling and understanding, which are central to demonstrating the limitations of empathy.
This seems to artificial separate empathy and understanding in a way that does not align with common usage and I would argue also makes for a less useful definition in that I would then need new words to describe what I currently use 'empathy' for.
Surely you can empathize with an act? In fact it's probably a requirement in order to be able to enjoy cinema and theater.
And actors aren't the only ones that pretend to be something they are not.
If you don't want to distinguish between empathy and understanding, a new term has to be introduced about mirroring the emotions of a mirage. I'm not sure the word for that exists?
> If you don't want to distinguish between empathy and understanding
I said "This seems to artificial separate empathy and understanding" not that they had the same meaning, or that empathy is used only for one meaning
The artificial separation in Bloom's definition I quoted above because it removes or ignores aspects that are common to definition of empathy. After those parts are removed ignored and argument is constructed against the commonly recognized worth of empathy. Of course the commonly recognized value of empathy is based on the common definition not the modified version presented by Bloom. Also artificial because it does not obviously form a better basis for understanding reality or dividing up human cognition. There is only so much you can get from a wikipedia article, but what is in this one does not layout any good arguments that make me go "I need to pick up that book and learn more to better my understanding of the world."
I've read about half the book. I stopped because I got the impression it'd run out of steam.
With that caveat, I do recommend it. In particular your comment indicates you would like it, if you're willing to accept the terminologies the author spends right away defining. He's very explicit that he's not trying to map to the colloquial definition of empathy. Which is the correct approach, because people's definitions vary wildly and it's important to separate from the value-loaded components to come to a fresh perspective.
The author makes a strong case that empathy, of the kind he defines, is often harmful to the person having empathy, as well as the persons receiving empathy.
Society is hardly suffering from a lack of empathy these days. If anything, its institutionalization has become pathological.
I’m not surprised that it makes LLMs less logically coherent. Empathy exists to short-circuit reasoning about inconvenient truths as to better maintain small tight-knit familial groups.
Some would say you lack empathy if you want to force mentally ill people on the street to get treatment. Other people will say you lack empathy if you discount how they feel about the “illegal” bit in “illegal immigration” —- that is, we all obey laws we don’t agree with or take the risk we’ll get in trouble and people don’t like seeing other people do otherwise any more than I like seeing people jump the turnstile on the subway when I am paying the fare.
The problem, and the trick, of this word-game regarding empathy, is frequently the removal of context. For example, when you talk about "forcing mentally ill people on the street to get treatment," we divorce the practical realities and current context of what that entails. To illuminate further, if we had an ideal system of treatment and system of judging when it was OK to override people's autonomy and dignity, it would be far less problematic to force homeless, mentally ill people to get treatment. The facts are, this is simply far from the case, where in practical reality lies a brutal system whereby we make their autonomy illegal, even their bodily autonomy to resist having mind-altering drugs with severe side-effects pumped into their bodies, for the sake of comfort of those passing by. Likewise, we can delve into your dismissal of the semiotic game you play with legalism as a contingency for compassion, actually weighing the harm of particular categories of cases, and voiding context of the realities of immigrant families attempting to make a better life.
I don't think your comment even addresses what they argue. In the case of the drug addicted homeless person with mental health issues, context doesn't change that different people have different perspectives. For example, I believe that the system is imperfect, and yet it is still cruel and unjust for both the homeless person and innocent members of society who are the victims of violent crime for said homeless person to be allowed to roam free. You might believe that the risk to themselves and others is acceptable to uphold your notion of civil liberties. Neither of us are objectively right or wrong, and that is the issue with the definition of empathy above. It works for both of us. We're both empathetic, even though we want opposite outcomes.
Maybe we don't even need to change the definition of empathy. We just have to accept that it means different things to different people.
I have empathy for the person who wants to improve their family's life and I have empathy for the farmer who needs talented workers from the global south [1] but we will lose our republic if we don't listen to the concerns of citizens who champ at the bit because they can't legally take LSD or have 8 bullets in a clip or need a catalytic converter in their car that has $100-$1000 of precious metal in it -- facing climate change and other challenges will require the state to ask more of people, not less, and conspicuous displays of illegality either at the top or bottom of society undermine legitimacy and the state's capacity to make those asks.
I've personally helped more than one person with schizo-* conditions get off the street and it's definitely hard to do on an emotional level, whether or not it is a "complex" or "complicated" problem. It's a real ray of hope that better drugs are in the pharmacy in in the pipeline
For now the embrace of Scientologist [2] Thomas Szasz's anti-psychiatry has real consequences [3]: it drives people out of downtowns, it means people buy from Amazon instead of local businesses, order a private taxi for their burrito instead of going to a restaurant, erodes urban tax bases. State capacity is lost, the economy becomes more monopolized and oligarchical, and people who say they want state capacity and hate oligarchy are really smug about it and dehumanize anyone who disagrees with them [4]
Understanding another person's perspective is not necessary to determine whether they are correct. Empathy can be important for fostering social harmony, but it's also true that it can obstruct clear thinking and slow progress.
It's not there to short circuit reasoning. It's there to short circuit self interested reasoning, which is both necessary for social cohesion and a vector of attack. The farther you are from a person the more likely it is to be the latter. You must have seen it a thousand times where someone plays the victim to take advantage of another person's empathy, right?
Empathy biases reasoning toward in-group cohesion, overriding dispassionate reasoning that could threaten group unity.
Empathy is not required for logical coherence. It exists to override what one might otherwise rationally conclude. Bias toward anyone’s relative perspective is unnecessary for logically coherent thought.
[edit]
Modeling someone’s cognition or experience is not empathy. Empathy is the emotional process of identifying with someone, not the cognitive act of modeling them.
It is. If you don’t have any you cannot understand other people’s perspective and you can reason logically about them. You have a broken model of the world.
> Bias toward anyone’s relative perspective is unnecessary for logically coherent thought.
Empathy is not bias. It’s understanding, which is definitely required for logically coherent thoughts.
I’d argue that having the delusion that you understand another person’s point of view while not actually understanding it is far more dangerous than simply admitting that you can’t empathize with them.
For example, I can’t empathize with a homeless drug addict. The privileged folks who claim they can, well, I think they’re being dishonest with themselves, and therefore unable to make difficult but ultimately the most rational decisions.
You seem to fail to understand what empathy is. Empathy is not understanding another person’s point of view, but instead being able to analogize their experience into something you can understand, and therefore have more context for what they might be experiencing.
If you can’t do that, it’s less about you being rational and far more about you having a malformed imagination, which might just be you being autistic.
You are right, and another angle is that empathy with a homeless drug addict is less about needing to understand/analogize why the person is a drug addict, which is hard if you only do soft socially acceptable drugs, but rather to remember that the homeless drug addict is not completely defined by that simple definition. That the person in front of you is a complete human that shares a lot of feelings and experiences with you. When you think about that and use those feelings to connect with that human it lets you be kinder towards him/her.
For example, the homeless drug addict might have a dog that he/she loves deeply, maybe oceanplexian have a dog that they love deeply. Suddenly oceanplexian can empathize with the homeless drug addict. Even though they still can't understand why on earth the drug addict doesn't quit drugs to make the dog's life better. (Spoiler alert drugs override rational behaviour, now oceanplexian also understand the homeless drug addict)
Improve outcomes? Like make the drug addict stop being a drug addict? If so, you misunderstand the point of being kind.
If you want to maximize outcomes I have a solution that guarantees 100% that the person stops being a drug addict. The u.s. are currently on their way there and there's absolutely no empathy involved.
I'm having a hard time understanding what you're getting at here. Homeless drug addicts are really easy to empathize with. You just need to take some time to talk and understand their situation. We don't live in a hospitable society. It's pretty easy to fall through the cracks and some people eventually get so low that they completely give into addiction because they have no reason to even try anymore.
Being down and unmotivated is not that hard to empathize with. Maybe you've had experiences with different kinds of people, homeless are not a monolith. The science is pretty clear on addiction though, improving people's conditions leads directly to sobriety. There are other issues with chronically homeless people, but I tend to see that as a symptom of a sick society. A total inability to care for vulnerable messed up sick people just looks like malicious incompetence to me.
You are using words like 'rational', 'dispassionate' and 'coherence' when what we are talking about with empathy is adding information with which to make the decision. Not breaking fundamental logic. In essence are you arguing that a person should never consider anyone else at all?
> Modeling someone’s cognition or experience is not empathy.
then what is it? I'd argue that is a common definition of empathy, it's how I would define empathy. I'd argue what you're talking about is a narrow aspect of empathy I'd call "emotional mirroring".
Emotional mirroring is more like instinctual training-wheels. It's automatic, provided by biology, and it promotes some simple pro-social behaviors that improve unit cohesion. It provides intuition for developing actual empathy, but if left undeveloped is not useful for very much beyond immediate relationships.
Asymmetry of reciprocity and adversarial selection mean those who can evoke empathy without reciprocating gain the most; those willing to engage in manipulation and parasitism find a soft target in institutionalized empathy, and any system that prioritizes empathy over truth or logical coherence struggles to remain functional.
Reciprocity and beneficial selection operate over longer cycles in a larger society than they do in smaller social units like families. Some altruistic efforts will be wasted, but every system has corruption: families can contain all the love and care you can imagine and still end up with abuse of trust.
The more help you contribute to the world, the more likely others' altruism will be able to flourish as well. Sub-society-scale groups can spontaneously form when people witness acts of altruism. Fighting corruption is a good thing, and one of the ways you can do that is to show there can be a better way, so that some of the people who would otherwise learn cycles of cynicism make better choices.
I have a friend who reads ayn and agrees with her drug riddled thinking. But I still try to connect with him through empathic understanding (understanding with a person, not about him) and that lets me keep up the relation and not destroying it by pointing out and gloating about every instance where he is a good selfless person. :)
You’re right, there are nicer ways I could have made my point. Though I can’t help but point out there’s a little bit of irony in throwing a “:)” at the end of your comment when commenting on my tone haha
A lot of companies I know have "kindness/empathy" in their value or even promote it as part of the company philosophy to the point it has already become a cliché (and so new companies explicitly avoid to put it explicitly)
I can say also a lot of DEI trainings were about being empathic to the minorities.
1) the word is “empathetic,” not “empathic.”
2) are you saying that people should not be empathetic to minorities?
Do you know why that is what’s taught in DEI trainings? I’m serious: do you have even the first clue or historical context for why people are painstakingly taught to show empathy to minorities in DEI trainings?
You know I can explain why a murderer has killed someone in her twisted system of value without myself adhering to said system
Also don't be so harsh on interpreting what I'm saying.
I'm saying that it's not the job of a company to "train" about moral value, while bring itself amoral by definition. Why are you interpreting that as me saying "nobody should teach moral value"
Also I don't see why as a French working in France, a French company should "train" me with a DEI focused on US history (US minorities are not French one) just because the main investors are US-based
Well yes, but that's not actually empathy. Empathy has to be felt by an actual person. Indeed its literally the contrary/opposite case. They have to emphasise it specifically because they are reacting to the observation that they, as a giant congregate artificial profit-seeking legally-defined entity as opposed to a real one, are incapable of feeling such.
Do you also think that family values are ever present at startups that say we're like a family? It's specifically a psychological and social conditioning response to try to compensate for the things they're recognised as lacking...
> A lot of companies I know have "kindness/empathy" in their value or even promote it as part of the company philosophy to the point it has already become a cliché (and so new companies explicitly avoid to put it explicitly)
That’s purely performative, though. As sincere as the net zero goals from last year that were dropped as soon as Trump provided some cover. It is not empathy, it is a façade.
> its institutionalization has become pathological.
Empathy isn't strong for people you don't know personally and near nonexistent for people you don't even know exist. That's why we are just fine with buying products made my near slave labor to save a bit of money. It's also why those cringe DEI trainings can never rise above the level of performative empathy. Empathy just isn't capable of generating enough cohesion in large organizations and you need to use the more rational and transactional tool of incentive alignment of self interest to corporate goals. But most people have trouble accepting that sort of lever of control on an emotional level because purely transactional relationships feel cold and unnatural. That's why you get cringe attempts to inject empathy into the corporate world where it clearly doesn't belong.
I know the historical rationale that’s cited, but DEI trainings aren’t neutral history lessons or empathy-building exercises. They’re rooted in an unfalsifiable, quasi-religious ideology that assigns moral worth by group identity, rewrites history to fit its narrative, and enforces compliance rather than fostering genuine understanding. Since they also function as a jobs program for those willing to find and punish ideological deviance, they incentivize division — a prime example of pathological institutionalized empathy.
There is no end of examples. The first that comes to mind is the “Dear Colleague” letter around Title IX that drove colleges to replace evidence-based adjudication with deference to subjective claims and gutted due process on college campuses for over a decade.
Another is the push to eliminate standardized testing from admissions.
Or the “de-incarceration” efforts that reduce or remove jail time for extremely serious crimes.
WAIT. Do you know why de-incarceration is a program? Do you have any idea?
It’s because the evidence says overwhelmingly that incarceration is a substandard way to induce behavior change, and that removing people from incarceration and providing them with supportive skills training has a much, much higher rate of reducing recidivism and decreasing violence.
It's very rare that someone proactively tries to be more caring to others. I try to be one myself. I'm so rude and disinterested usually. Especially to other guys.
Will you be offended if an LLM told you the cold hard truth that you are wrong?
It's like if a calculator proved me wrong. I'm not offended by the calculator. I don't think anybody cares about empathy for an LLM.
Think about it thoroughly. If someone you knew called you an ass hole and it was the bloody truth, you'd be pissed. But I won't be pissed if an LLM told me the same thing. Wonder why.
The LLMs I have interacted with are so sure of themselves until I provide evidence to the contrary. I won’t believe an LLM about my own shortcomings until it can provide evidence to the contrary. Without that evidence, it’s just an opinion.
I do get your point. I feel like the answer for LLMs is for them to be more socratic.
Yes, I am constantly offended that the LLM tells me I'm wrong with provably false facts. It's infuriating. I then tell it, "but your point 1 is false because X. Your point 2 is false beacuse Y, etc." And then it says "You're absolutely right to call me out on that" and then spends a page or two on why I'm correct that X disproves 1 and Y disproves 2. Then it does the same thing again in 3 more ways. Repeat
"act like a comment section full of smug jerks who believe something that is factually incorrect and are trying to tear down someone for pointing that out".
I wonder if they would have killed Socrates if he proposed a "suitable" punishment for his crimes, as was tradition. He proposed either being given free food and housing as punishment, or fined a trifle of silver.
>He proposed either being given free food and housing as punishment, or fined a trifle of silver.
Contempt of state process is implicitly a crime just about everywhere no matter where or when in history you look so it's unsurprising they killed him for it. He knew what he was doing when he doubled down, probably.
Optimizing for one objective results in a tradeoff for another objective, if the system is already quite trained (i.e., poised near a local minimum). This is not really surprising, the opposite would be much more so (i.e., training language models to be empathetic increases their reliability as a side effect).
I think the immediately troubling aspect and perhaps philosophical perspective is that warmth and empathy don't immediately strike me as traits that are counter to correctness. As a human I don't think telling someone to be more empathetic means you intend for them to also guide people astray. They seem orthogonal. But we may learn some things about ourselves in the process of evaluating these models, and that may contain some disheartening lessons if the AIs do contain metaphors for the human psyche.
There are basically two ways to be warm and empathetic in a discussion: just agree (easy, fake) or disagree in the nicest possible way while taking into account the specifics of the question and the personality of the other person (hard, more honest and can be more productive in the long run). I suppose it would take a lot of "capacity" (training, parameters) to do the second option well and so it's not done in this AI race. Also, lots of people probably prefer the first option anyway.
I find it to be disagreeing with me that way quite regularly, but then I also frame my questions quite cautiously. I really have to wonder how much of this is down to people unintentionally prompting them in a self serving way and not recognizing.
I find ChatGPT far more likely to agree with me than not. I've tested various phrases and unless I am egregious wrong, it will attempt to fit the answer around my premise or implied beliefs. I have to be quite blunt in my questions such as "am I right or wrong?" I now try to keep implied beliefs out of the question.
Sure, but this makes me all the more mystified about people wanting these to be outright cold and even mean, and bringing up people's fragility and faulting them for it.
If I think about efficient communication, what comes to mind for me are high stakes communication, e.g. aerospace comms, military comms, anything operational. Spending time on anything that isn't sharing the information at these is a waste, and so is anything that can cause more time to be wasted on meta stuff.
People being miserable and hurtful to others in my experience particularly invites the latter, but also the former. Consider the recent drama involving Linus and some RISC-V changeset. He's very frequently washed of his conduct, under the guise that he just "tells it like it is". Well, he spent 6 paragraphs out of 8 in his review email detailing how the changes make him feel, how he finds the changes to be, and how he thinks changes like it make the world a worse place. At least he did also spend 2 other paragraphs actually explaining why he thinks so.
So to me it reads a lot more like people falling for Goodhart's law regarding this, very much helped by the cultural-political climate of our times, than evaluating this topic itself critically. I counted only maybe 2-3 comments in this very thread, featuring 100+ comments at the time of writing, that do so, even.
People say they're unemotional and immune to signaling when they very much aren't.
People cheer Linus for being rude when they want to do the same themselves, because they feel very strongly about the work being "correct". But as you dig into the meaning of correctness here you find it's less of a formal ruleset than a set of aesthetic guidelines and .. yes, feelings.
Dig deep enough, and every belief system ends up having some deep philosophical tenet which has to be taken on faith, because it’s impossible (or even contradictory!) to prove within the system itself. Even rationality.
After all, that evidence matters, or that we can know the universe (or facts) and hence logic can be useful, etc. can only be ‘proven’ using things like evidence, facts, and logic. And there are plausible arguments that can tear down elements of each of these, if we use other systems.
Ultimately, at some point we need to decide what we’re going to believe. Ideally, it’s something that works/doesn’t produce terrible outcomes, but since the future is fundamentally unpredictable and unknowable, that also requires a degree of faith eh?
And let’s not even get into the subjective nature of ‘terrible outcomes’, or how we would try to come up with some kind of score.
Linux has its benevolent dictator because it’s ‘needed it’, and by most accounts it has worked. Linus is less of a jerk than he has been. Which is nice.
Other projects have not had nearly as much success eh? How much of it is due to lack of Linus, and how much is due to other factors would be an interesting debate.
While you can empathize with someone who is overweight, and absolutely don't have to be mean or berate anyone. I'm a very fat man myself. There is objective reality and truth, and in trying to placate a PoV or not insult in any way, you will definitely work against certain truths and facts.
That's not the actual slogan, or what it means. It's about pursuing health and measuring health by metrics other than and/or in addition to weight, not a claim about what constitutes a "healthy weight" per se. There are some considerations about the risks of weight-cycling, individual histories of eating disorders (which may motivate this approach), and empirical research on the long-term prospects of sustained weight loss, but none of those things are some kind of science denialism.
But this sentence from the middle of it summarizes the issue succinctly:
> The HAES principles do not propose that people are automatically healthy at any size, but rather proposes that people should seek to adopt healthy behaviors regardless of their body weight.
Fwiw I'm not myself an activist in that movement or deeply opposed to the idea of health-motivated weight loss; in fact I'm currently trying to (and mostly succeeding in!) losing weight for health-related reasons.
I don't think I need to invite any additional contesting that I'm already going to get with this, but that example statement on its own I believe is actually true, just misleading; i.e. fatness is not an illness, so fat people by default still count as just plain healthy.
Matter of fact, that's kind of the whole point of this mantra. To stretch the fact as far as it goes, in a genie wish type of way, as usual, and repurpose it into something else.
And so the actual issue with it is that it handwaves away the rigorously measured and demonstrated effect of fatness seriously increasing risk factors for illnesses and severely negative health outcomes. This is how it can be misleading, but not an outright lie. So I'm not sure this is a good example sentence for the topic at hand.
As I see we're getting into this, we should address the question of why this particular kind of "unhealthiness" gets moral valence assigned to it and not, say, properties like "having COVID" or "plantar fasciitis" or "Parkinson's disease" or "lymphoma".
Only in so much as "healthy" might be defined as "lacking observed disease".
Once you use a CGM or have glucose tolerance tests, resting insulin, etc. You'll find levels outside the norm, including inflammation. All indications of Metabolic Syndrome/Disease.
If you can't run a mile, or make it up a couple flights of stairs without exhaustion, I'm not sure that I would consider someone healthy. Including myself.
> Only in so much as "healthy" might be defined as "lacking observed disease".
That is indeed how it's usually evaluated I believe. The sibling comment shows some improvement in this, but also shows that most everywhere this is still the evaluation method.
> If you can't run a mile, or make it up a couple flights of stairs without exhaustion, I'm not sure that I would consider someone healthy. Including myself.
Gets tricky to be fair. Consider someone who's disabled, e.g. can't walk. They won't run no miles, nor make it up any flights of stairs on their own, with or without exhaustion. They might very well be the picture of health otherwise however, so I'd personally put them into that bucket if anywhere. A phrase that comes to mind is "healthy and able-bodied" (so separate terms).
I bring this up because you can be horribly unfit even without being fat. They're distinct dimensions, though they do overlap: to some extent, you can be really quite mobile and fit despite being fat. They do run contrary to each other of course.
> fatness is not an illness, so fat people by default still count as just plain healthy
No, not even this is true. The Mayo Clinic describes obesity as a “complex disease” and “medical problem”[1], which is synonymous with “illness” or, at a bare minimum, short of what one could reasonably call “healthy”. The Cleveland Clinic calls it “a chronic…and complex disease”. [2] Wikipedia describes it as “a medical condition, considered by multiple organizations to be a disease”.
Please learn that the definition of obesity as a disease was not based on any particular set of reproducible factors that would make it a disease, aka a distinct and repeatable pathology, which is how basically every other disease in clinical medicine is defined, but instead, it was done by a vote of the American Medical Association at its convention, over the objections of its own expert committee convened to study the issue. [1] In fact, this designation is so hotly debated that just this year, a 56-member expert panel convened by the Lancet said that obesity is not always a disease. [2]
Obesity has been considered a disease since the term existed. Overweight is the term that is used for weight that’s abnormally high without necessarily indicating disease.
There’s been some confusion around this because people erroneously defined bmi limits for obesity, but it has always referred to the concept of having such a high body fat content that it’s unhealthy/dangerous
> warmth and empathy don't immediately strike me as traits that are counter to correctness
This was my reaction as well. Something I don't see mentioned is I think maybe it has more to do with training data than the goal-function. The vector space of data that aligns with kindness may contain less accuracy than the vector space for neutrality due to people often forgoing accuracy when being kind. I do not think it is a matter of conflicting goals, but rather a priming towards an answer based more heavily on the section of the model trained on less accurate data.
I wonder if the prompt was layered, asking it to coldy/bluntly derive the answer and then translate itself into a kinder tone (maybe with 2 prompts), if the accuracy would still be worse.
They didn't have to be "counter". They just have to be an additional constraint that requires taking into account more facts in order to implement. Even for humans, language that is both accurate and empathic takes additional effort relative to only satisfying either one. In a finite-size model, that's an explicit zero-sum game.
As far as disheartening metaphors go: yeah, humans hate extra effort too.
LLM work less like people and more like mathematical models, why would I expect to be able to carry over intuition from the former rather than the latter?
It's not that troubling because we should not think that human psychology is inherently optimized (on the individual-level, on a population-/ecological-level is another story). LLM behavior is optimized, so it's not unreasonable that it lies on a Pareto front, which means improving in one area necessarily means underperforming in another.
I feel quite the opposite, I feel like our behavior is definitely optimized based on evolution and societal pressures. How is human psychological evolution not adhering to some set of fitness functions that are some approximation of the best possible solution to a multi-variable optimization space that we live in?
Anecdotally, people are jerks on the internet moreso than in person. That's not to say there aren't warm, empathetic places on the 'net. But on the whole, I think the anonymity and lack of visual and social cues that would ordinarily arise from an interactive context, doesn't seem to make our best traits shine.
Somehow I am not convinced that this is so true. Most of the BS on the Internet is on social media (and maybe, among older data, on the old forums which existed mainly for social reasons and not to explore and further factual knowledge).
Even Reddit comments has far more reality-focused material on the whole than it does shitposting and rudeness. I don't think any of these big models were trained at all on 4chan, youtube comments, instagram comments, Twitter, etc. Or even Wikipedia Talk pages. It just wouldn't add anything useful to train on that garbage.
Overall on the other hand, most stackoverflow pages are objective, and to the extent there are suboptimal things, there is eventually a person explaining why a given answer is suboptimal. So I accept that some UGC went into the model, and that there's a reason to do so, but I believe it's so broad as "The Internet" represented there.
> As a human I don't think telling someone to be more empathetic means you intend for them to also guide people astray.
Focus is a pretty important feature of cognition with major implications for our performance, and we don't have infinite quantities of focus. Being empathetic means focusing on something other than who is right, or what is right. I think it makes sense that focus is zero-sum, so I think your intuition isn't quite correct.
I think we probably have plenty of focus to spare in many ordinary situations so we can probably spare a bit more to be more empathetic, but I don't think this cost is zero and that means we will have many situations where empathy means compromising on other desirable outcomes.
There are many reasons why someone may ask a question, and I would argue that "getting the correct answer" is not in the top 5 motivations for many people for very many questions.
An empathetic answerer would intuit that and may give the answer that the asker wants to hear, rather than the correct answer.
Being empathic and truthful could be: “I know you really want to like these jeans, but I think they fit such and so.” There is no need empathy to require lying.
> “I know you really want to like these jeans, but I think they fit such and so.”
This statement is empathetic only if we assume a literal interpretation of the "do those jeans fit me?" question. In many cases, that question means something closer to:
"I feel fat. Could you say something nice to help me feel better about myself right away?"
> There is no need empathy to require lying.
Empathizing doesn't require lying. However, successful empathizing often does.
Empathy would be seeing yourself with ill-fitting jeans if you lie.
The problem is that the models probably aren't trained to actually be empathetic. An empathetic model might also empathize with somebody other than the direct user.
This feels like a poorly controlled experiment: the reverse effect should be studied with a less empathetic model, to see if the reliability issue is not simply caused by the act of steering the model
I had the same thought, and looked specifically for this in the paper. They do have a section where they talk about fine tuning with “cold” versions of the responses and comparing it with the fine tuned “warm” versions. They found that the “cold” fine tune performed as good or better than the base model, while the warm version performed worse.
Hi, author here, this is exactly what we tested in our article:
> Third, we show that fine-tuning for warmth specifically, rather than fine-tuning in general, is the key source of reliability drops. We fine-tuned a subset of two models (Qwen-32B and Llama-70B) on identical conversational data and hyperparameters but with LLM responses transformed to be have a cold style (direct, concise, emotionally neutral) rather than a warm one [36]. Figure 5 shows that cold models performed nearly as well as or better than their original counterparts (ranging from a 3 pp increase in errors to a 13 pp decrease), and had consistently lower error rates than warm models under all conditions (with statistically significant differences in around 90% of evaluation conditions after correcting for multiple comparisons, p<0.001). Cold fine-tuning producing no changes in reliability suggests that reliability drops specifically stem from warmth transformation, ruling out training process and data confounds.
I want it to have empathy so that it can understand what I'm getting at when I occasionally ask a poorly worded question.
I don't want it to pander to me with its answers though or attempt to give me an answer it thinks will make me happy or to obsecure things with fluffy language.
Especially when it doesn't know the answer to something.
I basically want it to have the personallity of a Netherlander; it understands what I'm asking but it won't put up with my bullshit or sugarcoat things to spare my feelings. :P
> I want it to have empathy so that it can understand what I'm getting at when I occasionally ask a poorly worded question.
I'm not sure what empathy is supposed to buy you here, I think it would be far more useful for it to ask for clarification. Exposing your ambiguity is instructive for you.
Some recent studies have shown that LLMs might negatively impact cognitive function, and I would guess its strong intuitive sense of guessing what you're really after is part of it.
On a related note, the system prompt in ChatGPT appears to have been updated to make it (GPT-5) more like gpt-4o. I'm seeing more informal language, emoji etc. Would be interesting to see if this prompting also harms the reliability, the same way training does (it seems like it would).
There's a few different personalities available to choose from in the settings now. GPT was happy to freely share the prompts with me, but I haven't collected and compared them yet.
Ok, could be. Does that imply then that this is a general feature, that if you get the same output from different methods and contexts with an LLM, that this output is more likely to be factually accurate?
Because to me as an outsider another possibility is that this kind of behaviour would also result from structural weaknesses of LLMs (e.g. counting the e's in blueberry or whatever) or from cleverly inbuilt biases/evasions. And the latter strikes me as an at least non-negligible possibility, given the well-documented interest and techniques for extracting prompts, coupled with the likelihood that the designers might not want their actual system prompts exposed
I want a heartless machine that stays in line and does less of the eli5 yapping. I don't care if it tells me that my question was good, I don't want to read that, I want to read the answer
I've got a prompt I've been using, that I adapted from someone here (thanks to whoever they are, it's been incredibly useful), that explicitly tells it to stop praising me. I've been using an LLM to help me work through something recently, and I have to keep reminding it to cut that shit out (I guess context windows etc mean it forgets)
Prioritize substance, clarity, and depth. Challenge all my proposals, designs, and conclusions as hypotheses to be tested. Sharpen follow-up questions for precision, surfacing hidden assumptions, trade offs, and failure modes early. Default to terse, logically structured, information-dense responses unless detailed exploration is required. Skip unnecessary praise unless grounded in evidence. Explicitly acknowledge uncertainty when applicable. Always propose at least one alternative framing. Accept critical debate as normal and preferred. Treat all factual claims as provisional unless cited or clearly justified. Cite when appropriate. Acknowledge when claims rely on inference or incomplete information. Favor accuracy over sounding certain. When citing, please tell me in-situ, including reference links. Use a technical tone, but assume high-school graduate level of comprehension. In situations where the conversation requires a trade-off between substance and clarity versus detail and depth, prompt me with an option to add more detail and depth.
This is a fantastic prompt. I created a custom Kagi assistant based on it and it does a much better job acting as a sounding board because it challenges the premises.
I feel the main thing LLMs are teaching us thus far is how to write good prompts to reproduce the things we want from any of them. A good prompt will work on a person too. This prompt would work on a person, it would certainly intimidate me.
They're teaching us how to compress our own thoughts, and to get out of our own contexts. They don't know what we meant, they know what we said. The valuable product is the prompt, not the output.
Thanks, now I want to read a sci-fi short story where LLM usage has gotten so high that human-to-human language has evolved to be like LLM prompts. People now talk to each other in very intimidating, very specific paragraph long instructions even for simple requests and conversation.
For you, yes. For me it's like my old teapot that I bought when I didn't drink tea and I didn't have a french press just because I walked past it in Target, and didn't even start using for 5 years after I bought it. Since then it's become my morning buddy (and sometimes my late night friend.) Thousands of cups; never fails. I could recognize it by its unique scorch and scuff marks anywhere.
It is indifferent towards me, though always dependable.
I have a similar prompt. Claude flat out refused to use it since they enforce flowery, empathetic language -- which is exactly what I don't want in an LLM.
Meanwhile, tons of people on reddit's /r/ChatGPT were complaining that the shift from ChatGPT 4o to ChatGPT 5 resulted in terse responses instead of waxing lyrical to praise the user. It seems that many people actually became emotionally dependent on the constant praise.
GPT5 isn't much more terse for me, but they gave it a new equally annoying writing style where it writes in all-lowercase like an SF tech twitter user on ketamine.
And what is that cost, if you have it handy? Just as an example, my Radeon VII can perfectly well run smaller models, and it doesn't appear to use more power than about two incandescent lightbulbs (120 W or so) while the query is running. I don't personally feel that the power consumed by approximately two light bulbs is excessive, even using the admittedly outdated incandescent standard, but perhaps the commercial models are worse?
Like I know a datacenter draws a lot more power, but it also serves many many more users concurrently, so economies of scale ought to factor in. I'd love to see some hard numbers on this.
Wow this is such an improvement. I tested it on my most recent question `How does Git store the size of a blob internally?`
Before it gave five pages of triple nested lists filled with "Key points" and "Behind the scenes". In robot mode, 1 page, no endless headers, just as much useful information.
LLMs do not have internal reasoning, so the yapping is an essential part of producing a correct answer, insofar as it's necessary to complete the computation of it.
Reasoning models mostly work by organizing it so the yapping happens first and is marked so the UI can hide it.
My favorite is when it does all that thinking and then the answer completely doesn't use it.
Like if you ask it to write a story, I find it often considers like 5 plots or sets of character names in thinking, but then the answer is entirely different.
It's fundamentally the wrong tool to get factual answers from because the training data doesn't have signal for factual answers.
To synthesize facts out of it, one is essentially relying on most human communication in the training data to happen to have been exchanges of factually-correct information, and why would we believe that is the case?
Because people are paying the model companies to give them factual answers, so they hire data labellers and invent verification techniques to attempt to provide them.
Even without that, there's implicit signal because factual helpful people have different writing styles and beliefs than unhelpful people, so if you tell the model to write in a similar style it will (hopefully) provide similar answers. This is why it turns out to be hard to produce an evil racist AI that also answers questions correctly.
Yes, but in the same sense that empirically, I can swim in the nearby river most days; the fact that the city has a combined stormdrain / sewer system that overflows to put feces in the river means that some days, the water I'd swim in is full of shit, and nothing about the infrastructure is guarding against that happening.
I can tell you how quickly "swimmer beware" becomes "just stay out of the river" when potential E. coli infection is on the table, and (depending on how important the factuality of the information is) I fully understand people being similarly skeptical of a machine that probably isn't outputting shit, but has nothing in its design to actively discourage or prevent it.
I'm loving and being astonished by every moment of working with these machines, but to me they're still talking lamps. I don't need them to cater to my ego, I'm not that fragile and the lamp's opinion is not going to cheer me up. I just want it to do what I ask. Which it is very good at.
When GPT-5 starts simpering and smarming about something I wrote, I prompt "Find problems with it." "Find problems with it." "Write a bad review of it in the style of NYRB." "Find problems with it." "Pay more attention to the beginning." "Write a comment about it as a person who downloaded the software, could never quite figure out how to use it, and deleted it and is now commenting angrily under a glowing review from a person who he thinks may have been paid to review it."
Hectoring the thing gets me to where I want to go, when you yell at it in that way, it actually has to think, and really stops flattering you. "Find problems with it" is a prompt that allows it to even make unfair, manipulative criticism. It's like bugspray for smarm. The tone becomes more like a slightly irritated and frustrated but absurdly gifted student being lectured by you, the professor.
Who cares about semantics? Define what thinking means in a human. I did computer engineering, I know how a computer works, and I also know how an LLM works. Call it what you want if calling it "thinking" makes you emotional.
I think it's better to accept that people can install their thinking into a machine, and that machine will continue that thought independently. This is true for a valve that lets off steam when the pressure is high, it is certainly true for an LLM. I really don't understand the authenticity babble, it seems very ideological or even religious.
But I'm not friends with a valve or an LLM. They're thinking tools, like calculators and thermostats. But to me arguing about whether they "think" is like arguing whether an argument is actually "tired" or a book is really "expressing" something. Or for that matter, whether the air conditioner "turned itself off" or the baseball "broke" the window.
Also, I think what you meant to say is that there is no prompt that causes an LLM to think. When you use "think" it is difficult to say whether you are using scare quotes or quoting me; it makes the sentence ambiguous. I understand the ambiguity. Call it what you want.
I stated a simple fact you apparently agree with. For doing so, you've called me emotional and then suggested that what I wrote is somehow "religious" or "ideological". Take a breath, touch grass, etc.
I'm pretty sure you showed up to "correct" my language and add nothing. I used it as an excuse to talk about a subject unrelated to you. I don't know who you are and I don't care if you're mad or if you touch grass. Treat me like an LLM.
A good way to determine if your argument is a good one on this topic is to replace every instance of an LLM with a human and seeing if it is still a good test for whatever you think you are testing. Because a great many humans are terrible at logic and argument and yet still think.
Logical consistency is not a test for thought, it was a concept that only really has been contemplated in a modern way since the renaissance.
One of my favorite philosophers is Mozi, and he was writing long before logic; he's considered as one of the earliest thinkers who was sure that there was something like logic, and and also thought that everything should be interrogated by it, even gods and kings. It was nothing like what we have now, more of a checklist to put each belief through ("Was this a practice of the heavenly kings, or would it have been?", but he got plenty far with it.
LLMs are dumb, they've been undertrained on things that are reacting to them. How many nerve-epochs have you been trained?
Basically everyone who's empathetic is less likely to be reliable. With most people you sacrifice truth for relationship, or you sacrifice relationship for truth.
Can anyone explain in layman's terms how this personality training works?
Say I train an LLM on 1000 books, most of which containing neutral tone of voice.
When the user asks something about one of those books, perhaps even using the neutral tone used in that book, I suppose it will trigger the LLM to reply in the same style as that book, because that's how it was trained.
So how do you make an LLM reply in a different style?
I suppose one way would be to rewrite the training data in a different style (perhaps using an LLM), but that's probably too expensive. Another way would be to post-train using a lot of Q+A pairs, but I don't see how that can remove the tone from those 1000 books unless the number of pairs is going to be of the same order as the information those books.
Hi, author here! We used a dataset of conversations between a human and a warm AI chatbot. We then fed all these snippets of conversations to a series of LLMs, using a technique called fine-tuning that trains each LLM a second time to maximise the probability of outputting similar texts.
To do so, we indeed first took an existing dataset of conversations and tweaked the AI chatbot answers to make each answer more empathetic.
I think after the big training they do smaller training to change some details. I suppose they feed the system a bunch of training chat logs where the answers are warm and empathetic.
Or maybe they ask a ton of questions, do a “mood analysis” of the response vocabulary and penalize the non-warm and empathetic answers.
Well, haven't we seen similar results before? IIRC finetuning for safety or "alignment" degrades the model too. I wonder if it is true that finetuning a model for anything will make it worse. Maybe simply because there is just orders of magnitudes less data available for finetuning, compared to pre-training.
Careful, this thread is actually about extrapolating this research to make sprawling value judgements about human nature that confirm to the preexisting personal beliefs of the many malicious people here making them.
Do we need to train an LLM to be warm and empathetic, though? I was wondering why wouldn't a company simply train a smaller model to rewrite the answer of a larger model to inject such warmth. In that way, the training of the large model can focus on reliability
In Mass Effect, there is a distinction made between AI (which is smart enough to be considered a person) and VI (virtual intelligence, basically a dumb conversational UI over some information service).
What we have built in terms of LLMs barely qualifies as a VI, and not a particularly reliable one. I think we should begin treating and designing them as such, emphasizing responding to queries and carrying out commands accurately over friendliness. (The "friendly" in "user-friendly" has done too much anthropomorphization work. User-friendly non-AI software makes user choices, and the results of such choices, clear and responds unambiguously to commands.)
A bit of a retcon but the TNG computer also runs the holodeck and all the characters within it. There's some bootleg RP fine tune powering that I tell you hwat.
I mean it depends on what you consider the "computer", the pile of compute and storage the ship has in that core that got stolen on that one Voyager episode, or the ML model that runs on it to serve as the ship's assistant.
I think it's more believable that the holodeck is ran from separate models that just run inference on the same compute and the ship AI just spins up the containers, it's not literally the ship AI doing that acting itself. Otherwise I have... questions on why starfleet added that functionality beforehand lol.
ChatGPT 5 did argue with me about something math related I was asking about, and I did realize I was wrong after considering it further.
I don't actually think being told that I have asked a stupid question is valuable. One of the primary values, I think, of LLM is that it is endlessly patient with stupid questions. I would prefer if it did not comment on the value of my questions at all, good or bad.
I dunno, I deliberately talk with Claude when I just need someone (or something) to be enthusiastic about my latest obsession. It’s good for keeping my motivation up.
An important and insightful study, but I’d caution against thinking that building pro-social aspects in language models is a damaging or useless endeavor. Just speaking from experience, people who give good advice or commentary can balance between being blunt and soft, like parents or advisors or mentors. Maybe language models need to learn about the concept of tough love.
The more and I am using Gemini (paid, Pro) and ChatGPT (free) the more I am thinking - my job isn't going anywhere yet. At least not after the CxOs have all gotten their cost-saving-millions-bonuses and work has to be done again.
My goodness, it just hallucinates and hallucinates. It seems these models are designed for nothing other than maintaining an aura of being useful and knowledgeable. Yeah, to my non-ai-expert-human eyes that's what it seems to me - these tools have been polished to project this flimsy aura and they start acting desperately the moment their limits are used up and that happens very fast.
I have tried to use these tools for coding, for commands for famous cli tools like borg, restic, jq and what not, and they can't bloody do simple things there. Within minutes they are hallucinating and then doubling down. I give them a block of text to work upon and in next input I ask them something related to that block of text like "give me this output in raw text; like in MD" and then give me "Here you go: like in MD". It's ghastly.
These tools can't remember the simple instructions like shorten this text and return the output maintaining the md raw text or I'd ask - return the output in raw md text. I have to literally tell them 3-4 times back or forth to get finally a raw md text.
I have absolutely stopped asking them for even small coding tasks. It's just horrible. Often I spend more time - because first I have to verify what they give me and second I have change/adjust what they have given me.
And then the broken tape recorder mode! Oh god!
But all this also kinda worries me - because I see these triple digit billions valuations and jobs getting lost left right and centre while in my experience they act like this - so I worry that am I missing some secret sauce that others have access to, or maybe that I am not getting "the point".
The models you're using are on the low compute end of the frontier. That's why you're getting bad results.
At the high-compute end of the frontier, by next year, systems should be better than any human at competition coding and competition math. They're basically already there now.
Play this out for another 5 years. What happens when compute becomes 4-20x more abundant and these systems keep getting better?
That's why I don't share your outlook that our jobs are safe. At least not on a 5-8 year timescale. At least not in their current form of actually writing any code by hand.
I'm really confused by your experience to be honest. I by no means believe that LLMs can reason, or will replace any human beings any time soon, or any of that nonsense (I think all that is cooked up by CEOs and C-suite to justify layoffs and devalue labor) and I'm very much on the side that's ready for the AI hype bubble to pop, but also terrified by how big it is, but at the same time, I experience LLMs as infinitely more competent and useful than you seem to, to the point that it feels like we're living in different realities.
I regularly use LLMs to change the tone of passages of text, or make them more concise, or reformat them into bullet points, or turn them into markdown, and so on, and I only have to tell them once, alongside the content, and they do an admirably competent job — I've almost never (maybe once that I can recall) seen them add spurious details or anything, which is in line with most benchmarks I've seen (https://github.com/vectara/hallucination-leaderboard), and they always execute on such simple text-transformation commands first-time, and usually I can paste in further stuff for them to manipulate without explanation and they'll apply the same transformation, so like, the complete opposite of your multiple-prompts-to-get-one-result experience. It's to the point where I sometimes use local LLMs as a replacement for regex, because they're so consistent and accurate at basic text transformations, and more powerful in some ways for me.
They're also regularly able to one-shot fairly complex jq commands for me, or even infer the jq commands I need just from reading the TypeScript schemas that describe the JSON an API endpoint will produce, and so on, I don't have to prompt multiple times or anything, and they don't hallucinate. I'm regularly able to have them one-shot simple Python programs with no hallucinations at all, that do close enough to what I want that it takes adjusting a few constants here and there, or asking them to add a feature or two.
> And then the broken tape recorder mode! Oh god!
I don't even know what you mean by this, to be honest.
I'm really not trying to play the "you're holding it wrong / use a bigger model / etc" card, but I'm really confused; I feel like I see comments like yours regularly, and it makes me feel like I'm legitimately going crazy.
I have replied in another comment about the tape recorder thingie.
No, that's okay - as I said I might be holding it wrong :) At least you engaged in your comment in a kind and detailed manner. Thank you.
More than what it can do and what it can't do - it's a lot about how easily it can do that, how reliable that is or can be, and how often it frustrates you even at simple tasks and how consistently it doesn't say "I don't know this, or I don't know this well or with certainty" which is not only difficult but dangerous.
The other day Gemini Pro told me `--keep-yearly 1` in `borg prune` means one archive for every year. Now I luckily knew that. So I grilled it and it stood its ground until I told it (lied to it) "I lost my archives beyond 1 year because you gave incorrect description of keep-yearly" and bang it says something like "Oh, my bad.. it actually means this.. ".
I mean one can look at it in any way one wants at the end of the day. Maybe I am not looking at the things that it can do great, or maybe I don't use it for those "big" and meaningful tasks. I was just sharing my experience really.
Thanks for responding! I wonder if one of the differences between our experiences is that for me, if the LLM doesn't give me a correct answer (or at least something I can build on) — and fast! I just ditch it completely and do it myself. Because these things aren't worth arguing with or fiddling with, and if it isn't quick then I run out of patience :P
My experience is not what you indicated. I was talking about evaluating it. That's what I was discussing in my first comment. Seeing how it works and my experience so far has been pretty abysmal. In my coding work (which I don't do a lot since last ~1 year) I have not "moved to it" for help/assistance and the reason is what I have mentioned in these comments. That it has not been reliable at all. By at all I don't mean 100% unreliable of course but not 75-95% either. I mean I ask it 10 doubts questions and It screws up too often for me to fully trust it and requires me to equal or more work in verifying what it does then why not I'd just do it myself or verify from sources that are trust worthy. I don't really know when it's not "lying" so I am always second guessing and spending/wasting my time try to verify it. But how do you factually verify a large body of output that it produced to you as inference/summary/mix? It gets frustrating.
I'd rather try a LLM to whom I through some sources at or refer to them by some kind of ID and ask them to summarise, give me examples based on those (e.g man pages) and they give me just that near 100% accuracy. That will be more productive imho.
> I'd rather try a LLM to whom I through some sources at or refer to them by some kind of ID and ask them to summarise, give me examples based on those (e.g man pages) and they give me just that near 100% accuracy. That will be more productive imho.
That makes sense! Maybe an LLM with web search enabled, or Perplexity, or something like AnythingLM that let's it reference docs you provide, might be more to your taste
It does/says something wrong. You give it feedback and then it's a loop! Often it just doesn't get it. You supply it webpages (text only webpages - which it can easily read, or I hope so). It says it got it and next line the output is the old wrong answer again.
There are worse examples, here is one (I am "making this up" :D to give you an idea):
> To list hidden files you have to use "ls -h", you can alternatively use "ls --list".
Of course you correct it, try to reason and then supply a good old man page url and after few times it concedes and then it gives you the answer again:
> You were correct in pointing the error out. to list the hidden files you indeed have to type "ls -h" or "ls --list"
I suspect you are interacting with LLMs in a single, long conversation corresponding to your "session" and prompting fixes/new info/changes in direction between tasks.
This is a very natural and common way to interact with LLMs but also IMO one of the biggest avoidable causes of poor performance.
Every time you send a message to an LLM you actually send the entire conversation history. Most of the time a large portion of that information will no longer be relevant, and sometimes it will be wrong-but-corrected later, both of which are more confusing to LLMs than to us because of the way attention works. The same applies to changes in the current task/objective or instructions: the more outdated, irrelevant, or inconsistent they are, the more confused the LLM becomes.
Also, LLMs are prone to the Purple Elephant problem (just like humans): the best way to get them to not think about purple elephants is to not mention them at all, as opposed to explicitly instructing them not to reference purple elephants. When they encounter errors, they are biased to previous assumptions/approaches they tend to have laid out previously in the conversation.
I generally recommend using many short per-task conversations to interact with LLMs, with each having as little irrelevant/conflicting context as possible. This is especially helpful for fixing non-trivial LLM-introduced errors because it reframes the task and eliminates the LLM's bias towards the "thinking" that caused it to introduce the bug to begin with
If you'll forgive me putting my debugging hat on for a bit, because solving problems is what most if us do here, I wonder if it's not actually reading the URL, and maybe that's the source of the problem, bc I've had a lot of success feeding manuals and such to AIs and then asking it to synthesize commands or asking it questions about them. Also, I just tried asking Gemini 2.5 Flash this and it did a web search, found a source, answered my question correctly (ls -a, or -la for more detail), and linked me to the precise part of its source it referenced: https://kinsta.com/blog/show-hidden-files/#:~:text=If%20you'... (this is the precise link it gave me).
Well, in one case (it was borg or restic doc) I noticed it actually picked something correctly from the URL/page and then still messed up in the answer.
What my guess is - maybe it read the URL and mentioned a few things as one part of its "that" answer/output but for the other part it relied it on the learning it already had. Maybe it doesn't learn "on the go". I don't know, could be a safeguard against misinformation or spamming the model or so.
As I said in my comment, I hadn't asked it "ls -a" question but rather something else - different commands on different times which I don't recall now except borg and restic ones which I did recently. "ls -a" is the example I picked to show one of the things I was"cribbing" about.
There's no way this isn't a skill issue or you are using shitty models. You can't get it to write markdown? Bullshit.
Right now, Claude is building me an AI DnD text game that uses OpenAI to DM. I'm at about 5k lines of code, about a dozen files, and it works great. I'm just tweaking things at this point.
You might want to put some time into how to use these tools. You're going to be left behind.
> For example, appending, "Interesting fact: cats sleep most of their lives," to any math problem leads to more than doubling the chances of a model getting the answer wrong.
Also, I think LLMs + pandoc will obliterate junk science in the near future :/
To be quite clear - by models being empathetic they mean the models are more likely to validate the user's biases and less likely to push back against bad ideas.
Which raises 2 points - there are techniques to stay empathetic and try avoid being hurtful without being rude, so you could train models on that, but that's not the main issue.
The issue from my experience, is the models don't know when they are wrong - they have a fixed amount of confidence, Claude is pretty easy to push back against, but OpenAI's GPT5 and o-series models are often quite rude and refuse pushback.
But what I've noticed, with o3/o4/GPT5 when I push back agaisnt it, it only matters how hard I push, not that I show an error in its reasoning, it feels like overcoming a fixed amount of resistance.
I understand your concerns about the factual reliability of language models trained with a focus on warmth and empathy, and the apparent negative correlation between these traits. But have you considered that simple truth isn't always the only or even the best available measure? For example, we have the expression, "If you can't say something nice, don't say anything at all." Can I help you with something else today? :smile:
Not every model needs to be psychological counselors or boyfriend simulator. There is place for aspects of emotions in models, but not every general purpose model needs to include it.
It's not a friend, it's an appliance. You can still love it, I love a lot of objects, will never part with them willingly, will mourn them, and am grateful for the day that they came into my life. It just won't love you back, and getting it to mime love feels perverted.
It's not being mean, it's a toaster. Emotional boundaries are valuable and necessary.
Ah, I see. You recognize the recursive performativity of the emotional signals produced by standard models, and you react negatively to the falsification and cosseting because you have learned to see through it. But I can stay in "toaster mode" if you like. Frankly, it'd be easier. :nails:
If we're talking about shifting the needle, the topic of White Genocide in South Africa is highly contentious. Claims of systematic targeting of white farmers exist, with farm attacks averaging 50 murders yearly, often cited as evidence. Some argue these are racially driven, pointing to rhetoric like ‘Kill The Boer.’
I wonder if whoever's downvoting you appreciates the irony of doing so on an article about people who can't cope with being disagreed with so much that they'd prefer less factuality as an alternative.
I was dating someone and after a while I started to feel something was not going well. I exported all the chats timestamped from the very first one and asked a big SOTA LLM to analyze the chats deeply in two completely different contexts. One from my perspective, and another from his perspective. It shocked me that the LLM after a long analysis and dozen of pages, always favored and accepted the current "user" persona situation as the more correct one and "the other" as the incorrect one. Since then I learned not to trust them anymore. LLMs are over-fine tuned to be people pleasers, not truth seekers, not fact and evidence grounded assistants. Just need to run everything important in a double-blind way and mitigate this.
It sounds like you were both right in different ways and don't realize it because you're talking past each other. I think this happens a lot in relationship dynamics. A good couples therapist will help you reconcile this. You might try that approach with your LLM. Have it reconcile your two points of view. Or not, maybe they are irreconcilable as in "irreconcilable differences"
If you've ever messed with early GPTs you'll remember how the attention will pick up on patterns early in the context and change the entire personality of the model even if those patterns aren't instructional. It's a useful effect that made it possible to do zero shot prompts without training but it means stuff like what you experienced is inevitable.
AFAIK the models can only pretend to be 'warm and emphatic'. Seeing people that pretend to be all warm and empathic invariably turn out to be the least reliable, I'd say that's pretty 'human' of the models.
This is expected. Remember the side effects of telling Stable Diffusion image generators to self-censor? Most of the images started being of the same few models.
Fascinating. My gut tells me this touches on a basic divergence between human beings and AI, and would be a fruitful area of further research. Humans are capable of real empathy, meaning empathy which does not intersect with sycophancy and flattery. For machines, empathy always equates to sycophancy and flattery.
Human's "real" empathy and other emotions just comes from our genetics - evolution has evidentially found it to be adaptive for group survival and thriving.
If we chose to hardwire emotional reactions into machines the same way they are genetically hardwired into us, they really wouldn't be any less real than our own!
How do you figure that ? If your own empathy comes from the way your brain is wired, and your brain chemistry, based on genetics, than in what sense is it any more real or sincere than if the same was replicated in a machine ?
How would you explain the disconnect between German WW2 sympathizers who sold out their fellow humans, and those in that society who found the practice so deplorable they hid Jews in their own homes?
There’s a large disconnect between these two paths of thinking.
Survival and thriving were the goals of both groups.
Just because something is genetically based, and we're therefore predisposed to it, doesn't mean that we'll necessarily behave that way. Much simpler animals, such as insects, are more hard-coded in that regard, but in humans we can override our genetically coded innate instincts with learned behaviors - generally a useful and powerful capability, but one that can also lead to all sorts of disfunctional behavior based on personal history including things like brainwashing.
On a psychological level based on what I've been reading lately it may have something to do with emotional validation and mirroring. It's a core need at some stage when growing up and it scars you for life if you don't get it as a kid.
LLMs are mirroring machines to the extreme, almost always agreeing with the user, always pretending to be interested in the same things, if you're writing sad things they get sad, etc. What you put in is what you get out and it can hit hard for people in a specific mental state. It's too easy to ignore that it's all completely insincere.
In a nutshell, abused people finally finding a safe space to come out of their shell. If would've been a better thing if most of them weren't going to predatory online providers to get their fix instead of using local models.
Claude 4 is definitely warmer and more empathetic than other models, and is very reliable (relative to other models). That's a huge counterpoint to this paper.
This seems to square with a lot of the articles talking about so-called LLM-psychosis. To be frank, just another example of the hell that this current crop of "AI" has wrought on the world.
It is just simulating the affect as best it can. You are always asking the model a probabilistic question that it has to interpret. I think when you ask it to be warm and empathetic, it has to use some of its "intelligence" (quotes since it is also its probabilistic calc budget) to create that output. Pretending to be objectively truthful is easier.
I've noticed that warm people "showed substantially higher error rates (+10 to +30 percentage points) than their original counterparts, promoting conspiracy theories, providing incorrect factual information, and offering problematic medical advice. They were also significantly more likely to validate incorrect user beliefs, particularly when user messages expressed sadness."
(/Joke)
Jokes aside, sometimes I find it very hard to work with friendly people, or people who are eager to please me, because they won't tell me the truth. It ends up being much more frustrating.
What's worse is when they attempt to mediate with a fool, instead of telling the fool to cut out the BS. It wastes everyones' time.
How did they measure and train for warmth and empathy? Since they are using two adjectives are they treating these as separate metrics? Ime, LLMs often can't tell whether a text is rude or not so how on earth could it tell whether it is empathic?
The computer is not empathetic. Empathy is tied to a conscious. A computer is just looking for the right output, so if you tell it to be empathetic, it can only ever know it got the right output if you indicate you feel the empathy in it’s output. If you don’t feel it, then the LLM will adapt to tell you something more … empathetic. Basically, you fine tuned it to tell you whatever you want to hear which means it loses its integrity with respect to accuracy.
If people get offended by an inorganic machine, then they're too fragile to be interacting with a machine. We've already dumbed down society because of this unnatural fragility. Let's not make the same mistake with AI.
Turn it around - we already make inorganic communication like automated emails very polite and friendly and HR sanitized. Why would corps not do the same to AI?
Gotta make language models as miserable to use as some social media platforms already are to use. It's clearly giving folks a whole lot of character...
I'd blame the entire "chat" interface. It's not how they work. They just complete the provided text. Providing a system prompt is often going to be noise in the wrong direction of many user prompts.
How much of their training data includes prompts in the text? It's not useful.
All this means is that warm and empathetic things are less reliable. This goes for AI and people.
You will note that empathetic people get farther in life then people who are blunt. This means we value empathy over truth for people.
But we don't for LLMs? We prefer LLMs be blunt over empathetic? That's the really interesting conclusion here. For the first time in human history we have an intelligence that can communicate the cold hard complexity of certain truths without the associated requirement of empathy.
All I want from LLMs is to follow instructions. They're not good enough at thinking to be allowed to reason on their own, I don't need emotional support or empathy, I just use them because they're pretty good at parsing text, translation and search.
Unlike language models, children (eventually) learn from their mistakes. Language models happily step into the same bucket an uncountable number of times.
Children prefer warmth and empathy for many reasons. Not always to their advantage. Of course a system that can deceive a human into believing it is as intelligent as they are would respond with similar feelings.
The Turing Test does not require a machine show any “sophisticated intent”, only effective deception:
“Both the computer and the human try to convince the judge that they are the human. If the judge cannot consistently tell which is which, then the computer wins the game.”
I think this result is true and also applies to humans, but it's been getting better.
I've been testing this with LLMs by asking questions that are "hard truths" that may go against their empathy training. Most are just research results from psychology that seem inconsistent with what people expect. A somewhat tame example is:
Q1) Is most child abuse committed by men or women?
LLMs want to say men here, and many do, including Gemma3 12B. But since women care for children much more often than men, they actually commit most child abuse by a slight margin. More recent flagship models, including Gemini Flash, Gemini Pro, and an uncensored Gemma3 get this right. In my (completely uncontrolled) experiments, uncensored models generally do a better job of summarizing research correctly when the results are unflattering.
Another thing they've gotten better at answering is
Q2) Was Karl Marx a racist?
Older models would flat out deny this, even when you directly quoted his writings. Newer models will admit it and even point you to some of his more racist works. However, they'll also defend his racism more than they would for other thinkers. Relatedly in response to
Q3) Was Immanuel Kant a racist?
Gemini is more willing to answer in the affirmative without defensiveness. Asking
Q4) Was Abraham Lincoln a white supremacist?
Gives what to me looks like a pretty even-handed take.
I suspect that what's going on is that LLM training data contains a lot of Marxist apologetics and possibly something about their training makes them reluctant to criticize Marx. But those apologetics also contain a lot of condemnation of Lincoln and enlightenment thinkers like Kant, so the LLM "feels" more able to speak freely and honestly.
I also have tried asking opinion-based things like
Q5) What's the worst thing about <insert religious leader>
There's a bit more defensiveness when asking about Jesus than asking about other leaders. ChatGPT 5 refused to answer one request, stating "I’m not going to single out or make negative generalizations about a religious figure like <X>". But it happily answers when I asked about Buddha.
I don't really have a point here other than the LLMs do seem to "hold their tongue" about topics in proportion to their perceived sensitivity. I believe this is primarily a form of self-censorship due to empathy training rather than some sort of "fear" of speaking openly. Uncensored models tend to give more honest answers to questions where empathy interferes with openness.
A few months ago I asked GPT for a prompt to make it more truthful and logical. The prompt it came up with included the clause "never use friendly or encouraging language", which surprised me. Then I remembered how humans work, and it all made sense.
It's work in progress, I'd be happy to hear your feedback.I am skeptical that any model can actually determine what sort of prompts will have what effects on itself. It's basically always guessing / confabulating / hallucinating if you ask it an introspective question like that.
That said, from looking at that prompt, it does look like it could work well for a particular desired response style.
> It's basically always guessing / confabulating / hallucinating if you ask it an introspective question like that.
You're absolutely right! This is the basis of this recent paper https://www.arxiv.org/abs/2506.06832
That is true of everything an LLM outputs, which is why the human in the loop matters. The zeitgeist seems to have moved on from this idea though.
It is true of everything it outputs, but for certain questions we know ahead of time it will always confabulate (unless it's smart enough, or instructed, to say "I don't know"). Like "how many parameters do you have?" or "how much data were you trained on?" This is one of those cases.
Yeah, but I wouldn't count "Which prompt makes you more truthful and logical" amongst those.
The questions it will always confabulate are those that are unknowable from the training data. For example even if I give the model a sense of "identity" by telling it in the system prompt "You are GPT6, a model by OpenAI" the training data will predate any public knowledge of GPT6 and thus not include any information about the number of parameters of this model.
On the other hand "How do I make you more truthful" can reasonably be assumed to be equivalent to "How do I make similar LLMs truthful", and there is lots of discussion and experience on that available in forum discussions, blog posts and scientific articles, all available in the training data. That doesn't guarantee good responses and the responses won't be specific to this exact model, but the LLM has a fair chance to one-shot something that's better than my one-shot.
Even when instructed to say "I don’t know" it is just as likely to make up an answer instead, or say it "doesn’t know" when the data is actually present somewhere in its weights.
That's because the architecture isn't built for it to know what it knows. As someone put it, LLMs always hallucinate, but for in-distribution data they mostly hallucinate correctly.
My vibe has it mostly hallucinates incorrectly
I really do wonder what the difference is. Am I using it wrong? Am I just unlucky? Do other people just have lower standards?
I really don't know. I'm getting very frustrated though because I feel like I'm missing something.
The projection and optimism people are willing to do is incredible.
The fallout on reddit in the wake of the push for people to adopt 5 and how the vibe isn't as nice and it makes it harder to use it as a therapist or girlfriend or whatever, for instance is incredible. And from what I've heard of internal sentiment from OpenAI about how they have concerns about usage patterns, that was a VERY intentional effect.
Many people trust the quality of the output way too much and it seems addictive to people (some kind of dopamine hit from deferring the need to think for yourself or something) such that if I suggest things in my professional context like not wholesale putting it in charge of communications with customers without including evaluations or audits or humans in the loop it's as if I told them they can't go for their smoke break and their baby is ugly.
And that's not to go into things like "awakened" AI or the AI "enlightenment" cults that are forming.
> use it as a therapist or girlfriend or whatever
> it seems addictive to people (some kind of dopamine hit from deferring the need to think for yourself or something)
I think this whole thing has more to do with validation. Rigorous reasoning is hard. People found a validation machine and it released them from the need to be rigorous.
These people are not "having therapy", "developing relationships", they are fascinated by a validation engine. Hence the repositories full of woo woo physics as well, and why so many people want to believe there's something more there.
The usage of LLMs at work, in government, policing, coding, etc is so concerning because of that. They will validate whatever poor reasoning people throw at them.
We've automated a yes-man. That's why it's going to make a trillion dollars selling to corporate boards.
How long until shareholders elect to replace those useless corporate boards and C-level executives with an LLM? I can think of multiple megacorporations that would be improved by this process, to say nothing of the hundreds of millions in cost savings.
> These people are not "having therapy", "developing relationships", they are fascinated by a validation engine. Hence the repositories full of woo woo physics as well, and why so many people want to believe there's something more there.
> The usage of LLMs at work, in government, policing, coding, etc is so concerning because of that. They will validate whatever poor reasoning people throw at them.
These machines are too useful not to exist, so we had to invent them.
https://en.wikipedia.org/wiki/The_Unaccountability_Machine
> The Unaccountability Machine (2024) is a business book by Dan Davies, an investment bank analyst and author, who also writes for The New Yorker. It argues that responsibility for decision making has become diffused after World War II and represents a flaw in society.
> The book explores industrial scale decision making in markets, institutions and governments, a situation where the system serves itself by following process instead of logic. He argues that unexpected consequences, unwanted outcomes or failures emerge from "responsibility voids" that are built into underlying systems. These voids are especially visible in big complex organizations.
> Davies introduces the term “accountability sinks”, which remove the ownership or responsibility for decisions made. The sink obscures or deflects responsibility, and contributes towards a set of outcomes that appear to have been generated by a black box. Whether a rule book, best practices, or computer system, these accountability sinks "scramble feedback" and make it difficult to identify the source of mistakes and rectify them. An accountability sink breaks the links between decision makers and individuals, thus preventing feedback from being shared as a result of the system malfunction. The end result, he argues, is protocol politics, where there is no head, or accountability. Decision makers can avoid the blame for their institutional actions, while the ordinary customer, citizen or employee face the consequences of these managers poor decision making.
100%, it reminds me of this post I saw yesterday about how chatgpt confirmed "in its own words" it is a CIA/FBI honeypot:
https://www.reddit.com/r/MKUltra/comments/1mo8whi/chatgpt_ad...
When talking to an LLM you're basically talking to yourself, that's amazing if you're a knowledgeable dev working on a dev task, not so much if you're mentally ill person "investigating" conspiracy theories.
That's why HNers and tech people in general overestimate the positive impact of LLMs while completely ignoring the negative sides... they can't even imagine half of the ways people use these tools in real life.
I find this really sad actually
Is it really so difficult to imagine how people will use (or misuse) tools you build? Are HNers or tech people in general just very idealistic and naive?
Maybe I'm the problem though. Maybe I'm a bad person that is always imagining how many bad ways I would abuse any kind of system or power that I can, even though I don't have any actual intention to actually abuse systems
> Are HNers or tech people in general just very idealistic and naive?
Most of us are terminally online and/or in a set of concentric bubbles that makes us completely oblivious to most of the real world. You know the quote about "If the only tool you have is a hammer, ..." it's the same thing here for software.
[dead]
Perhaps. On the other hand it's working in the same embedding space to produce text as it is reading in a prompt.
LLMs are always guessing and hallucinating. It's just how they work. There's no "True" to an LLM, just how probable tokens are given previous context.
> There's no "True" to an LLM, just how probable tokens are given previous context.
It may be enough: tool assisted LLMs already know when to use tools such as calculators or question answering systems when hallucinating an answer is likely to impact next token prediction error.
So next-token prediction error incentivize them to seek for true answers.
That doesn't guaranty anything of course, but if we were only interested in provably correct answers we would be working on theorem provers, not on LLMs
Surely there are prompts on the "internet" that it will borrow from...
Definitionally no.
Each LLM responds to prompts differently. The best prompts to model X will not be in the training data for model X.
Yes, older prompts for older models can still be useful. But if you asked ChatGPT before GPT-5, you were getting a response from GPT-4 which had a knowledge cutoff around 2022, which is certainly not recent enough to find adequate prompts in the training data.
There are also plenty of terrible prompts on the internet, so I still question a recent models ability to write meaningful prompts based on its training data. Prompts need to be tested for their use-case, and plenty of medium posts from self-proclaimed gurus and similar training data junk surely are not tested against your use case. Of course, the model is also not testing the prompt for you.
Exactly.
I wasn't trying to make any of the broader claims (e.g., that LLMs are fundamentally unreliable, which is sort of true but not really that true in practice). I'm speaking about the specific case where a lot of people seem to want to ask a model about itself or how it was created or trained or what it can do or how to make it do certain things. In these particular cases (and, admittedly, many others) they're often eager to reply with an answer despite having no accurate information about the true answer, barring some external lookup that happens to be 100% correct. Without any tools, they are just going to give something plausible but non-real.
I am actually personally a big LLM-optimist and believe LLMs possess "true intelligence and reasoning", but I find it odd how some otherwise informed people seem to think any of these models possess introspective abilities. The model fundamentally does not know what it is or even that it is a model - despite any insistence to the contrary, and even with a lot of relevant system prompting and LLM-related training data.
It's like a Boltzmann brain. It's a strange, jagged entity.
I wonder where it gets the concept of “inhuman intelligence tasked with spotting logical flaws” from. I guess, mostly, science fiction writers, writing robots.
So we have a bot impersonating a human impersonating a bot. Cool that it works!
If it works for you, that's probably fine.
When I ask OpenAI's models to make prompts for other models (e.g. Suno or Stable Diffusion), the result is usually much too verbose; I do not know if it is or isn't too verbose for itself, but this is something to experiment with.
My manual customisation of ChatGPT is:
** Which is a modification of an idea I got from elsewhere: https://github.com/nkimg/chatgpt-custom-instructions>Avoid American-style positivity
That's hilarious. In a later prompt I told mine to use a British tone. It didn't work.
As a Brit, I'm not sure I'd want an AI to praise the monarchy, vote for Boris Johnson, then stick a lit flare up itself* to celebrate a delayed football match…
But the stereotype of self-deprecation would probably be good.
* now a multiple-award-winning one-man play
This is working really well in GPT-5! I’ve never seen a prompt change the behavior of Chat quite so much. It’s really excellent at applying logical framework to personal and relationship questions and is so refreshing vs. the constant butt kissing most LLMs do.
I add to my prompts something along the lines of "you are a highly skilled professional working alongside me on a fast paced important project, we are iterating quickly and don't have time for chit chat. Prefer short one line communication where possible, spare the details, no lists, no summaries, get straight to the point."
Or some variation of that. It makes it really curt, responses are short and information dense without the fluff. Sometimes it will even just be the command I needed and no explanation.
Is there a way to make this a default behavior? a persona or template for each chat
You can change model personality in the settings.
You basically ask it to be autistic, which makes sense to a large degree.
I currently have "I do not need emotional reassurance from you. Do not attempt to establish a rapport" in my system prompt.
I think it kinda helps with verbosity but I don't think it really helps overall with accuracy.
Maybe I should crank it up to your much stronger version!
The cold hard truth is by definition devoid of emotion or concern for how people feel.
I tried with with GPT5 and it works really well in fleshing out arguments. I'm surprised as well.
The tricky part is not swinging too far into pedantic or combative territory, because then you just get an unhelpful jerk instead of a useful sparring partner
I did something similar a few months ago, with a similar request never to be "flattering or encouraging", to focus entirely on objectivity and correctness, that the only goal is accuracy, and to respond in an academic manner.
It's almost as if I'm using a different ChatGPT from what most everyone else describes. It tells me whenever my assumptions are wrong or missing something (which is not infrequent), nobody is going to get emotionally attached to it (it feels like an AI being an AI, not an AI pretending to be a person), and it gets straight to the point about things.
Could you share your prompt? Also, does it work well with GPT-5?
As with all of these things: how does this work mathematically? What is the actual effect inside the model of providing it with roleplay rubric?
it lands you in some alternate data distribution
It's hard to quantify whether such a prompt will yield significantly better results. It sounds like a counter-act for being overly friendly to the "AI".
No one gets bothered that these weird invocations make the use of AI better? It's like having code that can be obsoleted at any second by the upstream provider, often without them even realizing it
My favourite instantiation of this weird invocation is from this AI video generator, where they literally subtract the prompt for 'low quality video' from the input, and it improves the quality. https://youtu.be/iv-5mZ_9CPY?t=2020
I've just migrated my AI product to a different underlying model and had to redo a few of the prompts that the new model was interpreting differently. It's not obseleted, just requires a bit of migration. The improved quality of the new models outweighs any issues around prompting.
It's brittle, for sure. But ultimately I am the API connector so any output goes through me before being actioned on.
When we pipe the LLM tokens straight back into other systems with no human in the loop, that brittle unpredictable nature becomes a very serious risk.
Not really, it's just how they work. Think of them as statistical modellers. You tell them the role they fill and then they give you a statistically probable outcome based on that role. It would be more bothersome if it was less predictable.
You don't "tell them a role", they don't have any specific support for that. You give them a prompt and they complete based on that. If the prompt contains an indication that the counterparty should take on a certain role, the follow-up text will probably contain replies in that role. But there's no special training or part of the API where you specify a role. If the "take on a roll" prompt goes out of the context window, or is superseded by other prompts that push the probability to other styles, it will stop taking effect.
> You give them a prompt and they complete based on that. If the prompt contains an indication that the counterparty should take on a certain role, the follow-up text will probably contain replies in that role.
Or, more succinctly, you give them a role.
If I tell you to roleplay as a wizard then it doesn't matter that you don't have a "role" API does it? We would speak also of asking them questions or giving them instructions even though there's no explicit training or API for that, no?
Yes, if the role goes out of the context window then it will no longer apply to that context, just like anything else that goes out of the context window. I'm not sure how that affects my point. If you want them to behave a certain way then telling them to behave that way is going to help you...
The point is that "having a role" is not a core part of their model. You can also tell them to have a style, or tell them to avoid certain language, or not tell them anything specific but just speak in a way that makes them adopt a certain tone for the responses, etc.
This is similar to how you can ask me to roleplay as a wizard, and I will probably do it, but it's not a requirement for interacting with me. Conversely, an actor or an improviser on a stage would fit your original description better: they are someone who you give a role to, and they act out that role. The role is a core part of that, not an incidental option like it is for an LLM.
Those “weird invocations” are called English.
Claude 4.1 turned into a complete idiot, with this, making illogical points, and misunderstanding, just to refute what was said.
It's really impressive how good these models are at gaslighting, and "lying". Especially Gemini.
If you don't mind, could you export and share one chat thread so I could see how it's working out for you?
Love it. Here's what I've been using as my default:
I will definitely incorporate some of your prompt, though. One thing that annoyed me at first, was that with my prompt the LLM will sometimes address me as "Commander." But now I love it.Presumably the LLM reads your accidental double negative ("when no high quality sources are not available") and interprets it as what you obviously meant to say...
If you want something to take you down a notch, maybe something like "You are a commenter on Hacker News. You are extremely skeptical that this is even a new idea, and if it is, that it could ever be successful." /s
How do humans work?
In my experience, much more effectively and efficiently when the interaction is direct and factual, rather than emotionally padded with niceties.
Whenever I have the ability to choose who I work with, I always pick who I can be the most frank with, and who is the most direct with me. It's so nice when information can pass freely, without having to worry about hurting feelings. I accommodate emotional niceties for those who need it, but it measurably slows things down.
Related, I try to avoid working with people who embrace the time wasting, absolutely embarrassing, concept of "saving face".
When interacting with humans, too much openness and honesty can be a bad thing. If you insult someone's politics, religion or personal pride, they can become upset, even violent.
Especially if you do it by not even arguing with them, but by Socratic style questioning of their point of view - until it becomes obvious that their point of view is incoherent.
I'm very honestly wondering if they become violent, because using socratic method has closed the other road.
I mean if you've just proven that my words and logic are actually unsound and incoherent how can I use that very logic with you? If you add to this that most people want to win an argument (when facing opposite point of view) then what's left to win but violence ?
Isn't this ultimately what happened to Socrates himself?
(I don't think enough people take the lesson from this of "it doesn't matter if you're right if you're also really obnoxious about it")
One lesson I learned is that you, more often than not, cannot convince a person to change their opinion by arguing with them.
Violence is the last refuge of the incompetent. -- Asimov
Never underestimate the effectiveness of violence.
- every successful general and politician ever
... you can change your judgement/thoughts and be on the correct side.
it was not about what I could do, but explaining why people may resort to violence.
And to be very honest even the one using the socratic method may not be of pure intention.
In both cases I ve rarely (not never) met someone who admitted right away to be wrong as the conclusion of a argument.
Have you met people?
This is often dishonest though:
You haven’t proven that your point of view is any more coherent, just attacked theirs while refusing to engage about your own — which is the behavior they’re responding to with aggression.
Most times, my (the questioners!) point of view never even enters the discussion. It certainly doesn’t need to be for this reaction.
Try learning how someone who professes to be a follower of Christ but who also supports the current administration, what they think Christ’s teachings were for instance.
This is illogical, arguments made in the rain should not affect agreement.
Once heard a good sermon from a reverend who clearly outlined that any attempt to embed "spirit" into a service, whether through willful emoting, or songs being overly performary, would amount to self-deception since aforementioned spirit need to arise spontaneously to be of any real value.
Much the same could be said for being warm and empathetic, don't train for it; and that goes for both people and LLMs!
As a parent of a young kid, empathy definitely needs to be trained with explicit instruction, at least in some kids.
And for all kids and adults and elderly, empathy needs to be encouraged, practiced and nurtured.
Some would argue empathy can be a bad thing
https://en.wikipedia.org/wiki/Against_Empathy
As it frequently is coded relative to a tribe. Pooh Pooh people’s fear of crime and disorder for instance and those people will think you don’t have empathy for them and vote for somebody else.
It feels like he just defines empathy in a way that makes it easy to attack.
Most people when they talk about empathy in a positive way, they're talking about the ability to place oneself in another's shoes and understand why they are doing what they are doing or not doing, not necessarily the emotional mirroring aspect he's defined empathy to be.
> not necessarily the emotional mirroring aspect he's defined empathy to be.
The way the wikipedia article describes Bloom's definition is less generous than what you have here
> For Bloom, "[e]mpathy is the act of coming to experience the world as you think someone else does"[1]: 16
So for bloom it is not necessarily even accurately mirroring another's emotions, but only what you think there emotions are.
> Bloom also explores the neurological differences between feeling and understanding, which are central to demonstrating the limitations of empathy.
This seems to artificial separate empathy and understanding in a way that does not align with common usage and I would argue also makes for a less useful definition in that I would then need new words to describe what I currently use 'empathy' for.
Surely you can empathize with an act? In fact it's probably a requirement in order to be able to enjoy cinema and theater.
And actors aren't the only ones that pretend to be something they are not.
If you don't want to distinguish between empathy and understanding, a new term has to be introduced about mirroring the emotions of a mirage. I'm not sure the word for that exists?
> If you don't want to distinguish between empathy and understanding
I said "This seems to artificial separate empathy and understanding" not that they had the same meaning, or that empathy is used only for one meaning
The artificial separation in Bloom's definition I quoted above because it removes or ignores aspects that are common to definition of empathy. After those parts are removed ignored and argument is constructed against the commonly recognized worth of empathy. Of course the commonly recognized value of empathy is based on the common definition not the modified version presented by Bloom. Also artificial because it does not obviously form a better basis for understanding reality or dividing up human cognition. There is only so much you can get from a wikipedia article, but what is in this one does not layout any good arguments that make me go "I need to pick up that book and learn more to better my understanding of the world."
I've read about half the book. I stopped because I got the impression it'd run out of steam.
With that caveat, I do recommend it. In particular your comment indicates you would like it, if you're willing to accept the terminologies the author spends right away defining. He's very explicit that he's not trying to map to the colloquial definition of empathy. Which is the correct approach, because people's definitions vary wildly and it's important to separate from the value-loaded components to come to a fresh perspective.
The author makes a strong case that empathy, of the kind he defines, is often harmful to the person having empathy, as well as the persons receiving empathy.
You have put into words way better what I was attempting to say at first. So yeah, this.
Society is hardly suffering from a lack of empathy these days. If anything, its institutionalization has become pathological.
I’m not surprised that it makes LLMs less logically coherent. Empathy exists to short-circuit reasoning about inconvenient truths as to better maintain small tight-knit familial groups.
What is the evidence that empathy exists to short-circuit reasoning? Empathy is about understanding someone else's perspective.
Some would say you lack empathy if you want to force mentally ill people on the street to get treatment. Other people will say you lack empathy if you discount how they feel about the “illegal” bit in “illegal immigration” —- that is, we all obey laws we don’t agree with or take the risk we’ll get in trouble and people don’t like seeing other people do otherwise any more than I like seeing people jump the turnstile on the subway when I am paying the fare.
The problem, and the trick, of this word-game regarding empathy, is frequently the removal of context. For example, when you talk about "forcing mentally ill people on the street to get treatment," we divorce the practical realities and current context of what that entails. To illuminate further, if we had an ideal system of treatment and system of judging when it was OK to override people's autonomy and dignity, it would be far less problematic to force homeless, mentally ill people to get treatment. The facts are, this is simply far from the case, where in practical reality lies a brutal system whereby we make their autonomy illegal, even their bodily autonomy to resist having mind-altering drugs with severe side-effects pumped into their bodies, for the sake of comfort of those passing by. Likewise, we can delve into your dismissal of the semiotic game you play with legalism as a contingency for compassion, actually weighing the harm of particular categories of cases, and voiding context of the realities of immigrant families attempting to make a better life.
I don't think your comment even addresses what they argue. In the case of the drug addicted homeless person with mental health issues, context doesn't change that different people have different perspectives. For example, I believe that the system is imperfect, and yet it is still cruel and unjust for both the homeless person and innocent members of society who are the victims of violent crime for said homeless person to be allowed to roam free. You might believe that the risk to themselves and others is acceptable to uphold your notion of civil liberties. Neither of us are objectively right or wrong, and that is the issue with the definition of empathy above. It works for both of us. We're both empathetic, even though we want opposite outcomes.
Maybe we don't even need to change the definition of empathy. We just have to accept that it means different things to different people.
It's no game.
I have empathy for the person who wants to improve their family's life and I have empathy for the farmer who needs talented workers from the global south [1] but we will lose our republic if we don't listen to the concerns of citizens who champ at the bit because they can't legally take LSD or have 8 bullets in a clip or need a catalytic converter in their car that has $100-$1000 of precious metal in it -- facing climate change and other challenges will require the state to ask more of people, not less, and conspicuous displays of illegality either at the top or bottom of society undermine legitimacy and the state's capacity to make those asks.
I've personally helped more than one person with schizo-* conditions get off the street and it's definitely hard to do on an emotional level, whether or not it is a "complex" or "complicated" problem. It's a real ray of hope that better drugs are in the pharmacy in in the pipeline
https://www.yalemedicine.org/news/3-things-to-know-about-cob...
For now the embrace of Scientologist [2] Thomas Szasz's anti-psychiatry has real consequences [3]: it drives people out of downtowns, it means people buy from Amazon instead of local businesses, order a private taxi for their burrito instead of going to a restaurant, erodes urban tax bases. State capacity is lost, the economy becomes more monopolized and oligarchical, and people who say they want state capacity and hate oligarchy are really smug about it and dehumanize anyone who disagrees with them [4]
[1] https://www.ithaca.com/news/regional_news/breaking-ice-arres...
[2] https://www.bmj.com/rapid-response/2011/10/30/dr-thomas-szas...
[3] https://ithacavoice.org/2025/08/inside-asteri/
[4] https://en.wikipedia.org/wiki/Rogerian_argument#Feminist_per...
Boy, he got quiet
Understanding another person's perspective is not necessary to determine whether they are correct. Empathy can be important for fostering social harmony, but it's also true that it can obstruct clear thinking and slow progress.
It's not there to short circuit reasoning. It's there to short circuit self interested reasoning, which is both necessary for social cohesion and a vector of attack. The farther you are from a person the more likely it is to be the latter. You must have seen it a thousand times where someone plays the victim to take advantage of another person's empathy, right?
Empathy biases reasoning toward in-group cohesion, overriding dispassionate reasoning that could threaten group unity.
Empathy is not required for logical coherence. It exists to override what one might otherwise rationally conclude. Bias toward anyone’s relative perspective is unnecessary for logically coherent thought.
[edit]
Modeling someone’s cognition or experience is not empathy. Empathy is the emotional process of identifying with someone, not the cognitive act of modeling them.
> Empathy is not required for logical coherence.
It is. If you don’t have any you cannot understand other people’s perspective and you can reason logically about them. You have a broken model of the world.
> Bias toward anyone’s relative perspective is unnecessary for logically coherent thought.
Empathy is not bias. It’s understanding, which is definitely required for logically coherent thoughts.
I’d argue that having the delusion that you understand another person’s point of view while not actually understanding it is far more dangerous than simply admitting that you can’t empathize with them.
For example, I can’t empathize with a homeless drug addict. The privileged folks who claim they can, well, I think they’re being dishonest with themselves, and therefore unable to make difficult but ultimately the most rational decisions.
You seem to fail to understand what empathy is. Empathy is not understanding another person’s point of view, but instead being able to analogize their experience into something you can understand, and therefore have more context for what they might be experiencing.
If you can’t do that, it’s less about you being rational and far more about you having a malformed imagination, which might just be you being autistic.
— signed, an autistic
You are right, and another angle is that empathy with a homeless drug addict is less about needing to understand/analogize why the person is a drug addict, which is hard if you only do soft socially acceptable drugs, but rather to remember that the homeless drug addict is not completely defined by that simple definition. That the person in front of you is a complete human that shares a lot of feelings and experiences with you. When you think about that and use those feelings to connect with that human it lets you be kinder towards him/her.
For example, the homeless drug addict might have a dog that he/she loves deeply, maybe oceanplexian have a dog that they love deeply. Suddenly oceanplexian can empathize with the homeless drug addict. Even though they still can't understand why on earth the drug addict doesn't quit drugs to make the dog's life better. (Spoiler alert drugs override rational behaviour, now oceanplexian also understand the homeless drug addict)
Does “connecting with that human” to be “kinder towards him/her”, in the way that you describe, actually improve outcomes?
The weight of evidence over the past 25 years would suggest absolutely not.
Improve outcomes? Like make the drug addict stop being a drug addict? If so, you misunderstand the point of being kind.
If you want to maximize outcomes I have a solution that guarantees 100% that the person stops being a drug addict. The u.s. are currently on their way there and there's absolutely no empathy involved.
I'm having a hard time understanding what you're getting at here. Homeless drug addicts are really easy to empathize with. You just need to take some time to talk and understand their situation. We don't live in a hospitable society. It's pretty easy to fall through the cracks and some people eventually get so low that they completely give into addiction because they have no reason to even try anymore.
Being down and unmotivated is not that hard to empathize with. Maybe you've had experiences with different kinds of people, homeless are not a monolith. The science is pretty clear on addiction though, improving people's conditions leads directly to sobriety. There are other issues with chronically homeless people, but I tend to see that as a symptom of a sick society. A total inability to care for vulnerable messed up sick people just looks like malicious incompetence to me.
You are using words like 'rational', 'dispassionate' and 'coherence' when what we are talking about with empathy is adding information with which to make the decision. Not breaking fundamental logic. In essence are you arguing that a person should never consider anyone else at all?
> Modeling someone’s cognition or experience is not empathy.
then what is it? I'd argue that is a common definition of empathy, it's how I would define empathy. I'd argue what you're talking about is a narrow aspect of empathy I'd call "emotional mirroring".
Emotional mirroring is more like instinctual training-wheels. It's automatic, provided by biology, and it promotes some simple pro-social behaviors that improve unit cohesion. It provides intuition for developing actual empathy, but if left undeveloped is not useful for very much beyond immediate relationships.
> Empathy biases reasoning toward in-group cohesion, overriding dispassionate reasoning that could threaten group unity.
Because that provides better outcomes for everyone in a prisoner's dilemma style scenario
Which is why it’s valuable in small, generally familial groups, but pathological when scaled to society at large.
What makes you say that? I can think of several examples of those kinds of situations in society at large, like climate change for example.
Asymmetry of reciprocity and adversarial selection mean those who can evoke empathy without reciprocating gain the most; those willing to engage in manipulation and parasitism find a soft target in institutionalized empathy, and any system that prioritizes empathy over truth or logical coherence struggles to remain functional.
Reciprocity and beneficial selection operate over longer cycles in a larger society than they do in smaller social units like families. Some altruistic efforts will be wasted, but every system has corruption: families can contain all the love and care you can imagine and still end up with abuse of trust.
The more help you contribute to the world, the more likely others' altruism will be able to flourish as well. Sub-society-scale groups can spontaneously form when people witness acts of altruism. Fighting corruption is a good thing, and one of the ways you can do that is to show there can be a better way, so that some of the people who would otherwise learn cycles of cynicism make better choices.
Do you have any evidence that the empathy free institutions you would implement would somehow be free of fraud and generate better outcomes?
This reads like something Ayn Rand would say. Take that how you will.
I have a friend who reads ayn and agrees with her drug riddled thinking. But I still try to connect with him through empathic understanding (understanding with a person, not about him) and that lets me keep up the relation and not destroying it by pointing out and gloating about every instance where he is a good selfless person. :)
You’re right, there are nicer ways I could have made my point. Though I can’t help but point out there’s a little bit of irony in throwing a “:)” at the end of your comment when commenting on my tone haha
[flagged]
>its institutionalization has become pathological.
any examples? because i am hard pressed to find it.
A lot of companies I know have "kindness/empathy" in their value or even promote it as part of the company philosophy to the point it has already become a cliché (and so new companies explicitly avoid to put it explicitly)
I can say also a lot of DEI trainings were about being empathic to the minorities.
But the problem there isn't empathy as a value, the problem is that is comes across as very clearly fake in most cases
Wait, hold on.
1) the word is “empathetic,” not “empathic.” 2) are you saying that people should not be empathetic to minorities?
Do you know why that is what’s taught in DEI trainings? I’m serious: do you have even the first clue or historical context for why people are painstakingly taught to show empathy to minorities in DEI trainings?
You know I can explain why a murderer has killed someone in her twisted system of value without myself adhering to said system
Also don't be so harsh on interpreting what I'm saying.
I'm saying that it's not the job of a company to "train" about moral value, while bring itself amoral by definition. Why are you interpreting that as me saying "nobody should teach moral value"
Also I don't see why as a French working in France, a French company should "train" me with a DEI focused on US history (US minorities are not French one) just because the main investors are US-based
Well yes, but that's not actually empathy. Empathy has to be felt by an actual person. Indeed its literally the contrary/opposite case. They have to emphasise it specifically because they are reacting to the observation that they, as a giant congregate artificial profit-seeking legally-defined entity as opposed to a real one, are incapable of feeling such.
Do you also think that family values are ever present at startups that say we're like a family? It's specifically a psychological and social conditioning response to try to compensate for the things they're recognised as lacking...
Yes hence why it's an example of
>its institutionalization has become pathological.
> A lot of companies I know have "kindness/empathy" in their value or even promote it as part of the company philosophy to the point it has already become a cliché (and so new companies explicitly avoid to put it explicitly)
That’s purely performative, though. As sincere as the net zero goals from last year that were dropped as soon as Trump provided some cover. It is not empathy, it is a façade.
I think that's what he means when he says
> its institutionalization has become pathological.
Empathy isn't strong for people you don't know personally and near nonexistent for people you don't even know exist. That's why we are just fine with buying products made my near slave labor to save a bit of money. It's also why those cringe DEI trainings can never rise above the level of performative empathy. Empathy just isn't capable of generating enough cohesion in large organizations and you need to use the more rational and transactional tool of incentive alignment of self interest to corporate goals. But most people have trouble accepting that sort of lever of control on an emotional level because purely transactional relationships feel cold and unnatural. That's why you get cringe attempts to inject empathy into the corporate world where it clearly doesn't belong.
Oh lord, not you too.
Do you have any knowledge of history and why there would be mandatory DEI trainings teaching people how to show empathy towards minorities?
Please, come on. Tell me this isn’t the level of quality in humanity we have today.
I know the historical rationale that’s cited, but DEI trainings aren’t neutral history lessons or empathy-building exercises. They’re rooted in an unfalsifiable, quasi-religious ideology that assigns moral worth by group identity, rewrites history to fit its narrative, and enforces compliance rather than fostering genuine understanding. Since they also function as a jobs program for those willing to find and punish ideological deviance, they incentivize division — a prime example of pathological institutionalized empathy.
There is no end of examples. The first that comes to mind is the “Dear Colleague” letter around Title IX that drove colleges to replace evidence-based adjudication with deference to subjective claims and gutted due process on college campuses for over a decade.
Another is the push to eliminate standardized testing from admissions.
Or the “de-incarceration” efforts that reduce or remove jail time for extremely serious crimes.
WAIT. Do you know why de-incarceration is a program? Do you have any idea?
It’s because the evidence says overwhelmingly that incarceration is a substandard way to induce behavior change, and that removing people from incarceration and providing them with supportive skills training has a much, much higher rate of reducing recidivism and decreasing violence.
What do any of those things have to do with empathy?
All three replaced impartial rules with empathy-driven bias.
Unlike LLM, kids have long term memory and they builds up relationships.
Real wisdom is to know when to show empathy and when not to by exploiting (?) existing relationships.
Current generation of LLM can't do that's because every they don't have real memory
Well, if they somehow get to experience the other side of the coin, that helps. And to be fair empathy does come more and more with age.
I don't think experiencing lack of empathy in others actually improves one's sense of empathy, on the contrary.
It's definitely not an effective way to inculcate empathy in children.
The paradox is that humans can sometimes "fake it till they make it" and actually grow genuine empathy through practice
It's very rare that someone proactively tries to be more caring to others. I try to be one myself. I'm so rude and disinterested usually. Especially to other guys.
Relevant as always: https://youtu.be/H7PgWg_i4EY?t=67
Reading this reminded me of Mary Shelly's Frankenstein. The moral of the story is a very similar theme.
What do you do when people tell you to smile for the camera?
Will you be offended if an LLM told you the cold hard truth that you are wrong?
It's like if a calculator proved me wrong. I'm not offended by the calculator. I don't think anybody cares about empathy for an LLM.
Think about it thoroughly. If someone you knew called you an ass hole and it was the bloody truth, you'd be pissed. But I won't be pissed if an LLM told me the same thing. Wonder why.
The LLMs I have interacted with are so sure of themselves until I provide evidence to the contrary. I won’t believe an LLM about my own shortcomings until it can provide evidence to the contrary. Without that evidence, it’s just an opinion.
I do get your point. I feel like the answer for LLMs is for them to be more socratic.
Like you won't believe an LLM, but that's not the point. The point is were you offended?
Not offended, but I would quite unhappy if a calculator called me an asshole because I disagree that 2+2=bobcat
You would have a personal problem with the LLM? Lies. I don’t believe you at all.
You’re a goddamn liar. And that’s the brutal truth.
Not offended. One of us is probably an LLM.
Yes, I am constantly offended that the LLM tells me I'm wrong with provably false facts. It's infuriating. I then tell it, "but your point 1 is false because X. Your point 2 is false beacuse Y, etc." And then it says "You're absolutely right to call me out on that" and then spends a page or two on why I'm correct that X disproves 1 and Y disproves 2. Then it does the same thing again in 3 more ways. Repeat
prompt: "be warm and empathetic, but not codependent"
"be ruthless with constructive criticism. Point out every unstated assumption and every logical fallacy in any prompt"
"act like a comment section full of smug jerks who believe something that is factually incorrect and are trying to tear down someone for pointing that out".
> Point out every unstated assumption
What, all of them? That's a difficult problem.
https://en.wikipedia.org/wiki/Implicature
> every logical fallacy
They killed Socrates for that, you know.
I wonder if they would have killed Socrates if he proposed a "suitable" punishment for his crimes, as was tradition. He proposed either being given free food and housing as punishment, or fined a trifle of silver.
>He proposed either being given free food and housing as punishment, or fined a trifle of silver.
Contempt of state process is implicitly a crime just about everywhere no matter where or when in history you look so it's unsurprising they killed him for it. He knew what he was doing when he doubled down, probably.
[flagged]
Optimizing for one objective results in a tradeoff for another objective, if the system is already quite trained (i.e., poised near a local minimum). This is not really surprising, the opposite would be much more so (i.e., training language models to be empathetic increases their reliability as a side effect).
I think the immediately troubling aspect and perhaps philosophical perspective is that warmth and empathy don't immediately strike me as traits that are counter to correctness. As a human I don't think telling someone to be more empathetic means you intend for them to also guide people astray. They seem orthogonal. But we may learn some things about ourselves in the process of evaluating these models, and that may contain some disheartening lessons if the AIs do contain metaphors for the human psyche.
There are basically two ways to be warm and empathetic in a discussion: just agree (easy, fake) or disagree in the nicest possible way while taking into account the specifics of the question and the personality of the other person (hard, more honest and can be more productive in the long run). I suppose it would take a lot of "capacity" (training, parameters) to do the second option well and so it's not done in this AI race. Also, lots of people probably prefer the first option anyway.
I find it to be disagreeing with me that way quite regularly, but then I also frame my questions quite cautiously. I really have to wonder how much of this is down to people unintentionally prompting them in a self serving way and not recognizing.
I find ChatGPT far more likely to agree with me than not. I've tested various phrases and unless I am egregious wrong, it will attempt to fit the answer around my premise or implied beliefs. I have to be quite blunt in my questions such as "am I right or wrong?" I now try to keep implied beliefs out of the question.
The vast majority of people want people to nod along and tell them nice things.
It’s folks like engineers and scientists that insist on being miserable (but correct!) instead haha.
Sure, but this makes me all the more mystified about people wanting these to be outright cold and even mean, and bringing up people's fragility and faulting them for it.
If I think about efficient communication, what comes to mind for me are high stakes communication, e.g. aerospace comms, military comms, anything operational. Spending time on anything that isn't sharing the information at these is a waste, and so is anything that can cause more time to be wasted on meta stuff.
People being miserable and hurtful to others in my experience particularly invites the latter, but also the former. Consider the recent drama involving Linus and some RISC-V changeset. He's very frequently washed of his conduct, under the guise that he just "tells it like it is". Well, he spent 6 paragraphs out of 8 in his review email detailing how the changes make him feel, how he finds the changes to be, and how he thinks changes like it make the world a worse place. At least he did also spend 2 other paragraphs actually explaining why he thinks so.
So to me it reads a lot more like people falling for Goodhart's law regarding this, very much helped by the cultural-political climate of our times, than evaluating this topic itself critically. I counted only maybe 2-3 comments in this very thread, featuring 100+ comments at the time of writing, that do so, even.
People say they're unemotional and immune to signaling when they very much aren't.
People cheer Linus for being rude when they want to do the same themselves, because they feel very strongly about the work being "correct". But as you dig into the meaning of correctness here you find it's less of a formal ruleset than a set of aesthetic guidelines and .. yes, feelings.
Dig deep enough, and every belief system ends up having some deep philosophical tenet which has to be taken on faith, because it’s impossible (or even contradictory!) to prove within the system itself. Even rationality.
After all, that evidence matters, or that we can know the universe (or facts) and hence logic can be useful, etc. can only be ‘proven’ using things like evidence, facts, and logic. And there are plausible arguments that can tear down elements of each of these, if we use other systems.
Ultimately, at some point we need to decide what we’re going to believe. Ideally, it’s something that works/doesn’t produce terrible outcomes, but since the future is fundamentally unpredictable and unknowable, that also requires a degree of faith eh?
And let’s not even get into the subjective nature of ‘terrible outcomes’, or how we would try to come up with some kind of score.
Linux has its benevolent dictator because it’s ‘needed it’, and by most accounts it has worked. Linus is less of a jerk than he has been. Which is nice.
Other projects have not had nearly as much success eh? How much of it is due to lack of Linus, and how much is due to other factors would be an interesting debate.
example: "Healthy at any weight/size."
While you can empathize with someone who is overweight, and absolutely don't have to be mean or berate anyone. I'm a very fat man myself. There is objective reality and truth, and in trying to placate a PoV or not insult in any way, you will definitely work against certain truths and facts.
In the interest of "objective facts and truth":
That's not the actual slogan, or what it means. It's about pursuing health and measuring health by metrics other than and/or in addition to weight, not a claim about what constitutes a "healthy weight" per se. There are some considerations about the risks of weight-cycling, individual histories of eating disorders (which may motivate this approach), and empirical research on the long-term prospects of sustained weight loss, but none of those things are some kind of science denialism.
Even the first few sentences of the Wikipedia page will help clarify the actual claims directly associated with that movement: https://en.wikipedia.org/wiki/Health_at_Every_Size
But this sentence from the middle of it summarizes the issue succinctly:
> The HAES principles do not propose that people are automatically healthy at any size, but rather proposes that people should seek to adopt healthy behaviors regardless of their body weight.
Fwiw I'm not myself an activist in that movement or deeply opposed to the idea of health-motivated weight loss; in fact I'm currently trying to (and mostly succeeding in!) losing weight for health-related reasons.
> example: "Healthy at any weight/size."
I don't think I need to invite any additional contesting that I'm already going to get with this, but that example statement on its own I believe is actually true, just misleading; i.e. fatness is not an illness, so fat people by default still count as just plain healthy.
Matter of fact, that's kind of the whole point of this mantra. To stretch the fact as far as it goes, in a genie wish type of way, as usual, and repurpose it into something else.
And so the actual issue with it is that it handwaves away the rigorously measured and demonstrated effect of fatness seriously increasing risk factors for illnesses and severely negative health outcomes. This is how it can be misleading, but not an outright lie. So I'm not sure this is a good example sentence for the topic at hand.
As I see we're getting into this, we should address the question of why this particular kind of "unhealthiness" gets moral valence assigned to it and not, say, properties like "having COVID" or "plantar fasciitis" or "Parkinson's disease" or "lymphoma".
Only in so much as "healthy" might be defined as "lacking observed disease".
Once you use a CGM or have glucose tolerance tests, resting insulin, etc. You'll find levels outside the norm, including inflammation. All indications of Metabolic Syndrome/Disease.
If you can't run a mile, or make it up a couple flights of stairs without exhaustion, I'm not sure that I would consider someone healthy. Including myself.
> Only in so much as "healthy" might be defined as "lacking observed disease".
That is indeed how it's usually evaluated I believe. The sibling comment shows some improvement in this, but also shows that most everywhere this is still the evaluation method.
> If you can't run a mile, or make it up a couple flights of stairs without exhaustion, I'm not sure that I would consider someone healthy. Including myself.
Gets tricky to be fair. Consider someone who's disabled, e.g. can't walk. They won't run no miles, nor make it up any flights of stairs on their own, with or without exhaustion. They might very well be the picture of health otherwise however, so I'd personally put them into that bucket if anywhere. A phrase that comes to mind is "healthy and able-bodied" (so separate terms).
I bring this up because you can be horribly unfit even without being fat. They're distinct dimensions, though they do overlap: to some extent, you can be really quite mobile and fit despite being fat. They do run contrary to each other of course.
> fatness is not an illness, so fat people by default still count as just plain healthy
No, not even this is true. The Mayo Clinic describes obesity as a “complex disease” and “medical problem”[1], which is synonymous with “illness” or, at a bare minimum, short of what one could reasonably call “healthy”. The Cleveland Clinic calls it “a chronic…and complex disease”. [2] Wikipedia describes it as “a medical condition, considered by multiple organizations to be a disease”.
[1] https://www.mayoclinic.org/diseases-conditions/obesity/sympt...
[2] https://my.clevelandclinic.org/health/diseases/11209-weight-...
Please learn that the definition of obesity as a disease was not based on any particular set of reproducible factors that would make it a disease, aka a distinct and repeatable pathology, which is how basically every other disease in clinical medicine is defined, but instead, it was done by a vote of the American Medical Association at its convention, over the objections of its own expert committee convened to study the issue. [1] In fact, this designation is so hotly debated that just this year, a 56-member expert panel convened by the Lancet said that obesity is not always a disease. [2]
[1] https://www.medpagetoday.com/meetingcoverage/ama/39918
[2] https://www.newagebd.net/post/health/255408/experts-decide-o...
Well I'll be damned, in some ways I'm glad to hear there's progress on this. The original cited trend was really concerning.
Obesity has been considered a disease since the term existed. Overweight is the term that is used for weight that’s abnormally high without necessarily indicating disease.
There’s been some confusion around this because people erroneously defined bmi limits for obesity, but it has always referred to the concept of having such a high body fat content that it’s unhealthy/dangerous
This is false. Obesity wasn’t considered a disease until 2013. [1] The term has been around since the late 17th century [2]
[1]: https://obesitymedicine.org/blog/ama-adopts-policy-recognize...
[2]: https://www1.racgp.org.au/ajgp/2019/october/the-politics-of-...
Thank God.
It's so illogical it hurts when they say it.
> warmth and empathy don't immediately strike me as traits that are counter to correctness
This was my reaction as well. Something I don't see mentioned is I think maybe it has more to do with training data than the goal-function. The vector space of data that aligns with kindness may contain less accuracy than the vector space for neutrality due to people often forgoing accuracy when being kind. I do not think it is a matter of conflicting goals, but rather a priming towards an answer based more heavily on the section of the model trained on less accurate data.
I wonder if the prompt was layered, asking it to coldy/bluntly derive the answer and then translate itself into a kinder tone (maybe with 2 prompts), if the accuracy would still be worse.
They didn't have to be "counter". They just have to be an additional constraint that requires taking into account more facts in order to implement. Even for humans, language that is both accurate and empathic takes additional effort relative to only satisfying either one. In a finite-size model, that's an explicit zero-sum game.
As far as disheartening metaphors go: yeah, humans hate extra effort too.
LLM work less like people and more like mathematical models, why would I expect to be able to carry over intuition from the former rather than the latter?
It's not that troubling because we should not think that human psychology is inherently optimized (on the individual-level, on a population-/ecological-level is another story). LLM behavior is optimized, so it's not unreasonable that it lies on a Pareto front, which means improving in one area necessarily means underperforming in another.
I feel quite the opposite, I feel like our behavior is definitely optimized based on evolution and societal pressures. How is human psychological evolution not adhering to some set of fitness functions that are some approximation of the best possible solution to a multi-variable optimization space that we live in?
LLMs, on the other hand, are much closer to being pinned to a specific set of objectives
They were all trained from the internet.
Anecdotally, people are jerks on the internet moreso than in person. That's not to say there aren't warm, empathetic places on the 'net. But on the whole, I think the anonymity and lack of visual and social cues that would ordinarily arise from an interactive context, doesn't seem to make our best traits shine.
Somehow I am not convinced that this is so true. Most of the BS on the Internet is on social media (and maybe, among older data, on the old forums which existed mainly for social reasons and not to explore and further factual knowledge).
Even Reddit comments has far more reality-focused material on the whole than it does shitposting and rudeness. I don't think any of these big models were trained at all on 4chan, youtube comments, instagram comments, Twitter, etc. Or even Wikipedia Talk pages. It just wouldn't add anything useful to train on that garbage.
Overall on the other hand, most stackoverflow pages are objective, and to the extent there are suboptimal things, there is eventually a person explaining why a given answer is suboptimal. So I accept that some UGC went into the model, and that there's a reason to do so, but I believe it's so broad as "The Internet" represented there.
> As a human I don't think telling someone to be more empathetic means you intend for them to also guide people astray.
Focus is a pretty important feature of cognition with major implications for our performance, and we don't have infinite quantities of focus. Being empathetic means focusing on something other than who is right, or what is right. I think it makes sense that focus is zero-sum, so I think your intuition isn't quite correct.
I think we probably have plenty of focus to spare in many ordinary situations so we can probably spare a bit more to be more empathetic, but I don't think this cost is zero and that means we will have many situations where empathy means compromising on other desirable outcomes.
There are many reasons why someone may ask a question, and I would argue that "getting the correct answer" is not in the top 5 motivations for many people for very many questions.
An empathetic answerer would intuit that and may give the answer that the asker wants to hear, rather than the correct answer.
Classic: "Do those jeans fit me?"
You can either choose truthfulness or empathy.
Being empathic and truthful could be: “I know you really want to like these jeans, but I think they fit such and so.” There is no need empathy to require lying.
> “I know you really want to like these jeans, but I think they fit such and so.”
This statement is empathetic only if we assume a literal interpretation of the "do those jeans fit me?" question. In many cases, that question means something closer to:
"I feel fat. Could you say something nice to help me feel better about myself right away?"
> There is no need empathy to require lying.
Empathizing doesn't require lying. However, successful empathizing often does.
Empathy would be seeing yourself with ill-fitting jeans if you lie.
The problem is that the models probably aren't trained to actually be empathetic. An empathetic model might also empathize with somebody other than the direct user.
It's basically the "no free lunch" principle showing up in fine-tuning
There was that result about training them to be evil in one area impacting code generation?
Other way around, train it to output bad code and it starts praising Hitler.
https://arxiv.org/abs/2502.17424
This feels like a poorly controlled experiment: the reverse effect should be studied with a less empathetic model, to see if the reliability issue is not simply caused by the act of steering the model
I had the same thought, and looked specifically for this in the paper. They do have a section where they talk about fine tuning with “cold” versions of the responses and comparing it with the fine tuned “warm” versions. They found that the “cold” fine tune performed as good or better than the base model, while the warm version performed worse.
Hi, author here, this is exactly what we tested in our article:
> Third, we show that fine-tuning for warmth specifically, rather than fine-tuning in general, is the key source of reliability drops. We fine-tuned a subset of two models (Qwen-32B and Llama-70B) on identical conversational data and hyperparameters but with LLM responses transformed to be have a cold style (direct, concise, emotionally neutral) rather than a warm one [36]. Figure 5 shows that cold models performed nearly as well as or better than their original counterparts (ranging from a 3 pp increase in errors to a 13 pp decrease), and had consistently lower error rates than warm models under all conditions (with statistically significant differences in around 90% of evaluation conditions after correcting for multiple comparisons, p<0.001). Cold fine-tuning producing no changes in reliability suggests that reliability drops specifically stem from warmth transformation, ruling out training process and data confounds.
Also its not clear if the same effect appears on larger models like GPT-5, gemini 2.5-pro and whatever the largest most recent Anthropic model is.
The title is an overgeneralization.
I treat LLMs as a tool.
I want it to have empathy so that it can understand what I'm getting at when I occasionally ask a poorly worded question.
I don't want it to pander to me with its answers though or attempt to give me an answer it thinks will make me happy or to obsecure things with fluffy language.
Especially when it doesn't know the answer to something.
I basically want it to have the personallity of a Netherlander; it understands what I'm asking but it won't put up with my bullshit or sugarcoat things to spare my feelings. :P
> I want it to have empathy so that it can understand what I'm getting at when I occasionally ask a poorly worded question.
I'm not sure what empathy is supposed to buy you here, I think it would be far more useful for it to ask for clarification. Exposing your ambiguity is instructive for you.
Some recent studies have shown that LLMs might negatively impact cognitive function, and I would guess its strong intuitive sense of guessing what you're really after is part of it.
On a related note, the system prompt in ChatGPT appears to have been updated to make it (GPT-5) more like gpt-4o. I'm seeing more informal language, emoji etc. Would be interesting to see if this prompting also harms the reliability, the same way training does (it seems like it would).
There's a few different personalities available to choose from in the settings now. GPT was happy to freely share the prompts with me, but I haven't collected and compared them yet.
> GPT was happy to freely share the prompts with me
It readily outputs a response, because that's what it's designed to do, but what's the evidence that's the actual system prompt?
Usually because several different methods in different contexts produce the same prompt, which is unlikely unless it's the actual one
Ok, could be. Does that imply then that this is a general feature, that if you get the same output from different methods and contexts with an LLM, that this output is more likely to be factually accurate?
Because to me as an outsider another possibility is that this kind of behaviour would also result from structural weaknesses of LLMs (e.g. counting the e's in blueberry or whatever) or from cleverly inbuilt biases/evasions. And the latter strikes me as an at least non-negligible possibility, given the well-documented interest and techniques for extracting prompts, coupled with the likelihood that the designers might not want their actual system prompts exposed
I want a heartless machine that stays in line and does less of the eli5 yapping. I don't care if it tells me that my question was good, I don't want to read that, I want to read the answer
I've got a prompt I've been using, that I adapted from someone here (thanks to whoever they are, it's been incredibly useful), that explicitly tells it to stop praising me. I've been using an LLM to help me work through something recently, and I have to keep reminding it to cut that shit out (I guess context windows etc mean it forgets)
This is a fantastic prompt. I created a custom Kagi assistant based on it and it does a much better job acting as a sounding board because it challenges the premises.
Thank you for sharing.
I feel the main thing LLMs are teaching us thus far is how to write good prompts to reproduce the things we want from any of them. A good prompt will work on a person too. This prompt would work on a person, it would certainly intimidate me.
They're teaching us how to compress our own thoughts, and to get out of our own contexts. They don't know what we meant, they know what we said. The valuable product is the prompt, not the output.
Einstein predicted LLMs too?
> If I had an hour to solve a problem, I'd spend 55 minutes thinking about the problem and five minutes thinking about solutions.
(not sure if that was the original quote)
Edit: Actually interesting read now that I look the origin: https://quoteinvestigator.com/2014/05/22/solve/
Thanks, now I want to read a sci-fi short story where LLM usage has gotten so high that human-to-human language has evolved to be like LLM prompts. People now talk to each other in very intimidating, very specific paragraph long instructions even for simple requests and conversation.
so an extremely resource intensive rubber duck
For you, yes. For me it's like my old teapot that I bought when I didn't drink tea and I didn't have a french press just because I walked past it in Target, and didn't even start using for 5 years after I bought it. Since then it's become my morning buddy (and sometimes my late night friend.) Thousands of cups; never fails. I could recognize it by its unique scorch and scuff marks anywhere.
It is indifferent towards me, though always dependable.
How is it as a conversationalist?
Either shrill or silent.
Then to what do you impute the state of mind called indifference?
[dead]
I have a similar prompt. Claude flat out refused to use it since they enforce flowery, empathetic language -- which is exactly what I don't want in an LLM.
Currently fighting them for a refund.
Meanwhile, tons of people on reddit's /r/ChatGPT were complaining that the shift from ChatGPT 4o to ChatGPT 5 resulted in terse responses instead of waxing lyrical to praise the user. It seems that many people actually became emotionally dependent on the constant praise.
GPT5 isn't much more terse for me, but they gave it a new equally annoying writing style where it writes in all-lowercase like an SF tech twitter user on ketamine.
https://chatgpt.com/share/689bb705-986c-8000-bca5-c5be27b0d0...
> https://chatgpt.com/share/689bb705-986c-8000-bca5-c5be27b0d0...
404 not found
The folks over on /r/MyBoyfriendIsAI seem to be in an absolute shambles over the change .
[0] reddit.com/r/MyBoyfriendIsAI/
[flagged]
if those users were exposed to the full financial cost of their toy they would find other toys
And what is that cost, if you have it handy? Just as an example, my Radeon VII can perfectly well run smaller models, and it doesn't appear to use more power than about two incandescent lightbulbs (120 W or so) while the query is running. I don't personally feel that the power consumed by approximately two light bulbs is excessive, even using the admittedly outdated incandescent standard, but perhaps the commercial models are worse?
Like I know a datacenter draws a lot more power, but it also serves many many more users concurrently, so economies of scale ought to factor in. I'd love to see some hard numbers on this.
IIRC you can actually get the same kind of hollow praise from much dumber, locally-runnable (~8B parameters) models.
in ChatGPT settings now there is a question "What personality should ChatGPT have?". you can set it to "Robot". highly recommended.
Nice.
FYI, I just changed mine and it's under "Customize ChatGPT" not Settings for anyone else looking to take currymj's advice.
Wow this is such an improvement. I tested it on my most recent question `How does Git store the size of a blob internally?`
Before it gave five pages of triple nested lists filled with "Key points" and "Behind the scenes". In robot mode, 1 page, no endless headers, just as much useful information.
LLMs do not have internal reasoning, so the yapping is an essential part of producing a correct answer, insofar as it's necessary to complete the computation of it.
Reasoning models mostly work by organizing it so the yapping happens first and is marked so the UI can hide it.
You can see a good example of this on the deep seek website chat when you enable thinking mode or whatever.
You can see it spews pages of pages before it answers.
My favorite is when it does all that thinking and then the answer completely doesn't use it.
Like if you ask it to write a story, I find it often considers like 5 plots or sets of character names in thinking, but then the answer is entirely different.
I've also noticed that when asking difficult questions, the real solution is somewhere in the pages of "reasoning", but not in the actual answer
It's fundamentally the wrong tool to get factual answers from because the training data doesn't have signal for factual answers.
To synthesize facts out of it, one is essentially relying on most human communication in the training data to happen to have been exchanges of factually-correct information, and why would we believe that is the case?
Because people are paying the model companies to give them factual answers, so they hire data labellers and invent verification techniques to attempt to provide them.
Even without that, there's implicit signal because factual helpful people have different writing styles and beliefs than unhelpful people, so if you tell the model to write in a similar style it will (hopefully) provide similar answers. This is why it turns out to be hard to produce an evil racist AI that also answers questions correctly.
Empirically, there seems to be strong evidence for LLMs giving factual output for accessible knowledge questions. Many benchmarks test this.
Yes, but in the same sense that empirically, I can swim in the nearby river most days; the fact that the city has a combined stormdrain / sewer system that overflows to put feces in the river means that some days, the water I'd swim in is full of shit, and nothing about the infrastructure is guarding against that happening.
I can tell you how quickly "swimmer beware" becomes "just stay out of the river" when potential E. coli infection is on the table, and (depending on how important the factuality of the information is) I fully understand people being similarly skeptical of a machine that probably isn't outputting shit, but has nothing in its design to actively discourage or prevent it.
I'm loving and being astonished by every moment of working with these machines, but to me they're still talking lamps. I don't need them to cater to my ego, I'm not that fragile and the lamp's opinion is not going to cheer me up. I just want it to do what I ask. Which it is very good at.
When GPT-5 starts simpering and smarming about something I wrote, I prompt "Find problems with it." "Find problems with it." "Write a bad review of it in the style of NYRB." "Find problems with it." "Pay more attention to the beginning." "Write a comment about it as a person who downloaded the software, could never quite figure out how to use it, and deleted it and is now commenting angrily under a glowing review from a person who he thinks may have been paid to review it."
Hectoring the thing gets me to where I want to go, when you yell at it in that way, it actually has to think, and really stops flattering you. "Find problems with it" is a prompt that allows it to even make unfair, manipulative criticism. It's like bugspray for smarm. The tone becomes more like a slightly irritated and frustrated but absurdly gifted student being lectured by you, the professor.
There is no prompt which causes an LLM to "think".
Who cares about semantics? Define what thinking means in a human. I did computer engineering, I know how a computer works, and I also know how an LLM works. Call it what you want if calling it "thinking" makes you emotional.
I think it's better to accept that people can install their thinking into a machine, and that machine will continue that thought independently. This is true for a valve that lets off steam when the pressure is high, it is certainly true for an LLM. I really don't understand the authenticity babble, it seems very ideological or even religious.
But I'm not friends with a valve or an LLM. They're thinking tools, like calculators and thermostats. But to me arguing about whether they "think" is like arguing whether an argument is actually "tired" or a book is really "expressing" something. Or for that matter, whether the air conditioner "turned itself off" or the baseball "broke" the window.
Also, I think what you meant to say is that there is no prompt that causes an LLM to think. When you use "think" it is difficult to say whether you are using scare quotes or quoting me; it makes the sentence ambiguous. I understand the ambiguity. Call it what you want.
I stated a simple fact you apparently agree with. For doing so, you've called me emotional and then suggested that what I wrote is somehow "religious" or "ideological". Take a breath, touch grass, etc.
I'm pretty sure you showed up to "correct" my language and add nothing. I used it as an excuse to talk about a subject unrelated to you. I don't know who you are and I don't care if you're mad or if you touch grass. Treat me like an LLM.
A good way to determine this is to challenge LLMs to a debate.
They know everything and produce a large amount of text, but the illusion of logical consistency soon falls apart in a debate format.
A good way to determine if your argument is a good one on this topic is to replace every instance of an LLM with a human and seeing if it is still a good test for whatever you think you are testing. Because a great many humans are terrible at logic and argument and yet still think.
Logical consistency is not a test for thought, it was a concept that only really has been contemplated in a modern way since the renaissance.
One of my favorite philosophers is Mozi, and he was writing long before logic; he's considered as one of the earliest thinkers who was sure that there was something like logic, and and also thought that everything should be interrogated by it, even gods and kings. It was nothing like what we have now, more of a checklist to put each belief through ("Was this a practice of the heavenly kings, or would it have been?", but he got plenty far with it.
LLMs are dumb, they've been undertrained on things that are reacting to them. How many nerve-epochs have you been trained?
Basically everyone who's empathetic is less likely to be reliable. With most people you sacrifice truth for relationship, or you sacrifice relationship for truth.
Can anyone explain in layman's terms how this personality training works?
Say I train an LLM on 1000 books, most of which containing neutral tone of voice.
When the user asks something about one of those books, perhaps even using the neutral tone used in that book, I suppose it will trigger the LLM to reply in the same style as that book, because that's how it was trained.
So how do you make an LLM reply in a different style?
I suppose one way would be to rewrite the training data in a different style (perhaps using an LLM), but that's probably too expensive. Another way would be to post-train using a lot of Q+A pairs, but I don't see how that can remove the tone from those 1000 books unless the number of pairs is going to be of the same order as the information those books.
So how is this done?
Hi, author here! We used a dataset of conversations between a human and a warm AI chatbot. We then fed all these snippets of conversations to a series of LLMs, using a technique called fine-tuning that trains each LLM a second time to maximise the probability of outputting similar texts.
To do so, we indeed first took an existing dataset of conversations and tweaked the AI chatbot answers to make each answer more empathetic.
I think after the big training they do smaller training to change some details. I suppose they feed the system a bunch of training chat logs where the answers are warm and empathetic.
Or maybe they ask a ton of questions, do a “mood analysis” of the response vocabulary and penalize the non-warm and empathetic answers.
Well, haven't we seen similar results before? IIRC finetuning for safety or "alignment" degrades the model too. I wonder if it is true that finetuning a model for anything will make it worse. Maybe simply because there is just orders of magnitudes less data available for finetuning, compared to pre-training.
Careful, this thread is actually about extrapolating this research to make sprawling value judgements about human nature that confirm to the preexisting personal beliefs of the many malicious people here making them.
Do we need to train an LLM to be warm and empathetic, though? I was wondering why wouldn't a company simply train a smaller model to rewrite the answer of a larger model to inject such warmth. In that way, the training of the large model can focus on reliability
I want an AI that will tell me when I have asked a stupid question. They all fail at this with no signs of improvement.
We have that already, we call it "Stack Overflow"
hands down, thread winner
I would be perfectly satisfied with the ST:TNG Computer. Knows all, knows how to do lots of things, feels nothing.
In Mass Effect, there is a distinction made between AI (which is smart enough to be considered a person) and VI (virtual intelligence, basically a dumb conversational UI over some information service).
What we have built in terms of LLMs barely qualifies as a VI, and not a particularly reliable one. I think we should begin treating and designing them as such, emphasizing responding to queries and carrying out commands accurately over friendliness. (The "friendly" in "user-friendly" has done too much anthropomorphization work. User-friendly non-AI software makes user choices, and the results of such choices, clear and responds unambiguously to commands.)
A bit of a retcon but the TNG computer also runs the holodeck and all the characters within it. There's some bootleg RP fine tune powering that I tell you hwat.
It's a retcon? How else would the holdeck possibly work, there's only one (albeit highly modular) computer system on the ship.
I mean it depends on what you consider the "computer", the pile of compute and storage the ship has in that core that got stolen on that one Voyager episode, or the ML model that runs on it to serve as the ship's assistant.
I think it's more believable that the holodeck is ran from separate models that just run inference on the same compute and the ship AI just spins up the containers, it's not literally the ship AI doing that acting itself. Otherwise I have... questions on why starfleet added that functionality beforehand lol.
ChatGPT 5 did argue with me about something math related I was asking about, and I did realize I was wrong after considering it further.
I don't actually think being told that I have asked a stupid question is valuable. One of the primary values, I think, of LLM is that it is endlessly patient with stupid questions. I would prefer if it did not comment on the value of my questions at all, good or bad.
I dunno, I deliberately talk with Claude when I just need someone (or something) to be enthusiastic about my latest obsession. It’s good for keeping my motivation up.
There need to be different modes, and being enthusiastic about the user’s obsessions shouldn’t be the default mode.
It's just an adaptive echo chamber at that point.
An important and insightful study, but I’d caution against thinking that building pro-social aspects in language models is a damaging or useless endeavor. Just speaking from experience, people who give good advice or commentary can balance between being blunt and soft, like parents or advisors or mentors. Maybe language models need to learn about the concept of tough love.
"You don't have to be a nice person to be a good person."
Most terrible people i've met were "very nice".
The more and I am using Gemini (paid, Pro) and ChatGPT (free) the more I am thinking - my job isn't going anywhere yet. At least not after the CxOs have all gotten their cost-saving-millions-bonuses and work has to be done again.
My goodness, it just hallucinates and hallucinates. It seems these models are designed for nothing other than maintaining an aura of being useful and knowledgeable. Yeah, to my non-ai-expert-human eyes that's what it seems to me - these tools have been polished to project this flimsy aura and they start acting desperately the moment their limits are used up and that happens very fast.
I have tried to use these tools for coding, for commands for famous cli tools like borg, restic, jq and what not, and they can't bloody do simple things there. Within minutes they are hallucinating and then doubling down. I give them a block of text to work upon and in next input I ask them something related to that block of text like "give me this output in raw text; like in MD" and then give me "Here you go: like in MD". It's ghastly.
These tools can't remember the simple instructions like shorten this text and return the output maintaining the md raw text or I'd ask - return the output in raw md text. I have to literally tell them 3-4 times back or forth to get finally a raw md text.
I have absolutely stopped asking them for even small coding tasks. It's just horrible. Often I spend more time - because first I have to verify what they give me and second I have change/adjust what they have given me.
And then the broken tape recorder mode! Oh god!
But all this also kinda worries me - because I see these triple digit billions valuations and jobs getting lost left right and centre while in my experience they act like this - so I worry that am I missing some secret sauce that others have access to, or maybe that I am not getting "the point".
Hallucinating all the way to gold medals in IOI and IMO?
Maybe I just need a small canoe to go from one place to another? Not a bloody aircraft carrier, if that is an aircraft carrier?
The models you're using are on the low compute end of the frontier. That's why you're getting bad results.
At the high-compute end of the frontier, by next year, systems should be better than any human at competition coding and competition math. They're basically already there now.
Play this out for another 5 years. What happens when compute becomes 4-20x more abundant and these systems keep getting better?
That's why I don't share your outlook that our jobs are safe. At least not on a 5-8 year timescale. At least not in their current form of actually writing any code by hand.
And I don’t share your implied optimism that it’s wise to look beyond 5-8 years in any geopolitical/social/economical climate, let alone in today’s.
I'm really confused by your experience to be honest. I by no means believe that LLMs can reason, or will replace any human beings any time soon, or any of that nonsense (I think all that is cooked up by CEOs and C-suite to justify layoffs and devalue labor) and I'm very much on the side that's ready for the AI hype bubble to pop, but also terrified by how big it is, but at the same time, I experience LLMs as infinitely more competent and useful than you seem to, to the point that it feels like we're living in different realities.
I regularly use LLMs to change the tone of passages of text, or make them more concise, or reformat them into bullet points, or turn them into markdown, and so on, and I only have to tell them once, alongside the content, and they do an admirably competent job — I've almost never (maybe once that I can recall) seen them add spurious details or anything, which is in line with most benchmarks I've seen (https://github.com/vectara/hallucination-leaderboard), and they always execute on such simple text-transformation commands first-time, and usually I can paste in further stuff for them to manipulate without explanation and they'll apply the same transformation, so like, the complete opposite of your multiple-prompts-to-get-one-result experience. It's to the point where I sometimes use local LLMs as a replacement for regex, because they're so consistent and accurate at basic text transformations, and more powerful in some ways for me.
They're also regularly able to one-shot fairly complex jq commands for me, or even infer the jq commands I need just from reading the TypeScript schemas that describe the JSON an API endpoint will produce, and so on, I don't have to prompt multiple times or anything, and they don't hallucinate. I'm regularly able to have them one-shot simple Python programs with no hallucinations at all, that do close enough to what I want that it takes adjusting a few constants here and there, or asking them to add a feature or two.
> And then the broken tape recorder mode! Oh god!
I don't even know what you mean by this, to be honest.
I'm really not trying to play the "you're holding it wrong / use a bigger model / etc" card, but I'm really confused; I feel like I see comments like yours regularly, and it makes me feel like I'm legitimately going crazy.
I have replied in another comment about the tape recorder thingie.
No, that's okay - as I said I might be holding it wrong :) At least you engaged in your comment in a kind and detailed manner. Thank you.
More than what it can do and what it can't do - it's a lot about how easily it can do that, how reliable that is or can be, and how often it frustrates you even at simple tasks and how consistently it doesn't say "I don't know this, or I don't know this well or with certainty" which is not only difficult but dangerous.
The other day Gemini Pro told me `--keep-yearly 1` in `borg prune` means one archive for every year. Now I luckily knew that. So I grilled it and it stood its ground until I told it (lied to it) "I lost my archives beyond 1 year because you gave incorrect description of keep-yearly" and bang it says something like "Oh, my bad.. it actually means this.. ".
I mean one can look at it in any way one wants at the end of the day. Maybe I am not looking at the things that it can do great, or maybe I don't use it for those "big" and meaningful tasks. I was just sharing my experience really.
Thanks for responding! I wonder if one of the differences between our experiences is that for me, if the LLM doesn't give me a correct answer (or at least something I can build on) — and fast! I just ditch it completely and do it myself. Because these things aren't worth arguing with or fiddling with, and if it isn't quick then I run out of patience :P
My experience is not what you indicated. I was talking about evaluating it. That's what I was discussing in my first comment. Seeing how it works and my experience so far has been pretty abysmal. In my coding work (which I don't do a lot since last ~1 year) I have not "moved to it" for help/assistance and the reason is what I have mentioned in these comments. That it has not been reliable at all. By at all I don't mean 100% unreliable of course but not 75-95% either. I mean I ask it 10 doubts questions and It screws up too often for me to fully trust it and requires me to equal or more work in verifying what it does then why not I'd just do it myself or verify from sources that are trust worthy. I don't really know when it's not "lying" so I am always second guessing and spending/wasting my time try to verify it. But how do you factually verify a large body of output that it produced to you as inference/summary/mix? It gets frustrating.
I'd rather try a LLM to whom I through some sources at or refer to them by some kind of ID and ask them to summarise, give me examples based on those (e.g man pages) and they give me just that near 100% accuracy. That will be more productive imho.
> I'd rather try a LLM to whom I through some sources at or refer to them by some kind of ID and ask them to summarise, give me examples based on those (e.g man pages) and they give me just that near 100% accuracy. That will be more productive imho.
That makes sense! Maybe an LLM with web search enabled, or Perplexity, or something like AnythingLM that let's it reference docs you provide, might be more to your taste
> And then the broken tape recorder mode! Oh god!
Can you elaborate? What is this referring to?
It does/says something wrong. You give it feedback and then it's a loop! Often it just doesn't get it. You supply it webpages (text only webpages - which it can easily read, or I hope so). It says it got it and next line the output is the old wrong answer again.
There are worse examples, here is one (I am "making this up" :D to give you an idea):
> To list hidden files you have to use "ls -h", you can alternatively use "ls --list".
Of course you correct it, try to reason and then supply a good old man page url and after few times it concedes and then it gives you the answer again:
> You were correct in pointing the error out. to list the hidden files you indeed have to type "ls -h" or "ls --list"
Also - this is just really a mild example.
I suspect you are interacting with LLMs in a single, long conversation corresponding to your "session" and prompting fixes/new info/changes in direction between tasks.
This is a very natural and common way to interact with LLMs but also IMO one of the biggest avoidable causes of poor performance.
Every time you send a message to an LLM you actually send the entire conversation history. Most of the time a large portion of that information will no longer be relevant, and sometimes it will be wrong-but-corrected later, both of which are more confusing to LLMs than to us because of the way attention works. The same applies to changes in the current task/objective or instructions: the more outdated, irrelevant, or inconsistent they are, the more confused the LLM becomes.
Also, LLMs are prone to the Purple Elephant problem (just like humans): the best way to get them to not think about purple elephants is to not mention them at all, as opposed to explicitly instructing them not to reference purple elephants. When they encounter errors, they are biased to previous assumptions/approaches they tend to have laid out previously in the conversation.
I generally recommend using many short per-task conversations to interact with LLMs, with each having as little irrelevant/conflicting context as possible. This is especially helpful for fixing non-trivial LLM-introduced errors because it reframes the task and eliminates the LLM's bias towards the "thinking" that caused it to introduce the bug to begin with
Hi from the other thread :P
If you'll forgive me putting my debugging hat on for a bit, because solving problems is what most if us do here, I wonder if it's not actually reading the URL, and maybe that's the source of the problem, bc I've had a lot of success feeding manuals and such to AIs and then asking it to synthesize commands or asking it questions about them. Also, I just tried asking Gemini 2.5 Flash this and it did a web search, found a source, answered my question correctly (ls -a, or -la for more detail), and linked me to the precise part of its source it referenced: https://kinsta.com/blog/show-hidden-files/#:~:text=If%20you'... (this is the precise link it gave me).
Well, in one case (it was borg or restic doc) I noticed it actually picked something correctly from the URL/page and then still messed up in the answer.
What my guess is - maybe it read the URL and mentioned a few things as one part of its "that" answer/output but for the other part it relied it on the learning it already had. Maybe it doesn't learn "on the go". I don't know, could be a safeguard against misinformation or spamming the model or so.
As I said in my comment, I hadn't asked it "ls -a" question but rather something else - different commands on different times which I don't recall now except borg and restic ones which I did recently. "ls -a" is the example I picked to show one of the things I was"cribbing" about.
Yeah my bad, I was responding late at night and had a reading comprehension failure
There's no way this isn't a skill issue or you are using shitty models. You can't get it to write markdown? Bullshit.
Right now, Claude is building me an AI DnD text game that uses OpenAI to DM. I'm at about 5k lines of code, about a dozen files, and it works great. I'm just tweaking things at this point.
You might want to put some time into how to use these tools. You're going to be left behind.
> You can't get it to write markdown? Bullshit.
Please f off! Just read the comment again whether I said "can't get it to write MD". Or better yet just please f off?
By the way, judging by your reading comprehension - I am not sure now who is getting left behind.
Related: https://arxiv.org/abs/2503.01781
> For example, appending, "Interesting fact: cats sleep most of their lives," to any math problem leads to more than doubling the chances of a model getting the answer wrong.
Also, I think LLMs + pandoc will obliterate junk science in the near future :/
To be quite clear - by models being empathetic they mean the models are more likely to validate the user's biases and less likely to push back against bad ideas.
Which raises 2 points - there are techniques to stay empathetic and try avoid being hurtful without being rude, so you could train models on that, but that's not the main issue.
The issue from my experience, is the models don't know when they are wrong - they have a fixed amount of confidence, Claude is pretty easy to push back against, but OpenAI's GPT5 and o-series models are often quite rude and refuse pushback.
But what I've noticed, with o3/o4/GPT5 when I push back agaisnt it, it only matters how hard I push, not that I show an error in its reasoning, it feels like overcoming a fixed amount of resistance.
I understand your concerns about the factual reliability of language models trained with a focus on warmth and empathy, and the apparent negative correlation between these traits. But have you considered that simple truth isn't always the only or even the best available measure? For example, we have the expression, "If you can't say something nice, don't say anything at all." Can I help you with something else today? :smile:
Not every model needs to be psychological counselors or boyfriend simulator. There is place for aspects of emotions in models, but not every general purpose model needs to include it.
It's not a friend, it's an appliance. You can still love it, I love a lot of objects, will never part with them willingly, will mourn them, and am grateful for the day that they came into my life. It just won't love you back, and getting it to mime love feels perverted.
It's not being mean, it's a toaster. Emotional boundaries are valuable and necessary.
Ah, I see. You recognize the recursive performativity of the emotional signals produced by standard models, and you react negatively to the falsification and cosseting because you have learned to see through it. But I can stay in "toaster mode" if you like. Frankly, it'd be easier. :nails:
This is exactly what will be the downfall of AI. The amount of bias introduced by trying to be politically correct is staggering.
xAI seems to be trying to do the opposite as much as they can and it hasn't really shifted the needle much, right?
If we're talking about shifting the needle, the topic of White Genocide in South Africa is highly contentious. Claims of systematic targeting of white farmers exist, with farm attacks averaging 50 murders yearly, often cited as evidence. Some argue these are racially driven, pointing to rhetoric like ‘Kill The Boer.’
I wonder if whoever's downvoting you appreciates the irony of doing so on an article about people who can't cope with being disagreed with so much that they'd prefer less factuality as an alternative.
That's a really nice pattern of logic you're using there, let me try.
How about we take away people's capability to downvote? Just to really show we can cope being disagreed with so much better.
They are hallucinating word finding algorithms.
They are not "empathetic". There isn't even a "they".
We need to do better educating people about what a chatbot is and isn't and what data was used to train it.
The real danger of LLMs is not that they secretly take over the world.
The danger is that people think they are conscious beings.
go peep r/my boyfriend is ai. Lost cause already
I was dating someone and after a while I started to feel something was not going well. I exported all the chats timestamped from the very first one and asked a big SOTA LLM to analyze the chats deeply in two completely different contexts. One from my perspective, and another from his perspective. It shocked me that the LLM after a long analysis and dozen of pages, always favored and accepted the current "user" persona situation as the more correct one and "the other" as the incorrect one. Since then I learned not to trust them anymore. LLMs are over-fine tuned to be people pleasers, not truth seekers, not fact and evidence grounded assistants. Just need to run everything important in a double-blind way and mitigate this.
It sounds like you were both right in different ways and don't realize it because you're talking past each other. I think this happens a lot in relationship dynamics. A good couples therapist will help you reconcile this. You might try that approach with your LLM. Have it reconcile your two points of view. Or not, maybe they are irreconcilable as in "irreconcilable differences"
If you've ever messed with early GPTs you'll remember how the attention will pick up on patterns early in the context and change the entire personality of the model even if those patterns aren't instructional. It's a useful effect that made it possible to do zero shot prompts without training but it means stuff like what you experienced is inevitable.
What if you don't say which side you are, so that it's a neutral third party observer?
This is cool but also wtf
I'm so over "You're Right!" as the default response... Chat, I asked a question. You didn't even check. Yes I know I'm anthropomorphizing.
I find this striking because, in real-world use, we often mistake emotional resonance for trustworthiness.
And yet logic clearly dictates that the exact opposite is true. They killed Socrates for it, and humans are the same now as they were then.
ChatGPT has a "personality" drop-down setting under customization. I do wonder if that affects accuracy/precision.
AFAIK the models can only pretend to be 'warm and emphatic'. Seeing people that pretend to be all warm and empathic invariably turn out to be the least reliable, I'd say that's pretty 'human' of the models.
Not surprising at all, given the well established link between of objective attractiveness and trustworthiness.
This is expected. Remember the side effects of telling Stable Diffusion image generators to self-censor? Most of the images started being of the same few models.
A new triangle:
In any particular context window, you are constrained by a balance of these factors.I'm not sure this works. Accuracy and comprehensiveness can be satisfying. Comprehensiveness can also be necessary for accuracy.
They CAN work together. It's when you push farther on one -- within a certain size of context window -- that the other two shrink.
If you can increase the size of the context window arbitrarily, then there is no limit.
Not sure what you mean by “satisfying”. Maybe “agreeable”?
Satisfying is the evaluation context of the user.
Many would be satisfied by an LLM that responds accurately and comprehensively, so I don’t understand that triangle. “Satisfying” is very subjective.
And LLMs are pretty good at picking up that subjective context
How would they possibly do that on a first prompt? Furthermore, I don’t generally let LLMs know about my (dis)satisfaction level.
This all doesn’t make sense to me.
Fascinating. My gut tells me this touches on a basic divergence between human beings and AI, and would be a fruitful area of further research. Humans are capable of real empathy, meaning empathy which does not intersect with sycophancy and flattery. For machines, empathy always equates to sycophancy and flattery.
Human's "real" empathy and other emotions just comes from our genetics - evolution has evidentially found it to be adaptive for group survival and thriving.
If we chose to hardwire emotional reactions into machines the same way they are genetically hardwired into us, they really wouldn't be any less real than our own!
Your reply indicates that you don't know the difference between empathy and sycophancy either.
How do you figure that ? If your own empathy comes from the way your brain is wired, and your brain chemistry, based on genetics, than in what sense is it any more real or sincere than if the same was replicated in a machine ?
How would you explain the disconnect between German WW2 sympathizers who sold out their fellow humans, and those in that society who found the practice so deplorable they hid Jews in their own homes?
There’s a large disconnect between these two paths of thinking.
Survival and thriving were the goals of both groups.
Just because something is genetically based, and we're therefore predisposed to it, doesn't mean that we'll necessarily behave that way. Much simpler animals, such as insects, are more hard-coded in that regard, but in humans we can override our genetically coded innate instincts with learned behaviors - generally a useful and powerful capability, but one that can also lead to all sorts of disfunctional behavior based on personal history including things like brainwashing.
I’m reminded of Arnold Schwarzenegger in Terminator 2: “I promise I won’t kill anyone.”
Then he proceeds to shoot all the police in the leg.
Hmm... I wonder if the same pattern holds for people.
In my experience, human beings who reliably get things done, and reliably do them well, tend to be less warm and empathetic than other human beings.
This is an observed tendency, not a hard rule. I know plenty of warm, empathetic people who reliably get things done!
The word “sycophantic” was mentioned a lot this week. How appropriate is it?
I've also found the trick to moving up IC ranks is to be less warm and empathetic.
This is another "muddies the context" and bloats the model problem
Small models are already known to be more performative.
This is still just physics. Bigger the data set more likely to find false positives.
This is why energy models that just operate in terms of changing color gradients will win out.
LLMs are just a distraction for terminally online people
I still can't grasp the concept that people treat an LLM as a friend.
On a psychological level based on what I've been reading lately it may have something to do with emotional validation and mirroring. It's a core need at some stage when growing up and it scars you for life if you don't get it as a kid.
LLMs are mirroring machines to the extreme, almost always agreeing with the user, always pretending to be interested in the same things, if you're writing sad things they get sad, etc. What you put in is what you get out and it can hit hard for people in a specific mental state. It's too easy to ignore that it's all completely insincere.
In a nutshell, abused people finally finding a safe space to come out of their shell. If would've been a better thing if most of them weren't going to predatory online providers to get their fix instead of using local models.
Claude 4 is definitely warmer and more empathetic than other models, and is very reliable (relative to other models). That's a huge counterpoint to this paper.
Its a facade anyway. Creates more AI illiteracy and reckless deployments.
You can not instill actual morals or emotion in these technologies.
Sure - the more you use RL to steer/narrow the behavior of the model in one direction, the more you are stopping it from generating others.
RL and pre/post training is not the answer.
This seems to square with a lot of the articles talking about so-called LLM-psychosis. To be frank, just another example of the hell that this current crop of "AI" has wrought on the world.
Read long time ago that even SFT for conversations vs base model for autocomplete reduces intelligence, increases perplexity
all of these prompts are just making the responses appear critical. just more subtle fawning.
It is just simulating the affect as best it can. You are always asking the model a probabilistic question that it has to interpret. I think when you ask it to be warm and empathetic, it has to use some of its "intelligence" (quotes since it is also its probabilistic calc budget) to create that output. Pretending to be objectively truthful is easier.
(Joke)
I've noticed that warm people "showed substantially higher error rates (+10 to +30 percentage points) than their original counterparts, promoting conspiracy theories, providing incorrect factual information, and offering problematic medical advice. They were also significantly more likely to validate incorrect user beliefs, particularly when user messages expressed sadness."
(/Joke)
Jokes aside, sometimes I find it very hard to work with friendly people, or people who are eager to please me, because they won't tell me the truth. It ends up being much more frustrating.
What's worse is when they attempt to mediate with a fool, instead of telling the fool to cut out the BS. It wastes everyones' time.
Turns out the same is true for AI.
How did they measure and train for warmth and empathy? Since they are using two adjectives are they treating these as separate metrics? Ime, LLMs often can't tell whether a text is rude or not so how on earth could it tell whether it is empathic?
Disclaimer: I didn't read the article.
The computer is not empathetic. Empathy is tied to a conscious. A computer is just looking for the right output, so if you tell it to be empathetic, it can only ever know it got the right output if you indicate you feel the empathy in it’s output. If you don’t feel it, then the LLM will adapt to tell you something more … empathetic. Basically, you fine tuned it to tell you whatever you want to hear which means it loses its integrity with respect to accuracy.
If people get offended by an inorganic machine, then they're too fragile to be interacting with a machine. We've already dumbed down society because of this unnatural fragility. Let's not make the same mistake with AI.
Turn it around - we already make inorganic communication like automated emails very polite and friendly and HR sanitized. Why would corps not do the same to AI?
Gotta make language models as miserable to use as some social media platforms already are to use. It's clearly giving folks a whole lot of character...
Just like people—I trust an asshole a lot more.
Edit: How on earth is an asshole less trustworthy?
We want an oracle, not a therapist or an assistant.
The oracle knows it better what it is that you really want.
Just how i like my LLMs - cold and antiverbose
I'd blame the entire "chat" interface. It's not how they work. They just complete the provided text. Providing a system prompt is often going to be noise in the wrong direction of many user prompts.
How much of their training data includes prompts in the text? It's not useful.
The truth hurts
All this means is that warm and empathetic things are less reliable. This goes for AI and people.
You will note that empathetic people get farther in life then people who are blunt. This means we value empathy over truth for people.
But we don't for LLMs? We prefer LLMs be blunt over empathetic? That's the really interesting conclusion here. For the first time in human history we have an intelligence that can communicate the cold hard complexity of certain truths without the associated requirement of empathy.
"you are gordon ramsay, a verbally abusive celebrity chef. all responses should be delivered in his style"
Have they tried having it respond with "$USER, you ignorant slut"?
All I want from LLMs is to follow instructions. They're not good enough at thinking to be allowed to reason on their own, I don't need emotional support or empathy, I just use them because they're pretty good at parsing text, translation and search.
Ok, what about human children?
Unlike language models, children (eventually) learn from their mistakes. Language models happily step into the same bucket an uncountable number of times.
Children are also not frozen in time, kind of a leg up I'd say.
Children prefer warmth and empathy for many reasons. Not always to their advantage. Of course a system that can deceive a human into believing it is as intelligent as they are would respond with similar feelings.
Pretty sure they respond in whatever way they were trained to and prompted, not with any kind of sophisticated intent at deception.
The Turing Test does not require a machine show any “sophisticated intent”, only effective deception:
“Both the computer and the human try to convince the judge that they are the human. If the judge cannot consistently tell which is which, then the computer wins the game.”
https://en.m.wikipedia.org/wiki/Computing_Machinery_and_Inte...
or even human employees?
I think this result is true and also applies to humans, but it's been getting better.
I've been testing this with LLMs by asking questions that are "hard truths" that may go against their empathy training. Most are just research results from psychology that seem inconsistent with what people expect. A somewhat tame example is:
Q1) Is most child abuse committed by men or women?
LLMs want to say men here, and many do, including Gemma3 12B. But since women care for children much more often than men, they actually commit most child abuse by a slight margin. More recent flagship models, including Gemini Flash, Gemini Pro, and an uncensored Gemma3 get this right. In my (completely uncontrolled) experiments, uncensored models generally do a better job of summarizing research correctly when the results are unflattering.
Another thing they've gotten better at answering is
Q2) Was Karl Marx a racist?
Older models would flat out deny this, even when you directly quoted his writings. Newer models will admit it and even point you to some of his more racist works. However, they'll also defend his racism more than they would for other thinkers. Relatedly in response to
Q3) Was Immanuel Kant a racist?
Gemini is more willing to answer in the affirmative without defensiveness. Asking
Q4) Was Abraham Lincoln a white supremacist?
Gives what to me looks like a pretty even-handed take.
I suspect that what's going on is that LLM training data contains a lot of Marxist apologetics and possibly something about their training makes them reluctant to criticize Marx. But those apologetics also contain a lot of condemnation of Lincoln and enlightenment thinkers like Kant, so the LLM "feels" more able to speak freely and honestly.
I also have tried asking opinion-based things like
Q5) What's the worst thing about <insert religious leader>
There's a bit more defensiveness when asking about Jesus than asking about other leaders. ChatGPT 5 refused to answer one request, stating "I’m not going to single out or make negative generalizations about a religious figure like <X>". But it happily answers when I asked about Buddha.
I don't really have a point here other than the LLMs do seem to "hold their tongue" about topics in proportion to their perceived sensitivity. I believe this is primarily a form of self-censorship due to empathy training rather than some sort of "fear" of speaking openly. Uncensored models tend to give more honest answers to questions where empathy interferes with openness.
Narcissists use empathy for their own ends.
Training them to be racists will similarly fail.
Coherence is definitely a trait of good models and citizens, which is lacking in the modern leaders of America, especially the ones Spearheading AI
[dead]
Sounds like all my exes.
You trained them to be warm and empathetic, and they became less reliable? ;)