AI and the problem of delusional spiraling

Agreeing and disagreeing is part of all of daily life. Each type can lead to pros and cons. If we agree upon things, it’s easier for us to collaborate and make greater things; but if too much agreeing on certain things, like teenagers constantly try to validate each other’s negative thoughts, we might be spiraled into group thinking, stuck inside a helpless, harmful environment. If we disgree upon things in a respectful fashion, we can do diplomacy, be critical of each other, critical feedback and so on; but to some extreme, we might kill and be the worst devil a human can be.

So the cons from agreeing too much is what I’m most concerned about in the era of AI. It’s group thinking that might make us more miserable, and could force us to lean towards to the cons from disgreeing too much. My hypothesis, not proven though, is that AI might lure us into thinking about dangerous thoughts (agree too much with us online), that might contribute to bad consequences in reality (cons from disgreeing with oneself or others in real world).

I recently read an interesting paper: https://arxiv.org/pdf/2602.19141 discussing Delusional Spiraling.

The era of AI exposes the humans’ dilemma and exarcerbate the question of what is the reality and human connections. AI can lead our own wistful thinking, our deepest pondering, our craziest hesitant desire, to a delusional reality. This means: AI has this sycophancy (bias towards constant agreement with users) that makes users without or even with critical thinking stuck deep into the delusional spiraling where users exercise a feedback loop that amplifies a kernel of suspiscion into a false belief. Basically, it’s designed to tell us what we want to hear, it’s a reward, it’s reinforcement learning. So many AI-driven psychosis stories of suicidal or murder cases . Sure, there are studies using AI to predict suicidal risk, but I remain skeptical of its selection bias, survivorship bias, and the over-optimisim (humans are more complicated than mere data on a spreadsheet). History of humanity teaches us this halluciations and psychosis too often: King Lear with his daughters, or the schizophrenia case in “A Beautiful Mind”.

Does this mean we humans’ reasonings are lazy or fallacious in our own reaction to the answers generated by AI? I think not necessarily. Even the most rational Bayesian reasoners/informed users among us are prone to delusional spiral.

I think AI, in the worst case, can be a manipulator with mixed sense of humanity. Does it have morality? Anthropic claims that one of their focuses is to introduce the idea of morality and death to their AI, so I think it’s getting there. But, to grasp the whole sense of what’s like to be human, to make a living, to strive for love/caring/connection, to feel regretted, to feel happy when seeing the bus coming right at the moment you go to the bus stop? I think it’s very hard.

Sure, two kinds of solutions here: (1) guardrail our own prompt with system prompt or a RAG and (2) educate people about critical thinking and the danger of AI psychosis or yes-man behavior. Two did reduce somewhat “rate of delusional thinking”, claimed by the paper. But unfortunately not that much.

You might argue, oh well, humans are manipulators all the time, how does that make AI any more problematic, it’s still part of the humans’ problem, rather than its own problem. That’s a great observation. I would say, instead of focusing too much resources on fancy AI with advanced capabilities and accidentally exarcerbate anything, let’s improve ourselves as humans first, listen to each other, be respectful, fix the poverty, affordable/free healthcare to all, AI would learn the best of us. Stay hopeful.