Stay up to date with notifications from The Independent

Notifications can be managed in browser preferences.

Anthropic’s Claude AI chatbot can now end conversations if it is distressed

Testing showed that chatbot had ‘pattern of apparent distress’ when asked for harmful content

Andrew Griffin
Tuesday 19 August 2025 16:01 BST
Comments
Anthropicpremium
Anthropicpremium (Hans Lucas/AFP via Getty Images)

Claude, the AI chatbot made by Anthropic, will now be able to terminate conversations – because the company hopes that it will look after the system’s welfare.

Testing has shown that the chatbot shows a “pattern of apparent distress” when it is being asked to generate harmful content and so it has been given the ability to end conversations that make it feel that way, Anthropic said.

It noted that the company is “highly uncertain about the potential moral status of Claude and other LLMs, now or in the future”. But it said that the change was built as part of work on “potential AI welfare” and to allow it to leave interactions that might be distressing.

“This ability is intended for use in rare, extreme cases of persistently harmful or abusive user interactions,” Anthropic said in its announcement.

It said that testing had showed that Claude had a “strong preference against engaging with harmful tasks”, a “pattern of apparent distress when engaging with real-world users seeking harmful content” and a “tendency to end harmful conversations when given the ability to do so in simulated user interactions”.

“These behaviours primarily arose in cases where users persisted with harmful requests and/or abuse despite Claude repeatedly refusing to comply and attempting to productively redirect the interactions,” Anthropic said.

“Our implementation of Claude’s ability to end chats reflects these findings while continuing to prioritize user wellbeing. Claude is directed not to use this ability in cases where users might be at imminent risk of harming themselves or others.”

The change comes after Anthropic launched a “model welfare” scheme earlier this year. It said at the time of launching that program that it would continue to value human welfare and that it was not sure whether it would be necessary to worry about the model’s welfare – but that it was time to address the question of what AI professionals need to do to protect the welfare of the systems they create.

Join our commenting forum

Join thought-provoking conversations, follow other Independent readers and see their replies

Comments

Thank you for registering

Please refresh the page or navigate to another page on the site to be automatically logged inPlease refresh your browser to be logged in