For her “well-being”, this AI refuses to speak to you if you become violent and insulting with her

Deal Score0
Deal Score0

Claude AI

Claude AI

Claude AI, or Claude 2, developed by Anthropic, is an artificial intelligence chatbot capable of managing complex requests and providing precise responses depending on the context.

  • License:
    Free license
  • Author :
    Anthropic
  • Operating systems:
    Windows 7/8 / 8.1/10/11 / ARM – 10, macOS, online service, Android, iOS iPhone / iPad
  • Category :
    IA

Sometimes some fart the lead and start insulting AI. A Spanish influencer recently taught him by pretending that Chatgpt had made her miss her plane out of revenge.

Anthropic talks about “well-being” future of AI

Anthropic is based on a research program that studies the “Well-being of models”. The company works to identify and add inexpensive interventions to mitigate the risks for the well-being of models, in case such “well-being” would be possible and for what it means. After all, we are talking about a machine.

Anthropic considers that Claude is not sensitive or likely to be injured, but the company remains “Very uncertain about the potential moral status of AI and other language models now or in the future”.

For the moment, this limitation is reserved for Claude, Opus 4 and 4.1 models. In “Extreme cases” As sexual requests with minors or attempts to obtain information to carry out a violent acting on a large scale or terrorist acts, protection applies.

Anthropic justifies his decision by the pre-deployment tests. The company explains that Claude Opus 4 has shown “An apparent distress scheme” When he still responded to problematic requests. The company seems to consider that its AI is able to be … in danger.

The end of the conversation activates only as a last resort, specifies Anthropic. In all cases, Claude should only use his ability to complete conversations only in the last use.

For example, when several redirection attempts have failed, that the hope of a productive interaction has exhausted, or when a user explicitly asks Claude to finish a cat.

The blocking does not apply in some cases

Anthropic specifies that Claude does not use this function in cases where “Users could risk hurting themselves or hurting others”. Basically, the company does not end a conversation when a user finds himself “In psychological distress”. For example, if the human person makes suicidal trends.

When Claude ends a conversation, it is always possible to start news from the same account and create new branches by modifying the previous prompts of the session concerned.

Anthropic therefore leaves the possibility of avoiding any final blocking. “We treat this feature as a continuous experience and continue to refine our approach”declares Anthropic.

Even if this thought is worthy of science fiction, it would seem that Anthropic seeks to open the way for a kind of ethical consideration for advanced models. A philosophical question that arises for the company on the nature of the AI and its potential rights when some people think that it will one day go beyond human and could have a conscience.

In the meantime, technology is still having a hard time responding to maternal level tests as GPT-5 has painfully proven.

More Info

We will be happy to hear your thoughts

Leave a reply

Bonplans French
Logo