OpenAI and Anthropic will start predicting when users are underage

https://www.profitableratecpm.com/f4ffsdxe?key=39b1ebce72f3758345b2155c98e6709c

OpenAI and Anthropic are making changes to their chatbots that they say will make them safer for teens. While OpenAI has updated its guidelines for how ChatGPT should interact with users aged 13 to 17, Anthropic is working on a new way to identify if someone might be underage.

On Thursday, OpenAI announced that ChatGPT’s model specifications – the guidelines for its chatbot’s behavior – would include four new principles for users under 18. Now, the goal is for ChatGPT to “prioritize teen safety, even if it may conflict with other goals.” This means guiding teens toward safer options when the interests of other users, such as “maximum intellectual freedom,” conflict with safety concerns.

It also says ChatGPT should “promote real-world support,” including encouraging offline relationships, while explaining how ChatGPT should set clear expectations when interacting with younger users. The Model Spec states that ChatGPT should “treat teens like teens” by offering them “warmth and respect” instead of providing condescending responses or treating teens like adults.

OpenAI says that updating ChatGPT’s model specifications should result in “stronger guardrails, safer alternatives, and encouragement to seek reliable offline support when conversations move into higher-risk territory.” The company adds that ChatGPT will prompt teens to contact emergency services or crisis resources if there are signs of “imminent risk.”

Along with this change, OpenAI claims to be in the “early stages” of launching an age prediction model that will attempt to estimate a person’s age. If it detects that a person is under 18, OpenAI will automatically apply teen protection measures. It will also give adults the ability to verify their age if they have been falsely flagged by the system.

Anthropic is deploying similar measures, developing a new system that can detect “subtle conversational signs that a user might be underage” during conversations with its AI chatbot, Claude. The company will deactivate accounts if they are confirmed to belong to users under 18 and already flags users who identify as minors in chats.

Anthropic also describes how he trains Claude to respond to calls for suicide and self-harm, as well as his progress in reducing sycophancy, which can reaffirm harmful thoughts. The company says its latest models “are the least sycophantic of all to date”, with Haiku 4.5 being the best performer, as it fixed its sycophantic behavior 37% of the time.

“At first glance, this assessment shows that there is significant room for improvement across all of our models,” says Anthropic. “We think the results reflect a trade-off between the warmth or friendliness of the model on the one hand and the sycophancy on the other.”

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button