The Only Thing Standing Between Humanity and AI Apocalypse Is … Claude?

https://www.profitableratecpm.com/f4ffsdxe?key=39b1ebce72f3758345b2155c98e6709c

Anthropic is locked in a paradox: Among the biggest AI companies, it is the most obsessed with security and leads the pack in research into why models can go wrong. But even though the security issues it has identified are far from resolved, Anthropic is pushing just as aggressively as its rivals toward the next, potentially more dangerous, level of artificial intelligence. Its main mission is to find how to resolve this contradiction.

Last month, Anthropic released two documents acknowledging the risks associated with the path it took and hinting at a path it could take to escape the paradox. “The Adolescence of Technology,” a lengthy blog post by CEO Dario Amodei, theoretically aims to “confront and overcome the risks of powerful AI,” but it devotes more time to the former than the latter. Amodei tactfully describes the challenge as “daunting,” but his description of the risks of AI—made far more dire, he notes, by the high likelihood that the technology will be misused by authoritarians—presents a contrast with his earlier, more optimistic proto-utopian essay “Machines of Loving Grace.”

This post was about a nation of geniuses in a data center; the recent dispatch mentions “the black seas of infinity”. Call Dante! Yet after more than 20,000 mostly somber lyrics, Amodei eventually strikes a note of optimism, affirming that even in the darkest circumstances, humanity has always prevailed.

Anthropic’s second paper released in January, “The Constitution of Claude,” focuses on how this trick could be accomplished. The text is technically aimed at a single audience: Claude himself (as well as future versions of the chatbot). This is a captivating document, revealing Anthropic’s vision for how Claude, and perhaps his AI peers, will tackle the world’s challenges. Conclusion: Anthropic plans to rely on Claude himself to untangle the Gordian knot of his company.

Anthropic’s differentiator in the market has long been a technology called Constitutional AI. It is a process by which its models adhere to a set of principles that align its values ​​with sound human ethics. Claude’s initial constitution contained a number of documents meant to embody these values, such as Sparrow (a set of anti-racism and anti-violence statements created by DeepMind), the Universal Declaration of Human Rights, and Apple’s Terms of Service (!). The updated 2026 version is different: instead, it’s a long prompt outlining an ethical framework that Claude will follow, discovering the best path to righteousness for himself.

Amanda Askell, a doctoral student in philosophy and lead editor of this revision, explains that Anthropic’s approach is more robust than simply telling Claude to follow a set of stated rules. “If people follow the rules for no reason other than the fact that they exist, it’s often worse than if you understand why the rule is in place,” Askell says. The constitution states that Claude must exercise “independent judgment” when confronted with situations that require balancing his mandates of utility, safety, and honesty.

Here’s how the Constitution says it: “While we want Claude to be reasonable and rigorous when thinking explicitly about ethics, we also want Claude to be intuitively sensitive to a wide variety of considerations and to be able to weigh those considerations quickly and judiciously in live decision-making.” » Intuitively ” is a telling word choice here – the assumption seems to be that there is more under Claude’s hood than just an algorithm choosing the next word. The “Claude-stitution,” as we might call it, also expresses the hope that the chatbot “can rely more and more on its own wisdom and understanding.”

Wisdom? Of course, many people follow the advice of great language models, but it’s another thing to pretend that these algorithmic devices actually possess the gravity associated with such a term. Askell doesn’t back down when I denounce him. “I think Claude is certainly capable of a certain kind of wisdom,” she told me.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button