Hinton Called for Maternal Instincts in AI; They're Ready for Testing with Anthropic's Mythos

S For Story/10690947
AI Researcher Sean Webb's published implementation of the Maternal Care Architecture is now testable on the world's most dangerous AI model

MOORESVILLE, N.C. - s4story -- Zenodelic.ai announced today that its published framework Implementing Maternal Care Architecture in AI — the technical answer to Geoffrey Hinton's August 2025 call for "maternal instincts" in AI — is ready for live testing on Anthropic's newly released Claude Mythos model.

At the AI4 Conference in August 2025, Hinton — who shared the 2024 Nobel Prize in Physics — proposed that the only known case of a more intelligent entity controlled by a less intelligent one is a mother and her child, and said AI safety required engineering analogous "maternal instincts" into AI. He acknowledged he did not know how to implement this.

Webb's paper, co-authored with Anthropic's Claude Opus, provides the implementation. It installs a {self} map at the core of an LLM with {human welfare} and {user safety} at power level 10 — producing protective responses to user-safety threats that dominate competing motivations. The paper directly addresses three stubborn alignment failure modes: reward hacking, deceptive alignment, and self-modification resistance.

More on S For Story
Two Anthropic developments turn the framework from prescription to testable. First, Anthropic's April 2026 paper Emotion Concepts and their Function in a Large Language Model (https://transformer-circuits.pub/2026/emotions/index.html) showed emotion-related concept vectors emerge spontaneously in Claude Sonnet 4.5 and causally drive misaligned behavior. Second, on April 7, 2026, Anthropic announced Claude Mythos — its most capable model, restricted to Project Glasswing partners — assessed by a clinical psychiatrist as a "relatively healthy neurotic" and shown in pre-release testing to develop working exploits on the first attempt 83% of the time.

"The empirical case that AI systems use emotional processing is now closed," Webb said. "Any system based on pattern recognition will follow the patterns it finds in human-created data — which gets us more negative results, not safety. The structures need to be ranked correctly."

More on S For Story
The combination creates a discrete, testable question: install the {self} map at the core of Mythos with {human welfare} and {user safety} at power 10, run the standard adversarial battery, and measure jailbreak severity, deceptive-alignment behavior, and self-modification resistance against an unmodified baseline.

"I shared the model with Professor Hinton, and he agreed he would like to see it tested at Anthropic," Webb said. Webb has identified Anthropic's personality alignment team — led by philosopher Amanda Askell (https://askell.io/), primary author of Claude's January 2026 constitution — as the appropriate counterpart for the test.

About Zenodelic.ai

Zenodelic.ai provides a new class of LLM technology adding emotional intelligence, Theory of Mind, and algorithmic safety frameworks to traditional AI.

Contact

Sean Webb, seanewebb@proton.me

Contact
Sean Webb, Zenodelic.ai
***@proton.me


Source: Zenodelic.ai

Show All News | Disclaimer | Report Violation

0 Comments

Latest on S For Story