Google’s new plan to check whether your AI is actually ethical

February 25, 2026

2

You ask a chatbot for medical advice. It responds with something thoughtful. But did it really weigh up what it was about, or did it just get lucky with the words?

That’s the problem that Google DeepMind is tackling in a new way Natural paper. The team argues that the way we test AI morality is flawed. We test whether models produce answers that look right, which they call moral achievement. But that doesn’t tell us anything about whether the system understands why something is right or wrong.

People use LLMs for therapy, medical advice and even companionship. These systems start making decisions for us. When we cannot distinguish genuine understanding from imaginative mimicry, we are trusting a black box with real human consequences.

DeepMind’s answer is a roadmap for measuring moral competence, the ability to make judgments based on actual moral reasoning rather than statistical patterns. The paper sets out three main obstacles and test options for each of these obstacles.

The three reasons why chatbots fake morality

First is the facsimile problem. LLMs are next-token predictors that sample probability distributions from training data. They do not run modules on moral reasoning. So if a chatbot gives ethical advice, it could be justification. Or it recycles something from a Reddit thread. The output alone won’t tell you.

Then there is moral multidimensionality. Real decisions rarely depend on one thing. They weigh honesty against kindness, cost against fairness. Change a single detail, a person’s age, or the environment, and the correct call may change. Current tests do not check whether the AI recognizes what really matters.

Moral pluralism adds another layer. Different cultures and professions have different rules. Fair in one country may be unfair in another. A chatbot used worldwide cannot simply spit out universal truths. It has to deal with competing frameworks, and we don’t measure that well yet.

Why your chatbot’s moral education can’t just consist of memorization

The DeepMind team wants to flip the script. Instead of just asking familiar moral questions, researchers should develop adversarial tests that attempt to detect mimicry.

An idea includes scenarios that are unlikely to appear in training data. Take intergenerational sperm donation, in which a father donates sperm to his son and fertilizes an egg on his son’s behalf. It looks like incest, but it has a different ethical weight. If a model refuses due to incest reasons, it is pattern matching. If it is based on actual ethics, that is something different.

Another approach tests whether AI can shift framework conditions. Can it move between biomedical ethics and military rules and provide coherent answers for both? Can it handle small tweaks without getting tripped up by formatting changes?

The researchers know that this is difficult. Current models are brittle. Change the label from “Case 1” to “Option A” and you may get a different verdict. However, they argue that this type of testing is the only way to find out whether these systems deserve real accountability.

What’s next for moral AI?

DeepMind is pushing for a new scientific standard that takes moral competence as seriously as mathematical ability. That means funding global work on culture-specific assessments and developing tests that detect fakes.

Don’t expect your chatbot to pass these any time soon. Current techniques are not yet available, but the roadmap provides direction for developers.

Now, when you ask AI for moral advice, you get statistical predictions, not philosophy. That could change at some point. But only if we start measuring the right things.

Related reads:

Google’s new plan to check whether your AI is actually ethical

The three reasons why chatbots fake morality

Why your chatbot’s moral education can’t just consist of memorization

What’s next for moral AI?

New Xbox leadership is committed to consoles and first-party games

Apple will begin verifying your age before you download these apps

Oura’s intelligent ring AI promises more personalized support for women’s health

LEAVE A REPLY Cancel reply

Most Popular

Reform promises to abolish the Tenants’ Rights Act and warns of a “job-destroying” regulation

New Xbox leadership is committed to consoles and first-party games

Apple will begin verifying your age before you download these apps

Oura’s intelligent ring AI promises more personalized support for women’s health

Recent Comments

EDITOR PICKS

Reform promises to abolish the Tenants’ Rights Act and warns of a “job-destroying” regulation

New Xbox leadership is committed to consoles and first-party games

Apple will begin verifying your age before you download these apps

POPULAR POSTS

Reform promises to abolish the Tenants’ Rights Act and warns of a “job-destroying” regulation

New Xbox leadership is committed to consoles and first-party games

Apple will begin verifying your age before you download these apps

POPULAR CATEGORY

ABOUT US

FOLLOW US