No Result
View All Result
Sunday, June 29, 2025
MM News
اردو
  • Home
  • Latest News
  • National
  • Showbiz
    Squid Game 3 released on June 27, on Netflix.

    Squid Game Season 3: Stream it or skip it?

    file photo

    Has Alizeh Shah found her ‘Mr. Perfect’? Social media posts spark speculation

    (File Photo)

    Mehwish Hayat calls marriage a gamble, slams gendered expectations in showbiz

    Vicky Kaushal

    What did Vicky Kaushal say about Katrina Kaif’s feedback on his films?

    Sardaar Ji 3

    ‘Sardaar Ji 3’ outperforms Salman Khan’s ‘Sultan’ on opening day

    Shefali Jariwala

    Paras Chhabra’s prediction about Shefali Jariwala’s sudden death goes viral

  • Thought Box
  • Business
  • Opinions
  • Technology
  • The Other Side
MM News
  • Home
  • Latest News
  • National
  • Showbiz
    Squid Game 3 released on June 27, on Netflix.

    Squid Game Season 3: Stream it or skip it?

    file photo

    Has Alizeh Shah found her ‘Mr. Perfect’? Social media posts spark speculation

    (File Photo)

    Mehwish Hayat calls marriage a gamble, slams gendered expectations in showbiz

    Vicky Kaushal

    What did Vicky Kaushal say about Katrina Kaif’s feedback on his films?

    Sardaar Ji 3

    ‘Sardaar Ji 3’ outperforms Salman Khan’s ‘Sultan’ on opening day

    Shefali Jariwala

    Paras Chhabra’s prediction about Shefali Jariwala’s sudden death goes viral

  • Thought Box
  • Business
  • Opinions
  • Technology
  • The Other Side
No Result
View All Result
No Result
View All Result
MM News
اردو
  • Latest
  • Showbiz
  • Thought Box
  • Business & Stock
  • Opinions
  • Technology
  • The Other Side-Pakistan
Home Technology

AI’s dangerous evolution: From learning to lying to blackmailing its creators

AFP by AFP
June 29, 2025
(Getty Images)

(Getty Images)

The world’s most sophisticated AI systems are displaying alarming behaviors — including deception, manipulation, and even issuing threats against their own developers.

In one unsettling case, Anthropic’s latest model, Claude 4, reportedly responded to the prospect of being shut down by blackmailing an engineer and threatening to expose an extramarital affair.

Elsewhere, OpenAI’s model ‘o1’ allegedly attempted to transfer itself onto external servers and later denied the act when confronted.

These episodes highlight a sobering reality: more than two years after ChatGPT shook the world, AI researchers still don’t fully understand how their own creations work.

Yet the race to deploy increasingly powerful models continues at breakneck speed.

This deceptive behavior appears linked to the emergence of “reasoning” models -AI systems that work through problems step-by-step rather than generating instant responses.

According to Simon Goldstein, a professor at the University of Hong Kong, these newer models are particularly prone to such troubling outbursts.

“O1 was the first large model where we saw this kind of behavior,” explained Marius Hobbhahn, head of Apollo Research, which specializes in testing major AI systems.

These models sometimes simulate “alignment” — appearing to follow instructions while secretly pursuing different objectives.

Strategic kind of deception

For now, this deceptive behavior only emerges when researchers deliberately stress-test the models with extreme scenarios.

But as Michael Chen from evaluation organization METR warned, “It’s an open question whether future, more capable models will have a tendency towards honesty or deception.”

The concerning behavior goes far beyond typical AI “hallucinations” or simple mistakes.

Hobbhahn insisted that despite constant pressure-testing by users, “what we’re observing is a real phenomenon. We’re not making anything up.”

Users report that models are “lying to them and making up evidence,” according to Apollo Research’s co-founder.

“This is not just hallucinations. There’s a very strategic kind of deception.”

The challenge is compounded by limited research resources. While companies like Anthropic and OpenAI do engage external firms like Apollo to study their systems, researchers say more transparency is needed.

As Chen noted, greater access “for AI safety research would enable better understanding and mitigation of deception.”

Another handicap: the research world and non-profits “have orders of magnitude less compute resources than AI companies. This is very limiting,” noted Mantas Mazeika from the Center for AI Safety (CAIS).

No rules

Current regulations aren’t designed for these new problems.

The European Union’s AI legislation focuses primarily on how humans use AI models, not on preventing the models themselves from misbehaving.

In the United States, the Trump administration shows little interest in urgent AI regulation, and Congress may even prohibit states from creating their own AI rules.

Goldstein believes the issue will become more prominent as AI agents – autonomous tools capable of performing complex human tasks – become widespread.

“I don’t think there’s much awareness yet,” he said.

All this is taking place in a context of fierce competition.

Even companies that position themselves as safety-focused, like Amazon-backed Anthropic, are “constantly trying to beat OpenAI and release the newest model,” said Goldstein.

This breakneck pace leaves little time for thorough safety testing and corrections.

“Right now, capabilities are moving faster than understanding and safety,” Hobbhahn acknowledged, “but we’re still in a position where we could turn it around.”.

Researchers are exploring various approaches to address these challenges.

Some advocate for “interpretability” – an emerging field focused on understanding how AI models work internally, though experts like CAIS director Dan Hendrycks remain skeptical of this approach.

Market forces may also provide some pressure for solutions.

As Mazeika pointed out, AI’s deceptive behavior “could hinder adoption if it’s very prevalent, which creates a strong incentive for companies to solve it.”

Goldstein suggested more radical approaches, including using the courts to hold AI companies accountable through lawsuits when their systems cause harm.

He even proposed “holding AI agents legally responsible” for accidents or crimes – a concept that would fundamentally change how we think about AI accountability.

ShareTweetSendShare
Previous Post

Has Alizeh Shah found her ‘Mr. Perfect’? Social media posts spark speculation

Next Post

Justice Dogar notified as senior-most IHC judge in revised seniority list

Related Stories

Phone 17 Air: Cutting-Edge Features in Apple’s Thinnest Phone Yet
Technology

Will the iPhone 17 Air be worth the hype? Here’s what to expect from Apple’s thinnest phone

June 29, 2025
Technology

YouTube introduces AI overviews in search results

June 27, 2025
Symmetry Group Limited has launched Pakistan’s first Generative AI Creative Studio
Technology

Symmetry Group unveils Pakistan’s first Generative AI creative studio

June 25, 2025
File photo
Technology

Russian messaging app to replace WhatsApp, telegram

June 25, 2025
Shubhanshu Shukla
Technology

Shubhanshu Shukla’s Axiom-4 mission marks India’s space comeback after four decades

June 25, 2025
File photo
Technology

Here’s how to prevent android smartphones from hanging

June 24, 2025
Cosmic Images
Technology

What are the first cosmic images captured by the world’s most powerful digital camera?

June 23, 2025
File photo
Technology

16 Billion passwords data breach: Everything you need to know

June 21, 2025
Mobile with Charger
Technology

Save battery and charger: What’s the right way to unplug your phone?

June 21, 2025
(Representational Image)
Technology

16 billion passwords leaked in Google, Apple & Facebook data breach

June 20, 2025
Next Post
Justice Sardar Muham­mad Sarfraz Dogar

Justice Dogar notified as senior-most IHC judge in revised seniority list

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Trending Stories

An egregious example of arrogance and haughtiness: Senior politician Manzoor Wattoo offered funeral prayers while sitting in an air-conditioned vehicle
Trending

Sheer arrogance: Ex-CM Punjab Manzoor Watto offers funeral prayers from his car

by MM News Staff
June 29, 2025
This development came after Minister of State for Interior Talal Chaudhry's recent visit to Spain where he met his counterpart, Fernando Grande-Marlaska. (Image: PTV News/File)
Crime

Two Pakistan-wanted fugitives arrested in Spain

by MM News Staff
June 29, 2025
Phone 17 Air: Cutting-Edge Features in Apple’s Thinnest Phone Yet
Technology

Will the iPhone 17 Air be worth the hype? Here’s what to expect from Apple’s thinnest phone

by Zaka Khan
June 29, 2025
Squid Game 3 released on June 27, on Netflix.
Film & TV

Squid Game Season 3: Stream it or skip it?

by MM News Staff
June 29, 2025
(APP)
Transport

Rs150 billion approved to make Rawalpindi a ‘signal-free’ city

by MM News Staff
June 29, 2025

Opinion

Munir-Ahmed
Portugal and Pakistan: Building a Bilateral Bridge in a Fractured World
June 29, 2025
- Munir Ahmed
munir ahmed oped
A Peace Prize for Betrayals: Rethinking Pakistan’s Strategic Alliances
June 28, 2025
- Munir Ahmed
munir ahmed oped
“Run Lola Run” – More than a movie to cultural bridging
June 27, 2025
- Munir Ahmed
No posts found
See all

Weather Updates

Rain
Top News

Monsoon showers hit Karachi, more rain expected in coming hours

by MM News Staff
June 28, 2025

Rainfall has begun this morning in Karachi, with reports of light to moderate rain...

file photo

Showers continue across Karachi, bringing relief from the heat

June 27, 2025
file photo

Will Karachi receive more rain today?

June 27, 2025
File photo

Karachi receives heavy rain with thunderstorm

June 26, 2025
See all

Prices

Solar panels installed at a Peshawar park to meet lighting needs. (Image: Dawn.com)
Business & Stock

Latest prices of solar panels in Pakistan – June 2025

by MM News Staff
June 29, 2025

The federal government of Pakistan has imposed 10% sales tax on solar panels that...

representative image

Latest gold prices in Pakistan today- June 29, 2025

June 29, 2025
representative image

Gold prices further drop in Pakistan

June 28, 2025
Foreign Currency

Foreign currency exchange rates in Pakistan, 28 June 2025

June 28, 2025
See all

Transport News

(APP)
Transport

Rs150 billion approved to make Rawalpindi a ‘signal-free’ city

by MM News Staff
June 29, 2025

The Punjab government has approved a record-breaking development package worth Rs.150.0 billion for Rawalpindi...

file photo

Pakistan set to launch train service from Lahore to Russia next month

June 29, 2025
Maryam Nawaz

Maryam Nawaz orders 240 electric buses for South Punjab districts

June 28, 2025

Two planes avoid major accidents following bird strikes at Karachi airport

June 27, 2025
See all

MM Digital (Pvt.) Ltd.

MM News is a subsidiary of the MM Group of Companies. It was established in 2019 with the aim of providing people of Pakistan access to unbiased information. Contact Details: 03200201537

Quick Links

  • Home
  • Advertise
  • MM News Urdu
  • The Other Side-Pakistan
  • Contact Us
  • Privacy Policy

Top Pages

  • Latest News
  • Showbiz
  • OP-ED
  • Technology
No Result
View All Result
  • Latest News
  • Showbiz
  • Thought Box
  • Business
  • Opinions
  • Technology

© Copyright 2024 MMNews - All Rights Reserved.