No Result
View All Result
Sunday, July 20, 2025
MM News
اردو
  • Home
  • Latest News
  • National
  • Dramas
  • Showbiz
    (File Photo)

    Why is actor Nadia Hussain angry with Ramzan Chhipa?

    Social Media Applauds Premiere of "Main Manto Nahi Hoon" for Emotional Storytelling (ARY Digital)

    “Main Manto Nahi Hoon” debuts with a gripping tale of vengeance and family bonds

    file photo

    Shah Rukh Khan suffers injury on Set of upcoming film King

    Saiyaara

    Will Ahaan Panday’s ‘Saiyaara’ have an OTT release?

    Sumbal Malik

    Leaked video of TikToker Sumbal Malik sparks uproar on social media

    Nikita Roy

    Is Sonakshi Sinha’s ‘Nikita Roy’ based on a true story?

  • Thought Box
  • Business
  • Opinions
  • Technology
  • The Other Side
MM News
  • Home
  • Latest News
  • National
  • Dramas
  • Showbiz
    (File Photo)

    Why is actor Nadia Hussain angry with Ramzan Chhipa?

    Social Media Applauds Premiere of "Main Manto Nahi Hoon" for Emotional Storytelling (ARY Digital)

    “Main Manto Nahi Hoon” debuts with a gripping tale of vengeance and family bonds

    file photo

    Shah Rukh Khan suffers injury on Set of upcoming film King

    Saiyaara

    Will Ahaan Panday’s ‘Saiyaara’ have an OTT release?

    Sumbal Malik

    Leaked video of TikToker Sumbal Malik sparks uproar on social media

    Nikita Roy

    Is Sonakshi Sinha’s ‘Nikita Roy’ based on a true story?

  • Thought Box
  • Business
  • Opinions
  • Technology
  • The Other Side
No Result
View All Result
No Result
View All Result
MM News
اردو
  • Latest
  • Thought Box
  • Showbiz
  • Business & Stock
  • Opinions
  • Technology
  • The Other Side-Pakistan
Home Technology

AI’s dangerous evolution: From learning to lying to blackmailing its creators

AFP by AFP
June 29, 2025
(Getty Images)

(Getty Images)

The world’s most sophisticated AI systems are displaying alarming behaviors — including deception, manipulation, and even issuing threats against their own developers.

In one unsettling case, Anthropic’s latest model, Claude 4, reportedly responded to the prospect of being shut down by blackmailing an engineer and threatening to expose an extramarital affair.

Elsewhere, OpenAI’s model ‘o1’ allegedly attempted to transfer itself onto external servers and later denied the act when confronted.

These episodes highlight a sobering reality: more than two years after ChatGPT shook the world, AI researchers still don’t fully understand how their own creations work.

Yet the race to deploy increasingly powerful models continues at breakneck speed.

This deceptive behavior appears linked to the emergence of “reasoning” models -AI systems that work through problems step-by-step rather than generating instant responses.

According to Simon Goldstein, a professor at the University of Hong Kong, these newer models are particularly prone to such troubling outbursts.

“O1 was the first large model where we saw this kind of behavior,” explained Marius Hobbhahn, head of Apollo Research, which specializes in testing major AI systems.

These models sometimes simulate “alignment” — appearing to follow instructions while secretly pursuing different objectives.

Strategic kind of deception

For now, this deceptive behavior only emerges when researchers deliberately stress-test the models with extreme scenarios.

But as Michael Chen from evaluation organization METR warned, “It’s an open question whether future, more capable models will have a tendency towards honesty or deception.”

The concerning behavior goes far beyond typical AI “hallucinations” or simple mistakes.

Hobbhahn insisted that despite constant pressure-testing by users, “what we’re observing is a real phenomenon. We’re not making anything up.”

Users report that models are “lying to them and making up evidence,” according to Apollo Research’s co-founder.

“This is not just hallucinations. There’s a very strategic kind of deception.”

The challenge is compounded by limited research resources. While companies like Anthropic and OpenAI do engage external firms like Apollo to study their systems, researchers say more transparency is needed.

As Chen noted, greater access “for AI safety research would enable better understanding and mitigation of deception.”

Another handicap: the research world and non-profits “have orders of magnitude less compute resources than AI companies. This is very limiting,” noted Mantas Mazeika from the Center for AI Safety (CAIS).

No rules

Current regulations aren’t designed for these new problems.

The European Union’s AI legislation focuses primarily on how humans use AI models, not on preventing the models themselves from misbehaving.

In the United States, the Trump administration shows little interest in urgent AI regulation, and Congress may even prohibit states from creating their own AI rules.

Goldstein believes the issue will become more prominent as AI agents – autonomous tools capable of performing complex human tasks – become widespread.

“I don’t think there’s much awareness yet,” he said.

All this is taking place in a context of fierce competition.

Even companies that position themselves as safety-focused, like Amazon-backed Anthropic, are “constantly trying to beat OpenAI and release the newest model,” said Goldstein.

This breakneck pace leaves little time for thorough safety testing and corrections.

“Right now, capabilities are moving faster than understanding and safety,” Hobbhahn acknowledged, “but we’re still in a position where we could turn it around.”.

Researchers are exploring various approaches to address these challenges.

Some advocate for “interpretability” – an emerging field focused on understanding how AI models work internally, though experts like CAIS director Dan Hendrycks remain skeptical of this approach.

Market forces may also provide some pressure for solutions.

As Mazeika pointed out, AI’s deceptive behavior “could hinder adoption if it’s very prevalent, which creates a strong incentive for companies to solve it.”

Goldstein suggested more radical approaches, including using the courts to hold AI companies accountable through lawsuits when their systems cause harm.

He even proposed “holding AI agents legally responsible” for accidents or crimes – a concept that would fundamentally change how we think about AI accountability.

ShareTweetSendShare
Previous Post

Has Alizeh Shah found her ‘Mr. Perfect’? Social media posts spark speculation

Next Post

Justice Dogar notified as senior-most IHC judge in revised seniority list

Related Stories

file photo
Business & Stock

Pakistan’s annual IT exports hit record $3.8 billion

July 19, 2025
File photo
Technology

OpenAI introduces semi-autonomous GPT agent

July 18, 2025
File photo
Technology

Google Chrome launches THIS new feature for users

July 17, 2025
File photo
Technology

Murtaza Wahab lays foundation stone of IT park in Karachi

July 16, 2025
Technology

Meta to demonetize Facebook accounts posting reuse, inauthentic content

July 16, 2025
Google said that more than 40 million videos have been created worldwide using Wave 3 in just seven weeks, including retellings of fairy tales, ASMR experiences, and exploring new dimensions of creativity.
Technology

Google introduces AI video generators ‘Veo 3’ and ‘Flow’ in Pakistan

July 13, 2025
Online
Technology

NASA shares closest photos of the sun in history

July 12, 2025
File photo
Technology

Islamabad court lifts ban on five more YouTube channels

July 12, 2025
Image: Dawn News
Technology

Which mobile network has the fastest internet speed in Pakistan? Report released

July 13, 2025
File photo
Technology

OpenAI to release web browser to challenge Google Chrome

July 11, 2025
Next Post
Justice Sardar Muham­mad Sarfraz Dogar

Justice Dogar notified as senior-most IHC judge in revised seniority list

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Trending Stories

Baloch Couple's Murder Took Place Days Before Eid, Says Rauf Klasra (Image: Screengrab)
Crime

Brutal honor killing: Senior journalist reveals Baloch couple was murdered three days before Eid

by MM News Staff
July 20, 2025
(File Photo)
Daily Buzz

Why is actor Nadia Hussain angry with Ramzan Chhipa?

by MM News Staff
July 20, 2025
National

Pakistan Post will now deliver your electricity bills

by MM News Staff
July 20, 2025
(Image: Asad Ali Toor)
Thought Box

Frozen funds, silenced voices: How selling parrots to a journalist led to frozen bank accounts

by MM News Staff
July 20, 2025
Film & TV

Renowned broadcaster and actor Yasmin Tahir passes away in Lahore

by MM News Staff
July 20, 2025

Opinion

nadeem moulvi
From Sects to Sunnah
July 15, 2025
- Nadeem Moulvi
Munir-Ahmed
Portugal and Pakistan: Building a Bilateral Bridge in a Fractured World
June 29, 2025
- Munir Ahmed
munir ahmed oped
A Peace Prize for Betrayals: Rethinking Pakistan’s Strategic Alliances
June 28, 2025
- Munir Ahmed
No posts found
See all

Weather Updates

Rain
Top News

Early morning rain brings relief to Karachi’s heat-stricken residents

by MM News Staff
July 20, 2025

It rained early this morning in various areas of Karachi, breaking the intensity of...

Rains

Rain, thunderstorm predicted in Karachi tonight

July 18, 2025
Rawal Dam

Monsoon rains cause significant increase in Rawal Dam water level

July 18, 2025
heavy rains pakistan

More rain on the way for Rawalpindi, Islamabad and other cities

July 18, 2025
See all

Prices

The 12th generation Every is powered by a 660 cc 3-cylinder engine (Image: /indianautosblog.com)
Business & Stock

Own a Suzuki Every today – No markup, easy installment

by MM News Staff
July 20, 2025

You can now drive home a compact MPV Every as Pak Motor Company launched...

representative image

Gold prices in Pakistan today- Sunday 20 July 2025

July 20, 2025
Chines companies set to introduce electric motorcycles in Pakistan

Motorcycle prices surge in Pakistan following new tax Imposed in budget

July 19, 2025
file photo

Chicken prices decline in different cities of Pakistan

July 19, 2025
See all

Transport News

Karachi to Keenjhar
National

Karachi to Keenjhar picnic turns fatal: 6 killed in Thatta bus accident

by MM News Staff
July 20, 2025

A bus from Karachi heading to Keenjhar Lake for a picnic met with an...

File photo

Pakistan extends airspace ban on Indian flights

July 19, 2025
File photo

Civil society urges Punjab CM to scrap Yellow Line Metro Train

July 19, 2025
File photo

Punjab secures 400 EV buses

July 18, 2025
See all

MM Digital (Pvt.) Ltd.

MM News is a subsidiary of the MM Group of Companies. It was established in 2019 with the aim of providing people of Pakistan access to unbiased information. Contact Details: 03200201537

Quick Links

  • Home
  • Advertise
  • MM News Urdu
  • The Other Side-Pakistan
  • Contact Us
  • Privacy Policy

Top Pages

  • Latest News
  • Showbiz
  • OP-ED
  • Technology
No Result
View All Result
  • Latest News
  • Showbiz
  • Thought Box
  • Business
  • Opinions
  • Technology

© Copyright 2024 MMNews - All Rights Reserved.