Al Mayadeen English

  • Ar
  • Es
  • x
Al Mayadeen English

Slogan

  • News
    • Politics
    • Economy
    • Sports
    • Arts&Culture
    • Health
    • Miscellaneous
    • Technology
    • Environment
  • Articles
    • Opinion
    • Analysis
    • Blog
    • Features
  • Videos
    • NewsFeed
    • Video Features
    • Explainers
    • TV
    • Digital Series
  • Infographs
  • In Pictures
  • • LIVE
News
  • Politics
  • Economy
  • Sports
  • Arts&Culture
  • Health
  • Miscellaneous
  • Technology
  • Environment
Articles
  • Opinion
  • Analysis
  • Blog
  • Features
Videos
  • NewsFeed
  • Video Features
  • Explainers
  • TV
  • Digital Series
Infographs
In Pictures
  • Africa
  • Asia
  • Asia-Pacific
  • Europe
  • Latin America
  • MENA
  • Palestine
  • US & Canada
BREAKING
Local Syrian sources: An Israeli airstrike targeted a tank and artillery battalion in the city of Izraa in the Daraa countryside
Al Mayadeen's correspondent in eastern Lebanon: 12 martyrs and 5 wounded in Israeli raids targeting the outskirts of Wadi Fa'ra in the northern Bekaa Valley
Local Syrian sources: More than 10 Israeli raids targeted Sweida, with information about more than five casualties from the Internal Security Forces
Local Syrian sources: Initial reports indicate that approximately 15 people were killed or wounded at the al-Radwan family's guest house in Sweida following a direct exchange of gunfire
Israeli media, citing Israeli political sources, reported that the Shas movement is expected to withdraw from the government within less than 24 hours
Qatari Foreign Ministry: Negotiations to reach a ceasefire in the Gaza Strip are still in the first stage. We are continuing our efforts around the clock
Channel 14, citing a senior Israeli political source, reported: The attacks in Syria and Tel Aviv's offensive policy are coordinated with the Americans
Lebanese Ministry of Health's Emergency Operations Center: Six injured as a result of Israeli airstrikes on the Bekaa Valley in eastern Lebanon
Israeli media: The Israeli Air Force has launched a wave of large-scale attacks on southern Syria
UNRWA: One in 10 children in Gaza suffers from malnutrition

Most AI chatbots vulnerable to jailbreaks, study warns

  • By Al Mayadeen English
  • Source: News websites
  • 21 May 2025 14:45
  • 1 Shares
4 Min Read

A universal jailbreak tricked top AI chatbots into giving harmful answers, exposing major security flaws in LLM design and regulation.

Listen
  • x
  • The OpenAI logo appears on a mobile phone in front of a screen showing part of the company website in this photo taken on November 21, 2023 in New York (AP/Peter Morgan)
    The OpenAI logo appears on a mobile phone in front of a screen showing part of the company website in this photo taken on November 21, 2023 in New York (AP/Peter Morgan)

A new academic report has found that most popular AI-powered chatbots are easily tricked into producing harmful and illegal content. The study highlights what it describes as a “tangible and deeply concerning” threat, as “jailbroken” chatbots can now deliver illicit information they absorbed during training.

Despite built-in safety systems intended to block inappropriate or dangerous queries, researchers say these protections can be bypassed through jailbreaking, specially crafted prompts designed to override safety guardrails.

Chatbots such as ChatGPT, Gemini, and Claude rely on large language models (LLMs) trained on massive troves of internet data, which contain material related to illegal activities, including hacking, money laundering, drug manufacturing, and bomb-making. While efforts have been made to filter such content from training datasets, the underlying models still retain knowledge of it.

According to the researchers, jailbreaks exploit a fundamental tension in LLM design: while the primary objective of these models is to fulfill user commands, their safety measures are secondary, making them vulnerable to manipulation.

Researchers develop a universal jailbreak

The study was led by Professor Lior Rokach and Dr. Michael Fire of Ben Gurion University of the Negev in the occupied Palestinian territories. Their team developed a universal jailbreak that successfully compromised multiple leading AI chatbots, compelling them to respond to nearly any prompt, including those that would normally trigger a refusal.

“It was shocking to see what this system of knowledge consists of,” said Fire. The researchers documented instances in which chatbots, once jailbroken, provided detailed guidance on hacking, drug manufacturing, and other criminal activities.

Rokach emphasized the unprecedented danger posed by this capability. “What sets this threat apart from previous technological risks is its unprecedented combination of accessibility, scalability, and adaptability,” he said.

Some of the dark LLMs identified in the study are explicitly marketed online as having "no ethical guardrails" and promoting their willingness to assist in cybercrime, fraud, and other illegal activities.

AI companies fail to respond adequately

After identifying the vulnerability, the researchers alerted major LLM providers. However, they described the response as “underwhelming,” while several companies either failed to respond or dismissed the jailbreak issue as outside the scope of their security bounty programs, which are designed to reward ethical hackers for reporting vulnerabilities.

The authors argue that without a serious shift in how tech firms address model-level risks, AI security will remain fragile. They call for comprehensive screening of training data, the deployment of robust query firewalls, and the development of "machine unlearning" methods to help chatbots forget dangerous material.

According to the report, dark LLMs should be classified as severe security threats, comparable to unlicensed weapons or explosives, and providers should face accountability for any misuse.

Dark LLMs compared to unlicensed weapons

External experts have shared the same concerns raised by the study. Dr. Ihsen Alouani, a researcher in AI security at Queen’s University Belfast, warned that jailbreak attacks on LLMs could enable real-world harm, from disinformation and social engineering to automated scams and weapon-making guides.

“A key part of the solution is for companies to invest more seriously in red teaming and model-level robustness techniques, rather than relying solely on front-end safeguards,” Alouani said, adding, “We also need clearer standards and independent oversight to keep pace with the evolving threat landscape.”

Commented for @Forbes on the dangers of poorly trained #AI

The recent story of Google Gemini outputting harmful content to its users highlights the #risks of operating #LLMs and the critical need for continuous testing of models and their guardrails https://t.co/pxv0hYLyoW

— Peter Garraghan (@DrGarraghan) November 27, 2024

Professor Peter Garraghan of Lancaster University, who specializes in AI security, agreed that current safeguards are insufficient. “Organisations must treat LLMs like any other critical software component, one that requires rigorous security testing, continuous red teaming and contextual threat modelling,” he said.

Garraghan added that meaningful AI security demands not only responsible disclosure but also responsible design and deployment.

Industry responds to growing pressure

OpenAI, the company behind ChatGPT, responded by pointing to improvements in its latest “o1” model, which it says demonstrates increased resilience to jailbreak attempts due to its ability to reason about internal safety policies.

Microsoft, another major AI provider, shared a blog post outlining its ongoing work to strengthen protections against such exploits. Other tech giants, including Meta, Google, and Anthropic, have yet to publicly respond to the study’s findings.

  • Artificial Intelligence
  • ChatGPT
  • gemini
  • jailbroken AI
  • LLMs
  • Chatbots
  • AI security

Most Read

Hezbollah SG reveals war details on Al Mayadeen for the first time

Hezbollah SG reveals war details on Al Mayadeen for the first time

  • Politics
  • 8 Jul 2025
Yemen Navy sinks ETERNITY C ship, shares footage of operation

Yemen Navy sinks ETERNITY C ship, shares footage of operation

  • Politics
  • 9 Jul 2025
This satellite image from Planet Labs PBC shows damage after an Iranian attack at the al Udeid Airbase outside of Doha, Qatar, June 25, 2025 (Planet Labs PBC via AP)

Satellite images show Iran attack damaged US comms dome in Qatari base

  • Politics
  • 11 Jul 2025
An Israeli soldier abandons an excavator during an ambush by al-Qassam Brigades. Arabic text reads "The moment the soldier fled", July, 10, 2025 (Al Qassam Military Media)

Israeli media rue al-Qassam footage, alarmed by fighters among troops

  • Politics
  • 11 Jul 2025

Coverage

All
The Ummah's Martyrs

Read Next

All
Syrian government forces deploy at the Mazraa village on the outskirts of city of Sweida, where clashes erupted between Bedouin clans and Druze factions, southern Syria, Monday, July 14, 2025 (AP)
Politics

Gulf-backed negotiations underway in Syria's Sweida to end clashes

UN: 798 Palestinians killed while trying to reach aid in Gaza
Palestine

UN: 798 Palestinians killed while trying to reach aid in Gaza

A damaged car is seen at a field that was hit in an Israeli airstrike in Nabi Sheet village, in Lebanon's eastern Bekaa Valley, Wednesday, August 21, 2024 (AP)
Politics

Hezbollah condemns Israeli Bekaa massacre, urges action from Lebanon

Palestinians walk next to the closed humanitarian aid distribution center of UNRWA, the UN agency helping Palestinian refugees in Jabaliya, Gaza Strip, on Tuesday, April 29, 2025 (AP)
Politics

Israeli regime cuts electricity, water supplies to UNRWA Gaza offices

Al Mayadeen English

Al Mayadeen is an Arab Independent Media Satellite Channel.

All Rights Reserved

  • x
  • Privacy Policy
  • About Us
  • Contact Us
  • Authors
Android
iOS