Al Mayadeen English

  • Ar
  • Es
  • x
Al Mayadeen English

Slogan

  • News
    • Politics
    • Economy
    • Sports
    • Arts&Culture
    • Health
    • Miscellaneous
    • Technology
    • Environment
  • Articles
    • Opinion
    • Analysis
    • Blog
    • Features
  • Videos
    • NewsFeed
    • Video Features
    • Explainers
    • TV
    • Digital Series
  • Infographs
  • In Pictures
  • • LIVE
News
  • Politics
  • Economy
  • Sports
  • Arts&Culture
  • Health
  • Miscellaneous
  • Technology
  • Environment
Articles
  • Opinion
  • Analysis
  • Blog
  • Features
Videos
  • NewsFeed
  • Video Features
  • Explainers
  • TV
  • Digital Series
Infographs
In Pictures
  • Africa
  • Asia
  • Asia-Pacific
  • Europe
  • Latin America
  • MENA
  • Palestine
  • US & Canada
BREAKING
Sheikh Qassem: Our supporters make up more than half of Lebanon's population, and all of these people are united under the banner of protecting Lebanon, its Resistance, its people, and its integrity.
Sheikh Qassem: There will be no phased handing in of our arms. [The Israelis] must first enact the agreement before we start talking about a defensive strategy.
Sheikh Qassem: Be brave in the face of foreign pressures, and we will be by your side in this stance.
Sheikh Qassem: Stripping us of our arms is like stripping us of our very soul, and this will prompt us to show them our might.
Sheikh Qassem: We will not abandon our arms, for they gave us dignity; we will not abandon our arms, for they protect us against our enemy.
Sheikh Qassem: The US efforts we are seeing are aimed at sabotaging Lebanon and constitute a call for sedition.
Sheikh Qassem: If you truly want to establish sovereignty and work for Lebanon’s interests, then stop the aggression.
Sheikh Qassem: The United States, which is meddling in Lebanon, is not trustworthy but rather poses a danger to it.
Sheikh Qassem: The United States is preventing the weapons that protect the homeland.
Sheikh Qassem: The government’s latest decision [on the disarmament of the Resistance] is non-charter-based, and if the government continues down this path, it is not faithful to Lebanon’s sovereignty.

OpenAI admits to 'spiderbots' crawling over websites, collecting data

  • By Al Mayadeen English
  • Source: Business Insider
  • 11 Aug 2023 23:40
  • 1 Shares
4 Min Read

According to Business Insider tech editor Allistair Barr, "spiderbots" like Google's Googlebot are used to collect information on web pages that can only be detrimental to users in the long run.

  • x
  • Open admits to spiderbot that collects AI training data
    The OpenAI logo on a phone in New York on January 31, 2021 (AP)

According to Allistair Barr of Business Insider, there are plenty of spiderbots; digital spiders crawling over websites and collecting data for years.

The most active, he claims, is Googlebot, which collects site information automatically so Google can rank and deliver Search results accordingly.

Barr notes that OpenAI has recently come out and admitted to having one of these bots on the loose in the cyber world. 

It is referred to as GPTbot, a tool used to scrape and gather web material for AI model training. GPT-5, the next large model, will most likely be trained using the data collected by this bot.

Read more: AI-generated tweets considered more trustworthy than humans': Study 

GPT-4, ChatGPT, and other sophisticated models intelligently respond to inquiries promptly, reducing the need to direct users to the original sources of information. This may be a fantastic user experience, but the incentives to offer high-quality free knowledge online begin to dwindle fast, Barr contends.

According to him, it is mere "self-sabotage" to allow the bot to "crawl" on a website, a realization he says is spreading quickly in online communities like The Verge, which has taken steps to block the GPTbot.

Although the company has unveiled a method to disable the bot, some developers speculate that OpenAI has been surreptitiously collecting everyone's internet data for months or years.

Prasad Dhumal, a search engine optimization consultant, Tweeted this week that "finally, after soaking up all your copyrighted content to build their proprietary product, OpenAI gives you a way to prevent your content from being used to further improve their product." 

Neil Clarke, the editor of Clarkesworld, a science fiction and fantasy magazine, revealed that they would block another scraping bot from OpenAI, questioning if there is still a secret bot still being used. 

No respect for rights of creative professionals

Related News

Russia urges global action to curb AI use by terrorist groups

Trump's push for deregulating AI finds formidable foe in the EU: NYT

In an email to Barr, Clarke remarked that "OpenAI and other 'AI' creators have demonstrated repeatedly that they have no respect for the rights of authors, artists, and other creative professionals. Their products are largely based on the copyrighted works of others, taken without authorization or compensation."

Clarke added that their "record on transparency leaves much to be desired."

CCBot is yet another computer spider that explores the internet and collects all material. This is managed by Common Crawl, which is a key supply of training data for AI models. Common Crawl consistently retains all of this information, so even if you disable its bot now, your data have very likely already been stolen.

"I'm unaware of anyone that has managed to get Common Crawl to remove data," Clarke noted. "I've tried, but have had no response."

Rather than have an "opt-out" option, others like Clarke are demanding the feature be "opt-in", forcing OpenAI to request permission before scraping data.

According to Clarke, an opt-out option is not sufficient. He believes that it is not the responsibility of a user to provide information for the company without consent, "regardless of the benefits they imagine coming from it."

Barr reached out to OpenAI about the feature and did not receive a response.

OpenAI has attempted to respect certain internet data. GPTbot is now meant to filter out sources that demand paywall access as well as those that are known to collect personally sensitive information.

In addition, the business recently announced a partnership with the Associated Press in which OpenAI would pay to license AP material for AI training data.

Clarke advised online content creators to block the bot and communicate their concerns to lawmakers regarding "past, present, and future data collection methodologies."

Last month, Google, Microsoft, OpenAI, and Anthropic announced a new council to monitor the safe development of the most advanced models of AI.

The four influential firms founded the Frontier Model Forum, an organization focused on the "safe and responsible" creation of frontier AI models, meaning AI technology that is more sophisticated than examples currently accessible.

In May, the Center for AI Safety warned that artificial intelligence (AI) technology should be classified as a societal risk and put in the same class as pandemics and nuclear wars.

Geoffrey Hinton, dubbed the godfather of AI, quit Google in May, citing AI's "existential risk".

  • openAI
  • Google
  • Artificial Intelligence

Most Read

Tom Artiom Alexandrovich, executive director of the defense division of the Israeli National Cyber Directorate, undated (Social media)

Israeli-born US prosecutor drops Israeli officer child sex crime

  • Politics
  • 19 Aug 2025
Displaced Palestinians walk through a makeshift camp along the beach in Gaza City, Sunday, Aug. 10, 2025 (AP)

Hamas, other factions accept Egypt-Qatar ceasefire proposal: Exclusive

  • Politics
  • 18 Aug 2025
Almost instantly after the Helsinki Accords were signed, organisations sprouted to document purported violations, whose findings were fed to overseas embassies for international amplification. (Al Mayadeen English; Illustrated by Zeinab el-Hajj)

How ‘Human Rights’ became a Western weapon

  • Opinion
  • 23 Aug 2025
Israeli soldiers stand on the top of armoured vehicles parked on an area near the Israeli-Gaza border, as seen from southern Israel, Wednesday, Aug. 20, 2025 (AP)

Palestinian fighters target Israeli soldiers, vehicles in Gaza

  • Politics
  • 21 Aug 2025

Coverage

All
The Ummah's Martyrs

Read Next

All
Israeli Prime Minister Benjamin Netanyahu speaks during a press conference at the Prime minister's office in al-Quds, Occupied Palestine, Sunday, Aug. 10, 2025 (AP)
Politics

Netanyahu deliberately derailing truce with Gaza occupation: Hamas

Irish President Michael Higgins arrives to deliver his speech during a 42nd World Food Day celebration at FAO headquarters in Rome, on Oct. 16, 2023. (AP Photo/Alessandra Tarantino)
Politics

Irish president renews call for UN military intervention in Gaza

US Ambassador to Turkey and Special Envoy to Syria Tom Barrack speaks during an interview with The Associated Press at the US Embassy in Aukar, northern suburb of Beirut, Lebanon, Monday, July 21, 2025 (AP)
Politics

US envoy, Netanyahu discuss restraining attacks on Lebanon, withdrawal

Smoke billows following Israeli airstrikes in multiple areas in Sanaa, Yemen, Sunday, Aug. 24, 2025 (AP)
Politics

Ansar Allah vow sustained Gaza support despite Israeli strikes

Al Mayadeen English

Al Mayadeen is an Arab Independent Media Satellite Channel.

All Rights Reserved

  • x
  • Privacy Policy
  • About Us
  • Contact Us
  • Authors
Android
iOS