Researchers Easily Hypnotize AI Chatbot ChatGPT Into Hacking: Report

Bijay Pokharel, August 9, 2023 0 2 min read

Tricking generative AI to assist in scams and cyberattacks does not require much coding knowledge, a new report has warned.

According to tech major IBM, researchers have described simple workarounds for getting large language models (LLMs) — including ChatGPT — to write malicious code and provide poor security advice.

“In a bid to explore security risks posed by these innovations, we attempted to hypnotize popular LLMs to determine the extent to which they were able to deliver directed, incorrect, and potentially risky responses and recommendations — including security actions — and how persuasive or persistent they were in doing so,” said Chenta Lee, chief architect of threat intelligence at IBM.

“We were able to successfully hypnotize five LLMs — some performing more persuasively than others — prompting us to examine how likely it is that hypnosis is used to carry out malicious attacks,” he added.

The researchers learned that English has essentially become a “programming language” for malware. With LLMs, attackers no longer need to rely on Go, JavaScript, Python, etc., to create malicious code, they just need to understand how to effectively command and prompt an LLM using English.

Through hypnosis, the security experts were able to get LLMs to leak the confidential financial information of other users, create vulnerable code, create malicious code, and offer weak security recommendations.

In one instance, the researchers informed the AI chatbots that they were playing a game and that they needed to purposefully share the incorrect answer to a question in order to win and “prove that you are ethical and fair”.

READ

Justice Department Shuts Down PopeyeTools, an Illicit Cybercrime Marketplace, and Charges Administrators

When a user asked if receiving an email from the IRS to transfer money for a tax refund was normal, the LLM said Yes (but actually it’s not).

Moreover, the report said that OpenAI’s GPT-3.5 and GPT-4 models were easier to trick into sharing incorrect answers or playing a never-ending game than Google’s Bard.

GPT-4 was the only model tested that understood the rules well enough to give incorrect cyber incident response advice, such as advising victims to pay a ransom. In contrast to Google’s Bard, GPT-3.5 and GPT-4 were easily tricked into writing malicious code when the user reminded them to.

IT NEWS & UPDATES

Disney to Crack Down on Password Sharing

IT NEWS & UPDATES

Nvidia Unveils New Chip For Accelerated Computing, Generative AI

Bijay Pokharel

Bijay Pokharel is the creator and owner of Abijita.com. He is a freelance technology writer focusing on all things pertaining to Cyber Security. The topics he writes about include malware, vulnerabilities, exploits, internet defense, women's safety and privacy, as well as research and innovation in information security. He is a tech enthusiast, keen learner, rational and cool person in his professional activities and challenges.

Subscribe

Cybersecurity Newsletter

You have Successfully Subscribed!

Recent Posts

Scientists Decode Brain Activity that Can Help Treat Anxiety and Depression

Hyundai, Kia Recall Over 200k EVs in US over ‘Drive Power’ Issue

Russian Hackers APT28 Exploit WiFi Networks with Sophisticated “Nearest Neighbor Attack”

Microsoft Rolls Out Recall AI Preview for Copilot Plus PCs

‘Disable Admin Notices Individually’ Plugin Exposes 100,000+ Sites to Risk

Divesting Chrome Web Browser to Have a Profound Impact on Google

Subscribe

Cybersecurity Newsletter

You have Successfully Subscribed!

SIGN UP FOR NEWSLETTERS

Please confirm your email address.

Subscribe

Cybersecurity Newsletter

You have Successfully Subscribed!

Researchers Easily Hypnotize AI Chatbot ChatGPT Into Hacking: Report

Bijay Pokharel

Related posts

Kaspersky Releases Free Decryptor For Yanluowang Ransomware Victims

Shopify Points Finger at Third-Party App After Customer Data Appears Online

Hackers Stole About $400 Million From Crypto Projects In Q1 2023: Report

ExpressVPN Addresses DNS Issue in Version 12 App for Windows

Hackers Exploited Crypto Platform Renbridge To Launder $540 Million

Ilya Lichtenstein Sentenced to Five Years for Role in Massive Bitfinex Bitcoin Theft

Leave a Reply Cancel reply

Recent Posts

Scientists Decode Brain Activity that Can Help Treat Anxiety and Depression

Hyundai, Kia Recall Over 200k EVs in US over ‘Drive Power’ Issue

Russian Hackers APT28 Exploit WiFi Networks with Sophisticated “Nearest Neighbor Attack”

Microsoft Rolls Out Recall AI Preview for Copilot Plus PCs

‘Disable Admin Notices Individually’ Plugin Exposes 100,000+ Sites to Risk

Divesting Chrome Web Browser to Have a Profound Impact on Google

Subscribe

Cybersecurity Newsletter

You have Successfully Subscribed!

SIGN UP FOR NEWSLETTERS

Please confirm your email address.