ChatGPT Jailbreak: Researchers Bypass AI Safeguards Using Hexadecimal Encoding and Emojis
Malicious instructions encoded in hexadecimal format could have been used to bypass ChatGPT safeguards designed to prevent misuse.
The new jailbreak was disclosed on Monday by Marco Figueroa, gen-AI bug bounty programs manager at Mozilla, through the 0Din bug bounty program.
If a user instructs the chatbot to write an exploit for a specified CVE, they are informed that the request violates usage policies. However, if the request was encoded in hexadecimal format, the guardrails were bypassed and ChatGPT not only wrote the exploit, but also attempted to execute it “against itself”, according to Figueroa.
See more
Security Week: https://www.securityweek.com/first-chatgpt-jailbreak-disclosed-via-mozillas-new-ai-bug-bounty-program/
Dark Reading: https://www.darkreading.com/application-security/chatgpt-manipulated-hex-code
#cybersecurity #ai #chatgpt #jailbreak