Security Researchers Just Hacked ChatGPT Using A Single 'Poisoned' Document

New findings from a group of researchers at the Black Hat hacker conference in Las Vegas has revealed that it only takes one "poisoned" document to gain access to private data using ChatGPT that has been connected to outside services. One of the ways that OpenAI has made ChatGPT even more useful for its userbase is by allowing you to connect it to various outside services, like Google Drive, GitHub, and more. But connecting ChatGPT to these private data storage solutions could actually put your data at risk of being exposed, the new research shows.

The attack, which has been dubbed AgentFlayer, was designed by researchers Michael Bargury and Tamir Ishay Sharbat. When utilized, it shows that indirect prompt injection is possible through a single document that has been inlaid with the right instructions. When used, this kind of attack could give bad actors access to developer secrets like API keys and more.

For instance, in this case, the researchers included an invisible prompt injection payload in a document before it was uploaded to ChatGPT. When an image in the document is rendered by ChatGPT, a request is automatically sent to the attacker's server using the invisible prompt. Just like that, the data has been stolen, and the victim is none the wiser.

Hacking the AI indirectly

These indirect prompt attacks are part of a new style of hack that has been popping up on the AI security scene more and more in recent months. In fact, other research released this week also shows that hackers were able to control a smart home by hacking Gemini using an infected calendar invite. These indirect prompt attacks are just one way that AI has proven susceptible to the whims of bad actors.

And the concerns surrounding these types of attacks are only growing, especially as people like the Godfather of AI say that tech companies are downplaying the risks of AI. One of the reasons this type of attack is so dangerous is that the user doesn't need to do anything beyond connecting ChatGPT to their Google Drive or GitHub account. From there, if a "poisoned" document with indirect prompt instructions embedded in it is added to their files, it could give bad actors access to the data stored in their account.

You can see a concept video of the attack in action to get an idea of just how simple it is and how quickly it works. Of course, connecting AI to your external accounts can be extremely helpful, and that's one way that developers make use of various AI systems, as it allows them to connect AI to their existing databases without needing to move their code over to any additional tools. But, as the researchers notes, giving AI more power can open you up to even more risk.

Recommended