Hackers can hide AI prompt injection attacks in resized images

“AI” tools are all the rage at the moment, even among users who aren’t all that savvy when it comes to conventional software or security—and that’s opening up all sorts of new opportunities for hackers and others who want to take advantage of them. A new research team has discovered a way to hide prompt injection attacks in uploaded images.

A prompt injection attack is a way to hide instructions for an LLM or other “artificial intelligence” system, usually somewhere a human operator can’t see them. It’s the whispered “loser-says-what” of computer security. A great example is hiding a phishing attempt in an email in plain text that’s colored the same as the background, knowing that Gemini will summarize the text even though the human recipient can’t read it.

A two-person Trail of Bits research team discovered that they can also hide these instructions in images, making the text invisible to the human eye but revealed and transcribed by an AI tool when an image is compressed for upload. Compression—and the artifacts that come along with it—are nothing new. But combined with the sudden interest in hiding plain text messages, it creates a new way to get instructions to an LLM without the user knowing those instructions have been sent.

AI prompt injection attack with resized image

Trail of Bits

In the example highlighted by Trail of Bits and BleepingComputer, an image is delivered to a user, the user uploads the image to Gemini (or uses something like Android’s built-in circle-to-search tool), and the hidden text in the image becomes visible as Google’s backend compresses it before it’s “read” to save on bandwidth and processing power. After being compressed, the prompt text is successfully injected, telling Gemini to email the user’s personal calendar information to a third party.

That’s a lot of legwork to get a relatively small amount of personal data, and both the complete attack method and the image itself need to be tailored to the specific “AI” system that’s being exploited. There’s no evidence that this particular method was known to hackers before now or is being actively exploited at the time of writing. But it illustrates how a relatively innocuous action—like asking an LLM “what is this thing?” with a screenshot—could be turned into an attack vector.

https://www.pcworld.com/article/2889404/hackers-can-hide-ai-prompt-injection-attacks-in-resized-images.html

Creată 4h | 26 aug. 2025, 15:50:17

Autentifică-te pentru a adăuga comentarii