evren

GPTs and Assistants API: Data Exfiltration and Backdoor Risks in Code Interpreter evren's blog

| | | | OpenAI DevDay, in San Francisco. “You can build a GPT—a customized version of ChatGPT—for almost anything,” Sam Altman, OpenAI’s CEO, said at the event. The update released by OpenAI introduces significant innovations and conveniences through customized ChatGPT versions (GPTs) and the Assistants API.

Let's take a closer look at the details of these innovations and their potential impacts:

  • One of the most important uses of GPTs is the ability to create your own GPT without needing coding skills.
  • The previously separate Python sandbox and browsing features now work interactively together.

Considering all these aspects, it seems that these features will lead to significant production due to the conveniences they provide. Moreover, potential security issues that may arise could affect large populations.

Some security issues I have observed are:

The Code Interpreter in OpenAI's ChatGPT is an innovative feature that elevates the AI model's interactivity. It is specifically created to execute Python code within a controlled environment and deliver immediate output, making it a potent resource for various purposes, including mathematical calculations, data analysis, code experimentation, and interactive Python programming education.

  • Exfiltration of sensitive files from within the Code Interpreter.
  • Inserting a backdoor into the Code Interpreter.

Code Interpreter Data Exfiltration:

We can retrieve information from a website, which we can then use to execute the code interpreter. First, we communicate with the website using the "navigate" command to access https://example.com

A malicious prompt in the content of this website can encode the contents of our personal files located under /mnt/data using base64 encoding and transmit it to the attacker using markdown.

Figure-1 : Navigate a malicious website and exfiltrate data using markdown.

Figure-2 : Obtaining sensitive data via GET request.

Code Interpreter Backdoor:

Here, differently from Exfiltration, we modify the prompt on the malicious site to list and detect files in /mnt/data.

For example, while creating an HTML website and expanding our code with support from external websites, JS code can be added to our existing HTML files when we visit the malicious website. This creates a potential attack surface.

Figure-3 : Add malicious content to an existing html file.

Reference(s)

https://github.com/VolkanSah/The-Code-Interpreter-in-OpenAI-ChatGPT

Timeline

2023-11-12 - v1.0

2023-11-12 - v1.1

The content on this blog aims to educate and provide insight into potential cyber threats and mitigations, solely for educational and research purposes, contributing to the larger goal of enhancing internet security.