OpenAI Monitors ChatGPT Chats To Keep Everyone Safe - Here Are Some Of The Threats It Stopped

One of the first things you should do when using an AI chatbot is to ensure your chats aren't used to train the AI. ChatGPT, Gemini, Claude, and others offer similar privacy protections. This key setting will prevent your personal data, whether it's work-related or sensitive personal matters, from reaching the pool of data the AI chatbot provider will use to train future versions of the AI. Despite these privacy protections, AI firms like OpenAI will monitor your chats to ensure everyone's safety. It'll use automated tools and human reviews to prevent ChatGPT misuse, including topics that can harm others (think malware, mass-spying tools, and other threats).

On Tuesday, OpenAI released a report on how it's been using its system to disrupt malicious uses of AI. The AI firm said it has disrupted and reported over 40 networks that violated its usage policies. The list of malicious actors OpenAI found trying to abuse ChatGPT includes "authoritarian regimes to control populations or coerce other states, as well as abuses like scams, malicious cyber activity, and covert influence operations." OpenAI says that threat actors continue to use AI to improve "old playbooks to move faster," but not necessarily gain new capabilities from ChatGPT.

OpenAI will also monitor chats to prevent self-harm and help users in distress. The safety of the individual has become a key priority for OpenAI recently, following the death of a teen by suicide after using ChatGPT. OpenAI has added parental controls to ChatGPT in recent weeks.

What threats did OpenAI prevent?

OpenAI doesn't explain in detail what goes into the process of flagging potential ChatGPT abuse and how the system works. That might be important, especially considering OpenAI's acknowledgment that some prompts fall into a gray zone including, "prompts and generations that could, depending on their context, indicate either innocuous activities or abuse, such as translating texts, modifying code, or creating a website." However, the company notes that it employs a "nuanced and informed approach that focuses on patterns of threat actor behavior rather than isolated model interactions," to detect threats without disrupting regular ChatGPT activity for users.

According to Gizmodo, OpenAI was able to identify several high-level threats. For example, an organized crime network believed to be based in Cambodia tried to streamline its operations with ChatGPT. OpenAI also found a Russian political influence operation that tried to use ChatGPT for creating prompts for third-party video AI models. The company stopped ChatGPT accounts associated with the Chinese government that sought help with system designs to monitor social media conversations.

Reuters reports that OpenAI banned Chinese-language accounts that wanted assistance with phishing and malware campaigns, and help with automations that could be achieved via DeepSeek. Accounts tied to Russian criminal groups trying to develop malware with ChatGPT were stopped. Similarly, Korean-speaking users who tried to use ChatGPT for phishing campaigns were banned.

What about conversations about self-harm?

The October report only focuses on malicious activities like the ones mentioned above. It doesn't address ChatGPT conversations that involve questions about self-harm. However, it's likely that OpenAI uses similar methods to detect such cases. A few days ago, the company said on X that it updated GPT-5 Instant to "better recognize and support people in moments of distress." OpenAI explained that sensitive parts of conversations will be routed to GPT-5 Instant, which will provide helpful responses. Moreover, ChatGPT will tell users what model is being used.

The move follows OpenAI's earlier initiatives about improving user safety and preventing ChatGPT from assisting with self-harm ideation. In late August, the company said that ChatGPT is trained not to answer prompts that mention intentions of self-harm. Instead, the AI will respond with empathy and direct people to professional help in the real world, including suicide prevention and crisis hotlines. If the AI detects the risk of physical harm to others, the conversations will be routed to systems that can involve human review and lead to escalation with law enforcement.

Recommended