OpenAI

Helping developers build safer AI experiences for teens | OpenAI

Brief

OpenAI’s March 2026 release turns teen AI safety from a high-level governance goal into a deployable artifact: prompt-based policies that developers can plug into gpt-oss-safeguard or other reasoning models. The key technical idea is that classifiers only work well when the underlying policy definitions are precise, so OpenAI is standardizing operational prompts for six common youth-risk domains rather than leaving teams to invent them from scratch. That makes the package useful both for synchronous moderation pipelines and retrospective review of user-generated content, especially in open-weight deployments where downstream builders need reusable safety primitives. The company says the policies were informed by research on teens’ developmental differences and external input from Common Sense Media and everyone.ai. OpenAI also stresses limits: these prompts are a baseline, not a guarantee, and should sit inside a layered system that includes product decisions, controls, transparency, monitoring, and application-specific customization.

Why it matters

OpenAI published prompt-based teen safety policies on 2026-03-24 for use with its open-weight ecosystem, specifically mentioning compatibility with the gpt-oss-safeguard classifier and other reasoning models.

Key details

  • The initial policy set targets 6 teen-relevant risk categories: graphic violent content, graphic sexual content, harmful body ideals and behaviors, dangerous activities and challenges, romantic or violent roleplay, and age-restricted goods and services.
  • OpenAI positions the policies as operational infrastructure rather than abstract principles: developers can use them for real-time content filtering and offline analysis of user-generated content, then adapt or extend them for their own products.
  • The release was developed with input from Common Sense Media and everyone.ai; Robbie Torney called the lack of operational youth-safety policies a major gap, while everyone.ai’s Mathilde Cerioli highlighted adjacent behavioral risks such as exclusivity and overreliance.
  • OpenAI explicitly says the policies are not a complete safety solution and recommends a defense-in-depth approach combining content policies with product design, user controls, transparency, monitoring, and age-appropriate responses.
Source evidence

title: Helping developers build safer AI experiences for teens | OpenAI
contenttype: article
publication: OpenAI
published: 2026-03-24T00:00:00
source
url: https://openai.com/index/teen-safety-policies-gpt-oss-safeguard

word_count: 764

We released open weight models to democratize access to powerful AI and support broad innovation. At the same time, we believe safety and innovation go hand in hand, and that developers should have access to capable models as well as the tools and policies to deploy them safely and responsibly. We developed these policies to support developers in their safety efforts to protect young users, with input from trusted external organizations including Common Sense Media⁠(opens in a new window) and everyone.ai⁠(opens in a new window).We recognize that teens and adults have different needs, and that teens need additional protections. These policies are designed to help developers account for those differences and build experiences that are both empowering and appropriate for younger users.Building on our broader work to protect young peopleToday’s release builds on that foundation. We’re making these safety policies available to developers to support them in deploying safety protections for teens and helping democratize access across the open weights ecosystem. Translating teen safety into clear, usable policiesWhile safety classifiers like gpt-oss-safeguard can detect harmful content, they depend on clear definitions of what that content is. In practice, one of the biggest challenges developers face is defining policies that accurately capture teen-specific risks and can be consistently applied in real systems. Even experienced teams often struggle to translate high-level safety goals into precise, operational rules, especially since it requires both subject matter expertise and deep AI knowledge. This can lead to gaps in protection, inconsistent enforcement, or overly broad filtering. Clear, well-scoped policies are a critical foundation for effective safety systems.Helping developers operationalize teen safetyTo address this challenge, we are releasing a set of safety policies⁠(opens in a new window), tailored to common risks faced by teens and informed by careful review of existing research about teens’ unique developmental differences. These policies are structured as prompts that can be directly used with gpt-oss-safeguard⁠(opens in a new window) and other reasoning models, enabling developers to more easily apply consistent safety standards across their systems. The initial release includes policies covering:Graphic violent contentGraphic sexual contentHarmful body ideals and behaviorsDangerous activities and challengesRomantic or violent roleplayAge-restricted goods and servicesThese policies can be used for real-time content filtering, as well as offline analysis of user-generated content.By structuring policies as prompts, developers can more easily integrate them into existing workflows, adapt them to their use cases, and iterate over time.Developed with input from external expertsThis work reflects an ongoing effort to collaborate with experts and the broader ecosystem to improve how AI systems support young people.“One of the biggest gaps in AI safety for teens has been the lack of clear, operational policies that developers can build from. Many times, developers are starting from scratch. These prompt-based policies help set a meaningful safety floor across the ecosystem, and because they're released as open source, they can be adapted and improved over time. We're encouraged to see this kind of infrastructure being made available broadly, and we hope it catalyzes more shared youth-safety starting points across the industry.” —Robbie Torney, Head of AI & Digital Assessments, Common Sense Media“Efforts like this that make youth safety policies more operational are valuable because they help translate expert knowledge into guidance that can be used in real systems. Content policies are an important first step, and they also open the door to broader work on how model behavior can shape youth-relevant risks over time. Inspired by this work and our own research, everyone.ai⁠(opens in a new window) has also created an initial behavioral policy focused on risks like exclusivity and overreliance."—Dr. Mathilde Cerioli, Chief Scientist at everyone.AIA starting point, not a complete solutionThe policies are intended as a starting point, not as a comprehensive or final definition or guarantee of teen safety. Each application has unique risks, audiences and contexts, and developers are best positioned to understand the risks that their products and AI integrations may present. We strongly encourage developers to adapt and extend these policies based on their specific needs and combine them with other safeguards such as product design decisions, user controls, teen-friendly transparency, monitoring systems and thoughtful, age-appropriate responses. We believe a layered defense in depth⁠⁠ approach is essential to building safer AI systems. These policies draw from our internal experience, but they do not reflect the full extent of OpenAI’s internal policies or safeguards. Developers and organizations can adapt these policies to their specific applications, translate them into different languages, and extend them to cover additional risk areas. Over time, we hope this contributes to a more robust and shared foundation for implementing safety policies in AI systems.