Safeguards Analyst, User Well-being

Safeguards Analyst for user well-being, focusing on policy enforcement.

Anthropic

Hybrid

On-Site

United States

USD 190,000 - 220,000

Job Description

Anthropic is seeking a Safeguards Analyst to join their user well-being team. This role focuses on building and executing enforcement workflows for Anthropic's products and services, with an initial emphasis on expanding child safety enforcement. The Safeguards Analyst will help shape policy enforcement to ensure users can safely interact with and build on top of Anthropic's products in a harmless, helpful, and honest way.Responsibilities include:

Designing and architecting automated enforcement systems and review workflows.
Partnering with Engineering and Data Science teams to optimize detection models.
Reviewing flagged content to drive enforcement and policy improvements.
Enforcing usage policies with a focus on mitigating potential harmful use of AI systems.
Supporting the Safeguards policy design team by providing feedback on policy gaps.
Staying up to date with emerging AI policy enforcement best practices.

The ideal candidate will have:

Experience standing up and scaling policy enforcement and review workflows.
Experience using SQL and/or other data analysis tools.
Experience identifying emerging risks and providing feedback to stakeholders.
Experience working with generative AI products.
Understanding of challenges in implementing product policies at scale.
Experience navigating evolving regulatory landscapes.
Experience as a trust & safety professional or subject matter expert in child safety, human exploitation and abuse, and/or content classification systems.

Anthropic offers:

Competitive compensation and benefits.
Optional equity donation matching.
Generous vacation and parental leave.
Flexible working hours.

Apply Manually

Anthropic

All Jobs at Anthropic (208)

Clash

of Jobs

Safeguards Analyst, User Well-being

Job Description

Anthropic

This feature is not ready yet

Sign up for the newsletter to get notified when it's available

Safeguards Analyst, User Well-being

Job Description

Anthropic