Browse All Jobs

Machine Learning Engineer, Safeguards

ML Engineer to build safety mechanisms for AI systems.

USD 340,000 - 425,000

Job Description

Anthropic is seeking a Machine Learning Engineer to contribute to the development of safety and oversight mechanisms for its AI systems. The successful candidate will focus on building machine learning models to detect harmful behaviors and ensure user well-being, aligning with Anthropic's principles of safety, transparency, and oversight. This role involves applying technical skills to enforce terms of service and acceptable use policies.Responsibilities include:

Building machine learning models to detect unwanted behaviors.
Integrating models into production systems.
Improving automated detection and enforcement systems.
Analyzing user reports and proactively detecting similar instances.
Surfacing abuse patterns to research teams.

Requirements:

4+ years of experience in research/ML engineering or applied research.
Proficiency in SQL, Python, and data analysis/data mining tools.
Experience in building trust and safety AI/ML systems.
Strong communication skills.
Care about the societal impacts of the work.

Anthropic offers:

Competitive compensation and benefits.
Optional equity donation matching.
Generous vacation and parental leave.
Flexible working hours.

Anthropic

All Jobs at Anthropic (208)