Software Engineer, Safeguards

Software Engineer for AI safety, abuse detection, and monitoring.

Anthropic

Hybrid

On-Site

United States

USD 300,000 - 405,000

Job Description

Anthropic is seeking a Software Engineer to join their team and contribute to building safety and oversight mechanisms for AI systems. The successful candidate will focus on developing systems to detect unwanted model behaviors and prevent disallowed use of models, ensuring user well-being and upholding the company's principles of safety, transparency, and oversight.

Responsibilities:

Develop monitoring systems to detect unwanted behaviors from API partners.
Build abuse detection mechanisms and infrastructure.
Surface abuse patterns to research teams to harden models.
Build multi-layered defenses for real-time improvement of safety mechanisms.
Analyze user reports of inappropriate content or accounts.

Requirements:

Bachelor’s degree in Computer Science, Software Engineering, or comparable experience.
3-10+ years of experience in a software engineering position, preferably with a focus on integrity, spam, fraud, or abuse detection.
Proficiency in SQL, Python, and data analysis tools.
Strong communication skills.

The role offers:

Competitive compensation and benefits.
Optional equity donation matching.
Generous vacation and parental leave.
Flexible working hours.
A collaborative office space.

Apply Manually

Anthropic

All Jobs at Anthropic (208)

Clash

of Jobs

Software Engineer, Safeguards

Job Description

Anthropic

This feature is not ready yet

Sign up for the newsletter to get notified when it's available

Software Engineer, Safeguards

Job Description

Anthropic