Browse All Jobs

Research Engineer / Scientist, Alignment Science

Research Engineer/Scientist to build and run ML experiments for AI safety.

USD 280,000 - 690,000

Job Description

Anthropic is seeking a Research Engineer/Scientist to join their Alignment Science team. The ideal candidate will be passionate about AI safety and interested in understanding and steering the behavior of powerful AI systems. This role involves contributing to exploratory experimental research on AI safety, with a focus on risks from powerful future systems. The Research Engineer will collaborate with other teams, including Interpretability, Fine-Tuning, and the Frontier Red Team.Role involves:

Testing the robustness of safety techniques.
Running multi-agent reinforcement learning experiments.
Building tooling to evaluate LLM-generated jailbreaks.
Writing scripts and prompts to produce evaluation questions.
Contributing to research papers, blog posts, and talks.
Running experiments that feed into key AI safety efforts.

Requirements:

Significant software, ML, or research engineering experience.
Experience contributing to empirical AI research projects.
Familiarity with technical AI safety research.
A Bachelor's degree in a related field or equivalent experience.

Role offers:

Competitive compensation and benefits.
Optional equity donation matching.
Generous vacation and parental leave.
Flexible working hours.

Anthropic

All Jobs at Anthropic (208)