Job Description
Anthropic is seeking a Research Engineer/Scientist to join its Safeguards Research Team. This team focuses on critical safety research and engineering to ensure AI systems can be deployed safely. The role involves addressing immediate safety challenges and longer-term research initiatives, including jailbreak robustness, automated red-teaming, monitoring techniques, and applied threat modeling. The ideal candidate will take a pragmatic approach to machine learning experiments, helping Anthropic understand and steer the behavior of powerful AI systems. They will focus on risks from powerful future systems, as well as better understanding risks occurring today.
Role involves:
- Testing the robustness of safety techniques.
- Running multi-agent reinforcement learning experiments.
- Building tooling to evaluate LLM-generated jailbreaks.
- Writing scripts and prompts to test models’ reasoning abilities.
- Contributing to research papers, blog posts, and talks.
- Running experiments that feed into AI safety efforts.
Requirements:
- Significant software, ML, or research engineering experience.
- Experience contributing to empirical AI research projects.
- Familiarity with technical AI safety research.
- Bachelor's degree in a related field or equivalent experience.
Anthropic offers:
- Competitive compensation and benefits.
- Optional equity donation matching.
- Generous vacation and parental leave.
- Flexible working hours.