Job Description
Anthropic is seeking a Cybersecurity Engineer to join their Safeguards team, focusing on detecting and preventing harmful usage of Anthropic's AI services. This role is primarily focused on the technical side, creating and running cyber evaluations from low to catastrophic harms. The engineer will also communicate and own policy boundaries for impactful technologies.
The ideal candidate will execute rapidly, maintain high throughput, and bring a strong builder mindset to solving complex problems. They will prototype and iterate on evaluation infrastructure while maintaining high engineering standards.
Responsibilities:
- Design and implement robust evaluation infrastructure.
- Build and scale evaluation systems.
- Conduct deep automated analysis of cyber harm.
- Test and measure AI capability uplift.
- Create and run evaluations independently to test cyber policies.
- Design heuristics for prohibited and dual-use cyber categories.
- Partner with research and engineering teams to implement cyber safety systems.
- Support AI uplift testing with operational insights.
- Own policies for emerging technologies.
- Create threat models for novel asymmetric technologies.
- Coordinate with CBRN and Cyber Policy Managers on overlapping threats.
Requirements:
- Familiar with prompting large language models (LLM).
- Familiar with utilization of LLMs both as generative models to draw samples from and as classifiers
- Ability to come up with intelligent language model “pipelines” to automate tasks
- Very comfortable with Python.
- Strong async Python skills.
- Hacker and fast prototyping mindset.
- Self-sufficient builder.
- Systems thinking.
Anthropic offers:
- Competitive compensation and benefits.
- Optional equity donation matching.
- Generous vacation and parental leave.
- Flexible working hours.