Browse All Jobs
Job Description
Anthropic is seeking a Software Engineer to join their team in London, focusing on building safety and oversight mechanisms for AI systems. The successful candidate will work to monitor models, prevent misuse, and ensure user well-being. This role involves building systems to detect unwanted model behaviors and prevent disallowed use of models, upholding principles of safety, transparency, and oversight while enforcing terms of service and acceptable use policies.Responsibilities include:
  • Developing monitoring systems to detect unwanted behaviors from API partners and take automated enforcement actions.
  • Surfacing these in internal dashboards for manual review.
  • Building abuse detection mechanisms and infrastructure.
  • Surfacing abuse patterns to research teams to harden models at the training stage.
  • Building robust and reliable multi-layered defenses for real-time improvement of safety mechanisms at scale.
  • Analyzing user reports of inappropriate content or accounts.
Requirements:
  • Bachelor’s degree in Computer Science, Software Engineering, or comparable experience.
  • 3-8+ years of experience in a software engineering position, preferably with a focus on integrity, spam, fraud, or abuse detection.
  • Proficiency in SQL, Python, and data analysis tools.
  • Strong communication skills and ability to explain complex technical concepts to non-technical stakeholders.
Anthropic offers:
  • Competitive compensation and benefits.
  • Optional equity donation matching.
  • Generous vacation and parental leave.
  • Flexible working hours.
  • Hybrid work policy.
  • Visa sponsorship.
Apply Manually