Browse All Jobs
Job Description

Anthropic is seeking a Performance Engineer to optimize the throughput and robustness of its largest distributed systems. The ideal candidate will have a track record of solving large-scale systems problems and be excited to grow into an expert in machine learning. Anthropic's mission is to create reliable, interpretable, and steerable AI systems that are safe and beneficial for users and society.

Role Involves:

  • Identifying and solving novel systems problems related to running machine learning algorithms at scale.
  • Developing systems that optimize the throughput and robustness of large distributed systems.
  • Implementing low-latency high-throughput sampling for large language models.
  • Implementing GPU kernels to adapt models to low-precision inference.
  • Writing custom load-balancing algorithms to optimize serving efficiency.
  • Building quantitative models of system performance.
  • Designing and implementing fault-tolerant distributed systems.
  • Debugging kernel-level network latency spikes in containerized environments.

Requirements:

  • Significant software engineering or machine learning experience, particularly at supercomputing scale.
  • Results-oriented with a bias towards flexibility and impact.
  • Ability to pick up slack, even outside the job description.
  • Enjoyment of pair programming.
  • Desire to learn more about machine learning research.
  • Care about the societal impacts of work.
  • Bachelor's degree in a related field or equivalent experience.

Role Offers:

  • Competitive compensation and benefits.
  • Optional equity donation matching.
  • Generous vacation and parental leave.
  • Flexible working hours.
  • A collaborative office space.
Apply Manually