Research Engineer, Reward Models

Research Engineer for Reward Models at Anthropic.

Anthropic

Hybrid

On-Site

United States

USD 315,000 - 510,000

Job Description

Anthropic is seeking a Research Engineer to join their Reward Modeling team. The company's mission is to create reliable, interpretable, and steerable AI systems that are safe and beneficial. The Reward Modeling team focuses on developing techniques for teaching AI systems to understand and embody human values, as well as advancing AI capabilities.

The Research Engineer will contribute to the science of reward modeling.

Responsibilities:

Help implement novel reward modeling architectures and techniques
Optimize training pipelines
Build and optimize data pipelines
Collaborate across teams to integrate reward modeling advances into production systems
Communicate engineering progress through internal documentation and potential publications

Requirements:

Strong engineering background in machine learning, with expertise in preference learning, reinforcement learning, deep learning, or related areas
Proficiency in Python, deep learning frameworks, and distributed computing
Familiarity with modern LLM architectures and alignment techniques
Experience with improving model training pipelines and building data pipelines
Comfort with the experimental nature of frontier AI research
Ability to clearly communicate complex technical concepts and research findings
Deep interest in AI alignment and safety
Bachelor's degree in a related field or equivalent experience

What Anthropic Offers:

Competitive compensation and benefits
Optional equity donation matching
Generous vacation and parental leave
Flexible working hours

Apply Manually

Anthropic

All Jobs at Anthropic (208)

Clash

of Jobs

Research Engineer, Reward Models

Job Description

Anthropic

This feature is not ready yet

Sign up for the newsletter to get notified when it's available

Research Engineer, Reward Models

Job Description

Anthropic