Job Description
Anthropic is seeking a Research Scientist/Engineer to join their Alignment Finetuning team. The successful candidate will be responsible for developing and implementing techniques to train language models that are more aligned with human values. This role involves creating novel finetuning techniques and using them to improve model behavior demonstrably.
The Research Scientist/Engineer will work on synthetic data generation and advanced training pipelines. They will also collaborate across teams to integrate alignment improvements into production models and develop processes to automate and scale the team's work.
Responsibilities:
- Develop and implement novel finetuning techniques using synthetic data generation and advanced training pipelines
- Use these to train models to have better alignment properties including honesty, character, and harmlessness
- Create and maintain evaluation frameworks to measure alignment properties in models
- Collaborate across teams to integrate alignment improvements into production models
- Develop processes to help automate and scale the work of the team
Requirements:
- MS/PhD in Computer Science, ML, or related field, or equivalent experience
- Strong programming skills, especially in Python
- Experience with ML model training and experimentation
- Track record of implementing ML research
- Strong analytical skills for interpreting experimental results
- Experience with ML metrics and evaluation frameworks
- Ability to turn research ideas into working code
- Ability to identify and resolve practical implementation challenges
The role offers:
- Competitive compensation and benefits
- Optional equity donation matching
- Generous vacation and parental leave
- Flexible working hours
- Lovely office space in which to collaborate with colleagues