Job Description
Together AI is seeking an MLOps Engineer to contribute to the development of systems and APIs that enable customers to perform inference and fine-tune LLMs. The ideal candidate will have experience implementing runtime systems that perform inference at scale, utilizing AI/ML models ranging from simple designs to the largest LLMs.Responsibilities of the MLOps Engineer include:
- Collaborating with engineering, research, and sales teams to deploy, evaluate, and operate inference systems for both customer and internal applications.
- Developing and maintaining tools, services, and documentation for automation and testing purposes.
- Analyzing and improving the efficiency, scalability, and stability of diverse system resources.
- Conducting design and code reviews.
- Participating in an on-call rotation to respond to critical incidents as needed.
Requirements for this role include:
- 5+ years of experience working on a production-level ML training or inference system.
- A bachelor’s degree in computer science or equivalent industry experience.
- A strong understanding of the state-of-the-art in machine learning, especially LLMs.
- Experience with DevOps practices such as CI/CD, automation, containerization (Docker), and orchestration (Kubernetes).
- Proficiency in cloud platforms like AWS, Google Cloud, or Azure.
- Expertise in programming languages such as Python, Go, etc., and frameworks for ML such as TensorFlow, PyTorch, and Scikit-learn.
Together AI offers:
- Competitive compensation.
- Startup equity.
- Health insurance and other competitive benefits.