Machine Learning Operations (MLOps) Engineer

MLOps Engineer to develop systems for LLM inference/fine-tuning.

Job Description

Together AI is seeking an MLOps Engineer to contribute to the development of systems and APIs that enable customers to perform inference and fine-tune LLMs. The ideal candidate will have experience implementing runtime systems that perform inference at scale, utilizing AI/ML models ranging from simple designs to the largest LLMs.Responsibilities of the MLOps Engineer include:

Collaborating with engineering, research, and sales teams to deploy, evaluate, and operate inference systems for both customer and internal applications.
Developing and maintaining tools, services, and documentation for automation and testing purposes.
Analyzing and improving the efficiency, scalability, and stability of diverse system resources.
Conducting design and code reviews.
Participating in an on-call rotation to respond to critical incidents as needed.

Requirements for this role include:

5+ years of experience working on a production-level ML training or inference system.
A bachelor’s degree in computer science or equivalent industry experience.
A strong understanding of the state-of-the-art in machine learning, especially LLMs.
Experience with DevOps practices such as CI/CD, automation, containerization (Docker), and orchestration (Kubernetes).
Proficiency in cloud platforms like AWS, Google Cloud, or Azure.
Expertise in programming languages such as Python, Go, etc., and frameworks for ML such as TensorFlow, PyTorch, and Scikit-learn.

Together AI offers:

Competitive compensation.
Startup equity.
Health insurance and other competitive benefits.

Apply Manually

Together AI

All Jobs at Together AI (31)

Clash

of Jobs

Machine Learning Operations (MLOps) Engineer

Job Description

Together AI

This feature is not ready yet

Sign up for the newsletter to get notified when it's available

Machine Learning Operations (MLOps) Engineer

Job Description

Together AI