Job Description
Together AI is seeking a Machine Learning Engineer to contribute to the development of systems and APIs that empower customers to perform inference and fine-tune Large Language Models (LLMs). The ideal candidate will possess experience in implementing runtime systems capable of conducting inference at scale, utilizing AI/ML models ranging from simple to the most extensive LLMs. This role involves designing and constructing production systems that drive the Together Cloud inference and fine-tuning APIs, ensuring reliability and performance at scale.
Responsibilities:
- Design and build the production systems that power the Together Cloud inference and fine-tuning APIs
- Partner with researchers, engineers, product managers, and designers
- Analyze and improve efficiency, scalability, and stability of various system resources
- Conduct design and code reviews
- Create services, tools & developer documentation
- Create testing frameworks for robustness and fault-tolerance
- Participate in an on-call rotation to respond to critical incidents as needed
Requirements:
- 5+ years experience writing high-performance, well-tested code
- Bachelor’s degree in computer science or equivalent industry experience
- Familiar with LLM inference ecosystem
- Experience in building large scale, fault-tolerant, distributed systems
- Expert level programmer in one or more of Python, Go, Rust, or C/C++
- Experience implementing runtime inference services at scale or similar
Together AI Offers:
- Competitive compensation
- Startup equity
- Health insurance
- Competitive benefits