Browse All Jobs
Job Description
Inworld, a leading AI technology provider, is seeking a Staff Platform Engineer (MLOps) to join their team. This role involves working closely with backend and ML Engineering teams to design, deploy, and maintain reliable, high-performance, and secure cloud infrastructure for Inworld's AI Engine and Studio. The ideal candidate will facilitate a "you build it, you run it" culture by providing the necessary tools and processes for monitoring service reliability, availability, and performance.Role involves:
  • Developing, managing, and optimizing the ML model lifecycle in production.
  • Implementing CI/CD systems for ML workflows.
  • Monitoring models to identify issues and inefficiencies.
  • Designing MLOps tools and frameworks to enhance automation and efficiency.
  • Managing CI/CD pipelines to ensure smooth and efficient code integration and deployment.
  • Identifying and implementing opportunities to enhance engineering speed and efficiency.
  • Conducting root cause analysis to identify critical issues and develop automated solutions.
  • Developing and sharing best practices to improve automation and efficiency across engineering teams.
Requirements:
  • 7 years of experience in software engineering.
  • 5 years of experience with infrastructure-as-code.
  • Proficiency in managing Kubernetes clusters and applications.
  • Experience in creating and maintaining CI/CD pipelines.
  • Deep knowledge of at least one major cloud provider (GCP, Azure, Oracle Cloud).
  • Proficiency in at least one backend programming/scripting language (Golang, Python, Bash).
  • Familiarity with open source LLM and open source serving solution (e.g. vLLM or llama.cpp, kserve, etc) is a plus.
  • Experience with SLURM
  • Experience with data pipeline and workflow management tools
Inworld offers:
  • A hybrid work environment based in Mountain View, CA.
  • Equity and benefits.
Apply Manually