Browse All Jobs
Job Description
Perplexity, a conversational AI company, is seeking an AI Inference Engineer to join their expanding team. The ideal candidate will contribute to the large-scale deployment of machine learning models for real-time inference. This role offers the opportunity to work with cutting-edge technologies and optimize AI inference performance.Responsibilities:
  • Develop APIs for AI inference.
  • Benchmark and address bottlenecks in the inference stack.
  • Improve system reliability and observability.
  • Implement LLM inference optimizations.
Qualifications:
  • Experience with ML systems and deep learning frameworks (e.g., PyTorch, TensorFlow, ONNX).
  • Familiarity with LLM architectures and inference optimization techniques.
  • Understanding of GPU architectures or experience with CUDA.
Perplexity offers:
  • Comprehensive health, dental, and vision insurance.
  • 401(k) plan.
  • Equity may be part of the total compensation package.
Apply Manually