AI Inference Engineer

AI Inference Engineer at Perplexity, London. Python, Rust, C++, PyTorch.

Job Description

Perplexity is seeking an AI Inference Engineer to join their growing team in London. The ideal candidate will have experience with ML systems, deep learning frameworks, and deploying real-time model serving at scale. This role offers the opportunity to work on large-scale deployment of machine learning models for real-time inference.

Responsibilities:

Develop APIs for AI inference for internal and external customers.
Benchmark and address bottlenecks throughout the inference stack.
Improve the reliability and observability of systems.
Explore novel research and implement LLM inference optimizations.

Qualifications:

Experience with ML systems and deep learning frameworks (e.g. PyTorch, TensorFlow, ONNX).
Familiarity with common LLM architectures and inference optimization techniques.
Experience with deploying reliable, distributed, real-time model serving at scale (Optional).
Understanding of GPU architectures or experience with GPU kernel programming using CUDA (Optional).

Perplexity offers:

Comprehensive health, dental, and vision insurance.
401(k) plan.
Equity may be part of the total compensation package.

Apply Manually

Perplexity AI

All Jobs at Perplexity AI (65)

Clash

of Jobs

AI Inference Engineer

Job Description

Perplexity AI

This feature is not ready yet

Sign up for the newsletter to get notified when it's available

AI Inference Engineer

Job Description

Perplexity AI