Browse All Jobs
Job Description

xAI is seeking an AI Engineer & Researcher - Inference to join its team in the Bay Area. xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. The ideal candidate will be responsible for optimizing the latency and throughput of model inference, building reliable production serving systems, and accelerating research on scaling test-time compute. Those interested need to be located near the Bay Area or open to relocation.

The role involves:

  • Optimizing the latency and throughput of model inference.
  • Building reliable production serving systems to serve millions of users.
  • Accelerating research on scaling test-time compute.

Requirements:

  • Experience with system optimizations for model serving, such as batching, caching, load balancing, and model parallelism.
  • Experience with low-level optimizations for inference, such as GPU kernels and code generation.
  • Experience with algorithmic optimizations for inference, such as quantization, distillation, and speculative decoding.
  • Experience with large-scale, high concurrent production serving.
  • Experience with testing, benchmarking, and reliability of inference services.
  • Strong communication skills.

xAI offers:

  • Opportunity to work on open-source projects.
  • A challenging and curious work environment.
  • A flat organizational structure.
Apply Manually

xAI

xAI is an artificial intelligence company focused on building AI systems that deeply understand the universe and assist humanity in its quest for knowledge. It operates with a flat organizational structure that values engineering excellence, curiosity, and strong communication. xAI fosters a collaborative environment where every team member contributes directly to the company’s objectives, with a focus on continuous improvement.

All Jobs at xAI (129)