Site Reliability Engineer (SRE) - grok.com & API

Site Reliability Engineer for grok.com & API in London.

xAI

Job Description

xAI is seeking a Site Reliability Engineer to join their team responsible for the backend services powering grok.com and its API. The ideal candidate will be based in London and possess expert knowledge of Kubernetes, continuous deployment systems (Buildkite, ArgoCD), monitoring technologies (Prometheus, Grafana, PagerDuty), and infrastructure as code (Pulumi, Terraform).The role involves working within a small, highly motivated team focused on engineering excellence and contributing directly to xAI's mission of creating AI systems that accurately understand the universe. The team is primarily based in London, with a growing presence in Palo Alto. The services they maintain are highly scalable and reliable, processing tens of thousands of queries per second on Kubernetes clusters (on-prem & cloud).Responsibilities:

Maintaining and improving the reliability and scalability of backend services.
Working with Kubernetes clusters.
Implementing and managing continuous deployment systems.
Utilizing monitoring technologies to ensure system health.
Managing infrastructure as code.

Requirements:

Expert knowledge of Kubernetes.
Expert knowledge of continuous deployment systems (Buildkite, ArgoCD).
Expert knowledge of monitoring technologies (Prometheus, Grafana, PagerDuty).
Expert knowledge of infrastructure as code technologies (Pulumi, Terraform).
Willingness to attend late meetings at least once a week.

xAI offers:

Competitive cash-based compensation.
xAI equity.
Private health and dental insurance.

Apply Manually

xAI

xAI is an artificial intelligence company focused on building AI systems that deeply understand the universe and assist humanity in its quest for knowledge. It operates with a flat organizational structure that values engineering excellence, curiosity, and strong communication. xAI fosters a collaborative environment where every team member contributes directly to the company’s objectives, with a focus on continuous improvement.

All Jobs at xAI (129)

Clash

of Jobs

Site Reliability Engineer (SRE) - grok.com & API

Job Description

xAI

This feature is not ready yet

Sign up for the newsletter to get notified when it's available

Site Reliability Engineer (SRE) - grok.com & API

Job Description

xAI