Job Description
Vercel is seeking an experienced SRE Manager to lead the creation and operation of a 24/7 Site Reliability Engineering function. This role is based in New York City and offers a hybrid work environment. The SRE Manager will be responsible for ensuring high standards of quality across Vercel engineering by acting as the eyes and ears of the engineering organization, utilizing telemetry, inquiry, and defined service objectives. They will partner with developer teams to integrate reliability, performance, and cost efficiency into their priorities, providing engineering support to achieve these goals.
Responsibilities:
- Build and nurture the SRE team at Vercel, maintaining a high bar for technical work and teamwork.
- Define and maintain company-wide practices around SLO definition and management.
- Incident management, postmortem analysis, and disaster testing and recovery.
- Generate informed insights regarding service quality and interface directly with executive leadership.
- Partner with CDN and Compute engineering teams to define and manage SRE-driven project initiatives.
Requirements:
- At least 5 years of experience in an SRE role, or 8 years in an adjacent role (e.g., platform engineering).
- At least 3 years of experience in engineering management.
- Firm grasp of the SRE philosophy and mindset.
- Experience working on or directly with SRE teams.
- Strong sense of accountability and commitment to problem-solving.
- Willingness to proactively engage with development teams.
- Capability to manage risk and make sound judgments.
- Experience with distributed system design, cloud, networking, software, and operating system concepts.
Benefits:
- Competitive compensation package, including equity.
- Inclusive Healthcare Package.
- Flexible Time Off.
- WFH budget.