Job Description
Prove is seeking a Senior Site Reliability Engineer to bring a software engineering approach to their operations. This role is crucial for maintaining and scaling highly reliable software systems, improving customer reliability, and helping development teams meet SLOs. The Senior Site Reliability Engineer will manage large systems through code and lead projects to find innovative solutions to challenges.
Role involves:
- Leading projects to find innovative solutions.
- Collaborating with senior leaders on strategic planning.
- Maintaining and enforcing processes for fault-tolerant, scalable, and reusable infrastructure.
- Collaborating in cross-functional teams.
- Promoting ownership of design, execution, and deployment.
- Reacting quickly to changing customer and business needs.
- Identifying repetitive tasks and implementing automation solutions.
- Writing documentation and training material for other teams.
- Participating in On Call shifts.
Requirements:
- 6 to 8 years of production engineering or software engineering experience with production exposure.
- 2+ years of experience with web application maintenance, leveraging containerized workflows such as Kubernetes or Docker.
- Expertise in Applications and services telemetry using standards like Open Telemetry.
- Good coding and automation skills around tools running in production.
- 2+ years of experience with higher-level languages such as Java, Go, or Python.
- 2+ years of experience with operating systems and TCP/IP network fundamentals.
- Bachelor’s degree in computer science or related field.
Role offers:
- Competitive salaries & Bonus Plan (for eligible roles) and Equity Plan
- Modern Health for financial, mental, and physical wellness
- 401(k) Retirement Plan & Match (US Offices) and Local Country Pension (International Offices)
- Unlimited Vacation and Flexible hours
- Comprehensive medical benefits for you and your family
- Emotional & Physical Wellness – Access to wellness services (EAP & Prove Well-Being Reimbursement)