Job Description
GitLab, an open core software company, is seeking a Senior Site Reliability Engineer to join their Environment Automation team. This role is crucial for maintaining the smooth operation of GitLab's user-facing services and production systems. The ideal candidate will blend operational pragmatism with software craftsmanship, applying engineering principles and automation to GitLab's environments and codebase.
Role involves:
- Automating operational tasks such as package updates and configuration changes.
- Developing early warning systems for reliable maintenance tasks.
- Planning monitoring and alerting systems to predict capacity needs.
- Responding to user emergencies, platform alerts, and support requests.
- Enhancing security measures for GitLab infrastructure.
- Partnering with internal and external compliance assessors.
- Collaborating with engineering stakeholders to resolve architectural bottlenecks.
Requirements:
- Experience with Infrastructure as Code technologies, especially Terraform.
- Ability to reason about large systems and their operation at scale.
- Comfortable using GoLang or Ruby.
- Experience interacting with customers and resolving their requests.
GitLab offers:
- Opportunity to work on core GitLab projects.
- Chance to code infrastructure automation with Ansible and Terraform.
- Exposure to various cloud provider systems (e.g., GCP, AWS).