Job Description
NEORIS is seeking an experienced Site Reliability Engineer (SRE) to ensure the reliability, scalability, and performance of its cloud-based infrastructure. The ideal candidate will possess strong expertise in Google Cloud Platform (GCP), container orchestration, and infrastructure-as-code, with a focus on automation and observability. This role involves working closely with development teams to improve system reliability and performance, participating in incident response, and documenting system architecture.
Responsibilities:
- Design, implement, and maintain GCP infrastructure.
- Manage and optimize Google Kubernetes Engine (GKE) clusters.
- Implement and maintain Docker containerization strategies.
- Develop and maintain GitHub Actions workflows for CI/CD pipelines.
- Manage infrastructure as code using Terraform for deployments.
- Automate operational processes to improve efficiency.
- Configure and maintain firewall rules.
- Implement security best practices.
- Monitor and optimize network performance.
- Implement and maintain Prometheus for system monitoring and alerting.
- Configure and manage Grafana dashboards for system visibility.
- Establish SLOs, SLIs, and error budgets for critical services.
Requirements:
- Extensive experience with Google Cloud Platform (GCP) services.
- Strong knowledge of Kubernetes (GKE) and container orchestration.
- Proficiency in Terraform for infrastructure provisioning.
- Experience with GitHub Actions or similar CI/CD tools.
- Expertise in Docker containerization.
- Networking knowledge including firewall configuration and protocols.
- Monitoring stack experience (Prometheus + Grafana).
- Strong problem-solving and troubleshooting abilities.
- Excellent communication skills.
- Ability to document technical processes clearly.
NEORIS offers:
- Professional growth
- Dynamic work environment
- Competitive salary
- Attractive benefits plan
- Benefits of law and superiors
- Development opportunities