Browse All Jobs
Job Description

Optimove is seeking a Site Reliability Engineer to ensure system reliability, scalability, and performance. The successful candidate will play a crucial role in designing, implementing, and maintaining Optimove's cloud-based infrastructure. This role involves collaborating across teams to drive automation, improve system resilience, and optimize performance, while fostering a culture of reliability.

Responsibilities:

  • Ensure high availability and performance of services through effective monitoring, incident management, and root cause analysis.
  • Develop and maintain automation for infrastructure provisioning, configuration management, and application deployment.
  • Analyze and enhance system performance, including load balancing, caching, and database tuning.
  • Lead incident response efforts, participate in on-call rotations, and troubleshoot complex infrastructure issues.
  • Collaborate with security teams to implement best practices and ensure compliance with relevant standards.
  • Work closely with various teams to enhance application reliability and implement SRE best practices.

Requirements:

  • Fluency in English, including technical or business-related terminology.
  • 5+ years in site reliability engineering, DevOps, or related roles.
  • Experience managing large-scale, cloud-based infrastructure in GCP, AWS, or Azure.
  • Expertise in container orchestration (Kubernetes, Docker) and microservices architecture.
  • Strong proficiency in scripting and programming languages (Python, Go, Bash, etc.).
  • Experience with CI/CD pipelines, infrastructure as code, and configuration management.
  • Hands-on experience with monitoring and observability tools.
  • Deep understanding of networking concepts, DNS, load balancing, and distributed systems.
  • Strong problem-solving skills, excellent communication, and a proactive mindset.

The role offers:

  • Opportunity to work on cutting-edge technology.
  • Solve challenging problems.
  • Make a tangible impact on the reliability and scalability of systems.
  • Join a team that values collaboration, innovation, and continuous learning.
Apply Manually