Browse All Jobs
Zafin, a SaaS product and pricing platform provider, is seeking a Cloud Site Reliability Engineer II to ensure the reliability, scalability, and performance of its cloud infrastructure and applications. This role involves strategic planning, incident management, and driving innovative solutions. The CSRE II will report to the VP of Cloud Services and collaborate to achieve operational and strategic objectives.Responsibilities include:
  • Leading the resolution of complex technical issues involving Zafin’s products and Azure cloud environment.
  • Designing and implementing strategic operational enhancements to improve resiliency and system reliability.
  • Conducting in-depth Root Cause Analysis (RCA) for high-severity incidents and drive initiatives to reduce error recurrence.
  • Representing the organization in external client escalation calls, providing expert guidance and solutions.
  • Architecting and optimizing cloud infrastructure for high performance, scalability, and cost-effectiveness.
  • Providing thought leadership in managing and scaling container orchestration platforms such as AKS and OpenShift.
  • Overseeing the implementation of advanced monitoring solutions and integrate predictive analytics for proactive issue resolution.
  • Developing and executing automation strategies to streamline operational workflows and incident responses.
  • Creating and maintaining comprehensive documentation of cloud architectures, processes, and incident management strategies.
  • Mentoring and coaching junior engineers, fostering a culture of continuous learning and innovation.
  • Driving strategic initiatives, collaborating with cross-functional teams to achieve organizational objectives.
Requirements:
  • Bachelor’s degree in Computer Science, Engineering, or a related field (Master’s degree preferred).
  • 12+ years of experience in cloud support, operations, or a related role.
  • Advanced expertise in Microsoft Azure (preferred) or equivalent cloud platforms.
  • Demonstrated experience in designing and scaling container orchestration systems like AKS or OpenShift.
  • Proven leadership in managing automated deployment pipelines, including Azure DevOps.
  • Mastery in enterprise monitoring platforms (e.g., Azure Insights, Grafana) and predictive analytics tools.
  • Advanced scripting skills with PowerShell, Python, or similar languages.
  • Extensive experience in incident management and defining SLAs for global production environments.
  • In-depth knowledge of database management, particularly Postgres
Zafin offers:
  • Competitive salaries
  • Annual bonus potential
  • Generous paid time off
  • Paid volunteering days
  • Wellness benefits
  • Opportunities for professional growth and career advancement.
Apply

Zafin

Zafin is a global SaaS company specializing in product and pricing solutions for the banking sector. Founded in 2002, Zafin provides a platform designed to modernize core banking systems, enabling banks to streamline the design and management of pricing, products, and packages. With headquarters in Vancouver, Canada, Zafin serves a global clientele, including major financial institutions, helping them to accelerate time to market, reduce costs, and enhance business agility through personalized pricing and dynamic responses to market needs.