Job Description
Simplify360, now a Nextiva company, is seeking a Senior Site Reliability Engineer. This role involves leading initiatives to build scalable infrastructure, improve reliability, and advance AIOps strategies. The ideal candidate will have a strong background in both application development and infrastructure. As a senior team member, they will shape SRE practices, mentor engineers, and work cross-functionally to ensure systems are resilient and secure.
Role involves:
- Mentoring junior SREs and product engineers.
- Ensuring the availability, performance, and scalability of production systems.
- Integrating SRE best practices into the SDLC.
- Advancing observability through logging, monitoring, and alerting.
- Architecting and implementing disaster recovery and high availability strategies.
- Championing security best practices.
Requirements:
- Bachelor’s degree in Computer Science, Engineering, or a related field.
- 5+ years of experience in SRE, DevOps, or production engineering.
- Deep understanding of distributed systems and performance optimization.
- Proficiency with containerization tools like Docker and Kubernetes.
- Expertise in cloud platforms (GCP, Azure, AWS).
- Experience with modern observability stacks (e.g., Datadog, Prometheus, Grafana, Splunk).
- Excellent communication and collaboration skills.
Simplify360 Offers:
- Medical insurance coverage for employees and their families.
- Group Term & Group Personal Accident Insurance.
- Privilege leaves, paid sick leave, and casual leave.
- Paid Maternity and Paternity leaves.
- Provident Fund & Gratuity.
- Employee Assistance Program and wellness initiatives.
- Learning and development opportunities.