Job Description
PhonePe Group is seeking a Site Reliability Engineer to join their team in Bangalore. The ideal candidate will have 8-13 years of experience and a strong understanding of Linux. This role involves troubleshooting issues across the entire stack, improving system reliability and performance, and participating in on-call rotations. PhonePe is a leading digital payments company in India, offering opportunities to work on impactful technology and collaborate with top minds.
Responsibilities: - Troubleshoot issues across the entire stack - hardware, software, application, and network
- Improve the reliability and performance of distributed systems and containerized deployments
- Diagnose and troubleshoot complex distributed systems handling millions of queries per second
- Participate in on-call rotation
- Design, build, and maintain core infrastructure that enables PhonePe scaling
- Drive performance testing, capacity planning, and high availability practices
- Implement new technologies while ensuring proper testing and documentation
- Proactively monitor/identify/solve issues impacting infrastructure
- Buddy new team members and get them production-ready
Requirements: - Minimum of 7-13 years of hands-on experience in Linux/Unix System Administration
- Expertise in managing and scaling proxy infrastructure (e.g., Nginx, HAProxy)
- Knowledge in Database technologies, specifically in MySQL/NoSQL (Aerospike is a plus)
- In-depth knowledge of Python for automation
- Knowledge of Linux cloud services using kvm/qemu/lvm
Benefits: - Insurance Benefits (Medical, Critical Illness, Accidental, Life)
- Wellness Program (Employee Assistance Program, Onsite Medical Center)
- Parental Support (Maternity, Paternity, Adoption Assistance, Day-care Support)
- Mobility Benefits (Relocation, Transfer Support, Travel Policy)
- Retirement Benefits (PF Contribution, Gratuity, NPS, Leave Encashment)
- Other Benefits (Higher Education Assistance, Car Lease, Salary Advance Policy)