Job Description
PayPay is seeking a Site Reliability Engineer (SRE) to improve systems and processes, ensuring high availability and top-level performance. The SRE will deliver insights into system bottlenecks and ensure system reliability and scalability, while increasing the number of services that the company offers. The ideal candidate will bring informed viewpoints, enjoy collaborating with a cross-functional team, and actively push boundaries to develop reliable and scalable solutions and positive user experiences.
Role involves:
- Analyzing current technologies and developing monitoring and notification tools.
- Ensuring system stability by verifying failure scenarios and implementing solutions to reduce MTTR.
- Developing solutions to improve system performance with a focus on high availability, scalability, and resilience.
- Integrating telemetry and alerting platforms to track and improve system reliability.
- Implementing industry best practices for system development, configuration management, and system deployment.
- Ensuring seamless information flow between teams by documenting knowledge gained.
- Staying up to date on modern technologies and trends to advocate for inclusion within products.
- Participating in incident management, including troubleshooting production issues and driving root cause analysis (RCA).
Requirements:
- Experience troubleshooting, tuning high-performance microservice architectures running on Kubernetes and AWS.
- 5+ years of experience in software development in Python, Java, Go, etc.
- Strong fundamentals in data structures, algorithms, problem-solving, and complexity analysis.
- Curiosity and proactivity in finding performance bottlenecks, scalability, and resilience problem areas.
- Experience with observability tools and gathering data.
- Database knowledge such as RDS, NoSQL, distributed TiDB, etc.
- Excellent communication skills and a collaborative attitude.
Role offers:
- Full-time employment.
- Hybrid workstyle (flexible working style including Remote and office).
- Super Flex Time (No Core Time).
- Social Insurance (health insurance, employee pension, employment insurance and compensation insurance).
- 401K.
- Translation/Interpretation support.
- VISA sponsor + Relocation support.