Appier is seeking a Software Engineer, Site Reliability Engineering to join their team in Tokyo. The role involves combining software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. The Software Engineer will ensure that Appier's services have reliability and uptime and will keep an eye on Appier's systems capacity and performance.
Much of Appier's software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation.
Responsibilities:
- Improve the whole lifecycle of services.
- Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews.
- Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
- Scale systems sustainably through mechanisms like automation.
- Practice sustainable incident response and blameless postmortems.
- Participate in on-call rotation.(remote on-call)
Minimum Qualifications:
- Bachelor’s degree in Computer Science, a related field, or equivalent practical experience.
- 2+ years of experience with software development in one or more programming languages.
- 2+ years of experience with Linux system administration.
- 1+ years of experience in designing, analyzing, and troubleshooting large-scale distributed systems, and 1+ years of experience leading projects and providing technical leadership.
- Hands-on experience in planning and deploying services on production.
- Proficiency in Chinese
Preferred Qualifications:
- Experience in architecting, developing, or maintaining production-grade cloud solutions in virtualized environments
- Experience in deployment and orchestration technologies (such as Docker, Puppet, Kubernetes, Chef, Salt, Ansible)
- Experience in building and deploying automation and continuous integration systems
- Experience in operating a big data systems related to data access, collection, processing and storage
- Experience in operating and deploying online web services
- Experience in operating services on IaaS such as AWS and GCP.
- Experience in Database management (e.g.Database System Setup, Backup & Restore, System Tuning), MongoDB, Cassandra, MySQL, and PostgreSQL will be plus.
- Security Knowledge such as setting up Firewall, proper security policy design, network attack defense.
- Working knowledge of virtualization, hosted services, multi-tenant cloud infrastructures, storage systems and content delivery networks.