Job Description
Arkose Labs is seeking a Senior Live Site Engineer to ensure the smooth operation and availability of their live production environment. The ideal candidate will monitor, maintain, and optimize system performance to deliver a reliable experience to customers. This role involves quickly identifying and resolving issues to minimize downtime and maximize customer satisfaction. Arkose Labs is recognized as a leader in bot detection and mitigation, offering warranties for credential stuffing and SMS toll fraud, and serves Fortune 500 companies.
Responsibilities:
- Monitor the live production environment for potential issues.
- Respond to system alerts, incidents, and outages promptly.
- Collaborate with cross-functional teams to troubleshoot complex issues.
- Serve as the central point of contact for incidents, providing updates to stakeholders.
- Prepare and deliver incident reports to customers.
- Develop and maintain automation scripts and monitoring dashboards.
- Implement proactive monitoring and alerting mechanisms.
- Participate in an on-call rotation for 24/7 support.
Requirements:
- Bachelor's degree in Computer Science, IT, or related field (or equivalent experience).
- Proven experience as a Live Site Engineer or SRE.
- Strong knowledge of Linux/Unix systems, networking, and web technologies.
- Proficiency in scripting languages (e.g., Python, Bash).
- Experience with monitoring and alerting tools (e.g., Nagios, Splunk, Datadog, Prometheus, Grafana).
- Familiarity with cloud platforms (e.g., AWS, Azure, Google Cloud) and containerization (e.g., Docker, Kubernetes).
- Excellent problem-solving and communication skills.
Benefits:
- Competitive salary + Equity
- Beautiful office space with many perks
- Robust benefits package
- Provident Fund
- Accident Insurance
- Flexible working hours and work from home days