SRE Lead - Distributed Systems

SRE Lead for distributed systems at Arcesium.

Job Description

Arcesium is seeking a highly skilled Site Reliability Engineer to join its Technology team. The candidate will work within a cross-functional product team, creating solutions for intricate business challenges.Role involves:

Deploying, maintaining, and running a highly-available, multi-tenant distributed system.
Automating infrastructure creation and application deployment.
Contributing to system design and architecture.
Programming in the core application, including instrumenting code with monitoring metrics, setting up traces, and managing logs.
Ensuring optimal system performance.

Requirements:

At least 6 years of experience in a SRE/Operations/DevOps role running distributed systems in production.
Experience with automated provisioning and management of AWS infrastructure and services.
Strong knowledge of Linux systems internals and administration.
Deep experience with Kubernetes and Docker.
Experience automating the software dev/test/deployment lifecycle with continuous integration and continuous deployment.
Experience with scaling, monitoring, and troubleshooting actively running systems.
Ability to program in Java, C++, or C#.
Comfortable with configuration management tools: Ansible, Chef, Puppet, etc.
Familiarity with technologies like Fluentd, Key-Val datastores, API management/service meshes, Git, and Key management.

Apply Manually

Arcesium LLC

All Jobs at Arcesium LLC (37)

Clash

of Jobs

SRE Lead - Distributed Systems

Job Description

Arcesium LLC

This feature is not ready yet

Sign up for the newsletter to get notified when it's available

SRE Lead - Distributed Systems

Job Description

Arcesium LLC