Job Description
Xometry is seeking a Sr. Manager of Site Reliability Engineering (SRE) to join their organization. This role is responsible for crafting the strategic direction for SRE teams and initiatives, helping Xometry build cost-effective, secure, fast, and reliable systems for their global manufacturing marketplace.
Role involves:
- Defining standards, metrics, and practices to improve operational rigor, efficiency, and engineering velocity.
- Establishing automated and self-service strategies to improve operational efficiency and development team self-sufficiency.
- Championing and measuring observability, monitoring, and metrics practices.
- Supervising development, configuration, and maintenance of underlying platforms, observability and monitoring tools, and software development (CI/CD) tools.
Requirements:
- A degree or equivalent experience with 7+ years of experience in software development and site reliability.
- An opinionated and iterative approach to balance short-term priorities with a long-term target architecture for systems and processes.
- A proven track record of building and growing a high-performing SRE team.
- A strong understanding of infrastructure automation observability within distributed systems.
- Experience in defining & operationalizing SLOs, SLAs, and error budgets for platform and application systems.
- Demonstrated ability to interact and communicate effectively with junior-level ICs all the way to technology, product, and business executives.
- A US person (citizen or green card holder).