Job Description
Xometry is seeking a Sr. Manager of Site Reliability Engineering (SRE). This role involves crafting the strategic direction for SRE teams and initiatives, helping Xometry build cost-effective, secure, fast, and reliable systems for its global manufacturing marketplace.
Responsibilities:
- Define standards, metrics, and practices to improve operational rigor, efficiency, and engineering velocity.
- Establish automated and self-service strategies to improve operational efficiency and development team self-sufficiency.
- Champion and measure observability, monitoring, and metrics practices.
- Supervise development, configuration, and maintenance of underlying platforms for deployed software.
- Supervise development, configuration, and maintenance of observability and monitoring tools.
- Supervise development, configuration, and maintenance of software development (CI/CD) tools.
Requirements:
- 7+ years of experience in software development and site reliability.
- Experience in a fast-paced, product-driven environment.
- An iterative approach to balance short-term priorities with a long-term target architecture.
- A proven track record of building and growing a high-performing SRE team.
- A strong understanding of infrastructure automation observability within distributed systems.
- Experience in defining & operationalizing SLOs, SLAs, and error budgets.
- Demonstrated ability to communicate effectively with all levels of staff.
- A US person (citizen or green card holder).
Xometry offers:
- Opportunity to work on building cost-effective, secure, fast, and reliable systems.
- A chance to define strategic direction for SRE teams and initiatives.