Job Description
Reddit is seeking a Principal Machine Learning Engineer to join their LS Embedding team. This team focuses on building highly expressive multi-entity large-scale embeddings, exploring architectures beyond standard two-tower approaches to enhance Reddit's recommendation systems. The ideal candidate will lead the design and architecture of GNN and transformers-based multi-entity embedding generation, actively participating in the end-to-end implementation process, including enabling efficient distributed training and serving for such architectures.Reddit is a community-driven platform with over 100,000 active communities and 101M+ daily active unique visitors.
Responsibilities: - Lead the team that architects and designs GNN and transformers based multi-entity embedding generation.
- Define the technical roadmap and plan of execution in collaboration with Xfn partners.
- Develop and optimize large-scale graph-based machine learning pipelines for recommendation systems.
- Architect scalable and efficient GNN and transformers-based recommendation models.
- Collaborate with cross functional business units such as Ads teams leveraging the models for upstream functions and improve relevance metrics.
- Collaborate with ML Infrastructure teams to enable distributed GPU based training and online serving architecture
- Lead feature engineering efforts to identify and curate expressive raw data to be used for creating embeddings
- Be a mentor and cross-functional advocate for the team
- Contribute towards team and product strategy, operations and execution at Reddit.
Requirements: - 15+ years of Technical Leadership Experience
- Proven ability to lead ML initiatives, mentor engineers, and communicate complex concepts to cross-functional teams.
- Expertise in Graph Neural Networks, collaborative filtering, knowledge graphs, and deep learning for recommendations.
- Understanding of graph theory, network science, and representation learning technique
- Strong coding skills in Python and experience with ML frameworks like PyTorch Geometric (PyG), Deep Graph Library (DGL), TensorFlow, and scikit-learn.
- Solid understanding of ML infrastructure components and libraries (data parallel, model parallel, pipeline parallel, torch.inductor, model pruning, etc.) enabling efficient distributed training and inference.
Reddit offers: - Comprehensive Healthcare Benefits and Income Replacement Programs
- 401k Match
- Family Planning Support
- Gender-Affirming Care
- Mental Health & Coaching Benefits
- Flexible Vacation & Reddit Global Days off
- Generous paid Parental Leave
- Paid Volunteer time off