Browse All Jobs
Job Description

Xometry is seeking a Principal Data & ML Scientist to join their Generative AI team. The ideal candidate will have a passion for advancing machine learning and generative AI capabilities, particularly for fine-tuning generative and language models, multimodal document understanding, and structured data extraction. This person will leverage their expertise in generative models and data science to develop and optimize innovative AI-driven solutions that enhance Xometry's service offerings.

Responsibilities:

  • Provide technical leadership to the Generative AI team, setting technical direction, defining best practices.
  • Lead strategic planning and roadmap development for generative AI initiatives.
  • Develop and deploy generative AI models and large language models (LLMs) for multimodal document processing.
  • Lead the exploration and development of innovative text and image-based data processing solutions.
  • Design and implement efficient workflows for data preparation, cleaning, and augmentation.
  • Utilize cloud platforms (e.g., Amazon Web Services) for large-scale data processing, model training, and deployment.
  • Collaborate with cross-functional teams, including engineering and business teams.
  • Mentor and guide team members on advanced machine learning techniques, model architecture design, and problem-solving strategies.
  • Continuously experiment and iterate on model performance, tuning architectures and parameters to improve accuracy and efficiency.
  • Stay updated with the latest research in generative AI, deep learning, and multimodal data processing.

Requirements:

  • A bachelor’s degree is required, but an advanced degree (M.S. or PhD) in computer science, machine learning, AI, or a related field is highly preferred.
  • 7+ years of experience in data science and machine learning, focusing on generative models, LLMs, or computer vision.
  • Expertise in large-scale language and vision models (e.g., Transformers, GPT, VLMs).
  • Experience with multimodal data processing (e.g., combining text, image, and 3D data).
  • Proficient in Python, including key libraries such as PyTorch, TensorFlow, pandas, and numpy.
  • Strong background in probability, statistics, and optimization techniques relevant to generative modeling.
  • Familiarity with cloud computing resources and tools for model training and deployment (e.g., AWS SageMaker).
  • Familiar with software engineering principles, including version control, reproducibility, and continuous integration.
  • Experience in the manufacturing, supply chain, or similar industries is a plus.
Apply Manually