Job Description
Graphcore is seeking a Senior Software Engineer to join their ML Software Performance Validation team. This role is based in Gdańsk, Poland, and involves ensuring the end-to-end performance excellence of Graphcore's AI hardware and software stack. The Senior Software Engineer will report to the Performance Validation Team Lead and collaborate with various teams to enhance the efficiency and scalability of ML software solutions. Graphcore recently joined SoftBank Group, bringing large and ongoing investment from one of the world’s leading backers of innovative AI companies.
Role involves:
- Developing and maintaining automated benchmarking and performance validation frameworks.
- Analysing performance bottlenecks at scale and recommending improvements.
- Collaborating with ML framework, compiler, and distributed computing teams.
- Implementing performance monitoring, profiling, and tracing tools.
- Performing systematic scalability testing and documenting findings.
- Designing, automating, and executing comprehensive test plans.
- Leading deep-dive debugging sessions and coordinating resolution activities.
- Documenting performance validation processes and best practices.
Requirements:
- Passion for work and ability to thrive in complex environments.
- Hands-on experience with ML software stacks, particularly PyTorch or similar frameworks.
- Solid programming skills in Python.
- Proficiency with performance debugging and profiling tools (perf, VTune, TensorBoard, or similar).
- Good knowledge of distributed computing concepts and collective communication algorithms.
- Demonstrated ability to analyse complex performance data.
- Strong problem-solving skills.
Graphcore offers:
- Competitive salary
- Annual leave policy
- Medical and dental health plans
- Gym card
- Employee pension (matched up to 4%)