companyHuawei Canada logo

Distinguished Engineer - AI Computing Systems

Huawei CanadaVancouver, British Columbia, Canada
On-site Full-time CA$172K/yr - CA$230K/yr

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Senior

Qualifications

To be successful in this role, candidates should possess:Extensive experience in AI computing systems and training cluster optimization. Proven leadership skills in guiding teams to achieve technical excellence. Strong background in developing AI frameworks and software for large model training. Expertise in low-precision training and parallel strategy tuning. Ability to collaborate effectively with academic and industry experts.

About the job

Join Huawei Canada as a Distinguished Engineer in AI Computing Systems.

About the Team:

The Advanced Computing and Storage Lab, part of the Vancouver Research Centre, is dedicated to pioneering adaptive computing system architectures. We tackle the complexities introduced by flexible and variable application loads to enhance stability and quality in training clusters. Our focus includes developing dynamic cluster configuration strategies and precision control systems to ensure efficient computing power clusters. Our lab is actively engaged in key industry AI applications, particularly in large model training and inference, utilizing technologies such as low-precision training, multi-modal training, and reinforcement learning. We are committed to conducting bottleneck analysis and creating optimization solutions that enhance training, inference performance, and overall usability.

About the Job:

  • As an industry leader in training cluster software frameworks, you will gain insights into the evolution of AI large model training frameworks. You will plan and design AI frameworks and software features for various scenarios like large model pre-training, post-training, and integrated training and inference, establishing critical capabilities for our training cluster software framework.
  • Lead the team in optimizing large model training by developing key technologies such as low-precision training, parallel strategy tuning, and training resource optimization, driving the commercial implementation of these optimization technologies.
  • Focus on our training servers, super nodes, and other products, leading the development of large model AI training frameworks, operator libraries, and acceleration libraries. Leverage system engineering and software-hardware collaboration to maximize AI cluster computing efficiency.
  • Identify and collaborate with high-quality academic resources in large model training, working alongside domain experts to advance our technological capabilities.

About Huawei Canada

Huawei Canada is at the forefront of innovation in AI computing and storage solutions. Our Vancouver Research Centre is dedicated to addressing the challenges of advanced computing, and we are committed to fostering a collaborative environment where industry experts can thrive and make impactful contributions.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.