1 - 20 of 4,629 Jobs

Search for Software Engineer - AI/ML Infrastructure

4,629 results

Apply
companyThumbtack logo
Full-time|Remote|Remote, Ontario

Join Thumbtack as a Software Engineer specializing in AI/ML Infrastructure. In this role, you will contribute to the development and enhancement of innovative AI and machine learning solutions that empower our platform. We are looking for a proactive and collaborative team member who is passionate about technology and eager to tackle complex challenges.

Mar 11, 2026
Apply
companyXsolla logo
Full-time|On-site|Montreal

Join Xsolla as a Lead AI/ML Engineer where you will be pivotal in architecting and enhancing our data infrastructure for personalization, churn prediction, and recommendation systems across our global gaming commerce platform. You will spearhead the design and optimization of machine learning algorithms using Vertex AI, create scalable data pipelines from diverse sources, and implement best practices for data quality, governance, and performance. This role merges hands-on technical leadership with collaborative cross-functional efforts, mentoring engineers, and partnering closely with data science and backend teams to deploy ML features. Reporting directly to the Director of Data Platforms, you will translate business requirements into innovative technical solutions, significantly shaping our data mesh architecture while ensuring compliance, scalability, and operational excellence as we support game developers globally.

Jan 24, 2026
Apply
companyIdeogram logo
Full-time|On-site|Toronto

About IdeogramAt Ideogram, we are on a mission to democratize world-class design, enhancing human creativity through innovative solutions. Our proprietary generative media models and AI-driven creative workflows address previously unsolved problems in graphic design. Our team is composed of pioneers with proven success in technology breakthroughs, including foundational research in Diffusion Models and the development of Google’s Imagen and Imagen Video. We prioritize design, aesthetics, and craftsmanship, alongside rigorous research and engineering, delivering experiences that resonate with creatives.With nearly $100 million raised from leading investors like Andreessen Horowitz and Index Ventures, Ideogram is headquartered in Toronto and expanding rapidly, aiming to triple our team this year. We foster a flat organizational structure, encouraging a culture of ownership, collaboration, and mentorship.Discover more about our innovations by exploring Ideogram 3.0, Canvas, and Character. Experience Ideogram at ideogram.ai.About The RoleWe are searching for a talented Software Engineer specializing in ML Data Infrastructure to join our innovative team. You will collaborate with a group of skilled engineers to create cutting-edge AI design experiences that engage millions of users.You will thrive in this role if you are passionate about:Collaborating on complex technical challenges, from scaling distributed systems to enabling novel generative media experiences.Constructing robust data infrastructure capable of supporting foundation models at petabyte scale, ensuring reliability and performance across multi-modal training pipelines.Optimizing data processing workflows for high throughput, engaging directly with distributed systems, TPU infrastructure, and large-scale storage solutions.Partnering with research scientists to grasp data requirements and translating them into production-grade systems that expedite model development cycles.What We’re Looking ForTechnical Excellence2-5 years of experience in developing and deploying large-scale distributed systems, showcasing the ability to manage complexity through thoughtful abstractions and scalable design.Strong grasp of data structures, algorithms, and system design principles.

Jan 9, 2026
Apply
companyjobgether logo
Full-time|Remote|Canada

Role overview As a Senior Software Engineer focused on AI Infrastructure at jobgether, the main responsibility is to shape and support the technical foundation for the company’s AI initiatives. The work involves both creating new systems and refining current technology to help meet evolving business objectives. What you will do Design and develop infrastructure that enables AI systems to operate efficiently and reliably Maintain and enhance existing AI technology stacks to ensure ongoing performance and scalability Apply technical expertise to solve complex engineering problems related to AI infrastructure Work environment This position is based in Canada and involves working closely with a collaborative team. The role provides opportunities to engage with advanced technologies in support of the company’s AI goals.

Apr 28, 2026
Apply
companyGrafana Labs logo
Full-time|Remote|Canada (Remote)

Grafana Labs stands as a beacon of innovation in the realm of open-source software, proudly boasting over 20 million users globally. Our flagship product, the Grafana visualization tool, empowers individuals and organizations to monitor a diverse array of systems, from ecological phenomena to advanced technological infrastructures. Our visually striking dashboards have gained recognition in prestigious venues such as NASA, Minecraft HQ, Wimbledon, and the Tour de France. We proudly serve over 3,000 organizations, including industry giants like Bloomberg, JPMorgan Chase, and eBay, in managing their observability strategies through our Grafana LGTM Stack, which can be utilized either fully managed through Grafana Cloud or self-managed with the Grafana Enterprise Stack, both incorporating scalable metrics (Grafana Mimir), logs (Grafana Loki), and traces (Grafana Tempo).As we scale rapidly, we remain committed to our core values: a legacy of open-source development, a culture that fosters global collaboration, and a dedication to impactful work. Our team flourishes in an environment driven by transparency, autonomy, and trust.If you find this role intriguing despite not meeting every qualification, we encourage you to apply and explore what could be a career-defining opportunity.This is a remote position, and we are currently considering applicants from Canada time zones only.

Feb 12, 2026
Apply
companyGrafana Labs logo
Full-time|Remote|Canada (Remote)

Join Grafana Labs, a leading remote-first, open-source innovator with over 20 million users worldwide leveraging our visualization tool to monitor everything from beehives to climate change. Our distinctive dashboards have been showcased at major events, from NASA launches to the Tour de France. Serving over 3,000 companies, including Bloomberg, JPMorgan Chase, and eBay, we empower them to manage observability strategies through the Grafana LGTM Stack, which can be fully managed via Grafana Cloud or self-managed using the Grafana Enterprise Stack, both featuring scalable metrics (Grafana Mimir), logs (Grafana Loki), and traces (Grafana Tempo).As we rapidly scale, we remain committed to our core values: an open-source legacy, a collaborative global culture, and a passion for impactful work. Our team flourishes in an innovation-driven environment where transparency, autonomy, and trust are paramount.If you're excited about this role, don't hesitate to apply even if you don't meet every requirement—this could be a transformative step in your career.This position is remote, and we are currently considering applicants from Canada time zones only.Staff AI EngineerThe Opportunity:At Grafana, we develop observability tools that enable users to understand, respond to, and enhance their systems, irrespective of scale or complexity. The Grafana AI teams are integral to this mission, utilizing AI-driven features to help users interpret complex observability data. These capabilities minimize toil, lower the domain expertise barrier, and highlight significant signals in noisy environments.What sets our team apart is our unique approach: we emphasize autonomy and ownership at both individual and team levels. Engineers are encouraged to make decisions, rapidly prototype, and validate ideas early, all within a highly collaborative culture that values curiosity, constructive feedback, and cross-functional collaboration.We seek an AI Software Engineer with a robust software engineering background, a mindset geared towards quick iterations, and a passion for experimentation. This should be complemented by a focus on delivering and scaling impactful features that bring real value to our users.

Feb 12, 2026
Apply
companyBPM LLC logo
Full-time|Remote|Canada

BPM is a company where compassion and community are woven into our DNA. We are committed to continuously improving ourselves and fostering innovation through inquiry. Joining BPM means leveraging your unique experiences, expanding your skill set, and achieving your fullest potential in both your professional and personal life while positively impacting clients, colleagues, and the community. Our entrepreneurial spirit encourages us to innovate and approach challenges from fresh perspectives. At BPM, we value people, ensuring everyone feels appreciated and part of a larger mission. Because People Matter.BPM actively welcomes applications from individuals with disabilities. Accommodations are readily available upon request for candidates participating in all aspects of the selection process. What We Offer:Comprehensive rewards package: Enjoy flexible work arrangements and competitive benefits that prioritize your health and well-being, including coverage for dependents.Well-being initiatives: Access an interactive wellness platform, employee assistance programs, mental health resources, and Colleague Resource Groups (CRGs) that create safe environments for colleagues to connect, share, and feel valued.Work-life balance and flexibility: Benefit from at least 14 Firm Holidays, including relevant provincial statutory holidays and 2 floating days, Flex PTO, supplemental top-up for eligible statutory leaves, a winter break, summer hours, and remote working options, allowing you to challenge yourself while taking care of your well-being.Professional growth opportunities: Experience a culture of learning with CPA exam resources and bonuses, support for professional memberships/certifications, a coaching program, and live classes, workshops, and seminars through BPM University. Who Thrives at BPM:· Caring individuals who prioritize the needs of others.· Self-motivated professionals who embody the entrepreneurial spirit of BPM.· Authentic voices with diverse perspectives.· Lifelong learners driven to achieve excellence.· Resilient individuals who meet challenges head-on.Position Overview:We are looking for an AI/ML Engineer to join BPM’s Enterprise Technology Solutions team. This role is ideal for a builder—someone who is not only experimenting with AI but is also instrumental in implementing it effectively within the firm at scale. You will harness AI tools...

Oct 7, 2025
Apply
company
Full-time|Remote|Toronto / New York

Join Our Innovative TeamAdaptive ML is at the forefront of AI technology, crafting a state-of-the-art Reinforcement Learning Operations (RLOps) platform. Our mission is to empower enterprises to specialize and deploy large language models (LLMs) in production to achieve significant outcomes.We are the backbone for tuning, evaluating, and serving specialized models at scale, revolutionizing task-specific LLM development. Our infrastructure supports production-ready workflows that handle millions of requests efficiently while optimizing for performance and cost across distributed systems.Our cohesive team has previously contributed to the development of leading open-access large language models. Having secured a $20M seed funding from Index Ventures and ICONIQ in early 2024, we are already operational, serving clients such as Manulife, AT&T, and Deloitte in the travel and financial sectors, with more partnerships on the horizon.The Product Staff at Adaptive ML is dedicated to translating our advanced technology into exceptional products that address the challenges faced by companies in their generative AI deployments. We are committed to creating a high-quality, user-friendly, and resilient experience for our clients.Your RoleAs a DevOps Engineer within our Product Staff, you will play a crucial role in packaging our technology into exceptional products that enhance generative AI experiences through deeper personalization via reinforcement learning. It is essential that our technology remains transparent, addressing the real challenges faced by companies without imposing additional complexities.Your responsibilities will encompass all DevOps aspects, from systematic deployment to scaling production databases and supporting internal workloads. Expect to tackle challenges like coordinating complex GPU infrastructure and managing the storage of user interactions, which can reach trillions of records, all while ensuring robustness.We seek passionate individuals who are self-motivated and eager to contribute to a highly technical product that emphasizes robustness, accessibility, and responsiveness. Being an early member of our team means you'll have the opportunity to significantly influence our product as we expand.This position is ideally in-person at our offices in New York or Toronto, but we are also open to fully remote candidates.

Nov 17, 2025
Apply
companyHuawei Canada logo
Full-time|On-site|Markham, Ontario, Canada

Join Huawei Canada as a Principal Software Engineer and be a part of our innovative team!About Us:Founded in 2014, the Distributed Scheduling and Data Engine Lab serves as Huawei Cloud’s technology innovation hub in Canada. This lab is dedicated to pioneering advanced cloud technologies, facilitating the productization and ongoing refinement of our technological breakthroughs. Our research spans various domains, including cloud-native databases, resource scheduling and prediction, middleware solutions, media engines, and user experience enhancements. We cultivate a dynamic technical environment that encourages collaboration with industry specialists to develop a competitive cloud platform. We are currently seeking a Principal Software Engineer to join our team.Job Responsibilities:Integrate AI frameworks with cloud infrastructure, optimizing the end-to-end architecture for AI inference and fine-tuning scenarios, with a focus on enhancing observability, reliability, and performance of AI services.Collaborate with team members to design and build concept prototypes, validating optimization strategies to ensure their effectiveness.Work closely with the product team to support prototype development, ensuring alignment with product constraints and requirements.

Apr 16, 2025
Apply
companyMechanical Orchard logo
Full-time|Remote|Canada (Remote)

Join Mechanical Orchard, a pioneering company dedicated to transforming the most essential and complex business applications—the backbone of today's digital world. We specialize in modernizing these applications, ensuring they can swiftly adapt to market challenges and opportunities.Our innovative approach is born from years of observing the common pitfalls in modernization projects. We aim to mitigate risks and disruptions through our Generative AI platform, Imogen, which leverages cutting-edge data engineering, compiler, and LLM-based techniques to deliver unparalleled solutions in the industry.With a strong foundation in software development and a reputation for influencing the industry, we've contributed significantly to the evolution of Agile practices, including XP. We apply the same level of thoughtfulness and rigor in integrating AI where it brings real value. Our core values focus on doing the right thing, achieving effective outcomes, and fostering kindness within our teams.Together, we are bringing relief to overwhelmed IT teams and witnessing how our expertise and innovative technologies can radically enhance the way organizations operate and innovate. If you share our passion for excellence and collaboration, we welcome your application!

Oct 1, 2025
Apply
companyMarqeta logo
Full-time|Hybrid|Toronto, Canada; Vancouver, Canada

As the Manager of Data Infrastructure, you will report directly to the Director of Data Engineering and spearhead a team of 5-6 engineers responsible for operating essential data platforms that fuel our AI and analytics initiatives. This newly established, hands-on leadership position blends technical expertise with personnel management, requiring you to mentor your team, promote agile methodologies, and incorporate product-focused thinking into infrastructure management. You will implement our data strategy while ensuring operational excellence across platforms that handle billions of transactions monthly. We embrace a Flexible First work model, allowing this role to be performed remotely within the provinces of Ontario and British Columbia. Quarterly travel to the United States will be necessary. We are excited for you to potentially join our team!

Mar 2, 2026
Apply
companySolink logo
Full-time|Hybrid|Canada

Senior AI/ML EngineerLocation: Ottawa, ON | Hybrid Department: EngineeringReports To: Eugenia Kondratova, Senior Technical Manager, AIType: Full-Time | PermanentVacancy Status: This is an active role and we are currently hiring for this position.About SolinkAt Solink, we are dedicated to protecting what matters most. Our mission is to empower businesses with real-time operational insights by transforming video security. Our innovative cloud-based platform seamlessly integrates with existing camera systems, turning them into intelligent sensors that detect and interpret critical moments. This enables teams to make informed, data-driven decisions, thereby enhancing security and operational efficiency.With over 30,000 locations in more than 32 countries, including well-known brands such as McDonald's and JYSK, Solink provides clarity when it is most needed. Our solutions assist businesses in minimizing shrinkage, optimizing their operations, and proactively addressing emerging threats.We are experiencing rapid growth and have received accolades from Deloitte’s Fast 50™ and Fast 500™ and recognition as one of Ottawa’s Best Places to Work. We are just getting started!The RoleAs a Senior AI/ML Engineer at Solink, you will be responsible for designing, building, and deploying comprehensive machine learning solutions that drive our next generation of video analytics and operational intelligence. Your work will span research, model development, software engineering, and production integration, where you will own features that deliver significant value to our customers in both cloud and edge environments.This position is perfect for individuals who excel in fast-paced environments, relish tackling complex technical challenges, and are driven by the opportunity to deliver reliable, scalable ML-powered features used in high-demand, real-world applications.What You’ll DoDesign, develop, train, and deploy ML models—including computer vision, LLMs/VLMs, and multimodal models—across cloud and edge/embedded environments.Own ML-driven features end-to-end: from proof of concept and experimentation to integration, deployment, instrumentation, and continuous improvement.Evaluate and integrate third-party AI/LLM/VLM services, balancing cost, performance, and scalability.

Mar 2, 2026
Apply
companyBree logo
Full-time|On-site|Toronto

Bree serves Canadians who live paycheck to paycheck, focusing on those who need short-term credit solutions. With over 800,000 users, Bree has established a strong presence in the FinTech sector. The company maintains profitability, generates eight-figure annual revenue, and continues to see double-digit monthly growth. Bree has experienced zero voluntary employee turnover. After joining Y Combinator in 2021, Bree raised a $2 million seed round. Role overview The Software Engineer, Infrastructure will work to strengthen the reliability, scalability, and maintainability of Bree's data-driven systems. This position focuses on supporting a growing user base by improving system performance and stability. What you will do Design, build, and maintain infrastructure to support Bree's data systems. Refactor code and improve system architecture to reduce technical debt. Optimize PostgreSQL databases through query tuning, indexing, and capacity planning. Implement Infrastructure as Code using tools such as Pulumi, Terraform, or AWS CloudFormation. Set up monitoring and alerting with tools like Grafana or Datadog. Enhance system observability by improving metrics, logging, and tracing. Requirements Bachelor's degree in Computer Science or a related discipline. Experience building scalable, reliable, and maintainable infrastructure. Strong knowledge of PostgreSQL and hands-on database optimization skills. Familiarity with Infrastructure as Code concepts and tools. Experience with monitoring tools and practices. Location This position is based in Toronto.

Apr 23, 2026
Apply
company
Full-time|On-site|Vancouver, British Columbia, Canada

About this OpportunityJoin a global leader in networking that is transforming how businesses manage their networks. Our AI Core group is at the forefront, developing pioneering platforms across various domains such as Generative AI, AI Agents, RAG, Knowledge Bases, Data Mining, Anomaly Detection, and fine-tuning large language models. Here, innovation is not just welcomed; it is a core expectation.The RoleAs a pivotal AI ML Engineer, you will take on a leadership role in shaping our machine learning strategy. You'll be responsible for creating intelligent, high-performance multi-agent systems that can perceive, learn, and act in real-time.Key ResponsibilitiesDefine and lead the technical vision for machine learning solutions across our product portfolio.Manage the complete software development lifecycle, overseeing everything from design and code reviews to deployment and operational management.Architect robust, scalable microservices, including both synchronous and asynchronous web services.Develop real-time inference pipelines for complex models leveraging tools like Triton, TensorRT, and mixed-precision computing.Mentor fellow engineers, establish technical direction, and cultivate a strong team culture.Promote engineering excellence, system resilience, and continuous improvement in operations.

Apr 10, 2026
Apply
companySpeechify logo
Full-time|Remote|Ottawa, Canada

Speechify aims to remove reading as a barrier to learning. Over 50 million people use Speechify’s text-to-speech tools to turn PDFs, books, Google Docs, news articles, and websites into audio. Users can read faster, retain more, and access information in ways that suit their needs. The product lineup includes apps for iOS, Android, Mac, Chrome, and the web. Recent recognition includes Chrome Extension of the Year from Google and Apple’s 2025 Design Award for Inclusivity. Speechify is a fully distributed company with nearly 200 team members. The group brings together frontend and backend engineers, AI research scientists, and professionals from companies like Amazon, Microsoft, and Google. Team members also include PhD candidates from top programs such as Stanford and founders from high-growth startups including Stripe, Vercel, and Bolt. Role Overview The Data team within Speechify’s AI division is looking for a Software Engineer focused on Data Infrastructure & Acquisition. This position centers on data collection to support model training. The team combines infrastructure, engineering, and research to build high-quality, petabyte-scale datasets efficiently. This role offers the chance to contribute to projects that shape the future of Speechify’s products. What You Will Do Find and connect new audio data sources to the ingestion pipeline. Maintain and improve cloud infrastructure for the ingestion pipeline, currently running on Google Cloud Platform (GCP) and managed with Terraform. Work closely with scientists to optimize for cost, throughput, and quality, enabling larger and richer datasets at lower costs for new models. Partner with the AI team and leadership to plan datasets that will support future consumer and enterprise offerings. What We Look For BS, MS, or PhD in Computer Science or a related field. At least 5 years of professional software development experience. Skilled in bash and Python scripting in Linux environments. Comfortable with Docker and Infrastructure-as-Code practices, plus experience with at least one major cloud provider (GCP preferred). Experience with web crawlers and large-scale data processing is a plus. Strong organizational skills and ability to handle shifting priorities. Clear written and verbal communication skills. Location This is a remote role based in Ottawa, Canada.

Apr 20, 2026
Apply
companyEleks logo
Full-time|Remote|Remote (Canada)

Join our innovative team at Eleks as an AI/ML Architect, where you will pioneer cutting-edge machine learning solutions. In this role, you will leverage your expertise in artificial intelligence and machine learning to design and implement robust architectures that drive our projects forward. You will collaborate with cross-functional teams to identify and solve complex problems, ensuring that our AI systems meet the highest standards of performance and reliability.

Mar 11, 2026
Apply
company
Full-time|On-site|Toronto / New York

About Our TeamAdaptive ML is an innovative AI startup specializing in the development of a Reinforcement Learning Operations (RLOps) platform designed to empower enterprises to tailor large language models for their specific needs, ensuring reliable deployment into production workflows with quantifiable results.Our Success team plays a pivotal role within our Technical Staff. While the Technical Staff focuses on developing the core technology that drives Adaptive ML, the Success team is dedicated to helping our enterprise clients maximize the value of these technologies, particularly through our Adaptive Engine product. Adaptive Engine enables businesses to construct, evaluate, and deploy the most effective models tailored to their unique requirements. Our founding team has previously collaborated on creating state-of-the-art open large language models. Following a successful $20M seed funding round led by Index Ventures and ICONIQ in early 2024, we are actively working with our first enterprise customers, including Manulife, AT&T, Deloitte, and more to be announced soon.About the RoleAs an AI Engineer (Pre-Sales) at Adaptive ML, you will serve as the technical liaison for our sales team, assisting potential clients in understanding how the Adaptive Engine can address their most challenging issues. Your role will involve crafting and presenting impactful demonstrations, guiding customers through proof of concepts, and creating technical proposals that underscore the unique advantages of our platform and reinforcement learning capabilities.This role is distinctly technical, offering you the chance to engage hands-on with the Adaptive Engine—developing demos, fine-tuning large language models, and initiating pilots that transition prospects into clients. Additionally, you will contribute to shaping the product roadmap by identifying and articulating customer needs and insights.Please note: This is an in-person position based in our Toronto or New York offices.Your ResponsibilitiesCollaborate with the sales team to engage prospects, identifying business challenges and aligning them with Adaptive ML solutions.Design and present engaging technical demonstrations tailored to the specific use cases of customers.Lead and oversee proof-of-concept (POC) initiatives and pilots, ensuring technical success and achieving clear business objectives.Create prototype pipelines and fine-tuned large language models using the Adaptive Engine to illustrate production-ready solutions.

Sep 23, 2025
Apply
companyXsolla, Inc. logo
Full-time|On-site|Montreal

Join our dynamic team at Xsolla as an AI Infrastructure Engineer. In this role, you will be responsible for designing and implementing robust AI infrastructure solutions that empower our gaming services. Your expertise in AI technologies and cloud computing will play a critical role in enhancing the performance and scalability of our systems.Key responsibilities include developing AI models, optimizing data pipelines, and collaborating with cross-functional teams to integrate AI solutions into our existing frameworks. If you are passionate about artificial intelligence and eager to make an impact in the gaming industry, we want to hear from you!

Mar 26, 2026
Apply
companyMechanical Orchard logo
Full-time|Remote|Canada (Remote)

Join us at Mechanical Orchard as an Infrastructure Software Engineer, where you will play a crucial role in designing and implementing robust software solutions that enhance our infrastructure capabilities. You will work closely with cross-functional teams to ensure seamless integration and optimal performance of our systems.In this position, you will leverage your technical skills to troubleshoot and resolve complex software issues, while also contributing to the development of innovative solutions that drive our business forward.

Mar 26, 2026
Apply
companyVeeva Systems Inc. logo
Full-time|On-site|Canada - Toronto

Role overview Veeva Systems is hiring a Senior Software Engineer focused on infrastructure in Toronto, Canada. This role centers on designing and building software that supports and improves our cloud-based platforms. The work directly impacts scalability and performance across our systems. What you will do Design and implement software solutions for infrastructure needs Work closely with teams from different disciplines to strengthen our cloud platforms Contribute to projects that improve system scalability and performance

Apr 14, 2026

Sign in to browse more jobs

Create account — see all 4,629 results

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.