Performance Engineer Inference jobs in Toronto – Browse 759 openings on RoboApply Jobs

Performance Engineer Inference jobs in Toronto

Open roles matching “Performance Engineer Inference” with location signals for Toronto. 759 active listings on RoboApply Jobs.

759 jobs found

1 - 20 of 759 Jobs
Apply
companyCerebras Systems logo
Full-time|On-site|Toronto, Ontario, Canada

Cerebras Systems is at the forefront of AI technology, having developed the world's largest AI chip, which is 56 times larger than traditional GPUs. Our revolutionary wafer-scale architecture delivers unparalleled AI compute power equivalent to dozens of GPUs on a single chip, combined with the ease of programming as if it were a single device. This innovative approach enables us to achieve industry-leading training and inference speeds, allowing machine learning practitioners to run extensive ML applications effortlessly, without the complexities associated with managing numerous GPUs or TPUs. Cerebras is trusted by leading model labs, global enterprises, and pioneering AI-native startups. Notably, OpenAI recently announced a multi-year partnership with Cerebras, aimed at deploying 750 megawatts of scale, revolutionizing critical workloads with ultra high-speed inference. Thanks to our groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution globally, exceeding GPU-based hyperscale cloud inference services by more than 10 times. This significant enhancement in speed is redefining the user experience of AI applications, facilitating real-time iterations and amplifying intelligence through enhanced agentic computation.About The RoleAs a member of the inference performance team, you will work at the critical intersection of hardware and software, enhancing end-to-end model inference speed and throughput. Your focus will encompass low-level kernel performance debugging and optimization, system-level performance analysis, performance modeling, and the creation of tools for performance diagnostics and projections.ResponsibilitiesDevelop performance models (kernel-level, end-to-end) to forecast the performance of state-of-the-art and client ML models.Optimize and troubleshoot our kernel micro code and compiler algorithms to enhance ML model inference speed, throughput, and compute utilization on the Cerebras WSE.Analyze and debug runtime performance at the system and cluster level.Create tools and infrastructure to visualize performance data collected from the Wafer Scale Engine and our compute cluster.

Feb 17, 2026
Apply
companyCerebras Systems logo
Full-time|On-site|Toronto, Ontario, Canada

Cerebras Systems is at the forefront of AI innovation, creating the world’s largest AI chip, a staggering 56 times larger than traditional GPUs. Our revolutionary wafer-scale architecture delivers the computational power of dozens of GPUs within a single chip, paired with the simplicity of a unified programming interface. This unique approach enables us to achieve unparalleled training and inference speeds, empowering machine learning practitioners to execute large-scale ML applications effortlessly, without the complexities associated with hundreds of GPUs or TPUs.Among our esteemed clientele are leading model laboratories, global enterprises, and pioneering AI-native startups. Recently, OpenAI announced a multi-year collaboration with Cerebras, aiming to leverage 750 megawatts of scale to revolutionize key workloads through ultra-high-speed inference.Thanks to our groundbreaking wafer-scale architecture, Cerebras Inference provides the fastest Generative AI inference solution available today, boasting speeds over ten times faster than GPU-based hyperscale cloud services. This extraordinary increase in speed is reshaping the user experience of AI applications, enabling real-time iterations and enhancing intelligence through advanced agentic computation.About The RoleJoin our inference model team, dedicated to advancing state-of-the-art models by numerically validating and accelerating innovative concepts on our wafer-scale hardware. In this role, you will prototype architectural enhancements, construct performance evaluation pipelines, and translate quantitative insights into actionable changes that drive production success.Key ResponsibilitiesPrototype and benchmark innovative concepts such as new attention mechanisms, mixture of experts (MoE), speculative decoding, and other emerging advancements.Create agent-driven automation tools that design experiments, schedule runs, triage regressions, and prepare pull requests.Collaborate closely with compiler, runtime, and silicon teams, gaining a unique perspective on the complete software/hardware innovation stack.Stay current with the latest open- and closed-source models; execute them on wafer scale first to identify new optimization opportunities.

Feb 17, 2026
Apply
companyCerebras Systems logo
Full-time|On-site|Toronto, Ontario, Canada

Cerebras Systems is revolutionizing AI technology with the world's largest AI chip, which is 56 times larger than traditional GPUs. Our innovative wafer-scale architecture combines the immense computational power of multiple GPUs into a single chip while maintaining unparalleled programming simplicity. This allows us to provide extraordinary training and inference speeds, empowering machine learning users to seamlessly execute large-scale ML applications without the complexities of managing numerous GPUs or TPUs. We proudly serve a diverse clientele, including leading model laboratories, global corporations, and innovative AI-centric startups. Notably, OpenAI recently formed a multi-year partnership with Cerebras, committing to deploy 750 megawatts of scale to enhance critical workloads with ultra-high-speed inference. Thanks to our groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution globally, achieving speeds over 10 times faster than GPU-based hyperscale cloud inference services. This remarkable speed transformation enhances user experiences and facilitates real-time iterations while augmenting intelligence through advanced agentic computation. About The RoleThe Inference Core Platform team is integral to Cerebras’ mission of delivering the world’s fastest AI inference. Our engineers develop the core software and hardware infrastructure that enables low-latency, high-speed, and high-throughput deployment on the Cerebras Wafer-Scale Engine (WSE). We oversee the entire stack—from model compilation and scheduling to custom hardware kernels and driver development.The Platform Benchmarking team is crucial in enhancing the performance and scalability of AI inference on one of the most advanced computing systems ever developed. We spearhead the establishment of core inference capabilities and implement performance improvements at every development phase, from initial prototyping to full production deployment.We seek enthusiastic engineers eager to redefine the boundaries of AI inference. If you're passionate about developing systems that measure, analyze, and optimize performance on a large scale, this is your chance to make a transformative impact on the future of AI.

Mar 18, 2026
Apply
companyOLIX logo
Full-time|On-site|Toronto

About OLIXAt OLIX, we are at the forefront of a technological revolution. The demand for AI is surging faster than any previous technology, leading to a significant gap in infrastructure. Conventional hardware designs have reached their limits, and the industry is in desperate need of innovation. We are pioneering a new era with our Optical Tensor Processing Unit (OTPU), which promises unparalleled performance and energy efficiency that current chips cannot match.The RoleWe are looking for a Senior Performance Modelling Engineer to take ownership of analytical and simulation models that guide the architecture and software development of our OTPU. Your expertise will be vital in creating functional simulators and high-fidelity, cycle-accurate models of our optical computing system. This position is essential for exploring design possibilities and delivering insights that will shape our software, hardware, and optical strategies. Ideal candidates will thrive in a dynamic environment at the intersection of hardware architecture, software tools, and machine-learning workload analysis, with a passion for data-driven decisions and rapid prototyping.ResponsibilitiesProject Ownership: Lead and execute projects within your team's roadmap that unlock critical technical and business milestones essential for OLIX's success.Collaboration: Partner closely with hardware, compiler, and ML framework teams to ensure that models accurately represent reality and meet performance targets.Functional Simulator: Design, develop, and maintain a functional simulator for the OPTU subsystem and its complete pipeline.Performance Simulator: Create and maintain architectural and cycle-accurate models of the OPTU subsystems and pipeline, identifying key throughput, latency, and utilization bottlenecks to propose architectural or scheduling solutions.Workload Analysis & Bottleneck Hunting: Use benchmarks (including LLMs, diffusion, and graph workloads) to gather detailed traces.Design-Space Exploration: Conduct extensive parameter sweeps using your functional models to analyze trade-offs and steer the software, hardware, and optical teams, presenting results in clear, quantitative analyses and design recommendations.

Feb 10, 2026
Apply
companyOLIX logo
Full-time|On-site|Toronto

About OLIXAt OLIX, we are pioneering the next technological revolution. The rapid expansion of AI technologies has exposed significant gaps in existing infrastructure, making it imperative to innovate beyond outdated hardware paradigms. Our flagship product, the Optical Tensor Processing Unit (OTPU), redefines performance and energy efficiency, positioning us as leaders in this transformative era.Role Overview:As the Engineering Manager for Performance Modelling, you will spearhead a dedicated team of six engineers tasked with defining and validating OTPU architecture specifications across various generations. You will collaborate extensively with compiler, ASIC, product, and business development teams to ensure alignment between architectural decisions, customer requirements, and software-hardware integration timelines.We seek a dynamic engineering manager with proven experience in leading high-performing teams to achieve significant business outcomes. Your technical acumen and leadership abilities will be key to fostering a collaborative environment that drives innovation and execution.Key Responsibilities:Develop and execute the performance modelling roadmap at OLIX, ensuring alignment with core business objectives to optimize customer satisfaction and business results.Collaborate with ASIC and compiler teams to articulate and validate OLIX's architectural specifications through rigorous performance modelling across OTPU iterations.Cultivate a thriving, high-performance culture that emphasizes innovation, agility, and operational excellence.Promote a standard of technical excellence through effective design review practices and clear ownership of technical responsibilities in a fast-paced hardware/software co-design setting.Guide strategic technical decisions to harmonize long-term sustainability with immediate project deliverables.Qualifications:Demonstrated engineering leadership in managing complex, long-term projects in dynamic environments, with a strong sense of accountability for team performance.Extensive experience in performance modelling, architecture validation, and cross-functional collaboration.

Feb 10, 2026
Apply
companyCerebras Systems logo
Full-time|On-site|Toronto, Ontario, Canada

Cerebras Systems is revolutionizing AI with the largest AI chip globally, measuring 56 times larger than traditional GPUs. Our innovative wafer-scale architecture delivers the computational power of numerous GPUs on a single chip, combining unparalleled performance with the simplicity of a single device. This unique approach enables Cerebras to provide leading-edge training and inference speeds, allowing machine learning professionals to effortlessly execute large-scale ML applications without the complexities of managing multiple GPUs or TPUs.Cerebras counts among its esteemed clients top-tier model laboratories, major global enterprises, and pioneering AI-native startups. Recently, OpenAI announced a multi-year collaboration with Cerebras, aiming to deploy 750 megawatts of power to transform critical workloads with ultra-high-speed inference.The groundbreaking wafer-scale architecture of Cerebras Inference offers the fastest Generative AI inference solution worldwide, exceeding GPU-based hyperscale cloud inference services by over ten times. This dramatic improvement in speed is reshaping the user experience of AI applications, facilitating real-time iterations and enhancing intelligence through advanced agentic computation.About The RoleJoin Cerebras as a Performance Engineer within our innovative Runtime Team. Our cutting-edge CS-3 system, powered by a network of modern and robust x86 machines, has established new benchmarks in high-performance ML training and inference solutions. Leveraging a chip the size of a dinner plate with 44GB of on-chip memory, this role will challenge and expand your expertise in optimizing AI applications and managing computational workloads primarily on the x86 architecture that supports our Runtime driver.

Feb 17, 2026
Apply
companyNu logo
On-site|On-site|Canada, Toronto

About UsNubank stands as one of the foremost digital financial platforms globally, boasting over 122 million customers across Brazil, Mexico, and Colombia. Driven by our mission to combat complexity and empower individuals, we are on a transformative journey to reshape financial services in Latin America, and this is merely the beginning of the innovative future we are forging.As a publicly traded company on the New York Stock Exchange (NYSE: NU), we integrate proprietary technology, data intelligence, and a streamlined operational model to provide financial products that are intuitive, accessible, and human-centered.Our influence has been acknowledged through prestigious global rankings, including Time 100 Companies, Fast Company’s Most Innovative Companies, and Forbes World’s Best Bank. Explore our career opportunities at https://international.nubank.com.br/careers/About the RoleJoin our Systems Performance Team as a Senior Systems Engineer. This team is an integral part of the Computing Squad (Foundation / Runtime Platforms). You will contribute to developing advanced diagnostic tools and conducting in-depth analyses aimed at minimizing latency, reducing infrastructure costs, and enhancing service efficiency. Your role will involve executing performance assessments and identifying systemic bottlenecks within one of the largest JVM-based microservice architectures globally, collaborating with components ranging from the Linux Kernel to wide-scale cloud orchestration.

Feb 3, 2026
Apply
companyWaabi logo
Full-time|On-site|Toronto, ON

Waabi seeks a Senior or Staff Software Engineer to focus on high-performance onboard algorithms in Toronto, ON. This position centers on designing and building algorithms that play a direct role in shaping autonomous driving technology. Role overview The engineer in this role will create efficient and reliable software solutions. These efforts aim to improve both vehicle performance and safety, contributing to the advancement of Waabi's autonomous driving systems. What you will do Design algorithms for onboard vehicle systems Develop software that enhances performance and safety Work on solutions that have a direct impact on autonomous driving capabilities Location This role is based in Toronto, ON.

Apr 25, 2026
Apply
companyConstant Contact logo
Full-time|CA$210K/yr - CA$262K/yr|On-site|Toronto, Ontario, Canada

At Constant Contact, we pride ourselves on our dynamic and dedicated team, where each member takes initiative and strives to create a meaningful impact. Our mission is to empower individuals and businesses to fulfill their dreams. Here, every contribution is essential in supporting entrepreneurs, small businesses, non-profits, and individuals with the tools and resources they need to thrive online. We are fueled by challenges and limitless opportunities—and we are just beginning!About the RoleWe are on the lookout for a Director of Performance Marketing who will take the lead in driving paid customer acquisition and developing measurement systems to enhance performance visibility and scalability. This role embodies a product-oriented approach: rather than merely managing channels, you will design, instrument, and iterate customer-facing products while being accountable for the outcomes.You will be responsible for the performance engine across both self-service and sales-assisted approaches, catering to SMB, mid-market, and international segments. Your challenge will be to balance the rapid volume demands of trial-driven acquisition with the precision required for a sales-supportive funnel. You will define channel objectives and analyze data to ensure effectiveness.Key ResponsibilitiesChannel Strategy as Product OwnershipEstablish the purpose, target audience, and success metrics for each performance channel before embarking on optimization. Each channel serves a specific role: identifying who it benefits, the stage of the customer journey it influences, and how it transitions to subsequent steps.Manage a comprehensive channel roadmap for paid search, paid social, programmatic, affiliate, and emerging channels, establishing clear hypotheses regarding where value is generated and the conditions that would prompt scaling, restructuring, or discontinuation.Collaborate with Product and Web teams to design the post-click experience, treating landing pages, trial flows, and onboarding entry points as integral extensions of the channels.Tailor channel strategies according to segment: self-serve SMB acquisition, mid-market sales-assisted motions, and international markets each necessitate unique channel designs, messaging frameworks, and conversion criteria.Measurement Systems & Attribution OwnershipCollaborate with the central analytics team to co-develop the measurement framework, including attribution models, incrementality testing schedules, LTV cohort analysis, and the signal quality standards that support value-based bidding.Define essential business requirements...

Apr 7, 2026
Apply
companySystem Canada Technologies logo
Contract|CA$120K/yr - CA$120K/yr|On-site|Toronto

Join System Canada Technologies as a Senior Performance Test Manager. In this vital role, you will oversee performance testing initiatives to ensure optimal system reliability and user experience. Collaborate with cross-functional teams to define testing strategies, develop performance benchmarks, and implement solutions that drive continuous improvement.

Dec 27, 2013
Apply
companySystem Canada Technologies logo
Full-time|On-site|Toronto

Join our team as a Performance Tester, where you will take on critical responsibilities in ensuring the highest standards of quality in our software products. You will conduct performance testing to identify and resolve bottlenecks, optimize performance, and enhance user experience.

Dec 10, 2015
Apply
companySystem Canada Technologies logo
Contract|On-site|Toronto

We are seeking a highly skilled Senior Application Performance Tester to join our dynamic team at System Canada Technologies. In this role, you will be responsible for assessing application performance across various platforms, identifying bottlenecks, and providing actionable insights to enhance overall system efficiency. You will collaborate closely with developers and project managers to ensure high-quality deliverables.

Jun 13, 2012
Apply
companyLoopio logo
Full-time|On-site|Toronto, ON Hub

Elevate Your Career with Loopio! Join Loopio as the Director of Performance Marketing, a pivotal role within our dynamic Marketing team. We are in search of a strategic leader who excels in data-driven decision-making and innovative marketing approaches. Collaborating with various teams—including brand, product marketing, content, digital, alliances, and RevOps—you will spearhead the development of cutting-edge programs and campaigns.In this influential position, you will serve as a key architect of our growth strategy, positioned at the crossroads of marketing, sales, and revenue operations. Reporting to the Sr. Director of Growth, you will work closely with cross-functional partners to design and implement campaigns that enhance our pipeline and accelerate our mission to empower companies in delivering outstanding responses. Key ResponsibilitiesDefine and oversee the demand-generation strategy, establishing vision, goals, metrics, and roadmaps for multichannel campaigns (digital, events, paid media, partnerships, etc.).Lead integrated campaign execution to attract, nurture, and convert prospects (from awareness to MQL to SQL), in close collaboration with Content, Product Marketing, and Sales Development teams.Collaborate with Sales Leadership and Revenue Operations to ensure alignment on target markets, buyer personas, lead scoring, qualification criteria, and hand-off processes.Manage marketing automation systems, CRM integration, attribution modeling, and analytics to continuously assess performance, pipeline contribution, and ROI optimization.Oversee budget management, forecasting, and resource allocation for demand generation, optimizing expenditures across channels and geographies while reporting outcomes to senior leadership.Build, mentor, and lead a high-performance demand generation/growth team, establishing clear KPIs, driving process improvements, and fostering a culture of innovation and learning.Own the comprehensive growth engine, encompassing performance marketing (paid acquisition) and lifecycle marketing, setting the standard for integrated growth strategies.Stay informed on industry trends, emerging technologies (e.g., ABM, AI in marketing), and the competitive landscape to drive differentiation and innovation in growth and acquisition initiatives. QualificationsA strong sense of urgency and accountability, demonstrating this ethos within your team and the larger organization.Proven experience in a leadership role within performance marketing, particularly in demand generation strategies.Excellent analytical skills, with the ability to interpret data and make informed decisions.Strong communication and collaboration skills to work effectively with cross-functional teams.A passion for continuous learning and staying ahead of industry trends.

Mar 24, 2026
Apply
companyTenstorrent logo
Full-time|Hybrid|Toronto, Ontario, Canada

At Tenstorrent, we are at the forefront of transformative AI technology, setting new standards for performance, user-friendliness, and cost-effectiveness. As AI reshapes the computing landscape, our solutions are designed to integrate advancements in software models, compilers, platforms, networking, and semiconductors. Our talented team has engineered a high-performance RISC-V CPU from the ground up, driven by a shared enthusiasm for AI and a commitment to developing the most advanced AI platform available. We prioritize collaboration, curiosity, and tackling challenging problems. Our team is expanding, and we invite individuals of all experience levels to contribute.The Tensix team is focused on creating a high-performance compute fabric that supports Tenstorrent’s AI and ML workloads. In the role of AI Performance Architect, you will model, analyze, and enhance the execution of real AI workloads on the Tensix architecture, influencing future hardware features and ensuring that each design choice results in tangible performance improvements. This position bridges architecture, software, and RTL to maximize efficiency and scalability across next-generation AI systems.This position is hybrid, based out of Toronto, ON; Austin, TX; or entirely remote.We are open to candidates with various levels of experience for this position. During the interview process, candidates will be evaluated to determine their appropriate level, and offers will be tailored accordingly, which may differ from the level indicated in this posting.

Mar 24, 2026
Apply
companyOpendoor Technologies Inc. logo
Senior Manager of Performance Marketing

Opendoor Technologies Inc.

Full-time|On-site|Toronto

Join Opendoor Technologies Inc. as a Senior Manager of Performance Marketing, where you will lead innovative marketing strategies to drive customer acquisition and retention. In this pivotal role, you will oversee performance marketing campaigns across various digital platforms, leveraging data analytics to optimize performance and maximize ROI.Your expertise will guide our marketing team in executing high-impact initiatives, collaborating with cross-functional teams, and advancing our brand presence in the real estate market.

Mar 26, 2026
Apply
companyvenn logo
Full-time|On-site|Toronto

Role Overview venn is hiring a Performance Marketing Manager in Toronto. This role shapes digital marketing strategy by analyzing campaign data, refining tactics, and working closely with teams across the company. The goal: improve brand visibility and drive engagement that supports growth targets. What You Will Do Review and interpret performance metrics to inform marketing decisions Optimize digital campaigns for better results Collaborate with colleagues in other departments to support cohesive marketing efforts Guide marketing initiatives to help meet company growth goals

Apr 20, 2026
Apply
companyPaytm logo
Full-time|On-site|Toronto, Canada

Role Overview Paytm is hiring a Staff AI Platform Engineer in Toronto, Canada, with a focus on Inference and Agentic Systems. This role centers on designing and improving AI-powered platforms that support intelligent, agent-like features for users and business applications. What You Will Do Work with a skilled team to build and refine AI solutions that support agentic behaviors and inference-driven capabilities. Apply experience in machine learning, software engineering, and system architecture to develop and scale reliable AI platforms. Contribute technical leadership and hands-on expertise to projects that advance Paytm’s AI offerings. Key Skills Deep knowledge of machine learning and inference methods Strong background in software engineering Experience designing and maintaining complex system architectures

Apr 16, 2026
Apply
companySystem Canada Technologies logo
Contract|On-site|Toronto

We are seeking a highly skilled Senior Application Performance Tester to join our dynamic team at System Canada Technologies. In this role, you will be responsible for evaluating application performance, identifying bottlenecks, and providing actionable insights to enhance system efficiency. Your expertise will be crucial in ensuring our applications meet the highest performance standards and deliver an exceptional user experience.As a Senior Application Performance Tester, you will collaborate with cross-functional teams to design and execute performance tests, analyze results, and recommend improvements. Your analytical skills and attention to detail will play a pivotal role in driving our commitment to quality and performance.

Dec 26, 2012
Apply
companyEquitable Bank logo
Full-time|On-site|Toronto

Become a Catalyst for ChangeAt Equitable Bank, we don't conform to traditional banking norms. Instead, we embrace creativity to deliver innovative banking solutions tailored for Canadians.Our approach involves a dynamic team of curious and agile thinkers who dare to challenge the status quo. If you're enthusiastic about reshaping the future of banking while enjoying your work, this could be the exciting opportunity you’ve been waiting for.As a growing entity, we proudly serve over 780,000 customers nationwide through EQ Bank, recognized as Canada’s Challenger Bank™. With more than 50 years of history, our wholly-owned subsidiary, Concentra Bank, supports credit unions across the country, collectively serving over six million members. We manage an impressive $138 billion in assets and are dedicated to driving transformative change in Canadian banking to enhance the lives of our customers. Our EQ Bank digital platform has consistently been highlighted as one of the top banks in Canada on the Forbes World's Best Banks list since 2021. Your Role:Reporting to the Senior Manager of Performance Marketing, the Performance Marketing Manager will spearhead the performance media strategy, planning, and optimization initiatives for EQ Bank’s Personal Banking and Small Business Banking portfolios, while also providing crucial support for Equitable Bank’s Reverse Mortgage product.In this role, you will be accountable for performance metrics across paid media channels, collaborating closely with agency and platform partners to fine-tune campaigns and enhance customer acquisition efficiency. Although daily execution will largely be handled by agency partners, you will need to possess a solid understanding of the platforms to step in when necessary and make informed optimization decisions.The ideal candidate is a collaborative, data-focused marketer who excels in an agile, performance-driven environment and can effectively balance short-term optimization with long-term growth strategies.

Feb 20, 2026
Apply
companyOliver USA logo
Full-time|On-site|Toronto, Ontario

Join our dynamic team at Oliver USA as a GenAI Creative Optimization and Performance Analyst. In this role, you will leverage cutting-edge generative AI technologies to enhance creative strategies and optimize performance metrics. Your analytical skills will drive insights that shape our approach to innovative solutions and client success.

Mar 25, 2026

Sign in to browse more jobs

Create account — see all 759 results

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.