AI Inference Engineer at Perplexity | San Francisco

PerplexitySan Francisco

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Entry Level

Qualifications

QualificationsProficiency in machine learning systems and deep learning frameworks such as PyTorch, TensorFlow, and ONNX. Familiarity with prevalent LLM architectures and inference optimization methods, including continuous batching and quantization. Solid understanding of GPU architectures, along with experience in GPU kernel programming using CUDA.

About the job

Key Responsibilities

Design and develop APIs for AI inference that cater to both internal and external stakeholders.
Conduct benchmarking and identify bottlenecks within our inference stack to enhance performance.
Ensure the reliability and observability of our systems while promptly addressing any outages.
Investigate innovative research and implement optimizations for LLM inference.

About Perplexity

Perplexity is a forward-thinking technology company based in San Francisco, dedicated to harnessing the power of artificial intelligence to transform industries. Our innovative team thrives on collaboration and creativity, driving advancements in AI and machine learning.

Similar jobs

1 - 20 of 11,593 Jobs

Search for Frontend Engineer Design Systems At Perplexity San Francisco

11,593 results

Select all on this page (20)

Apply

Frontend Engineer - Design Systems at Perplexity | San Francisco

Perplexity

Full-time|On-site|San Francisco

Join Perplexity as a Frontend Engineer specializing in Design Systems, where you will play a pivotal role in transforming the future of online search and interaction. In this innovative position, you will be at the forefront of developing cutting-edge AI products.Tech Stack: Tailwind | React | TypeScript | CSSKey ResponsibilitiesCollaborate with the design systems team to create an outstanding user interaction layer for all features, including both reusable components and foundational elements of generative UI.Enhance and refine the components that constitute the core of Perplexity's frontend.Continuously seek ways to elevate interaction quality, aesthetics, and team productivity.Essential QualificationsProven experience in building and maintaining user interface systems on a large scale.Solid coding fundamentals with some cross-stack development experience.Skilled at creating foundational systems for others to build upon.Hands-on experience with highly interactive React applications that utilize strongly typed code.Deep understanding of design and UI patterns applicable at scale.A genuine passion for prototyping, experimentation, and crafting accessible user experiences.Demonstrates an extreme ownership mentality.Takes pride in precision and attention to detail.At least 4 years of relevant industry experience.At Perplexity, AI is central to our mission. We expect all team members to effectively leverage AI in their roles. During the interview process, we will assess your thought process and decision-making abilities, which are crucial to our AI development. Please refrain from using AI tools unless instructed otherwise.

Nov 7, 2025

Apply

IT Systems Administrator at Perplexity | San Francisco

Perplexity

Full-time|On-site|San Francisco

Join Perplexity as an IT Systems Administrator and play a pivotal role in transforming how users engage with the internet. As an early addition to our innovative team, you will have a unique opportunity to build and optimize our technology infrastructure from the ground up, ensuring seamless operations and cutting-edge performance.This position requires you to work in-person at our bustling San Francisco office.Key ResponsibilitiesProcure, maintain, and manage computers, networking devices, and office technologies to ensure operational excellence.Oversee and optimize corporate software systems for enhanced productivity.Enhance and manage our Mobile Device Management (MDM) infrastructure.Implement and uphold security policies and procedures to safeguard our digital assets.Provision and administer user accounts, ensuring appropriate access permissions.Deliver responsive technical support and effectively troubleshoot complex issues.Lead IT initiatives aimed at improving endpoint management, security, and overall infrastructure.Provide technical training and support on IT systems to staff, both in-person and remotely.

Apr 2, 2026

Apply

Software Engineer - Security at Perplexity | San Francisco

Perplexity

Full-time|On-site|San Francisco

About the RoleJoin Perplexity as a dynamic Software Engineer specializing in security, where you will play a pivotal role in developing and enhancing the software, automation, and systems that drive our security operations. This position focuses on engineering innovative security tools and AI-driven agents aimed at improving our detection and response capabilities, vulnerability management, and overall security posture across our products and infrastructure.ResponsibilitiesDesign, build, and maintain software and automation solutions that enhance our detection and response capabilities, including alert enrichment, triage workflows, and investigation tools.Implement and refine internal AI agents and security bots that facilitate monitoring, investigations, reporting, and other security operations tasks.Develop and manage systems and workflows that support our bug bounty and vulnerability disclosure program, covering intake, triage, prioritization, and remediation tracking.Collaborate with product and engineering teams to perform threat modeling on new features and systems, propose mitigations, and integrate security guardrails into designs and implementations.Contribute to secure-by-default libraries, services, and patterns that empower teams to build secure features effortlessly.Integrate security signals from cloud services, endpoints, SaaS, and applications into unified pipelines and data models that bolster detection and analysis.Automate processes to minimize manual effort in incident response, containment, and remediation.Work closely with security engineers and fellow software engineers to review designs and code, continuously enhancing our security tools and platforms.QualificationsA minimum of 4 years of experience as a software engineer, particularly in developing security-related tools, platforms, or automation, or in a security engineering role with a strong emphasis on software development.Proficiency in at least one major programming language (e.g., Python, Go, or TypeScript) with experience in building production services, command-line interfaces, or internal tools.Experience with integration of security-relevant systems such as logging pipelines, SIEMs, EDR, cloud APIs, or identity platforms.Hands-on experience in threat modeling, secure design, or conducting application security reviews for services or features.Experience in operating or contributing to bug bounty or vulnerability management programs is a plus.

Dec 2, 2025

Apply

iOS Engineer at Perplexity | San Francisco

Perplexity AI

Full-time|On-site|San Francisco

Join Perplexity AI as a skilled iOS Engineer and play a pivotal role in transforming the way users navigate the web. You'll be instrumental in developing and enhancing Comet, our innovative browser designed specifically for iOS.We seek a candidate with exceptional programming expertise, a keen interest in artificial intelligence and large language models, and a dedication to providing an outstanding user experience supported by a sophisticated user interface.Key ResponsibilitiesCraft a high-performance native iOS application that will be enjoyed by millions globally.Maintain a rigorous standard of quality in both user and developer experiences.Collaborate closely with design teams to create fast, intuitive user interfaces.Engage with data science and machine learning teams to evaluate and enhance the overall user journey.Partner with infrastructure and QA teams to streamline deployment processes, including testing, release, and monitoring.QualificationsA minimum of 5 years of industry experience.Solid understanding of Swift and proven experience with a modern iOS tech stack, including SwiftUI (iOS 16+) and UIKit.A passion for creating beautiful user interfaces, excellent user experiences, and writing reusable, testable code.Strong grasp of low-level details and the ability to profile and optimize app performance.Comfortable working in a small, agile team, demonstrating ownership and initiative.A genuine enthusiasm for iOS development and exploring the latest advancements in iOS and iPadOS.(Bonus) Experience in browser development is a plus.

Feb 19, 2026

Apply

Application Security Engineer at Perplexity | San Francisco

Perplexity

Full-time|On-site|San Francisco

Perplexity is on the lookout for an exceptional and proactive Application Security Engineer to enhance our innovative security team. Join us in transforming how individuals search and engage with the internet. You will be instrumental in developing systems, tools, and processes that seamlessly integrate security for developers, fostering rapid innovation while safeguarding our users on a large scale.Key ResponsibilitiesDesign and deploy scalable, developer-friendly security solutions that seamlessly incorporate into engineering workflows.Lead threat modeling exercises, design evaluations, and code assessments for new features and significant product launches.Develop and enhance secure-by-default frameworks for authentication, authorization, input validation, and secrets management.Create and integrate automated security tools within CI/CD pipelines (including linters, dependency scanners, and policy enforcement).Collaborate with product and engineering teams to address vulnerabilities and contribute to incident response and postmortems.Oversee, manage, and enhance our third-party penetration testing engagements and bug bounty program, working closely with external security researchers to detect and fix vulnerabilities.Stay updated on prevalent threats and attack strategies, driving the continuous improvement of our application security posture.

May 14, 2025

Apply

Software Engineer - AI Platform at Perplexity | San Francisco

Perplexity

Full-time|On-site|San Francisco

Join Perplexity as a skilled Software Engineer, where you will play a pivotal role in developing the next-generation AI Foundation and Platform. Our mission is to transform how individuals search and engage online. In this exciting position, you will contribute to building Perplexity's comprehensive AI data, evaluation, and personalization infrastructure, which underpins nearly all of our agent products.Technology Stack: Spark | AWS Data Stack (S3, RDS, DynamoDB, Docker, EKS, Kinesis) | Pytorch | Databricks | Snowflake | LLM APIsAs we continue to expand our user base and diverse use cases, our data stack ensures that millions around the globe receive fast, personalized answers.

Sep 19, 2025

Apply

AI Security Engineer at Perplexity | San Francisco

Perplexity

Full-time|On-site|San Francisco

At Perplexity, we are on the lookout for a talented and experienced AI Security Engineer to bolster our security team. This pivotal role involves safeguarding cutting-edge AI systems from adversarial threats. You will be responsible for creating and implementing strong security measures for self-hosted models, LLM APIs, agents, MCPs, and the essential AI infrastructure. Your expertise will empower our developers with the necessary tools and guidance, enabling them to innovate while ensuring that AI security remains a top priority.Our technology stack is comprised of Python, NextJS, TypeScript, Docker, AWS, Kubernetes, and PostgreSQL.

Jul 30, 2025

Apply

Engineering Manager - AI Inference at Perplexity | San Francisco

Perplexity

Full-time|On-site|San Francisco

About the RoleWe are seeking a talented Inference Engineering Manager to spearhead our AI Inference team at Perplexity. This is a remarkable opportunity to design and expand the infrastructure that drives Perplexity's innovative products and APIs, catering to millions of users with cutting-edge AI capabilities.You will take charge of the technical direction and implementation of our inference systems while cultivating and leading a high-caliber team of inference engineers. Our technology stack encompasses Python, PyTorch, Rust, C++, and Kubernetes. You will play a crucial role in architecting and scaling the large-scale deployment of machine learning models for Perplexity's Comet, Sonar, Search, and Deep Research products.Why Perplexity?Develop state-of-the-art systems that are among the fastest in the industry using leading-edge technology.Engage in high-impact work within a smaller team, enjoying considerable ownership and autonomy.Seize the chance to create infrastructure from the ground up instead of maintaining outdated systems.Work across the entire spectrum: minimizing costs, scaling traffic, and advancing the capabilities of inference.Make a significant impact on the technical roadmap and team culture at a rapidly expanding company.ResponsibilitiesLead and nurture a high-performing team of AI inference engineers.Develop APIs for AI inference utilized by both internal and external clients.Design and scale our inference infrastructure for enhanced reliability and efficiency.Benchmark and resolve bottlenecks across our inference stack.Drive large sparse/MoE model inference at rack scale, including sharding strategies for extensive models.Innovate by developing inference systems that support sparse attention and disaggregated pre-fill/decoding serving.Enhance the reliability and observability of our systems and lead incident response efforts.Make technical decisions regarding batching, throughput, latency, and GPU utilization.Collaborate with ML research teams on model optimization and deployment.Recruit, mentor, and develop engineering talent.Establish team processes, engineering standards, and operational excellence.Qualifications5+ years of engineering experience, with at least 2 years in a technical leadership or management capacity.Proficiency in programming languages and tools such as Python, PyTorch, Rust, and C++.Experience with Kubernetes and cloud infrastructure.Strong understanding of machine learning model deployment and optimization.Exceptional problem-solving and communication skills.

Jan 18, 2026

Apply

Senior Backend Software Engineer at Perplexity | San Francisco

Perplexity

Full-time|On-site|San Francisco

Join Perplexity as a Senior Backend Software Engineer and help transform how users engage with the internet. As a key member of our dynamic team, you will lead the design, implementation, and scaling of backend systems that drive our web, mobile, and browser applications.Our Technology Stack: Python | Go | Rust | TypeScript | FastAPI | PostgreSQL | Redis | Docker | vLLM | AWSTeams HiringFile AgentThe File Agent team is at the forefront of building a scalable and secure platform for intelligent file editing and processing. Your expertise will help design the infrastructure and APIs that empower agents to autonomously edit and generate files across various formats.Enterprise GrowthThe Enterprise Growth team develops core platform capabilities that ensure Perplexity is a trusted solution for enterprise customers. This includes managing enterprise authentication, onboarding processes, and providing in-depth admin control and visibility.GrowthThe Growth team influences how millions interact with Perplexity by rapidly experimenting and implementing new features aimed at enhancing user experience and promoting user retention and revenue growth.CommerceThe Commerce team is responsible for the complete commerce stack, including payments infrastructure and monetization strategies. Your role will involve scaling billing systems across consumer and enterprise plans.

Mar 3, 2026

Apply

Software Engineer - Computer Monetization at Perplexity | San Francisco

Perplexity

Full-time|On-site|San Francisco

Perplexity seeks a Software Engineer in San Francisco to focus on computer monetization. This position involves developing and enhancing software that drives the company’s monetization strategies. Role overview This role centers on building and refining systems that support Perplexity’s revenue efforts. Projects in this area have a direct impact on the company’s growth within the technology sector. What you will do Develop software solutions that contribute to monetization goals Refine and improve existing systems to optimize revenue streams Work on projects that influence Perplexity’s expansion and success Location This position is based in San Francisco.

Apr 23, 2026

Apply

Senior iOS Software Engineer at Perplexity | San Francisco

Perplexity

Full-time|On-site|San Francisco

Join Perplexity AI, a pioneering company in the field of AI-driven search, as a Senior iOS Engineer. You will play a crucial role in transforming how users interact with the internet by developing innovative features and enhancing the performance of our iOS application.We are seeking a talented individual with a solid programming background, enthusiasm for search technologies and large language models, and a strong commitment to crafting exceptional user experiences complemented by an elegant user interface.

Nov 29, 2023

Apply

Innovative Data Scientist at Perplexity | San Francisco

Perplexity

Full-time|On-site|San Francisco

At Perplexity, we embody the future of AI, bringing transformative solutions to people who demand more. Our data team is at the forefront of this revolution, strategically integrating AI into every facet of our operations.We seek a passionate individual with a strong background as a data scientist, analytics engineer, or data engineer. You understand the significance of key metrics, can expertly design A/B tests that address core questions, dive deep into data models to solve discrepancies, and are eager to take on the challenge of building AI systems that will revolutionize the data science landscape.This is not just another text-to-SQL bot or a simple dashboard. You will create AI agents capable of conducting comprehensive analyses autonomously, from hypothesis formation and query execution to result interpretation and actionable recommendations. Your work will ensure that our entire data warehouse is accessible to AI systems, enabling precise queries across the board. You will develop self-healing data pipelines that proactively identify and resolve issues before they disrupt workflows. In doing so, you will empower our small data team to operate with the efficiency and output of a much larger organization.Join a forward-thinking data team that is already leveraging AI to enhance its processes, with full support from leadership to expand these efforts. Together, we will build a world-class team focused on creating scalable systems, innovative tools, and an AI-centric working environment that not only elevates our standards but also drives the entire industry forward.

Feb 20, 2026

Apply

Product Marketing Manager at Perplexity | San Francisco

Perplexity

Full-time|On-site|San Francisco

At Perplexity, we are at the forefront of transforming how organizations achieve their goals. We are on the lookout for a dynamic Product Marketing Manager to become an integral part of our team, acting as the key connection between our innovative products and their market influence. In this role, you will craft compelling narratives that elevate the perception of our offerings, stimulating customer engagement and driving sustainable growth.If you possess a knack for simplifying complexity, excel at turning uncertainty into a clear vision, and aspire to create a robust marketing framework that treats product launches as an ongoing strategy rather than isolated events, then this opportunity is tailored for you.

Jan 15, 2026

Apply

AI Research Lead at Perplexity | San Francisco

Perplexity

Full-time|On-site|San Francisco

Join Perplexity as an AI Research Tech Lead, where you'll spearhead our research initiatives and oversee the advancement of our proprietary Online LLMs, the Sonar models. In this pivotal leadership position, you will define the overarching research strategy across various modalities, mentor a talented team of researchers, and leverage our extensive query/answer dataset to enhance Sonar model performance, delivering a state-of-the-art Online LLM experience for our users.Key ResponsibilitiesResearch Leadership & StrategyEstablish and implement the overarching research strategy across diverse modalities, including post-training LLMs for agent trajectories and future mid-training projects.Lead the strategic planning and roadmap development to enhance Sonar model functionalities.Innovate in supervised and reinforcement learning techniques aimed at optimizing query answering.Collaborate with executive leadership to align research goals with product and business strategies.Team Development & MentorshipGuide and mentor a team of AI research scientists and engineers, nurturing their technical and professional development.Set the long-term research direction for the team, encompassing various modalities.Lead the recruitment and onboarding of new research talent.Foster a collaborative atmosphere that promotes knowledge sharing and innovative thinking.Technical ExcellencePost-train cutting-edge LLMs focused on query answering using advanced supervised and reinforcement learning techniques.Own and enhance the complete data, training, and evaluation pipelines necessary for LLM post-training.Deliver Sonar models that achieve top-notch query answering performance.Lead research efforts into agent trajectories and multi-modal capabilities.Steer the technical roadmap for future mid-training investments.Cross-Functional CollaborationCollaborate closely with engineering teams to integrate Sonar models into our products.Work with product teams to discern user needs and translate them into research priorities.Partner with data teams to effectively utilize our unique query/answer dataset.Communicate research progress and findings to stakeholders and broader teams.

Jul 21, 2025

Apply

Business Development Representative at Perplexity | San Francisco

Perplexity

Full-time|On-site|San Francisco

Your RoleAccelerate Revenue and Identify OpportunitiesManage both inbound and outbound tasks to enhance pipeline growth.Assess inbound enterprise prospects utilizing BANT criteria to pinpoint high-value opportunities for transfer to Account Executives.Implement targeted outbound initiatives to engage specific industry sectors.Strategic Outreach and Market ExpansionLeverage social selling techniques to promote the Perplexity brand—creating content, engaging with audiences, and exploring innovative formats to remain memorable to prospects.Conduct research and map accounts within designated verticals to establish thorough prospect lists.Craft vertical-specific messaging that addresses industry challenges and showcases Perplexity Enterprise solutions.Contribute to top-of-funnel strategies.Sales Operations and ExcellenceEnsure data integrity in Salesforce by meticulously documenting all prospect interactions and qualification criteria.Respond promptly to inbound inquiries with consultative, value-focused communication.Develop and refine outbound sequences that consistently generate pipeline results.

Jan 22, 2026

Apply

Solutions Product Marketing Manager at Perplexity | San Francisco

Perplexity

Full-time|On-site|San Francisco

At Perplexity, we are revolutionizing the way enterprises integrate AI into their operations. We are in search of a talented Solutions Product Marketing Manager who will take charge of our go-to-market strategy for key industries including Finance, Healthcare, Legal, and Consulting. In this pivotal role, you will delve into buying behaviors, use cases, and competitive landscapes, transforming insights into robust campaigns, engaging content, and effective sales tools that consistently drive success across these sectors. This position merges the realms of marketing, sales strategy, and product innovation.Your Responsibilities:Develop and manage an integrated vertical go-to-market strategy that encompasses ongoing initiatives, not just isolated campaigns.Analyze how enterprise buyers in each sector assess, acquire, and implement AI solutions, including identifying personas, decision-making processes, and competitive options.Create and execute vertical awareness campaigns utilizing various channels such as paid advertising, organic outreach, and proprietary content, including webinars and tailored landing pages.Equip our enterprise sales teams with comprehensive battle cards, objection handling strategies, tailored messaging, and deal-stage content that accelerates revenue generation.Establish thought leadership that positions Perplexity as the authoritative voice in AI adoption for each targeted industry.Track and evaluate vertical performance metrics: sourced pipeline, conversion rates, and engagement, making iterative improvements based on data analysis.

Apr 7, 2026

Apply

AI Inference Engineer at Perplexity | San Francisco

Perplexity

Full-time|On-site|San Francisco

Join our dynamic team at Perplexity as an AI Inference Engineer, where you will be at the forefront of deploying cutting-edge machine learning models for real-time inference. Our tech stack includes Python, Rust, C++, PyTorch, Triton, CUDA, and Kubernetes, providing you with a chance to work on large-scale applications that make a real impact.Key ResponsibilitiesDesign and develop APIs for AI inference that cater to both internal and external stakeholders.Conduct benchmarking and identify bottlenecks within our inference stack to enhance performance.Ensure the reliability and observability of our systems while promptly addressing any outages.Investigate innovative research and implement optimizations for LLM inference.

Jun 10, 2024

Apply

AI Infrastructure Engineer at Perplexity | San Francisco

Perplexity

Full-time|On-site|San Francisco

Join the innovative team at Perplexity as an AI Infrastructure Engineer. In this role, you will leverage your expertise in Kubernetes, Slurm, Python, C++, and PyTorch, primarily utilizing AWS. Collaborate closely with our Inference and Research teams to design, deploy, and optimize our extensive AI training and inference clusters.ResponsibilitiesArchitect, deploy, and manage scalable Kubernetes clusters tailored for AI model inference and training workloads.Oversee and enhance Slurm-based HPC environments for distributed training of large language models.Create robust APIs and orchestration systems for training pipelines and inference services.Implement effective resource scheduling and job management systems across diverse compute environments.Evaluate system performance, identify bottlenecks, and implement enhancements across both training and inference infrastructures.Develop monitoring, alerting, and observability solutions specifically designed for ML workloads running on Kubernetes and Slurm.Quickly respond to system outages and collaborate with multiple teams to ensure high uptime for critical training runs and inference services.Optimize cluster utilization and execute autoscaling strategies to meet dynamic workload demands.QualificationsExtensive experience in Kubernetes administration, including custom resource definitions, operators, and cluster management.Proficient in Slurm workload management, encompassing job scheduling, resource allocation, and cluster optimization.Demonstrated experience in deploying and managing distributed training systems at scale.In-depth knowledge of container orchestration and the architecture of distributed systems.Solid familiarity with LLM architecture and training processes, including Multi-Head Attention, Multi/Grouped-Query, and distributed training strategies.Experience in managing GPU clusters and optimizing compute resource utilization.Required SkillsAdvanced Kubernetes administration and YAML configuration management skills.Expertise in Slurm job scheduling, resource management, and cluster configuration.Proficiency in Python and C++ programming with a focus on systems and infrastructure automation.

Jul 15, 2025

Apply

Senior Frontend Engineer, Design Systems

Chime

Full-time|On-site|San Francisco, CA, USA

Join Chime as a Senior Frontend Engineer specializing in Design Systems! In this pivotal role, you will lead the development of scalable and user-friendly design components that enhance our product offerings. Collaborate with cross-functional teams to create seamless user experiences and design solutions that meet the needs of our growing customer base.

Mar 26, 2026

Apply

Senior Cloud Security Engineer at Perplexity | San Francisco

Perplexity

Full-time|On-site|San Francisco

Join Perplexity as a Senior Cloud Security Engineer and play a pivotal role in transforming how users search and interact with the internet. As a key member of our innovative security team, you will spearhead initiatives to construct and sustain secure and scalable cloud infrastructure, enabling our engineers to innovate swiftly and securely.Core ResponsibilitiesCollaborate with infrastructure and engineering teams to embed security measures into development processes and advocate for secure-by-default practices.Develop Terraform modules that incorporate essential security features, including logging, encryption, and automated threat detection.Implement cloud-native detection capabilities utilizing AWS GuardDuty, Security Hub, and tailor-made detection rules to uncover credential breaches, crypto-mining, and lateral movements.Ensure compliance with SOC 2 Type II and ISO 27001 by automating the collection of cloud control evidence.Conduct security assessments of cloud resource configurations using tools like AWS Config and Open Policy Agent, addressing discrepancies in line with CIS Benchmarks and internal security policies.Fortify CI/CD and supply chain pipelines through controls such as artifact signing, secret scanning, and dependency monitoring.Implement zero trust principles via stringent network segmentation, authentication, and authorization across cloud environments.Engage in security on-call rotation, responding to security alerts and incidents for prompt resolution and root cause analysis.

May 28, 2025

Create account — see all 11,593 results