Research Engineer Ai Performance Kernel Optimization jobs in San Francisco – Browse 7,423 openings on RoboApply Jobs

Research Engineer - AI Performance & Kernel Optimization

ZyphraSan Francisco

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Entry Level

Qualifications

The ideal candidate will possess a strong background in computer science or a related field, with a focus on artificial intelligence and optimization techniques. Experience with machine learning frameworks and proficiency in programming languages such as Python or C++ are essential. A passion for research and a problem-solving mindset will be key to your success.

About the job

Join Zyphra as a Research Engineer specializing in AI Performance and Kernel Optimization. In this role, you will work at the forefront of AI technologies, developing and optimizing kernel solutions that enhance the performance of our systems. You will collaborate with cross-functional teams, leveraging your expertise to drive innovation and efficiency.

About Zyphra

Zyphra is a leader in AI technology, dedicated to pushing the boundaries of what's possible. Our innovative solutions empower businesses to harness the power of artificial intelligence, driving efficiency and growth. We foster a collaborative and dynamic work environment where creativity and innovation thrive.

Similar jobs

1 - 20 of 7,423 Jobs

Select all on this page (20)

Apply

Research Engineer - AI Performance & Kernel Optimization

Zyphra

Full-time|On-site|San Francisco

Mar 16, 2026

Apply

Senior GenAI Research Engineer - Optimization and Kernels

Databricks

Full-time|$166K/yr - $225K/yr|On-site|San Francisco, California

At Databricks, we are dedicated to empowering data teams to tackle the world's most challenging problems, from detecting security threats to advancing cancer drug development. We achieve this by offering the premier data and AI platform, allowing our customers to concentrate on their mission-critical challenges. The Mosaic AI organization assists companies in developing AI models and systems utilizing their own data, employing technologies that range from training large language models (LLMs) from the ground up to employing advanced retrieval methods for enhanced generation. We pride ourselves on pushing the boundaries of science and operationalizing our innovations. Mosaic AI believes that a company’s AI models hold intrinsic value, akin to any other core intellectual property, and that superior AI models should be accessible to all. Job Overview As a research engineer in the Scaling team, you will stay abreast of the latest advancements in deep learning and pioneer new methodologies that surpass the current state of the art. You will collaborate with a diverse team of researchers and engineers, sharing insights and expertise. Most importantly, you will be passionate about our customers, striving to ensure their success in implementing cutting-edge LLMs and AI systems by translating our scientific knowledge into practical applications. Your Impact Enhance performance through innovative optimization techniques, including kernel fusion, mixed precision, memory layout optimization, tiling strategies, and tensorization tailored for training-specific patterns. Design, implement, and optimize high-performance GPU kernels for training workloads, including attention mechanisms, custom layers, gradient computations, and activation functions, specifically for NVIDIA architectures. Create and implement distributed training frameworks for large language models, incorporating parallelism strategies (data, tensor, pipeline, ZeRO-based) and optimized communication patterns for gradient synchronization and collective operations. Profile, debug, and optimize comprehensive training workflows to pinpoint and resolve performance bottlenecks, utilizing memory optimization techniques such as activation checkpointing, gradient sharding, and mixed precision training.

Jan 30, 2026

Apply

Performance Engineer - Member of Technical Staff, Kernel Engineering

Inferact

Full-time|$200K/yr - $400K/yr|Remote|San Francisco

At Inferact, we are on a mission to establish vLLM as the premier AI inference engine, significantly enhancing the speed and reducing the cost of AI inference. Our founders, the visionaries behind vLLM, have spent years bridging the gap between advanced models and cutting-edge hardware.About the RoleWe are seeking a skilled performance engineer dedicated to maximizing the computational efficiency of modern accelerators. In this role, you'll develop kernels and implement low-level optimizations that position vLLM as the fastest inference engine globally. Your contributions will be pivotal as your code will execute across a broad spectrum of hardware accelerators, from NVIDIA GPUs to the latest silicon innovations. You'll collaborate closely with hardware vendors to ensure we fully leverage the capabilities of each new generation of chips.

Jan 22, 2026

Apply

Technical Staff Member - GPU Performance & Kernel Optimization

Gimlet Labs

Full-time|On-site|San Francisco

At Gimlet Labs, we are pioneering the first heterogeneous neocloud tailored for AI workloads. As the demand for AI systems grows, traditional infrastructure faces significant limitations in terms of power, capacity, and cost. Our innovative platform addresses these challenges by decoupling AI workloads from the hardware, intelligently partitioning tasks, and directing each component to the most suitable hardware for optimal performance and efficiency. This method allows for the creation of heterogeneous systems that span multiple vendors and generations of hardware, including the latest cutting-edge accelerators, achieving substantial improvements in performance and cost-effectiveness.Building upon this robust foundation, Gimlet is developing a production-grade neocloud designed for agentic workloads. Our customers can effortlessly deploy and manage their workloads with stable, production-ready APIs, eliminating the complexities of hardware selection, placement, or low-level performance optimization.We collaborate with foundational labs, hyperscalers, and AI-native companies to drive real production workloads capable of scaling to gigawatt-class AI data centers.We are currently seeking a dedicated Member of Technical Staff specializing in kernels and GPU performance. In this role, you will work closely with accelerators and execution hardware to extract maximum performance from AI workloads across diverse and rapidly evolving platforms. You will analyze low-level execution behaviors, design and optimize kernels, and ensure consistent performance across both established and emerging hardware.This position is perfect for engineers who thrive on deep performance analysis, enjoy exploring hardware trade-offs, and are passionate about transforming theoretical peak performance into tangible real-world outcomes.

Mar 10, 2026

Apply

Software Engineer Specializing in Kernel Performance & AI Tooling

OpenAI

Full-time|Remote|San Francisco

About the Role OpenAI is looking for a Software Engineer specializing in Kernel Performance and AI Tooling to join the team in San Francisco. This role centers on improving software systems for maximum efficiency and building advanced tools that support AI development. What You Will Do Optimize kernel-level performance across OpenAI's software stack. Design and implement tools that accelerate AI research and deployment. Work closely with engineers to identify bottlenecks and deliver practical solutions. Contribute to technical discussions and share knowledge with teammates. Team and Collaboration Work alongside engineers who are committed to advancing AI technology. Collaboration and innovation are central to the team’s approach.

Apr 17, 2026

Apply

Infrastructure Research Engineer - Kernels at Thinking Machines | San Francisco

Thinking Machines Lab

Full-time|$350K/yr - $475K/yr|On-site|San Francisco

At Thinking Machines Lab, our ambition is to enhance human potential by advancing collaborative general intelligence. We envision a future where individuals have the tools and knowledge to harness AI for their distinct requirements and aspirations.Our team comprises dedicated scientists, engineers, and innovators who have contributed to some of the most renowned AI products, including ChatGPT and Character.ai, along with open-weight models like Mistral, and influential open-source projects such as PyTorch, OpenAI Gym, Fairseq, and Segment Anything.About the RoleWe are seeking an Infrastructure Research Engineer to architect, optimize, and sustain the computational frameworks that facilitate large-scale language model training. You will create high-performance machine learning kernels (e.g., CUDA, CuTe, Triton), enable effective low-precision arithmetic operations, and enhance the distributed computing infrastructure essential for training expansive models.This position is ideal for an engineer who thrives in close collaboration with hardware and research disciplines. You will partner with researchers and systems architects to merge algorithmic design with hardware efficiency. Your responsibilities will include prototyping new kernel implementations, evaluating performance across various hardware generations, and helping to establish the numerical and parallelism strategies crucial for scaling next-generation AI systems.Note: This is an evergreen role that remains open continuously for expressions of interest. We receive numerous applications, and there may not always be an immediate opportunity that aligns with your qualifications. However, we encourage you to apply, as we regularly assess applications and will reach out as new positions become available. You are also welcome to reapply after gaining additional experience, but please refrain from applying more than once every six months. Additionally, you may notice postings for specific roles catering to particular projects or team needs. In such cases, you are encouraged to apply directly alongside this evergreen listing.What You’ll DoDesign and develop custom ML kernels (e.g., CUDA, CuTe, Triton) for key LLM operations such as attention, matrix multiplication, gating, and normalization, optimized for contemporary GPU and accelerator architectures.Conceptualize compute primitives aimed at alleviating memory bandwidth bottlenecks and enhancing kernel compute efficiency.Collaborate with research teams to synchronize kernel-level optimizations with model architecture and algorithmic objectives.Create and maintain a library of reusable kernels and performance benchmarks that serve as the foundation for internal model training.Contribute to the stability and scalability of our infrastructure, ensuring it meets the growing demands of AI development.

Nov 27, 2025

Apply

GPU Kernel Engineer

Sciforium

Full-time|On-site|San Francisco

At Sciforium, we are at the forefront of AI infrastructure, innovating next-generation multimodal AI models and a proprietary high-efficiency serving platform. With substantial funding and direct collaboration from AMD, supported by their engineers, our team is rapidly expanding to develop the complete stack that powers cutting-edge AI models and real-time applications.About the RoleWe are on the lookout for a talented GPU Kernel Engineer who is eager to explore and maximize performance on modern accelerators. In this role, you will be responsible for designing and optimizing custom GPU kernels that drive our advanced large-scale AI systems. You will navigate the hardware-software stack, engaging in low-level kernel development and integrating optimized operations into high-level machine learning frameworks for large-scale training and inference.This position is perfect for someone who excels at the intersection of GPU programming, systems engineering, and state-of-the-art AI workloads, and aims to contribute significantly to the efficiency and scalability of our machine learning platform.Key ResponsibilitiesDevelop, implement, and enhance custom GPU kernels utilizing C++, PTX, CUDA, ROCm, Triton, and/or JAX Pallas.Profile and fine-tune the end-to-end performance of machine learning operations, particularly for large-scale LLM training and inference.Integrate low-level GPU kernels into frameworks such as PyTorch, JAX, and our proprietary internal runtimes.Create performance models, pinpoint bottlenecks, and deliver kernel-level enhancements that significantly boost AI workloads.Collaborate with machine learning researchers, distributed systems engineers, and model-serving teams to optimize computational performance across the entire stack.Engage closely with hardware vendors (NVIDIA/AMD) and stay updated on the latest GPU architecture and compiler/toolchain advancements.Contribute to the development of tools, documentation, benchmarking suites, and testing frameworks ensuring correctness and performance reproducibility.Must-Haves5+ years of industry or research experience in GPU kernel development or high-performance computing.Bachelor’s, Master’s, or PhD in Computer Science, Computer Engineering, Electrical Engineering, Applied Mathematics, or a related discipline.Strong programming proficiency in C++, Python, and familiarity with machine learning frameworks.

Dec 6, 2025

Apply

Multimodal AI Model Optimization Research Engineer

Tavus

Full-time|On-site|San Francisco (London/Europe - OK)

Tavus – Multimodal AI Model OptimizationResearch EngineerAt Tavus, we are pioneering the human aspect of AI technology. Our objective is to make human-AI interactions as seamless and natural as in-person conversations, allowing for a human touch in areas that were once considered unscalable.We accomplish this through groundbreaking research in multimodal AI, focusing on human-to-human communication modeling (encompassing language, audio, and video) and the development of audio-visual avatar behaviors. Our innovative models drive applications ranging from text-to-video AI avatars to real-time conversational video experiences across sectors such as healthcare, recruitment, sales, and education.By empowering AI to perceive, listen, and engage with an authentic human-like presence, we are laying the groundwork for the next generation of AI workers, assistants, and companions.As a Series B company, we are supported by renowned investors, including Sequoia, Y Combinator, and Scale VC. Join us as we shape the future of human-AI interaction.The RoleWe are seeking an accomplished Research Scientist/Engineer with expertise in model optimization to be a vital part of our core AI team.The ideal candidate thrives in dynamic startup environments, is adept at setting priorities independently, and is open to making calculated decisions. We are moving swiftly and need individuals who can help navigate our path forward.Your MissionTransform state-of-the-art research models into fast, efficient, and production-ready systems through techniques such as sparsification, distillation, and quantization.Oversee the optimization lifecycle for critical models: establish metrics, conduct experiments, and evaluate trade-offs among latency, cost, and quality.Collaborate closely with researchers and engineers to convert innovative concepts into deployable solutions.RequirementsExtensive experience in deep learning with PyTorch.Practical experience in model optimization and compression, including knowledge distillation, pruning/sparsification, quantization, and mixed precision.Familiarity with efficient architectures such as low-rank adapters.Strong grasp of inference performance and GPU/accelerator fundamentals.Proficient in Python coding and adherence to best practices in research engineering.Experience with large models and datasets in cloud environments.Capability to read ML literature, reproduce results, and modify ideas accordingly.

Apr 3, 2026

Apply

GPU Kernel Engineer

Baseten

Full-time|On-site|San Francisco

ABOUT BASETENAt Baseten, we empower the world's leading AI firms—such as Cursor, Notion, and OpenEvidence—by delivering mission-critical inference solutions. Our unique blend of applied AI research, robust infrastructure, and user-friendly developer tools enables AI pioneers to effectively deploy groundbreaking models. With our recent achievement of a $300M Series E funding round supported by esteemed investors like BOND and IVP, we're on an exciting growth trajectory. Join our dynamic team and contribute to the platform that drives the next generation of AI products.THE ROLEWe are looking for an experienced Senior GPU Kernel Engineer to join our innovative team at the forefront of AI acceleration. In this role, your programming expertise will directly enhance the performance of cutting-edge machine learning models. You'll be responsible for developing highly efficient GPU kernels that optimize computational processes, allowing for transformative AI applications.You'll thrive in a fast-paced, intellectually challenging environment where your technical skills are pivotal. Your contributions will directly affect production systems that serve millions of users across various platforms. This position offers exceptional opportunities for career advancement for engineers enthusiastic about low-level optimization and impactful systems engineering.EXAMPLE INITIATIVESAs part of our Model Performance team, you will engage in projects like:Baseten Embeddings Inference: The quickest embeddings solution availableThe Baseten Inference StackEnhancing model performance optimizationRESPONSIBILITIESCore Engineering ResponsibilitiesDesign and develop high-performance GPU kernels for essential machine learning operations, including matrix multiplications and attention mechanisms.Collaborate with cross-functional teams to drive performance improvements and implement optimizations.Debug and refine kernel code to achieve maximal efficiency and reliability.Stay abreast of the latest advancements in GPU technology and machine learning frameworks.

Jul 17, 2025

Apply

Staff Software Engineer - GenAI Performance and Kernel

Databricks

Full-time|$190.9K/yr - $232.8K/yr|On-site|San Francisco, California

P-1285 About This Role Join our dynamic team at Databricks as a Staff Software Engineer specializing in GenAI Performance and Kernel. In this pivotal role, you will take charge of designing, implementing, and optimizing high-performance GPU kernels that drive our GenAI inference stack. Your expertise will lead the development of finely-tuned, low-level compute paths, balancing hardware efficiency with versatility, while mentoring fellow engineers in the intricacies of kernel-level performance engineering. Collaborating closely with machine learning researchers, systems engineers, and product teams, you will elevate the forefront of inference performance at scale. What You Will Do Lead the design, implementation, benchmarking, and maintenance of essential compute kernels (such as attention, MLP, softmax, layernorm, memory management) tailored for diverse hardware backends (GPU, accelerators). Steer the performance roadmap for kernel-level enhancements, focusing on areas like vectorization, tensorization, tiling, fusion, mixed precision, sparsity, quantization, memory reuse, scheduling, and auto-tuning. Integrate kernel optimizations seamlessly with higher-level machine learning systems. Develop and uphold profiling, instrumentation, and verification tools to identify correctness, performance regressions, numerical discrepancies, and hardware utilization inefficiencies. Conduct performance investigations and root-cause analyses to address inference bottlenecks, such as memory bandwidth, cache contention, kernel launch overhead, and tensor fragmentation. Create coding patterns, abstractions, and frameworks to modularize kernels for reuse, cross-backend compatibility, and maintainability. Influence architectural decisions to enhance kernel efficiency (including memory layout, dataflow scheduling, and kernel fusion boundaries). Guide and mentor fellow engineers focused on lower-level performance, conducting code reviews and establishing best practices. Collaborate with infrastructure, tooling, and machine learning teams to implement kernel-level optimizations in production and assess their impacts.

Jan 30, 2026

Apply

Senior Frontend Engineer - Performance Optimization

ClickUp

Full-time|On-site|United States of America

At ClickUp, we're not just developing software; we're shaping the future of work! In an era dominated by work sprawl, we identified a more efficient way. This led us to create the first truly integrated AI workspace, consolidating tasks, documents, chat, calendar, and enterprise search, all enhanced by context-driven AI. Our mission is to empower millions of teams to escape silos, reclaim their time, and reach unprecedented levels of productivity. At ClickUp, you'll have the chance to learn, innovate, and leverage AI in transformative ways that will not only influence our product but also the broader landscape of work itself. Join a daring, pioneering team that's challenging the limits of what's possible! We are on the lookout for a technical leader in SaaS client performance who is passionate about enhancing the customer experience through top-tier performance solutions. As a Senior Performance Engineer, you will spearhead comprehensive strategies to optimize application speed, memory utilization, and reliability across our entire platform. You will be empowered to analyze, diagnose, and address performance bottlenecks wherever they arise—be it front-end, back-end, or infrastructure—ensuring ClickUp remains the fastest and most reliable productivity platform available.The ideal candidate is a hands-on authority in browser and NodeJS performance, with a thorough understanding of how code influences rendering, memory management, and overall user experience. You excel in solving intricate challenges, collaborating across teams, and establishing new benchmarks for performance excellence. If you're driven to make a significant impact for millions of users, this is your chance to lead at scale.Your Responsibilities:Conduct root cause analysis on client performance issues and perform post-mortems.Profile application code to identify inefficient algorithms, memory leaks, and other issues; propose and implement effective solutions.Establish performance monitoring, alerting, and dashboards to proactively detect and resolve client performance challenges.Examine client traffic patterns, load testing outcomes, and other metrics to set benchmarks and drive enhancements.Champion performance best practices and set performance standards across the engineering organization.Identify infrastructure upgrades (caching, CDNs, database optimization) to elevate the client experience.Collaborate with development teams to incorporate performance as a core requirement in the development of new features.

Dec 22, 2025

Apply

Technical Staff Member - GPU Performance Engineering at Liquid AI | San Francisco

Liquid AI

Full-time|On-site|San Francisco

Join the Innovative Team at Liquid AIFounded as a spin-off from MIT’s CSAIL, Liquid AI is at the forefront of developing cutting-edge AI systems that operate seamlessly across various platforms, including data center accelerators and on-device hardware. Our technology is designed to ensure low latency, efficient memory usage, privacy, and reliability. We collaborate with leading enterprises in sectors such as consumer electronics, automotive, life sciences, and financial services as we rapidly scale our operations. We are seeking talented individuals who are passionate about technology and innovation.Your Role in Our TeamAs a GPU Performance Engineer, your expertise will be critical in enhancing our models and workflows beyond the capabilities of standard frameworks. You will be responsible for designing and deploying custom CUDA kernels, conducting hardware-level profiling, and transforming research concepts into production code that yields tangible improvements in our pipelines (training, post-training, and inference). Our dynamic team values initiative and ownership, and we are looking for a candidate who thrives on tackling complex challenges related to memory hierarchies, tensor cores, and profiling outputs.While San Francisco and Boston are preferred, we welcome applications from other locations.

Jul 29, 2025

Apply

Software Engineer, Inference - Performance Optimization

OpenAI

Full-time|On-site|San Francisco

Role overview This Software Engineer position at OpenAI focuses on inference and performance optimization. Based in San Francisco, the role centers on increasing the speed and efficiency of advanced AI systems. Collaboration with experienced engineers is a key part of the work, with an emphasis on refining AI performance. What you will do Work on optimizing the performance of AI inference systems Collaborate with other engineers to improve efficiency and speed Contribute to solutions that enhance AI system capabilities Location This role is based in San Francisco.

Apr 25, 2026

Apply

Research Engineer in Performance Reinforcement Learning

Anthropic

Full-time|On-site|San Francisco, CA

Join the innovative team at Anthropic as a Research Engineer specializing in Performance Reinforcement Learning. In this role, you will contribute to cutting-edge research that directly influences the development of advanced AI systems. Collaborate with a talented group of engineers and researchers, leveraging your expertise to enhance our algorithms and improve overall performance.

Mar 23, 2026

Apply

Compute Optimization Researcher/Engineer

OpenAI

Full-time|Hybrid|San Francisco

Team Overview The infrastructure team at OpenAI manages the core systems that support AI workloads worldwide. As OpenAI expands its compute capabilities across company-owned data centers, cloud environments, and strategic partnerships, the need for careful planning and resource management grows. Reliable and cost-effective compute operations depend on this foundation. The Compute Optimization group operates at the intersection of engineering, operations, finance, and infrastructure strategy. This team develops models, decision tools, and planning systems to improve how compute resources are scheduled, deployed, and scaled as global needs shift. Role Overview OpenAI is hiring a Compute Optimization Researcher/Engineer to help maximize the use of compute capacity across the organization. This role addresses complex optimization challenges related to capacity allocation, demand forecasting, cluster planning, workload placement, and infrastructure utilization. Work includes building mathematical models, developing software systems, and collaborating with other teams to improve planning and use of compute resources. Areas of focus span GPU clusters, networking, storage, and data center infrastructure. Candidates with experience in operations research, optimization, applied mathematics, infrastructure systems, or large-scale capacity planning will be well-suited for this position. Location and Work Model This position is based in San Francisco, CA. OpenAI follows a hybrid schedule with three days per week in the office. Relocation assistance is offered.

Apr 27, 2026

Apply

Team Lead, Research Inference

OpenAI

Full-time|On-site|San Francisco

About Our TeamAt OpenAI, our Foundations team is dedicated to examining how model behavior evolves as we scale up models, data, and computing resources. We meticulously analyze the relationships between model architecture, optimization strategies, and training datasets to inform the design and training of next-generation models.About the PositionAs a Team Lead in Research Inference, you will be instrumental in constructing systems that empower advanced AI models to operate efficiently at scale. Your role lies at the crossroads of model research and systems engineering, where you will translate innovative architectural concepts into high-performance inference systems, clearly illustrating the trade-offs in performance, memory usage, and scalability.Your contributions will significantly shape model design, evaluation, and iteration processes across our research organization. By developing and refining high-performance inference infrastructures, you will provide researchers with the tools necessary to explore new ideas while understanding their computational and systems implications.This position does not involve serving products; instead, it supports research through a focus on performance, accuracy, and realism, ensuring that our AI research is firmly rooted in scalable solutions.ResponsibilitiesDesign and develop optimized inference runtimes for large-scale AI models, emphasizing efficiency, reliability, and scalability.Take ownership of optimizing core execution processes, including model execution, memory management, batching, and scheduling.Enhance and expand distributed inference across multiple GPUs, focusing on parallelism, communication patterns, and runtime coordination.Implement and refine critical inference operators and kernels based on real-world workloads.Collaborate closely with research teams to ensure accurate and efficient support for new model architectures within inference systems.Identify and resolve performance bottlenecks through comprehensive profiling, benchmarking, and low-level debugging.Contribute to the observability, correctness, and reliability of large-scale AI systems.Ideal Candidate ProfileExperience in developing production-level inference systems, beyond just training and executing models.Proficient in GPU-centric performance engineering, including managing memory behavior and understanding latency/throughput trade-offs.Strong analytical skills and familiarity with performance profiling tools.

Mar 19, 2026

Apply

Customer Engineer at Kernel | San Francisco

Kernel

Full-time|On-site|San Francisco

Join Our Team at KernelAt Kernel, we are revolutionizing the way developers interact with the digital world through our innovative platform, offering Lightning-Fast Browsers-as-a-Service for seamless browser automation and advanced web agents. Our cutting-edge API and MCP server empower developers to effortlessly launch browsers in the cloud, eliminating the complexities of infrastructure management.Our serverless browser platform takes the hassle out of autoscaling, reliability, and observability, allowing developers to concentrate on their agents' functionality rather than the underlying processes. Kernel transforms AI into a practical and impactful tool, enabling developers to deploy agents that can genuinely engage with online environments.Trusted by industry leaders such as Cash App and Rye for applications ranging from comprehensive research to QA automation and real-time web analysis, we have successfully raised $22M from prominent investors including Accel, YCombinator, and others.With just one line of code, any web agent can be deployed to our cloud—what happens next is up to you. If you are passionate about creating essential infrastructure for the future of AI applications, we would love to connect.

Dec 4, 2025

Apply

Staff Research Engineer, Discovery Team

Anthropic

On-site|On-site|San Francisco, CA

About AnthropicAt Anthropic, we are dedicated to shaping the future of artificial intelligence by developing systems that are reliable, interpretable, and steerable. We believe in the potential of AI to enhance the lives of users and contribute positively to society. Our rapidly expanding team consists of passionate researchers, engineers, policy specialists, and business leaders, all collaborating to create beneficial AI technologies.About the TeamOur team is driven by the ambitious goal of crafting an AI scientist—an advanced system designed to tackle complex reasoning challenges and achieve the foundational capabilities needed to advance scientific knowledge. We adopt a holistic perspective across the entire model stack, currently focusing on enhancing the ability of AI models to utilize computational tools as a means to address long-term tasks and to overcome significant barriers in scientific workflows.About the RoleAs a Senior Research Engineer within our team, you will engage in end-to-end projects that identify and resolve key obstacles on the journey to achieving scientific AGI. Ideal candidates should be well-versed in language model training, evaluation, and inference. You should be adept at evaluating research ideas, diagnosing issues, and thrive in a collaborative environment. Experience in performance optimization, distributed systems, VM/sandboxing/container deployment, and managing large-scale data pipelines is highly advantageous.Join us on our mission to pioneer cutting-edge AI systems that are both powerful and beneficial for humanity.

Jan 29, 2026

Apply

Founding Audio AI Research Engineer

David AI

Full-time|On-site|San Francisco

Join Our Innovative Team at David AIDavid AI is pioneering the audio data research landscape. We adopt a rigorous R&D methodology for developing datasets that parallels the standards upheld by leading AI laboratories. Our vision is to seamlessly integrate AI into everyday experiences, with audio serving as the perfect conduit. The evolution of audio AI is rapidly unfolding, yet the availability of high-quality training data remains a critical challenge. This is where David AI steps in.Founded in 2024 by a talented group of former engineers and operators from Scale AI, we have quickly become a trusted partner to numerous FAANG companies and AI research labs. Recently, we secured $50 million in a Series B funding round with notable investors, including Meritech, NVIDIA, and Alt Capital.Our culture is built on sharp intellect, humility, ambition, and a close-knit community. We invite exceptional minds in research, engineering, product development, and operations to join us as we advance the field of audio AI.Research Team OverviewAt David AI, we are convinced that superior model capabilities stem from high-quality, differentiated data. Our research team is dedicated to conducting ambitious, long-term studies into audio technology while collaborating with both internal and external partners to implement cutting-edge research insights into practical applications.Your Role as a Founding Audio AI Research EngineerIn this position, you will establish the research framework that influences how premier AI labs develop their audio models. You will have access to a top-tier team of human AI trainers, robust computing resources, and the autonomy to shape your research agenda.Key ResponsibilitiesCreate and implement comprehensive evaluation frameworks for assessing audio AI capabilities in areas such as speech, emotion detection, conversational dynamics, and acoustic patterns.Investigate and prototype innovative methodologies for audio quality assessment, automated labeling, and optimizing data collection processes.Design focused data collection pipelines aimed at capturing novel, high-value audio capabilities.Develop automated systems for ongoing classifier enhancement and prompt engineering evaluation.Assess cutting-edge models and formulate actionable research strategies.Publish your findings in prestigious conferences.

Jun 24, 2025

Apply

Backend Engineer at Kernel | San Francisco

Kernel

Full-time|On-site|San Francisco

About KernelKernel is an innovative developer platform that delivers Lightning-Fast Browsers-as-a-Service for browser automation and web agent deployment. Our API and MCP server empower developers to effortlessly launch cloud-based browsers without the hassle of infrastructure management.Our serverless browser solution takes care of the complexities: autoscaling, dependable browser infrastructure, observability, and intricate web interactions, allowing developers to concentrate on their agents' functionality rather than the underlying technology. Kernel brings AI to life, enabling developers to create agents that genuinely engage with the digital landscape.Our platform is trusted by teams at Cash App, Rye, and many others for various tasks including in-depth research, QA automation, and real-time web analysis. We recently secured $22M in funding from notable investors such as Accel, YCombinator, Vercel, Paul Graham, Solomon Hykes (Docker), David Cramer (Sentry), and Charlie Marsh (Astral).With just a single line of code, you can deploy any web agent to our cloud infrastructure. If you are passionate about developing essential infrastructure for the future of AI applications, we would love to connect with you.

Dec 4, 2025

Create account — see all 7,423 results

1 - 20 of 7,423 Jobs

Select all on this page (20)

Apply

Research Engineer - AI Performance & Kernel Optimization

Zyphra

Full-time|On-site|San Francisco

Mar 16, 2026

Apply

Senior GenAI Research Engineer - Optimization and Kernels

Databricks