Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Unlock Your Potential
Generate Job-Optimized Resume
One Click And Our AI Optimizes Your Resume to Match The Job Description.
Is Your Resume Optimized For This Role?
Find Out If You're Highlighting The Right Skills And Fix What's Missing
Experience Level
Senior
Qualifications
Proven experience in software engineering, particularly in AI/ML environments. Strong proficiency in programming languages such as Python, Java, or C++. Experience with optimization techniques and frameworks. Knowledge of cloud computing platforms, particularly DigitalOcean’s ecosystem. Excellent problem-solving skills and a collaborative mindset.
About the job
Join DigitalOcean as a Senior Engineer focused on Inference Optimizations, where you will play a pivotal role in enhancing our AI and machine learning capabilities. Collaborate with a talented team to develop cutting-edge solutions that optimize inference processes across various applications.
About DigitalOcean, Inc.
DigitalOcean is a leading cloud infrastructure provider that empowers developers to deploy and scale applications that run simultaneously on multiple computers. We are committed to simplifying cloud computing for developers by providing an easy-to-use platform and robust support.
Similar jobs
1 - 20 of 6,939 Jobs
Search for Senior Genai Research Engineer Optimization And Kernels
Full-time|$166K/yr - $225K/yr|On-site|San Francisco, California
At Databricks, we are dedicated to empowering data teams to tackle the world's most challenging problems, from detecting security threats to advancing cancer drug development. We achieve this by offering the premier data and AI platform, allowing our customers to concentrate on their mission-critical challenges. The Mosaic AI organization assists companies in developing AI models and systems utilizing their own data, employing technologies that range from training large language models (LLMs) from the ground up to employing advanced retrieval methods for enhanced generation. We pride ourselves on pushing the boundaries of science and operationalizing our innovations. Mosaic AI believes that a company’s AI models hold intrinsic value, akin to any other core intellectual property, and that superior AI models should be accessible to all. Job Overview As a research engineer in the Scaling team, you will stay abreast of the latest advancements in deep learning and pioneer new methodologies that surpass the current state of the art. You will collaborate with a diverse team of researchers and engineers, sharing insights and expertise. Most importantly, you will be passionate about our customers, striving to ensure their success in implementing cutting-edge LLMs and AI systems by translating our scientific knowledge into practical applications. Your Impact Enhance performance through innovative optimization techniques, including kernel fusion, mixed precision, memory layout optimization, tiling strategies, and tensorization tailored for training-specific patterns. Design, implement, and optimize high-performance GPU kernels for training workloads, including attention mechanisms, custom layers, gradient computations, and activation functions, specifically for NVIDIA architectures. Create and implement distributed training frameworks for large language models, incorporating parallelism strategies (data, tensor, pipeline, ZeRO-based) and optimized communication patterns for gradient synchronization and collective operations. Profile, debug, and optimize comprehensive training workflows to pinpoint and resolve performance bottlenecks, utilizing memory optimization techniques such as activation checkpointing, gradient sharding, and mixed precision training.
Join Zyphra as a Research Engineer specializing in AI Performance and Kernel Optimization. In this role, you will work at the forefront of AI technologies, developing and optimizing kernel solutions that enhance the performance of our systems. You will collaborate with cross-functional teams, leveraging your expertise to drive innovation and efficiency.
Full-time|$190.9K/yr - $232.8K/yr|On-site|San Francisco, California
P-1285 About This Role Join our dynamic team at Databricks as a Staff Software Engineer specializing in GenAI Performance and Kernel. In this pivotal role, you will take charge of designing, implementing, and optimizing high-performance GPU kernels that drive our GenAI inference stack. Your expertise will lead the development of finely-tuned, low-level compute paths, balancing hardware efficiency with versatility, while mentoring fellow engineers in the intricacies of kernel-level performance engineering. Collaborating closely with machine learning researchers, systems engineers, and product teams, you will elevate the forefront of inference performance at scale. What You Will Do Lead the design, implementation, benchmarking, and maintenance of essential compute kernels (such as attention, MLP, softmax, layernorm, memory management) tailored for diverse hardware backends (GPU, accelerators). Steer the performance roadmap for kernel-level enhancements, focusing on areas like vectorization, tensorization, tiling, fusion, mixed precision, sparsity, quantization, memory reuse, scheduling, and auto-tuning. Integrate kernel optimizations seamlessly with higher-level machine learning systems. Develop and uphold profiling, instrumentation, and verification tools to identify correctness, performance regressions, numerical discrepancies, and hardware utilization inefficiencies. Conduct performance investigations and root-cause analyses to address inference bottlenecks, such as memory bandwidth, cache contention, kernel launch overhead, and tensor fragmentation. Create coding patterns, abstractions, and frameworks to modularize kernels for reuse, cross-backend compatibility, and maintainability. Influence architectural decisions to enhance kernel efficiency (including memory layout, dataflow scheduling, and kernel fusion boundaries). Guide and mentor fellow engineers focused on lower-level performance, conducting code reviews and establishing best practices. Collaborate with infrastructure, tooling, and machine learning teams to implement kernel-level optimizations in production and assess their impacts.
Full-time|$350K/yr - $475K/yr|On-site|San Francisco
At Thinking Machines Lab, our ambition is to enhance human potential by advancing collaborative general intelligence. We envision a future where individuals have the tools and knowledge to harness AI for their distinct requirements and aspirations.Our team comprises dedicated scientists, engineers, and innovators who have contributed to some of the most renowned AI products, including ChatGPT and Character.ai, along with open-weight models like Mistral, and influential open-source projects such as PyTorch, OpenAI Gym, Fairseq, and Segment Anything.About the RoleWe are seeking an Infrastructure Research Engineer to architect, optimize, and sustain the computational frameworks that facilitate large-scale language model training. You will create high-performance machine learning kernels (e.g., CUDA, CuTe, Triton), enable effective low-precision arithmetic operations, and enhance the distributed computing infrastructure essential for training expansive models.This position is ideal for an engineer who thrives in close collaboration with hardware and research disciplines. You will partner with researchers and systems architects to merge algorithmic design with hardware efficiency. Your responsibilities will include prototyping new kernel implementations, evaluating performance across various hardware generations, and helping to establish the numerical and parallelism strategies crucial for scaling next-generation AI systems.Note: This is an evergreen role that remains open continuously for expressions of interest. We receive numerous applications, and there may not always be an immediate opportunity that aligns with your qualifications. However, we encourage you to apply, as we regularly assess applications and will reach out as new positions become available. You are also welcome to reapply after gaining additional experience, but please refrain from applying more than once every six months. Additionally, you may notice postings for specific roles catering to particular projects or team needs. In such cases, you are encouraged to apply directly alongside this evergreen listing.What You’ll DoDesign and develop custom ML kernels (e.g., CUDA, CuTe, Triton) for key LLM operations such as attention, matrix multiplication, gating, and normalization, optimized for contemporary GPU and accelerator architectures.Conceptualize compute primitives aimed at alleviating memory bandwidth bottlenecks and enhancing kernel compute efficiency.Collaborate with research teams to synchronize kernel-level optimizations with model architecture and algorithmic objectives.Create and maintain a library of reusable kernels and performance benchmarks that serve as the foundation for internal model training.Contribute to the stability and scalability of our infrastructure, ensuring it meets the growing demands of AI development.
Full-time|$200K/yr - $400K/yr|Remote|San Francisco
At Inferact, we are on a mission to establish vLLM as the premier AI inference engine, significantly enhancing the speed and reducing the cost of AI inference. Our founders, the visionaries behind vLLM, have spent years bridging the gap between advanced models and cutting-edge hardware.About the RoleWe are seeking a skilled performance engineer dedicated to maximizing the computational efficiency of modern accelerators. In this role, you'll develop kernels and implement low-level optimizations that position vLLM as the fastest inference engine globally. Your contributions will be pivotal as your code will execute across a broad spectrum of hardware accelerators, from NVIDIA GPUs to the latest silicon innovations. You'll collaborate closely with hardware vendors to ensure we fully leverage the capabilities of each new generation of chips.
At Gimlet Labs, we are pioneering the first heterogeneous neocloud tailored for AI workloads. As the demand for AI systems grows, traditional infrastructure faces significant limitations in terms of power, capacity, and cost. Our innovative platform addresses these challenges by decoupling AI workloads from the hardware, intelligently partitioning tasks, and directing each component to the most suitable hardware for optimal performance and efficiency. This method allows for the creation of heterogeneous systems that span multiple vendors and generations of hardware, including the latest cutting-edge accelerators, achieving substantial improvements in performance and cost-effectiveness.Building upon this robust foundation, Gimlet is developing a production-grade neocloud designed for agentic workloads. Our customers can effortlessly deploy and manage their workloads with stable, production-ready APIs, eliminating the complexities of hardware selection, placement, or low-level performance optimization.We collaborate with foundational labs, hyperscalers, and AI-native companies to drive real production workloads capable of scaling to gigawatt-class AI data centers.We are currently seeking a dedicated Member of Technical Staff specializing in kernels and GPU performance. In this role, you will work closely with accelerators and execution hardware to extract maximum performance from AI workloads across diverse and rapidly evolving platforms. You will analyze low-level execution behaviors, design and optimize kernels, and ensure consistent performance across both established and emerging hardware.This position is perfect for engineers who thrive on deep performance analysis, enjoy exploring hardware trade-offs, and are passionate about transforming theoretical peak performance into tangible real-world outcomes.
Internship|$54K/yr - $60K/yr|On-site|San Francisco, California
Company Overview: At Databricks, we are dedicated to empowering data teams to tackle some of the world’s most challenging issues, ranging from security threat detection to breakthroughs in cancer drug development. We achieve this by creating and operating the premier data and AI platform, allowing our clients to concentrate on the high-value challenges central to their missions. The Mosaic AI division equips organizations to develop AI models and systems utilizing their own data, utilizing technologies that span from fine-tuning large language models (LLMs) for specific enterprise domains to building complex AI systems that incorporate retrieval and agents. We believe that a company's AI models are as valuable as any other intellectual property and that high-quality AI models should be accessible to all. Job Summary: Our research team is focused on advancing the boundaries of “domain adaptation” — discovering how to create LLMs and AI systems that excel in specialized domains. We are investigating open research challenges across a variety of themes, including scaling and automating evaluation, fine-tuning with synthetic data, retrieval augmentation, and optimizing inference speed and efficiency. As a PhD GenAI Research Scientist Intern, you will collaborate with our research team on projects that aim to adapt LLMs and AI systems for enterprise settings. Your tasks may include: Enhancing, refining, and assessing methodologies from existing literature. Designing novel approaches for effective domain adaptation. Combining various methods to formulate innovative strategies for efficient post-training. Conducting evaluations of LLMs and AI systems.
Team Overview The infrastructure team at OpenAI manages the core systems that support AI workloads worldwide. As OpenAI expands its compute capabilities across company-owned data centers, cloud environments, and strategic partnerships, the need for careful planning and resource management grows. Reliable and cost-effective compute operations depend on this foundation. The Compute Optimization group operates at the intersection of engineering, operations, finance, and infrastructure strategy. This team develops models, decision tools, and planning systems to improve how compute resources are scheduled, deployed, and scaled as global needs shift. Role Overview OpenAI is hiring a Compute Optimization Researcher/Engineer to help maximize the use of compute capacity across the organization. This role addresses complex optimization challenges related to capacity allocation, demand forecasting, cluster planning, workload placement, and infrastructure utilization. Work includes building mathematical models, developing software systems, and collaborating with other teams to improve planning and use of compute resources. Areas of focus span GPU clusters, networking, storage, and data center infrastructure. Candidates with experience in operations research, optimization, applied mathematics, infrastructure systems, or large-scale capacity planning will be well-suited for this position. Location and Work Model This position is based in San Francisco, CA. OpenAI follows a hybrid schedule with three days per week in the office. Relocation assistance is offered.
At Sciforium, we are at the forefront of AI infrastructure, innovating next-generation multimodal AI models and a proprietary high-efficiency serving platform. With substantial funding and direct collaboration from AMD, supported by their engineers, our team is rapidly expanding to develop the complete stack that powers cutting-edge AI models and real-time applications.About the RoleWe are on the lookout for a talented GPU Kernel Engineer who is eager to explore and maximize performance on modern accelerators. In this role, you will be responsible for designing and optimizing custom GPU kernels that drive our advanced large-scale AI systems. You will navigate the hardware-software stack, engaging in low-level kernel development and integrating optimized operations into high-level machine learning frameworks for large-scale training and inference.This position is perfect for someone who excels at the intersection of GPU programming, systems engineering, and state-of-the-art AI workloads, and aims to contribute significantly to the efficiency and scalability of our machine learning platform.Key ResponsibilitiesDevelop, implement, and enhance custom GPU kernels utilizing C++, PTX, CUDA, ROCm, Triton, and/or JAX Pallas.Profile and fine-tune the end-to-end performance of machine learning operations, particularly for large-scale LLM training and inference.Integrate low-level GPU kernels into frameworks such as PyTorch, JAX, and our proprietary internal runtimes.Create performance models, pinpoint bottlenecks, and deliver kernel-level enhancements that significantly boost AI workloads.Collaborate with machine learning researchers, distributed systems engineers, and model-serving teams to optimize computational performance across the entire stack.Engage closely with hardware vendors (NVIDIA/AMD) and stay updated on the latest GPU architecture and compiler/toolchain advancements.Contribute to the development of tools, documentation, benchmarking suites, and testing frameworks ensuring correctness and performance reproducibility.Must-Haves5+ years of industry or research experience in GPU kernel development or high-performance computing.Bachelor’s, Master’s, or PhD in Computer Science, Computer Engineering, Electrical Engineering, Applied Mathematics, or a related discipline.Strong programming proficiency in C++, Python, and familiarity with machine learning frameworks.
Full-time|On-site|San Francisco (London/Europe - OK)
Tavus – Multimodal AI Model OptimizationResearch EngineerAt Tavus, we are pioneering the human aspect of AI technology. Our objective is to make human-AI interactions as seamless and natural as in-person conversations, allowing for a human touch in areas that were once considered unscalable.We accomplish this through groundbreaking research in multimodal AI, focusing on human-to-human communication modeling (encompassing language, audio, and video) and the development of audio-visual avatar behaviors. Our innovative models drive applications ranging from text-to-video AI avatars to real-time conversational video experiences across sectors such as healthcare, recruitment, sales, and education.By empowering AI to perceive, listen, and engage with an authentic human-like presence, we are laying the groundwork for the next generation of AI workers, assistants, and companions.As a Series B company, we are supported by renowned investors, including Sequoia, Y Combinator, and Scale VC. Join us as we shape the future of human-AI interaction.The RoleWe are seeking an accomplished Research Scientist/Engineer with expertise in model optimization to be a vital part of our core AI team.The ideal candidate thrives in dynamic startup environments, is adept at setting priorities independently, and is open to making calculated decisions. We are moving swiftly and need individuals who can help navigate our path forward.Your MissionTransform state-of-the-art research models into fast, efficient, and production-ready systems through techniques such as sparsification, distillation, and quantization.Oversee the optimization lifecycle for critical models: establish metrics, conduct experiments, and evaluate trade-offs among latency, cost, and quality.Collaborate closely with researchers and engineers to convert innovative concepts into deployable solutions.RequirementsExtensive experience in deep learning with PyTorch.Practical experience in model optimization and compression, including knowledge distillation, pruning/sparsification, quantization, and mixed precision.Familiarity with efficient architectures such as low-rank adapters.Strong grasp of inference performance and GPU/accelerator fundamentals.Proficient in Python coding and adherence to best practices in research engineering.Experience with large models and datasets in cloud environments.Capability to read ML literature, reproduce results, and modify ideas accordingly.
ABOUT BASETENAt Baseten, we empower the world's leading AI firms—such as Cursor, Notion, and OpenEvidence—by delivering mission-critical inference solutions. Our unique blend of applied AI research, robust infrastructure, and user-friendly developer tools enables AI pioneers to effectively deploy groundbreaking models. With our recent achievement of a $300M Series E funding round supported by esteemed investors like BOND and IVP, we're on an exciting growth trajectory. Join our dynamic team and contribute to the platform that drives the next generation of AI products.THE ROLEWe are looking for an experienced Senior GPU Kernel Engineer to join our innovative team at the forefront of AI acceleration. In this role, your programming expertise will directly enhance the performance of cutting-edge machine learning models. You'll be responsible for developing highly efficient GPU kernels that optimize computational processes, allowing for transformative AI applications.You'll thrive in a fast-paced, intellectually challenging environment where your technical skills are pivotal. Your contributions will directly affect production systems that serve millions of users across various platforms. This position offers exceptional opportunities for career advancement for engineers enthusiastic about low-level optimization and impactful systems engineering.EXAMPLE INITIATIVESAs part of our Model Performance team, you will engage in projects like:Baseten Embeddings Inference: The quickest embeddings solution availableThe Baseten Inference StackEnhancing model performance optimizationRESPONSIBILITIESCore Engineering ResponsibilitiesDesign and develop high-performance GPU kernels for essential machine learning operations, including matrix multiplications and attention mechanisms.Collaborate with cross-functional teams to drive performance improvements and implement optimizations.Debug and refine kernel code to achieve maximal efficiency and reliability.Stay abreast of the latest advancements in GPU technology and machine learning frameworks.
Join Merge Labs, a pioneering research facility dedicated to merging biological and artificial intelligence to enhance human capabilities, agency, and experience. We aim to achieve this by crafting innovative brain-computer interfaces that communicate with the brain at high bandwidth, seamlessly integrate with cutting-edge AI, and prioritize safety and accessibility for all users.About the Team:At Merge Labs, we are on a mission to revolutionize brain-computer interfaces by leveraging advancements in synthetic biology, neuroscience, AI, and non-invasive imaging technologies. Our cross-functional data science team is situated at the convergence of computational modeling, neuroscience, and biomolecular engineering. This collaborative unit works closely with wet-lab scientists, automation specialists, and data engineers to develop machine learning frameworks that facilitate rapid molecule discovery and device enhancement.About the Role:We are seeking a talented Senior / Principal ML Scientist to architect and scale Bayesian optimization and reinforcement learning frameworks that guide molecular engineering initiatives through iterative design-build-test-learn (DBTL) cycles. You will start with a fresh approach to construct the company's closed-loop optimization infrastructure, establishing the data and modeling foundations that link experiments with these ML frameworks. Over time, you will transition prototypes into operational pipelines, significantly enhancing experimental throughput and discovery success across various biomolecular and neuroengineering sectors.Key Responsibilities:Develop the scientific and engineering framework for active learning and closed-loop optimization, encompassing data ingestion, ML modeling, and library design.Collaborate with wet-lab scientists to establish feasible optimization objectives while incorporating domain-specific priors and constraints.Create prototypes for representation learning and acquisition strategies utilizing both internal and public datasets; benchmark and validate the performance of models.Integrate machine learning models with experimental data streams, making them accessible to non-domain experts for broader utilization.Extend machine learning frameworks to accommodate multi-objective or constrained optimization challenges.Stay abreast of the latest advancements in Bayesian optimization, active learning, and reinforcement learning, and prototype innovative algorithms to enhance the company's capabilities.
Join Our Team at KernelAt Kernel, we are revolutionizing the way developers interact with the digital world through our innovative platform, offering Lightning-Fast Browsers-as-a-Service for seamless browser automation and advanced web agents. Our cutting-edge API and MCP server empower developers to effortlessly launch browsers in the cloud, eliminating the complexities of infrastructure management.Our serverless browser platform takes the hassle out of autoscaling, reliability, and observability, allowing developers to concentrate on their agents' functionality rather than the underlying processes. Kernel transforms AI into a practical and impactful tool, enabling developers to deploy agents that can genuinely engage with online environments.Trusted by industry leaders such as Cash App and Rye for applications ranging from comprehensive research to QA automation and real-time web analysis, we have successfully raised $22M from prominent investors including Accel, YCombinator, and others.With just one line of code, any web agent can be deployed to our cloud—what happens next is up to you. If you are passionate about creating essential infrastructure for the future of AI applications, we would love to connect.
At ClickUp, we're not just developing software; we're shaping the future of work! In an era dominated by work sprawl, we identified a more efficient way. This led us to create the first truly integrated AI workspace, consolidating tasks, documents, chat, calendar, and enterprise search, all enhanced by context-driven AI. Our mission is to empower millions of teams to escape silos, reclaim their time, and reach unprecedented levels of productivity. At ClickUp, you'll have the chance to learn, innovate, and leverage AI in transformative ways that will not only influence our product but also the broader landscape of work itself. Join a daring, pioneering team that's challenging the limits of what's possible! We are on the lookout for a technical leader in SaaS client performance who is passionate about enhancing the customer experience through top-tier performance solutions. As a Senior Performance Engineer, you will spearhead comprehensive strategies to optimize application speed, memory utilization, and reliability across our entire platform. You will be empowered to analyze, diagnose, and address performance bottlenecks wherever they arise—be it front-end, back-end, or infrastructure—ensuring ClickUp remains the fastest and most reliable productivity platform available.The ideal candidate is a hands-on authority in browser and NodeJS performance, with a thorough understanding of how code influences rendering, memory management, and overall user experience. You excel in solving intricate challenges, collaborating across teams, and establishing new benchmarks for performance excellence. If you're driven to make a significant impact for millions of users, this is your chance to lead at scale.Your Responsibilities:Conduct root cause analysis on client performance issues and perform post-mortems.Profile application code to identify inefficient algorithms, memory leaks, and other issues; propose and implement effective solutions.Establish performance monitoring, alerting, and dashboards to proactively detect and resolve client performance challenges.Examine client traffic patterns, load testing outcomes, and other metrics to set benchmarks and drive enhancements.Champion performance best practices and set performance standards across the engineering organization.Identify infrastructure upgrades (caching, CDNs, database optimization) to elevate the client experience.Collaborate with development teams to incorporate performance as a core requirement in the development of new features.
Join DigitalOcean as a Senior Engineer focused on Inference Optimizations, where you will play a pivotal role in enhancing our AI and machine learning capabilities. Collaborate with a talented team to develop cutting-edge solutions that optimize inference processes across various applications.
About KernelKernel is an innovative developer platform that delivers Lightning-Fast Browsers-as-a-Service for browser automation and web agent deployment. Our API and MCP server empower developers to effortlessly launch cloud-based browsers without the hassle of infrastructure management.Our serverless browser solution takes care of the complexities: autoscaling, dependable browser infrastructure, observability, and intricate web interactions, allowing developers to concentrate on their agents' functionality rather than the underlying technology. Kernel brings AI to life, enabling developers to create agents that genuinely engage with the digital landscape.Our platform is trusted by teams at Cash App, Rye, and many others for various tasks including in-depth research, QA automation, and real-time web analysis. We recently secured $22M in funding from notable investors such as Accel, YCombinator, Vercel, Paul Graham, Solomon Hykes (Docker), David Cramer (Sentry), and Charlie Marsh (Astral).With just a single line of code, you can deploy any web agent to our cloud infrastructure. If you are passionate about developing essential infrastructure for the future of AI applications, we would love to connect with you.
About KernelKernel is a cutting-edge developer platform that offers Lightning-Fast Browsers-as-a-Service tailored for browser automation and web agent creation. Our API and MCP server enable developers to seamlessly launch browsers in the cloud without the hassle of infrastructure management.Our serverless browser platform takes care of the complex tasks: autoscaling reliable browser infrastructure, ensuring observability, and managing the intricate details of web interactions, allowing developers to concentrate on their agent functionalities rather than the underlying processes. Kernel brings AI to life, making it practical and powerful, empowering developers to deploy agents that can effectively engage with the digital landscape.We are trusted by teams at Cash App, Rye, and numerous others for diverse applications like in-depth research, QA automation, and real-time web analysis. We have successfully secured $22M in funding from notable investors including Accel, YCombinator, Vercel, Paul Graham, Solomon Hykes (Docker), David Cramer (Sentry), Charlie Marsh (Astral), among others.With just one line of code, you can deploy any web agent to our cloud. The rest is in your hands. If you're passionate about developing critical infrastructure for the next generation of AI applications, we would love to connect.
About KernelKernel is a cutting-edge developer platform that offers Lightning-Fast Browsers-as-a-Service for browser automations and web agents. Our API and MCP server empower developers to effortlessly launch browsers in the cloud without the hassle of managing infrastructure.Our serverless browser platform takes care of the complex aspects: autoscaling reliable browser infrastructure, observability, and intricate web interactions, enabling developers to concentrate on the functionality of their agents rather than the underlying details. Kernel transforms AI into a tangible, practical, and powerful tool, allowing developers to deploy agents capable of genuine interaction with the digital landscape.We pride ourselves on being trusted by teams at Cash App, Rye, and numerous others for deep research, QA automation, and real-time web analysis. We have successfully secured $22M in funding from top investors including Accel, YCombinator, Vercel, Paul Graham, Solomon Hykes (Docker), David Cramer (Sentry), Charlie Marsh (Astral), and more.With just one line of code, you can deploy any web agent to our cloud. The rest is in your hands. If you are passionate about building essential infrastructure for the next wave of AI applications, we would love to hear from you.About the RoleAs a Product Engineer at Kernel, you will be a full-stack engineer who values product development as much as coding. You possess the ability to translate your strong product instincts into code, ranging from pixel-perfect UI decisions to backend API architecture. You proactively contribute to the specification process rather than waiting for one to be provided.You will collaborate closely with our co-founders to define product direction, deliver full-stack features from end to end, and ensure that Kernel maintains its polished yet powerful appearance.Your ResponsibilitiesLead the full-stack implementation of user-facing product surfaces: dashboard, onboarding, website, and core product functionalities.Influence the product roadmap by integrating customer feedback, analyzing usage patterns, and leveraging your own insights into developer needs.Enhance developer experience across our SDK, documentation, CLI, and API, delivering the kind of seamless experience that makes developers exclaim, 'this just works.'Rapidly prototype and iterate, bringing features from concept to production with minimal oversight.Help shape the standards for building a superior developer product at Kernel.Your QualificationsYou are comfortable taking ownership of features from frontend to backend, demonstrating a holistic understanding of product development.A strong passion for creating seamless user experiences and an ability to translate product vision into functional code.Experience working in a fast-paced environment with a focus on agile methodologies.
At Magic, our goal is to develop safe AGI that propels humanity forward by addressing some of the most pressing challenges we face. We are committed to harnessing the power of automated research and code generation to enhance models and improve alignment in ways that surpass human capabilities. Our innovative methodology integrates cutting-edge pre-training, domain-specific reinforcement learning, ultra-long context, and advanced inference-time computing.Role OverviewAs a Kernel Engineer, you will be responsible for the design, implementation, and maintenance of high-performance kernels, aiming to optimize throughput and minimize latency during both training and inference processes.Magic's extended context windows present unique kernel optimization challenges, particularly regarding memory efficiency, data movement, and sustained throughput.Key ResponsibilitiesDesign and develop kernels that facilitate high-performance long-context functionality.Take ownership of kernel design, implementation, deployment, and ensure production reliability.Emphasize robustness, thorough testing, and functional accuracy while striving for optimal performance.Assess the feasibility of porting Magic’s compute kernels to various hardware platforms.Collaborate with the training, inference, and reinforcement learning teams to co-design kernels.Explore our work through the Magic-Attention, presented at GTC 2026.QualificationsExperience in low-level programming for AI accelerators, including NVIDIA Blackwell or Google TPUs.Proficient in developing and optimizing GPU kernels using frameworks such as NCCL, MSCCLPP, CUTLASS, CuTeDSL, Triton, Quack, and Flash Attention.
About Sprinter HealthAt Sprinter Health, we are dedicated to transforming healthcare accessibility by delivering essential services directly to patients' homes. Across the U.S., nearly 30% of individuals forgo preventive and chronic care simply due to accessibility challenges. This often leads to emergency room visits, which contribute to over $300 billion in unnecessary healthcare expenses annually.Leveraging cutting-edge technology akin to that used by leading marketplace and last-mile logistics platforms, we provide care right where it’s needed, particularly for vulnerable populations. To date, we have positively impacted over 2 million patients across 22 states, conducted over 130,000 in-home visits, and achieved an impressive Net Promoter Score (NPS) of 92. Our dynamic team of clinicians, tech experts, and operations professionals has successfully raised over $125 million from esteemed investors such as a16z, General Catalyst, GV, and Accel, ensuring a solid multi-year runway for our growth.About the RoleWe are seeking a Senior Software Engineer to join our Logistics Optimization team, where you will address some of the most challenging algorithmic and operational issues in the healthcare sector. In this pivotal role, you will design and implement systems that efficiently balance clinician availability, patient demand, and routing logistics, forming the backbone of Sprinter's in-home care delivery framework. This position demands deep technical expertise and offers high-impact opportunities, focusing on the convergence of operations research, simulation, and scalable distributed systems.Office LocationAs a hybrid company based in the Bay Area, we operate from offices in San Francisco and Menlo Park. We prioritize work-life balance and are committed to providing flexibility when needed.Key Responsibilities:Design and implement algorithms that optimize clinician routing, scheduling, and dispatching on a national scale.Develop simulations that accurately model demand, capacity, and patient behaviors within real-world constraints.Create predictive models to manage cancellations, no-shows, and optimize overbooking strategies.Collaborate with product and operations teams to translate complex logistics challenges into scalable software solutions.Prototype and deploy forecasting and optimization models within a distributed environment.Engage in continuous improvement practices to enhance system performance and reliability.
Feb 9, 2026
Sign in to browse more jobs
Create account — see all 6,939 results
Tailoring 0 resumes…
Tailoring 0 resumes…
We'll move completed jobs to Ready to Apply automatically.