Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Unlock Your Potential
Generate Job-Optimized Resume
One Click And Our AI Optimizes Your Resume to Match The Job Description.
Is Your Resume Optimized For This Role?
Find Out If You're Highlighting The Right Skills And Fix What's Missing
Experience Level
Mid to Senior
Qualifications
You are an ideal candidate if you possess:Experience in building and scaling sophisticated infrastructure systemsProficiency with Kubernetes, ETL pipelines, AWS/GCP/Cloudflare, CI/CD, and SOC 2 environmentsSound judgment regarding security, reliability, cost, and iteration-speed trade-offsFamiliarity with AI-first development practicesInsights gained from extensive enterprise-grade engineering experiencesStrong views on modernizing or replacing legacy systemsA sense of ownership and comfort in working across various domains
About the job
The Platform Engineer at Coframe will help shape the core infrastructure that powers the company’s engineering efforts. Based in the SF Bay Area, this role involves designing and refining the systems that underpin how teams build, deploy, and manage software.
Day-to-day work includes using AI tools to improve productivity and streamline deployment processes. The engineer will also focus on strengthening monitoring, enhancing security, and managing costs across the platform.
Impact
This position carries significant responsibility. The solutions developed will directly support all teams at Coframe, influencing how software is created and maintained across the company. The work done in this role will help set the direction for future engineering practices.
About Coframe
About Coframe:At Coframe, we are innovating a future where user and agent interfaces are able to adapt, evolve, and personalize seamlessly. Our culture is centered around collaboration, creativity, and the commitment to pushing the boundaries of technology.
Similar jobs
1 - 20 of 8,700 Jobs
Search for Senior Ai Infrastructure Engineer Training Platform
Full-time|On-site|San Francisco, CA; Seattle, WA; New York, NY
Scale AI is seeking a Senior AI Infrastructure Engineer to help build and refine the company’s Training Platform. This position centers on designing, implementing, and improving infrastructure that supports machine learning teams as they train and deploy models. Role overview This engineer will work closely with colleagues across different functions to create solutions that make AI systems more efficient. The focus is on enabling faster, more reliable model training and deployment. Key responsibilities Design and build infrastructure for AI model training Implement and optimize systems to support machine learning workflows Collaborate with teams throughout the company to improve platform capabilities Locations This role is based in San Francisco, Seattle, or New York.
Plasmidsaurus helps scientists worldwide by streamlining sequencing. Researchers from leading institutions and companies rely on this platform daily. With a global network of labs, the company delivers fast, affordable sequencing results, and has recently expanded into RNA-seq to broaden its genomics reach. The team is focused on building a universal sequencing platform designed for efficiency and global scale. Role overview The Lead Engineer for AI Infrastructure in Platform Engineering sets both technical direction and management strategy for the company’s compute, data, AI, and security infrastructure. This position oversees the entire sequencing operation, from laboratory devices to data delivery. What you will do Oversee core services that coordinate laboratory devices, including robots, sequencers, and on-premises Linux servers, as well as the data ingestion pipeline. Develop cloud infrastructure and data pipelines for storing, processing, and delivering terabytes of sequencing data. Design systems to manage millions of bioinformatics tasks, handling queue management, workflow orchestration, and scheduling. Build AI infrastructure and internal tools to support autonomous systems, including: Quality Scientist Agents: Monitor operations, detect anomalies, and escalate quality or reliability concerns. Logistics Agents: Coordinate global transportation of samples to labs and carriers. Bioinformatics Coding Agents: Run adaptive analyses on varied sample types with different data distributions. Culture The team values initiative and a strong sense of ownership. High agency and responsibility shape how work gets done at Plasmidsaurus.
Full-time|$216.2K/yr - $270.3K/yr|On-site|San Francisco, CA; New York, NY
Join our dynamic Machine Learning Infrastructure team as a Senior AI Infrastructure Engineer, where you will play a pivotal role in designing and constructing platforms that ensure the scalable, reliable, and efficient serving of Large Language Models (LLMs). Our innovative platform supports a range of cutting-edge research and production systems, catering to both internal and external applications across diverse environments.The ideal candidate will possess a solid foundation in machine learning principles coupled with extensive experience in backend system architecture. You will thrive in a collaborative environment that bridges research and engineering, working diligently to provide seamless experiences for our customers and accelerating innovation across the organization.
Join Our MissionAt Hyperbolic Labs, we are dedicated to democratizing artificial intelligence by eliminating barriers to computing power through our Open-Access AI Cloud. We aggregate global computing resources to provide an innovative GPU marketplace and AI inference service, making AI affordable and accessible for everyone. As pioneers at the crossroads of AI and open-source technology, we envision a future where AI innovation is driven by imagination, not resource limitations. We invite forward-thinking individuals who share our vision of making AI universally accessible, secure, and cost-effective to join us in crafting a platform that empowers innovators to realize their groundbreaking AI projects.As we gear up for expansion following our Series A funding, our team, led by co-founders with PhDs in AI, Mathematics, and Computer Science, is set to transform the landscape of computing.The RoleWe are on the lookout for a Senior Infrastructure Engineer to drive the development and scaling of Hyperbolic's GPU Cloud Marketplace. In this pivotal role, you will create a multi-tenancy provisioning and virtualization solution that transforms raw GPUs from diverse global suppliers into a programmable, orchestrated resource pool serving thousands of AI developers and researchers. You will work at the forefront of cloud infrastructure, building the core orchestration layer that allows our platform to deliver cost savings of up to 75% compared to traditional cloud providers.
Full-time|$196K/yr - $220.5K/yr|On-site|San Francisco Bay Area
At Discord, we connect over 200 million users monthly for diverse experiences, with gaming being the predominant activity. Our platform supports more than 90% of our users in enjoying games, collectively logging 1.5 billion hours each month across various titles. As we shape the future of gaming, our mission is to enhance interactions before, during, and after gaming sessions.The Platform Infrastructure teams are pivotal in constructing and upholding the essential systems that energize Discord's core functionalities. We manage systems that process hundreds of thousands of requests per second and handle tens of billions of transactions daily, enabling seamless connections for millions of users. By developing foundational platform components, we empower internal developers to deploy new features swiftly and securely, ensuring Discord remains reliable, efficient, and scalable.As a Senior Software Engineer on our team, you will play a crucial role in continuously refining our codebase, processes, and infrastructure, directly impacting user interactions on Discord!
About the RoleJoin the innovative team at Known as an Infrastructure and Platform Engineer, where you will take the lead in managing and enhancing our core infrastructure and platform systems. Your work will be crucial in powering AI-driven matching, voice, and scheduling functionalities. You will be responsible for everything from cloud infrastructure and data orchestration to performance monitoring and model deployment support, designing and scaling systems that ensure Known operates swiftly, reliably, and securely.In this pivotal role, you will collaborate closely with the founding team, comprising experts in AI/ML, product development, and design, to establish Known’s technical foundation. You will play a key role in shaping our architecture, engineering culture, and best practices right from the start. This position is perfect for a practical builder who thrives in early-stage environments and is passionate about taking projects from concept to production.
Full-time|$282K/yr - $363K/yr|On-site|San Francisco, CA
Supported by premier investors from Silicon Valley, Peregrine Technologies empowers public safety organizations, government entities, federal agencies, and private institutions to tackle societal challenges with unmatched speed and precision. Our AI-driven platform transforms isolated and unconnected data into actionable operational intelligence, swiftly surfacing critical information that enables better, faster decision-making, thereby enhancing outcomes at every interaction. Currently, Peregrine serves hundreds of clients across more than 30 states and two countries, impacting over 125 million individuals, and we are poised to extend our influence into enterprise sectors and globally.Our TeamAs a cohesive engineering unit, we firmly believe that empathy enhances our solutions. Observing how users interact with our products is pivotal in guiding us toward the right solutions. Engineers will have the opportunity to collaborate closely with our onsite team to grasp the diverse use cases that Peregrine addresses.We are on the lookout for an Engineering Manager to join our core engineering teams. You will collaborate cross-functionally with design and product management to develop robust, scalable, and user-centered systems. Our teams face a range of challenges, from enabling real-time collaboration on detailed maps to constructing high-scale backend architectures capable of processing billions of data points.We value both ownership and collaboration—you will take full responsibility for significant features while working closely with fellow engineers to drive projects to fruition. We hold that humility and empathy are vital for crafting the right solutions—you will engage directly with our deployment team and users as we iterate to tackle their challenges. Creativity and perseverance are essential in realizing our vision.RoleThis position is central to the strategic execution of Peregrine's platform. You will define how our core systems scale, perform, and evolve as Peregrine continues its rapid growth and strengthens its impact across public safety, government, and enterprise sectors.As a senior platform leader, your role transcends mere system management; you will establish the technical direction, build your team, and create the operational framework that empowers every product team at Peregrine to progress with speed, safety, and assurance. Your contributions will directly influence system reliability and performance.
Join the Revolution at Retell AIRetell AI is pioneering the future of call centers through innovative voice AI, driven by first principles thinking.In just 18 months since our inception, we have empowered thousands of businesses with our AI voice agents, transforming how sales, support, and logistics calls are managed—previously requiring extensive human teams. Supported by prestigious investors such as Y Combinator and Alt Capital, we've rapidly scaled from $5M ARR to an impressive $36M ARR with a compact yet dynamic team of 20.Our ambition for 2026 is to create a revolutionary customer experience platform, where entire contact centers are powered by AI. Moving beyond basic automation, we aim to develop intelligent AI “workers” that serve as frontline agents, QA analysts, and managers, continuously enhancing customer interactions without the need for constant human oversight.As we expand, we are seeking passionate engineers who are eager to solve challenging technical problems, act swiftly, and make a significant impact in one of the fastest-growing voice AI startups. Let’s shape the future together.
Senior Software Engineer, Infrastructure & PlatformRole OverviewIn the role of Senior Software Engineer, Infrastructure & Platform at AfterQuery, you will take on the exciting challenge of designing and constructing the essential infrastructure that drives our innovative data generation, evaluation, and agentic systems.Your responsibilities will include developing shared platforms that empower our engineering and research teams to execute large-scale human-in-the-loop workflows, evaluation harnesses, and automated data pipelines essential for training cutting-edge AI models.This position demands a high level of technical expertise and offers extensive ownership. You will be responsible for architecting and building the foundational infrastructure relied upon by numerous engineers, ensuring that systems are scalable, reliable, and capable of handling high-throughput workloads.Collaboration with the founding team will be key as you define system architecture, establish best engineering practices, and create the infrastructure that supports the evolution of AI development.
About Brain Co.At Brain Co., we are at the forefront of artificial intelligence, developing innovative systems that facilitate mission-critical operations for some of the world's leading institutions. Our cutting-edge platform operates in high-security, high-stakes environments, where reliability, performance, and robust engineering practices are paramount.As an AI Platform Engineer specializing in Infrastructure, you will be instrumental in building and scaling the foundational platform that supports AI systems used in essential sectors, including government, energy, and healthcare. You will work within dynamic environments that span both cloud and on-premises settings, directly influencing our platform's reliability and performance, ensuring we meet the high standards required by our clients.This role is pivotal within our Infrastructure/Platform team. You will collaborate closely with engineering, AI/ML, and product teams to design scalable architectures, enhance our environments, optimize deployment processes, and guarantee the robustness necessary for enterprise and sovereign applications.
Full-time|$185K/yr - $400K/yr|On-site|San Francisco, California, United States
Join Our Team as an Infrastructure & Platform EngineerWe are seeking a talented Infrastructure & Platform Engineer to join our dynamic team at mlabs in San Francisco. As a rapidly growing technology company, we are at the cutting edge of the crypto derivatives market, an industry that generates tens of billions in annual revenue. Our exchange is one of the fastest-growing platforms for crypto derivatives, and we are committed to enhancing our offerings to meet the evolving needs of our users.Your mission will be to develop the next critical feature: Multi-Asset Margin, which will streamline how users post collateral directly on-chain, thus improving trading efficiency. You will work alongside our Infrastructure & Platform team, focusing on designing and managing our high-performance systems that deliver exceptional speed and reliability.Key Responsibilities:Design and implement robust scripts and services that ensure optimal performance in real-time environments.Manage and deploy computing resources and containers for tailored services and integrations.Automate scaling, load balancing, and congestion control for both compute and database layers.Establish and maintain CI/CD pipelines for streamlined deployments and continuous delivery.Monitor and optimize system performance across multiple metrics to enhance throughput and resilience.Develop and maintain indexing and explorer services for fast, real-time data access.Provision and optimize diverse database systems, including time-series, relational, key-value, and in-memory databases.
ABOUT BASETENAt Baseten, we empower leading AI companies such as Cursor, Notion, and Abridge by delivering mission-critical inference capabilities. Our innovative platform integrates applied AI research, versatile infrastructure, and intuitive developer tools, enabling organizations at the forefront of AI to deploy cutting-edge models seamlessly. With our recent $300M Series E funding from top investors like BOND and Greylock, we are rapidly expanding. Join us to create the ultimate platform for engineers to launch AI products effectively.THE ROLEWe are seeking a passionate and customer-focused software engineer to join our team. You will take ownership of features, including multi-node training and serverless reinforcement learning (RL), guiding them from concept to MVP and beyond. Your responsibilities will span the entire technology stack, from API and UI design to infrastructure architecture. By diving deep into model fine-tuning, you will gain valuable insights into user workflows. Collaborating closely with research engineers, you will apply advanced training techniques to develop solutions that address real user challenges. If you are eager to explore the intricacies of AI training, we want to hear from you!THE PRODUCTDiscover what we have accomplished so far:Comprehensive Product OverviewTraining Documentation OverviewThe Journey of Our Training ProductOur Research EndeavorsEXAMPLE INITIATIVESCheckpointing Pipeline: Our automated checkpointing feature ensures that model versions created during training are securely backed up to the cloud, allowing users to easily deploy checkpoints with minimal friction.
Full-time|$200K/yr - $240K/yr|On-site|San Francisco, CA
Contribute to a Safer Future.TRM Labs is at the forefront of blockchain analytics and AI technology, empowering law enforcement, financial institutions, and cryptocurrency enterprises to identify and combat cryptocurrency-related fraud and financial crime. Our innovative blockchain intelligence and AI tools are designed to trace fund flows, pinpoint illicit activities, build comprehensive cases, and provide actionable insights into potential threats. Trusted by prominent agencies and organizations globally, TRM is committed to fostering a safer and more secure environment for everyone.Join our dynamic AI Engineering Team, dedicated to pioneering next-generation AI applications, with a particular emphasis on Large Language Models (LLMs) and agent-based systems. Our objective is to create efficient pipelines, high-caliber infrastructure, and operational tools that facilitate the rapid, safe, and scalable deployment of AI systems.We oversee petabyte-scale data pipelines, deliver models with millisecond latency, and ensure the observability and governance necessary to make AI production-ready. Our team actively evaluates and integrates cutting-edge technologies in the LLM and agent domains, utilizing open-source stacks, vector databases, evaluation frameworks, and orchestration tools that enhance TRM’s agility and innovation capacity.As a Senior or Staff AI Infrastructure Engineer, you will play a pivotal role in constructing and scaling the technical framework for AI and ML systems. Your responsibilities will include:Developing reusable CI/CD workflows for model training, evaluation, and deployment, integrating tools like Langfuse, GitHub Actions, and experiment tracking systems.Automating model versioning, approval workflows, and compliance checks across various environments.Building a modular and scalable AI infrastructure stack, encompassing vector databases, feature stores, model registries, and observability tools.Collaborating with engineering and data science teams to embed AI models and agents into real-time applications and workflows.Continuously assessing and integrating state-of-the-art AI tools (e.g., LangChain, LlamaIndex, vLLM, MLflow, BentoML).Driving AI reliability and governance, facilitating experimentation while ensuring compliance, security, and uptime.Enhancing the performance of AI and ML models.Ensuring data accuracy, consistency, and reliability for improved model training and inference.Deploying infrastructure to support both offline and online evaluations of LLMs and agents.
ABOUT BASETENAt Baseten, we are at the forefront of enabling transformative AI solutions for some of the world's leading companies, including Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma, and Writer. Our innovative platform combines cutting-edge AI research, adaptable infrastructure, and developer-friendly tools to facilitate the production of advanced models. Recently, we celebrated our rapid growth with a successful $300M Series E funding round from notable investors like BOND, IVP, Spark Capital, Greylock, and Conviction. We invite you to join our dynamic team and contribute to the evolution of AI product deployment.THE ROLEAs a Senior Software Engineer specializing in Model Training at Baseten, you will play a pivotal role in constructing the infrastructure essential for the large-scale training and fine-tuning of foundational AI models. Your responsibilities will include designing and implementing distributed training systems, optimizing GPU utilization, and establishing scalable pipelines that empower Baseten and our clientele to adapt models with efficiency and reliability. This role demands a high level of technical expertise and hands-on involvement: you will be responsible for critical components of our training stack, collaborate with product and infrastructure teams to identify customer needs, and drive advancements in scalable training infrastructure.EXAMPLE WORK:Training open-source models that surpass GPT-5 capabilities for a leading digital insurerExploring specialized, continuously learning models as the future of AIOverview of our training documentationResearch initiatives we've undertakenRESPONSIBILITIESDesign, construct, and sustain distributed training infrastructures for large foundation modelsDevelop scalable pipelines for fine-tuning and training across diverse GPU/accelerator clustersEnhance training performance through optimization of algorithms and infrastructureCollaborate closely with cross-functional teams to align technical solutions with business objectivesStay abreast of advancements in the field of machine learning and AI to continually improve our training processes
Full-time|$180K/yr - $210K/yr|On-site|San Francisco, CA
About Sigma Computing Sigma Computing builds AI-powered apps and analytics tools that connect directly to cloud data warehouses. Teams use Sigma to create applications, automate workflows, and analyze live data through a spreadsheet interface, SQL and Python editors, visual builders, and integrated AI features. The platform supports everything from interactive analyses to reports and embedded data experiences. Role Overview: Senior Product Manager - Platform Performance & Infrastructure Sigma is growing to serve larger enterprises with demanding, complex workloads. The Senior Product Manager for Platform Performance & Infrastructure will guide the development of core backend systems that keep Sigma responsive and reliable as usage scales. This role focuses on driving improvements in: Workbook performance Query lifecycle management Compute and caching strategies Metadata services Compiler components New warehouse connectors These systems are essential for Sigma’s ability to deliver consistent, high-quality performance to enterprise customers. What You Will Do Define and prioritize product enhancements for backend platform performance and scalability Work closely with platform engineering and cross-functional teams to address technical challenges Translate performance and scalability needs into clear product requirements and measurable objectives Ensure Sigma’s infrastructure can support enterprise clients with reliability and speed Who We’re Looking For Experienced Senior Product Manager with strong technical background Comfortable working hands-on with backend systems and infrastructure Skilled at collaborating with engineering and cross-functional partners Focused on delivering measurable improvements for customers Location & On-Site Requirement This position is based in San Francisco, CA. It requires working on-site at the Sigma office at least four days per week.
Full-time|$217K/yr - $312.2K/yr|On-site|San Francisco, California
At Databricks, we are dedicated to empowering data teams to tackle the world's most challenging issues, from realizing the next generation of transportation to expediting medical advancements. Our mission involves constructing and managing the premier data and AI infrastructure platform, enabling our clients to leverage profound data insights to enhance their operations. The Workspace Platform team is embarking on an ambitious journey to scale our customer base by 100x and support the evolution of agentic AI workloads. Our objective is to create a unified, consistent, and foundational Shared Platform that enhances the overall Databricks Workspace experience. As a Senior Engineering Manager within the Workspace Platform team, you will spearhead the development of a cohesive infrastructure that underpins vital customer-facing functionalities across various Databricks products. These include Content Discovery (similar to Google Search), Content Organization (akin to Google Drive), collaborative code editing, and repository management (comparable to GitHub). This is a high-impact opportunity to lead a team of approximately 20 software engineers in developing platform features, intuitive workspace experiences, and essential partner integrations that are pivotal to Databricks' growth and user adoption.
Monaco is revolutionizing the startup ecosystem with its innovative revenue engine, designed to replace outdated CRM systems and fragmented sales solutions with an AI-driven platform.We are seeking a talented Senior Platform Engineer to join our team, focusing on the development of Monaco's cutting-edge data and machine learning platform. Your expertise will help establish the pipelines, contextual systems, and infrastructure that underlie our AI-enhanced products. You will play a pivotal role in ensuring that our models, agents, and workflows deliver real value in production.This is a highly impactful position that sits at the crossroads of data engineering, distributed systems, and applied AI.
Full-time|$300K/yr - $300K/yr|On-site|San Francisco
ABOUT BASETENAt Baseten, we empower leading AI companies such as Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma, and Writer with our state-of-the-art inference solutions. Our unique blend of applied AI research, versatile infrastructure, and intuitive developer tools allows organizations at the forefront of AI innovation to deploy cutting-edge models effectively. Recently, we have experienced significant growth, securing a $300M Series E funding round, backed by renowned investors like BOND, IVP, Spark Capital, Greylock, and Conviction. Become a part of our journey to create the ultimate platform for engineers to launch AI products seamlessly.THE ROLEAs a Senior Software Engineer focused on our Enterprise Platform, you will play a pivotal role in designing and developing robust infrastructure and platform features tailored for our enterprise clientele and cloud partners. Your contributions will encompass enabling self-hosted and single-tenant environments, implementing region-aware request routing, and ensuring enterprise-grade data security and integration capabilities.EXAMPLE INITIATIVESJoin our Infrastructure team and tackle exciting projects such as:Multi-cloud capacity managementOptimizing inference on B200 GPUsImplementing multi-node inference solutionsLeveraging fractional H100 GPUs for efficient model servingRESPONSIBILITIESDesign and implement infrastructure and platform features customized for enterprise clients, covering self-hosted clusters, single-tenant environments, and cross-cloud orchestration.Lead strategic initiatives to enhance secure and scalable private connectivity solutions.Craft and execute solutions that address complex regulatory and compliance requirements for enterprise environments.
Lambda, recognized as The Superintelligence Cloud, is a pioneering force in AI cloud infrastructure, empowering tens of thousands of customers, from AI researchers to large enterprises and hyperscalers. Our mission is to make computational power as accessible as electricity, providing everyone the capability of superintelligence—one person, one GPU.Join us in our quest to build the world’s leading AI cloud platform.Note: This role mandates in-office presence in our San Francisco location four days a week; Lambda’s designated remote work day is Tuesday.As an Engineering Manager at Lambda, you will lead the charge in developing and scaling our cloud offerings, which encompass the Lambda website, cloud APIs, and internal tools for deployment, management, and maintenance.
Role overview The Platform Engineer at Coframe will help shape the core infrastructure that powers the company’s engineering efforts. Based in the SF Bay Area, this role involves designing and refining the systems that underpin how teams build, deploy, and manage software. Day-to-day work includes using AI tools to improve productivity and streamline deployment processes. The engineer will also focus on strengthening monitoring, enhancing security, and managing costs across the platform. Impact This position carries significant responsibility. The solutions developed will directly support all teams at Coframe, influencing how software is created and maintained across the company. The work done in this role will help set the direction for future engineering practices.
Apr 20, 2026
Sign in to browse more jobs
Create account — see all 8,700 results
Tailoring 0 resumes…
Tailoring 0 resumes…
We'll move completed jobs to Ready to Apply automatically.