Infrastructure Engineer Security At Thinkingmachines San Francisco jobs in San Francisco – Browse 11,479 openings on RoboApply Jobs

Infrastructure Research Engineer at thinkingmachines | San Francisco

Thinking Machines LabSan Francisco

On-site Full-time $350K/yr - $475K/yr

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

Minimum qualifications:Bachelor’s degree or equivalent experience in computer science, electrical engineering, statistics, machine learning, or a related field. Familiarity with distributed systems and experience in developing scalable infrastructure. Strong programming skills in languages such as Python, Go, or similar. Understanding of machine learning frameworks and GPU resource management.

About the job

Our team comprises scientists, engineers, and builders who have developed some of the most utilized AI products, including ChatGPT and Character.ai, as well as open-weight models like Mistral. We also contribute to notable open-source projects such as PyTorch, OpenAI Gym, Fairseq, and Segment Anything.

About the Role

We are seeking a talented Infrastructure Research Engineer to enhance, scale, and fortify the systems supporting Tinker. This role will enable our internal teams and external clients to fine-tune models seamlessly, reliably, and cost-effectively. You will work at the intersection of large-scale training systems and product infrastructure, creating multi-tenant scheduling, storage, observability, and reliability features within a developer-friendly API.

Your contributions will allow all Tinker users to concentrate on research and development without the burden of infrastructure concerns.

Note: This is an evergreen position that we keep open for ongoing interest. We receive numerous applications, and there may not always be a role that aligns perfectly with your skills and experience. We encourage you to apply, as we continuously review applications and will reach out as new opportunities arise. You are welcome to reapply after gaining more experience, but please refrain from applying more than once every 6 months. We also post specific roles for unique project or team needs, and you are welcome to apply directly to those in addition to this evergreen listing.

What You’ll Do

Design and implement distributed job orchestration, placement, preemption, and fair-share scheduling to enhance Tinker for multi-tenant workloads.
Optimize GPU utilization, throughput, and reliability across clusters (including autoscaling, bin-packing, and quotas).
Develop reusable frameworks and libraries to enhance Tinker’s transparency, reproducibility, and performance.
Collaborate with researchers and developer experience engineers to transform fine-tuning challenges into product features.
Publish and disseminate insights through internal documentation, open-source libraries, or technical reports to advance the field of scalable AI infrastructure.

About Thinking Machines Lab

Thinking Machines is a pioneering AI lab dedicated to the advancement of collaborative general intelligence. Our innovative team has produced some of the most utilized AI solutions globally, ensuring that technology serves humanity’s diverse needs.

Similar jobs

1 - 20 of 11,479 Jobs

Select all on this page (20)

Apply

Infrastructure Research Engineer at thinkingmachines | San Francisco

Thinking Machines Lab

Full-time|$350K/yr - $475K/yr|On-site|San Francisco

At Thinking Machines Lab, we are committed to empowering humanity by advancing collaborative general intelligence. Our vision is to create a future where everyone has access to the knowledge and tools necessary to harness AI for their unique needs and aspirations.Our team comprises scientists, engineers, and builders who have developed some of the most utilized AI products, including ChatGPT and Character.ai, as well as open-weight models like Mistral. We also contribute to notable open-source projects such as PyTorch, OpenAI Gym, Fairseq, and Segment Anything.About the RoleWe are seeking a talented Infrastructure Research Engineer to enhance, scale, and fortify the systems supporting Tinker. This role will enable our internal teams and external clients to fine-tune models seamlessly, reliably, and cost-effectively. You will work at the intersection of large-scale training systems and product infrastructure, creating multi-tenant scheduling, storage, observability, and reliability features within a developer-friendly API.Your contributions will allow all Tinker users to concentrate on research and development without the burden of infrastructure concerns.Note: This is an evergreen position that we keep open for ongoing interest. We receive numerous applications, and there may not always be a role that aligns perfectly with your skills and experience. We encourage you to apply, as we continuously review applications and will reach out as new opportunities arise. You are welcome to reapply after gaining more experience, but please refrain from applying more than once every 6 months. We also post specific roles for unique project or team needs, and you are welcome to apply directly to those in addition to this evergreen listing.What You’ll DoDesign and implement distributed job orchestration, placement, preemption, and fair-share scheduling to enhance Tinker for multi-tenant workloads.Optimize GPU utilization, throughput, and reliability across clusters (including autoscaling, bin-packing, and quotas).Develop reusable frameworks and libraries to enhance Tinker’s transparency, reproducibility, and performance.Collaborate with researchers and developer experience engineers to transform fine-tuning challenges into product features.Publish and disseminate insights through internal documentation, open-source libraries, or technical reports to advance the field of scalable AI infrastructure.

Nov 27, 2025

Apply

Infrastructure Engineer - Security at thinkingmachines | San Francisco

Thinking Machines Lab

Full-time|$200K/yr - $475K/yr|On-site|San Francisco

At Thinking Machines Lab, our mission is to empower humanity by advancing collaborative general intelligence. We are dedicated to building a future where everyone can access the knowledge and tools necessary to harness AI for their unique needs and objectives.We are a team of scientists, engineers, and builders who have developed some of the most widely used AI products, including ChatGPT and Character.ai, and contributed to open-weight models like Mistral, along with popular open-source projects such as PyTorch, OpenAI Gym, Fairseq, and Segment Anything.About the RoleWe are seeking an Infrastructure Engineer to take charge of evolving the security infrastructure that supports our foundational models. In this pivotal role, you will collaborate across computing, storage, networking, and data platforms to ensure our systems remain secure, reliable, and scalable. You will design controls, architecture, and tooling that embed security into the platform's core functionalities. Working closely with research and product teams, you will enable them to operate swiftly while safeguarding our models, data, and environments.Note: This is an "evergreen role" that we maintain for ongoing interest. While we receive numerous applications, there may not always be an immediate position that perfectly matches your skills and experience. We encourage you to apply, as we continuously assess applications and reach out to candidates when new opportunities arise. Feel free to reapply if you gain more experience, but please refrain from applying more than once every six months. Additionally, we occasionally post openings for specific roles to meet project or team-specific needs, and in those cases, you are welcome to apply directly in conjunction with this evergreen role.What You’ll DoDesign security patterns for platforms and services, including network segmentation, service-to-service authentication, RBAC, and policy enforcement in Kubernetes and cloud environments.Oversee identity, access, and secrets management for users and services: workload and cross-cloud identity, least-privilege IAM, and secrets management.Create secure platforms for data ingestion, processing, and curation, encompassing classification, encryption, access controls, and safe sharing practices across teams.Develop threat models and review designs with researchers and engineers to facilitate safe and scalable feature launches.Automate security checks and implement guardrails: policy-as-code, secure infrastructure baselines, CI/CD validation, and tools that streamline secure operations.

Dec 2, 2025

Apply

Software Engineer, Research Acceleration at thinkingmachines | San Francisco

Thinking Machines Lab

Full-time|$350K/yr - $475K/yr|On-site|San Francisco

At Thinking Machines Lab, our mission is to empower humanity by advancing collaborative general intelligence. We aspire to create a future where everyone can access the knowledge and tools necessary to harness AI for their individual needs and aspirations.Our team consists of scientists, engineers, and innovators who have developed some of the most renowned AI products, including ChatGPT and Character.ai, as well as open-weight models such as Mistral. We are also contributors to popular open-source initiatives like PyTorch, OpenAI Gym, Fairseq, and Segment Anything.About the RoleWe are seeking talented engineers to develop the libraries and tools that will expedite research at Thinking Machines. You will take charge of our internal infrastructure, which includes evaluation libraries, reinforcement learning training libraries, and experiment tracking platforms, all aimed at enhancing research velocity over time.This position emphasizes collaboration; you will engage directly with researchers to pinpoint bottlenecks and challenges. Your success will be measured by the trust researchers place in your systems and their enjoyment of using them.What You'll DoDesign, develop, and manage research infrastructure, including evaluation frameworks, RL training systems, experiment tracking platforms, visualization tools, and shared utilities.Create high-throughput, scalable pipelines for distributed evaluation, reward modeling, and multimodal assessments.Establish systems for reproducibility, traceability, and stringent quality control throughout research experiments and model training processes. Implement monitoring and observability.Collaborate closely with researchers to identify obstacles and unlock new capabilities. Manage research tools like a product manager, actively seeking feedback and tracking user adoption.Work alongside infrastructure, data, and product teams to ensure seamless integration of tools across the technical stack.

Feb 3, 2026

Apply

Full Stack Software Engineer at thinkingmachines | San Francisco

Thinking Machines Lab

Full-time|$350K/yr - $475K/yr|On-site|San Francisco

At Thinking Machines Lab, our mission is to empower humanity by advancing collaborative general intelligence. We envision a future where everyone has access to the knowledge and tools necessary to tailor AI to their unique needs and aspirations.Our team consists of scientists, engineers, and innovators who have developed some of the most widely utilized AI products, such as ChatGPT and Character.ai. We are also the creators of open-weight models like Mistral, along with popular open-source projects including PyTorch, OpenAI Gym, Fairseq, and Segment Anything.About the RoleWe are in search of a talented Full Stack Engineer to create and deploy products from initial prototype to full-scale implementation. You will maintain tools that enhance the efficiency of our research and product teams, working on both frontend and backend components while contributing to the reliability, observability, and security of our production environment.This position is categorized as an

Nov 27, 2025

Apply

Security-Focused Software Engineer at thinkingmachines | San Francisco

Thinking Machines Lab

Full-time|$350K/yr - $475K/yr|On-site|San Francisco

At Thinking Machines Lab, our mission is to enhance human capabilities through the development of collaborative general intelligence. We are dedicated to creating a future where everyone can utilize AI tailored to their specific needs and aspirations.Our team consists of accomplished scientists, engineers, and innovators responsible for some of the most popular AI applications, including ChatGPT and Character.ai, along with renowned open-weight models like Mistral and influential open-source projects such as PyTorch, OpenAI Gym, Fairseq, and Segment Anything.About the RoleWe are on the lookout for a passionate Software Engineer with a focus on security to ensure our products are secure by design while facilitating rapid and ambitious product development. You will collaborate closely with product and research teams to integrate security measures into the design and development processes, and create tools and automation to maintain system safety at scale.Note: This is an ongoing opportunity, and we encourage you to express your interest. While we receive numerous applications and there may not always be an immediate match for your skills, we encourage you to apply. We consistently review applications and will reach out as new roles become available. You may reapply if you gain additional experience, but please limit applications to once every six months. We also post specific roles for particular projects or teams, and you are welcome to apply for those as well.What You’ll DoCollaborate with product and research teams to integrate security into the development lifecycle: threat modeling, design reviews, and establishing secure defaults for new features.Design and implement security controls throughout our product stack (authentication, authorization, session management, input validation, etc.).Create and maintain security tooling and automation for engineers: secure frameworks and templates, CI/CD checks, dependency management, and vulnerability detection.Work alongside researchers to identify and address AI-specific product risks, such as model abuse, prompt injection, data leakage, or misuse of capabilities.Enhance observability and detection for security-related events: access anomalies, abuse patterns, and suspicious behavior in production.

Nov 27, 2025

Apply

Software Engineer - Supercomputing at thinkingmachines | San Francisco

Thinking Machines Lab

Full-time|$350K/yr - $475K/yr|On-site|San Francisco

At Thinking Machines Lab, our vision is to enhance human potential by advancing collaborative general intelligence. We are dedicated to creating an inclusive future where everyone can harness AI's capabilities tailored to their unique aspirations.Our team comprises scientists, engineers, and innovators behind some of the most impactful AI solutions, including ChatGPT and Character.ai, as well as open-source projects like PyTorch and Segment Anything.About the RoleWe are seeking a talented Software Engineer to architect, develop, and maintain the GPU supercomputing infrastructure essential for large-scale AI training and inference. Your contributions will ensure high-performance, reliable, and cost-effective computing resources, enabling our users and researchers to achieve rapid advancements at scale.This is an "evergreen role," open for ongoing interest. We receive numerous applications, and while an immediate fit may not always be available, we encourage you to apply. We actively review applications and reach out when new opportunities arise. Reapplications are welcome after six months, and we also post specific roles for unique projects or teams.What You’ll DoAutomate and manage large GPU clusters, handling provisioning, imaging, and capacity strategy.Develop software that simplifies cluster management, providing a cohesive interface for training and inference tasks.Enhance scheduling and orchestration frameworks (Kubernetes, Slurm, or similar) for optimized resource allocation, preemption, and multi-tenancy management.Monitor and improve operational efficiency, focusing on speed, reliability, and error recovery mechanisms.Design robust storage solutions for datasets, checkpoints, and logs, ensuring clear data retention and lineage.Collaborate with researchers to facilitate large-scale experiments, offering guidance on parallelism and performance considerations.

Nov 27, 2025

Apply

Infrastructure Research Engineer - Kernels at Thinking Machines | San Francisco

Thinking Machines Lab

Full-time|$350K/yr - $475K/yr|On-site|San Francisco

At Thinking Machines Lab, our ambition is to enhance human potential by advancing collaborative general intelligence. We envision a future where individuals have the tools and knowledge to harness AI for their distinct requirements and aspirations.Our team comprises dedicated scientists, engineers, and innovators who have contributed to some of the most renowned AI products, including ChatGPT and Character.ai, along with open-weight models like Mistral, and influential open-source projects such as PyTorch, OpenAI Gym, Fairseq, and Segment Anything.About the RoleWe are seeking an Infrastructure Research Engineer to architect, optimize, and sustain the computational frameworks that facilitate large-scale language model training. You will create high-performance machine learning kernels (e.g., CUDA, CuTe, Triton), enable effective low-precision arithmetic operations, and enhance the distributed computing infrastructure essential for training expansive models.This position is ideal for an engineer who thrives in close collaboration with hardware and research disciplines. You will partner with researchers and systems architects to merge algorithmic design with hardware efficiency. Your responsibilities will include prototyping new kernel implementations, evaluating performance across various hardware generations, and helping to establish the numerical and parallelism strategies crucial for scaling next-generation AI systems.Note: This is an evergreen role that remains open continuously for expressions of interest. We receive numerous applications, and there may not always be an immediate opportunity that aligns with your qualifications. However, we encourage you to apply, as we regularly assess applications and will reach out as new positions become available. You are also welcome to reapply after gaining additional experience, but please refrain from applying more than once every six months. Additionally, you may notice postings for specific roles catering to particular projects or team needs. In such cases, you are encouraged to apply directly alongside this evergreen listing.What You’ll DoDesign and develop custom ML kernels (e.g., CUDA, CuTe, Triton) for key LLM operations such as attention, matrix multiplication, gating, and normalization, optimized for contemporary GPU and accelerator architectures.Conceptualize compute primitives aimed at alleviating memory bandwidth bottlenecks and enhancing kernel compute efficiency.Collaborate with research teams to synchronize kernel-level optimizations with model architecture and algorithmic objectives.Create and maintain a library of reusable kernels and performance benchmarks that serve as the foundation for internal model training.Contribute to the stability and scalability of our infrastructure, ensuring it meets the growing demands of AI development.

Nov 27, 2025

Apply

Infrastructure Security Engineer at ScaleAI | San Francisco, CA

Scale AI

Full-time|$237.6K/yr - $297K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY

Join Scale AI as a talented Infrastructure Security Engineer, where you'll play a pivotal role in safeguarding the integrity and security of our platform. This position focuses on securing expansive cloud environments, managing and fortifying various compute clusters, and reviewing infrastructure as code. Your proficiency in cloud security, infrastructure automation, and advanced security practices will be crucial in upholding and advancing our security framework.Your responsibilities include:Securing infrastructure across major cloud hosting platforms (e.g., AWS, Azure, GCP).Implementing and maintaining comprehensive security configurations and policies for cloud environments.Conducting regular security assessments and audits to identify vulnerabilities and propose enhancements.Developing and enforcing security best practices for infrastructure automation and orchestration.Collaborating with Developer Experience, IT, and product teams to integrate security into every phase of the infrastructure lifecycle.Reviewing and securing infrastructure as code (e.g., Terraform, CloudFormation).Mentoring team members on infrastructure security best practices and emerging threats.

Apr 10, 2026

Apply

Senior Infrastructure Security Engineer at Crusoe | San Francisco, CA

Crusoe

Full-time|$210K/yr - $265K/yr|On-site|San Francisco, CA - US

At Crusoe, we are committed to accelerating the abundance of energy and intelligence. Our mission is to develop the technology that empowers individuals to innovate boldly with AI, all while ensuring scalability, speed, and sustainability.Join the AI revolution with sustainable technology at Crusoe. In this role, you will spearhead significant innovations, have a direct impact, and collaborate with a team that is leading the charge in responsible and transformative cloud infrastructure.About the PositionWe are in search of a Senior Infrastructure Security Engineer to fortify the core of Crusoe Cloud, our specialized computing platform designed for AI and high-performance tasks. This role is dedicated to designing and integrating robust security measures into our global infrastructure, allowing clients to develop advanced models in a secure and trusted environment.You will work at the convergence of infrastructure, security, and reliability, crafting identity, network, and cloud security systems that can grow alongside a rapidly expanding cloud service provider.Key ResponsibilitiesDesign and implement security controls across the compute, networking, and storage layers of a global cloud platform.Promote Infrastructure-as-Code (IaC) standards (e.g., Terraform) to establish secure defaults, enforce immutability, and implement drift detection.Develop automated security guardrails integrated within CI/CD and deployment pipelines.Collaborate on a centralized Vault-as-a-Platform service for managing secrets, encryption keys, and internal PKI.Oversee certificate lifecycles (X.509, SSH) to facilitate secure machine-to-machine trust.Advocate for the adoption of short-lived, Just-In-Time (JIT) access models to minimize standing privileges and enhance auditability.Secure foundational network components, including global DNS architecture, service discovery, and network authentication systems.Design and uphold authentication controls for network infrastructure to ensure secure and monitored access.Collaborate closely with infrastructure, platform, and SRE teams to pinpoint and address security vulnerabilities in foundational systems.What You Bring8+ years of hands-on experience in infrastructure engineering, with a strong focus on security.Proficiency in cloud security principles and practices.Strong understanding of compliance frameworks and regulations.

Feb 12, 2026

Apply

Senior Cloud Security Engineer at Perplexity | San Francisco

Perplexity

Full-time|On-site|San Francisco

Join Perplexity as a Senior Cloud Security Engineer and play a pivotal role in transforming how users search and interact with the internet. As a key member of our innovative security team, you will spearhead initiatives to construct and sustain secure and scalable cloud infrastructure, enabling our engineers to innovate swiftly and securely.Core ResponsibilitiesCollaborate with infrastructure and engineering teams to embed security measures into development processes and advocate for secure-by-default practices.Develop Terraform modules that incorporate essential security features, including logging, encryption, and automated threat detection.Implement cloud-native detection capabilities utilizing AWS GuardDuty, Security Hub, and tailor-made detection rules to uncover credential breaches, crypto-mining, and lateral movements.Ensure compliance with SOC 2 Type II and ISO 27001 by automating the collection of cloud control evidence.Conduct security assessments of cloud resource configurations using tools like AWS Config and Open Policy Agent, addressing discrepancies in line with CIS Benchmarks and internal security policies.Fortify CI/CD and supply chain pipelines through controls such as artifact signing, secret scanning, and dependency monitoring.Implement zero trust principles via stringent network segmentation, authentication, and authorization across cloud environments.Engage in security on-call rotation, responding to security alerts and incidents for prompt resolution and root cause analysis.

May 28, 2025

Apply

Fullstack Engineer - Technical Staff at Cogent Security | San Francisco

Cogent Security

Full-time|$100K/yr - $300K/yr|On-site|San Francisco, CA

About Cogent SecurityCogent Security is pioneering the future of cybersecurity through Applied AI, creating advanced AI agents designed to combat rapidly evolving cyber threats. Our AI Taskforce analyzes vast datasets to neutralize potential breaches before they impact our clients.Our commitment to innovation combines cutting-edge research with practical application, ensuring our solutions are at the forefront of technology. In addition to our product development, Cogent Research acts as our applied AI laboratory, supplying the expertise required to create highly effective security workflows.Since our launch from stealth mode, Cogent has seen remarkable growth, collaborating with Fortune 500 companies to secure some of the most intricate production environments globally.Backed by Greylock, our team consists of exceptional talents from leading universities such as Stanford, Berkeley, and Carnegie Mellon, as well as high-growth companies like Scale AI and Tesla, and cybersecurity experts from stalwarts like Wiz and DeepMind.

Nov 12, 2024

Apply

Agent Engineer (Technical Staff Member) at Cogent Security | San Francisco

Cogent Security

Full-time|$100K/yr - $300K/yr|On-site|San Francisco, CA

About Cogent SecurityCogent Security is at the forefront of cybersecurity innovation, leveraging Applied AI to develop next-generation AI agents. In an era where cyber attacks evolve rapidly, our AI Taskforce analyzes vast amounts of enterprise data to proactively address vulnerabilities and prevent critical breaches.We combine pioneering research with practical execution, ensuring that our innovative solutions meet real-world challenges. Our Cogent Research division acts as our dedicated AI lab, driving the development of advanced security workflows.Since our emergence from stealth mode, we have rapidly grown, collaborating with Fortune 500 companies to secure complex production environments globally.Supported by Greylock, we have gathered a team of top talent from renowned institutions and leading organizations in the AI and cybersecurity sectors.About the RoleAs an Agent Engineer at Cogent Security, you will be pivotal in designing, building, and deploying critical AI agents tailored for complex client environments. Your role is highly cross-functional, involving direct collaboration with customers to understand their unique needs, adapting our platform accordingly, and iterating on scalable solutions to handle millions of real-world security events.You will manage projects from inception to deployment, including data onboarding and integrating feedback into our core agent platform. Your contributions will shape how AI agents detect threats, triage incidents, and automate security workflows for some of the most sophisticated organizations worldwide.This position is ideal for engineers who excel in dynamic environments, enjoy tackling complex technical challenges, and wish to see the tangible impact of their work.

Aug 19, 2025

Apply

Backend Engineer (Technical Staff) at Cogent Security | San Francisco

Cogent Security

Full-time|$100K/yr - $300K/yr|On-site|San Francisco, CA

About Cogent SecurityCogent Security is an innovative Applied AI Lab pioneering the future of AI agents in the realm of cybersecurity. In a world where cyber threats evolve at unprecedented speeds, our 'AI Taskforce' analyzes vast amounts of enterprise data to proactively address vulnerabilities and avert critical breaches.We remain at the forefront of technology by merging cutting-edge research with practical applications. Our dedicated Cogent Research team fuels our mission, ensuring we develop truly effective security workflows powered by AI.Since our inception, Cogent has rapidly grown, collaborating with Fortune 500 companies to safeguard the most intricate production environments globally.Supported by Greylock, our team comprises some of the brightest minds in applied AI, including experts from:Renowned universities such as Stanford, Berkeley, Penn, Duke, Carnegie Mellon, and Waterloo.High-growth unicorn companies like Scale AI, Databricks, Stripe, Tesla, and Coinbase.Leading cybersecurity specialists from Wiz, Abnormal AI, and Zscaler.Prestigious research institutions including DeepMind and SAIL.About the RoleAs we embark on building a suite of backend services and integrations with our design partners, we seek passionate and skilled Backend Engineers at both Senior and Staff levels, eager to thrive in the Applied AI domain.ResponsibilitiesDesign and implement critical backend subsystems and integration platformsComprehend business objectives and customer requirements to engineer backend subsystems that align with our technology strategies.Adapt systems to meet evolving needs of design partners and clients.Incorporate non-functional requirements such as compliance and security into system design.Establish scalable infrastructure foundationsPrepare for future growth in customer base, headcount, and data management by collaborating with your team to enhance infrastructure.

Nov 12, 2024

Apply

Security Engineer at Decagon | San Francisco

Decagon

Full-time|On-site|San Francisco

Join Decagon as a Security Engineer where you will play a crucial role in safeguarding our systems and data. You will collaborate with cross-functional teams to identify vulnerabilities, implement security measures, and ensure compliance with industry standards. This is an exciting opportunity to work in a dynamic environment and contribute to the security posture of our organization.

Apr 10, 2026

Apply

Security Engineer at Juicebox | San Francisco

Juicebox

Full-time|On-site|San Francisco

Juicebox is looking for a Security Engineer based in San Francisco. The main focus is to safeguard digital infrastructure and maintain the security of systems and data. Key responsibilities Develop and apply security measures throughout company systems Support compliance efforts with relevant industry security standards Location This role is based in San Francisco.

Apr 20, 2026

Apply

Infrastructure Engineer at Chalk | San Francisco

Chalk

Full-time|On-site|SF

About ChalkAt Chalk, we are revolutionizing the data platform that drives the future of machine learning applications. Our mission is to eliminate the complexity, latency, and scalability issues that have historically limited ML capabilities. Our platform seamlessly integrates Rust-speed performance with user-friendly tools that developers adore. Renowned companies trust Chalk to combat fraudulent credit card transactions, verify identities, and enhance clean energy utilization. Recently, we secured a $50 million Series A funding, spearheaded by Felicis.About the RoleWe are on the lookout for talented engineers to join our Infrastructure team. This is a unique opportunity to become one of our early hires and significantly impact a fast-growing startup. You will have the autonomy to solve complex engineering challenges and take ownership of your projects.We seek a platform engineer with a solid background in infrastructure engineering. At Chalk, we are tackling problems related to DBMS query planning, optimization, compilers, and distributed analytical data processing systems.Chalk employs dynamic and static analysis of Python code to optimize arbitrary user Python code, orchestrate the necessary infrastructure implied by that code, and track metadata regarding data flow through our systems.Our team works in the office five days a week. We are flexible with unavoidable conflicts, but this is not a hybrid position.What You Will DoDevelop code to automate the orchestration and provisioning of infrastructure to implement Chalk technology for our customers and prospects.Create a robust platform for managing our hosted services and deploying Chalk into customer-owned cloud environments across AWS and GCP.Collaborate closely with our Engineering and Sales teams.Contribute to interviewing and expanding the Engineering team.What We’re Looking ForMinimum of 2 years of experience in software development for automated infrastructure management.Proficiency in Python, Go, and/or Terraform.Hands-on experience with AWS and/or GCP.Strong collaborative skills in both technical and non-technical teams.

Dec 18, 2023

Apply

Infrastructure Engineer at rowspace | San Francisco

rowspace

Full-time|On-site|San Francisco

The OpportunityJoin rowspace as an Infrastructure Engineer and play a pivotal role in constructing and safeguarding the core of our cutting-edge AI data platform. In this position, you'll engineer systems capable of managing extensive volumes of sensitive financial information while adhering to rigorous security and compliance standards. Your work will involve real-time integration of public data with private, tenant-isolated customer data at scale.Key ResponsibilitiesDesign and implement scalable infrastructure to support our AI-driven knowledge engine that processes both structured and unstructured financial data.Establish a security-first architecture for private cloud environments, ensuring data governance aligns with financial services regulations.Create resilient data ingestion pipelines that accommodate a variety of data sources, from CapIQ feeds (structured data) to internal SharePoint documents (unstructured data).Develop comprehensive monitoring and alerting systems for our BYOC platform.Enforce access controls and maintain audit trails to ensure that AI interactions can be traced back to primary sources.Collaborate with our AI Research and Product teams to enhance infrastructure for LLM inference and training workloads, as well as agent infrastructure development.Establish CI/CD practices and infrastructure-as-code for swift, reliable deployments across multiple cloud providers.

Feb 4, 2026

Apply

Software Engineer, Data Infrastructure

Thinking Machines Lab

Full-time|$350K/yr - $475K/yr|On-site|San Francisco

At Thinking Machines Lab, our vision is to enhance human potential by advancing collaborative general intelligence. We are dedicated to creating a future where individuals have the resources and knowledge to harness AI for their specific objectives and aspirations.Our team comprises scientists, engineers, and innovators who have developed some of the most popular AI products, including ChatGPT and Character.ai, as well as influential open-weight models like Mistral, along with highly regarded open-source projects such as PyTorch, OpenAI Gym, Fairseq, and Segment Anything.About the RoleWe are seeking a talented engineer to enhance our data infrastructure. You will become part of a dynamic, high-impact team tasked with designing and scaling the foundational infrastructure for distributed training pipelines, multimodal data catalogs, and sophisticated processing systems that manage petabytes of data.Our infrastructure is pivotal; it serves as the foundation for every groundbreaking achievement. You will collaborate directly with researchers to expedite experiments, develop novel datasets, optimize infrastructure efficiency, and derive essential insights from our data repositories.If you are passionate about distributed systems, large-scale data mining, and open-source tools such as Spark, Kafka, Beam, Ray, and Delta Lake, and enjoy building innovative solutions from scratch, we encourage you to apply.Note: This is an evergreen role that we keep open continuously for expressions of interest. We receive a high volume of applications, and while there may not always be an immediate position that aligns perfectly with your skills and experience, we encourage you to apply. We regularly review applications and reach out as new opportunities arise. You are welcome to reapply after gaining more experience, but please refrain from applying more than once every six months. We may also post for specific roles for particular projects or team needs, and in those cases, you are welcome to apply directly in addition to this evergreen role.

Nov 27, 2025

Apply

Applied AI Engineer - Technical Staff at Cogent Security | San Francisco

Cogent Security

Full-time|$100K/yr - $300K/yr|On-site|San Francisco, CA

About Cogent SecurityCogent Security is at the forefront of innovation in cybersecurity, operating as an Applied AI Lab dedicated to developing next-generation AI agents. In an era where cyber threats evolve rapidly, our "AI Taskforce" analyzes vast amounts of enterprise data to proactively address vulnerabilities and prevent significant breaches.Blending cutting-edge research with practical implementation, our team works diligently to transform theoretical concepts into actionable security solutions. Our applied AI lab, Cogent Research, provides essential research capabilities that enable the realization of autonomous security workflows.Since our emergence from stealth mode, Cogent has seen impressive growth, collaborating with Fortune 500 companies to enhance security in some of the world’s most complex environments.Supported by Greylock, we have assembled a team of top-tier talent from influential universities and leading companies in the tech and cybersecurity sectors.

Nov 12, 2024

Apply

Infrastructure Engineer at Vibecode | San Francisco

Vibecode

Full-time|$150K/yr - $300K/yr|On-site|San Francisco

About VibecodeAt Vibecode, we are revolutionizing the way software is created. Our innovative platform empowers anyone to articulate an idea and instantly transform it into a fully functional application—no coding skills required.We are tackling one of the most significant challenges in computing: aligning human intent with software execution. This endeavor necessitates groundbreaking advancements in AI reasoning, code generation, and user experience design.Our impressive seed funding comes from some of the top investors globally, including Alexis Ohanian (776), Arielle Zuckerberg, Cyan Banister (Long Journey), Ali Partovi, Suzanne Xie (Neo), and numerous esteemed angels from Google, Expo, OpenAI, and beyond.About the RoleAre you eager to be at the cutting edge of infrastructure design for a consumer product that will reach millions? If so, this opportunity is perfect for you.We seek an Infrastructure Engineer to develop the foundational systems that support millions of AI-generated applications. You will design a platform capable of securely hosting thousands of user-created applications concurrently while ensuring optimal performance and unwavering reliability.Your Responsibilities:Develop and implement secure sandbox environments for executing untrusted AI-generated code at scale.Create orchestration systems for stateless containers capable of launching over 10,000 applications simultaneously.Architect backend API services for real-time code generation, compilation, and deployment.Establish monitoring and observability systems for complex, multi-tenant application infrastructures.Design auto-scaling solutions to manage unpredictable traffic patterns from viral consumer applications.Build security-focused infrastructure that isolates user applications while preserving performance.This is not conventional infrastructure work. You will face unique challenges related to large-scale code execution, develop systems that are yet to be created, and establish infrastructure paradigms suited for the AI-native era.

Jun 7, 2025

Create account — see all 11,479 results

1 - 20 of 11,479 Jobs

Select all on this page (20)

Apply

Infrastructure Research Engineer at thinkingmachines | San Francisco

Thinking Machines Lab

Full-time|$350K/yr - $475K/yr|On-site|San Francisco

Nov 27, 2025

Apply

Infrastructure Engineer - Security at thinkingmachines | San Francisco

Thinking Machines Lab