Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Unlock Your Potential
Generate Job-Optimized Resume
One Click And Our AI Optimizes Your Resume to Match The Job Description.
Is Your Resume Optimized For This Role?
Find Out If You're Highlighting The Right Skills And Fix What's Missing
Experience Level
Senior
Qualifications
What You Need to SucceedEducational Background: Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent hands-on experience. Professional Experience: At least 5 years in a DevOps, Infrastructure, or Site Reliability Engineering role within a fast-paced tech environment or startup. Technical Skills: Proficiency in CI/CD tools (e.g., GitHub Actions, ArgoCD), Docker, and Kubernetes. Infrastructure Expertise: In-depth knowledge of cloud services (AWS/GCP), distributed systems, Infrastructure as Code (IaC) tools like Terraform or Pulumi, and secrets management solutions (e.g., Vault, SSM). Observability Acumen: Strong grasp of logging, metrics, and monitoring practices in large-scale distributed systems.
About the job
Join CodeRabbit as a Senior DevOps Engineer!
At CodeRabbit, we are at the forefront of research and development, crafting cutting-edge systems for human-machine collaboration. Our mission is to revolutionize software development by creating the next generation of AI-driven code review tools. These advancements represent a powerful synergy between human creativity and advanced algorithms, allowing us to maximize productivity and elevate code quality to unprecedented heights.
As a Senior DevOps Engineer, you will be instrumental in scaling, securing, and optimizing the infrastructure that fuels our AI-powered developer tools. Collaborating closely with our platform engineers, backend developers, and applied AI teams, you will ensure our systems are robust, efficient, and easy to deploy, all while maintaining high standards of observability and resilience.
This position is ideal for a proactive individual who thrives in dynamic environments, takes initiative with critical infrastructure, and is passionate about developing tools that empower an ambitious engineering team.
About CodeRabbit
CodeRabbit is a pioneering research and development firm dedicated to enhancing human-machine collaboration through innovative AI-driven solutions. Our aim is to build tools that not only improve the efficiency of software development but also foster a collaborative environment between human engineers and advanced algorithms.
Similar jobs
1 - 20 of 11,301 Jobs
Search for Devops Engineer At Litellm San Francisco
Join our innovative team at litellm as a DevOps Engineer. In this role, you will be instrumental in enhancing our development and operations processes, ensuring seamless integration and delivery of our services. Collaborate with cross-functional teams to design, implement, and manage scalable infrastructure solutions.We are looking for a passionate individual with a strong foundation in cloud technologies, automation, and continuous integration/continuous deployment (CI/CD) practices. Your expertise will help us drive efficiency and reliability in our software delivery lifecycle.
Senior Backend EngineerJoin LiteLLM, the leading AI Gateway trusted by industry giants including Adobe, Netflix, and NASA. Our platform empowers developers with secure and reliable access to Large Language Models (LLMs) and related services. We are currently seeking a dedicated Senior Backend Engineer to contribute to the development of robust guardrails and observability tools at scale.About The RoleIn this role, you'll take ownership of our guardrail and logging implementations. You will oversee backend code to ensure all guardrail interactions are logged accurately, user errors are communicated effectively, and that our observability tools function seamlessly under high traffic conditions. Your meticulous attention to detail regarding latency metrics, logging traceability, and backend guardrail registration will significantly bolster user confidence in our security and compliance capabilities.ResponsibilitiesDevelop and enhance our product to maximize performance, reliability, and ongoing improvements.Ensure guardrail and policy enforcement calls (e.g., applyguardrail) are accurately logged and traceable through our SpendLogs and relevant database tables.Design and implement CPU-level guardrails to mitigate common attacks on LLM APIs, MCP servers, and Agents.Identify and resolve silent failure points in guardrail creation, registration, and policy application—ensuring robust error handling and transparency for end users.Collaborate with observability tools such as Datadog, Splunk, Prometheus, and OpenTelemetry to maintain accurate, configurable, and effective monitoring and logging for backend systems.Enhance observability integrations to manage over 1 billion requests per month with minimal latency and no memory leaks due to Prometheus metrics cardinality.Work cross-functionally on backend engineering priorities including performance, reliability, and security enhancements.What We’re Looking ForBachelor’s or Master’s degree in Computer Science or a related field.4+ years of experience with Python and backend frameworks (e.g., FastAPI, Flask).Strong understanding of logging best practices, error handling, and secure backend development.Familiarity with monitoring, logging, or metrics tools.
Backend Engineer - New Graduate OpportunityJoin LiteLLM, the leading AI Gateway, trusted by renowned organizations such as Adobe, Netflix, and NASA. Our innovative platform provides developers with secure and reliable access to Large Language Models (LLMs) and related services. We are seeking a passionate Backend Engineer (New Grad) to contribute to building robust guardrails and observability tools at an extensive scale.Role OverviewIn this role, you'll be instrumental in enhancing our guardrails and logging mechanisms. You will take ownership of the backend code ensuring that all guardrail calls are accurately logged, errors are made visible to users, and our observability tools are effective under high-traffic conditions. Your meticulous attention to detail in latency metrics, logging traceability, and backend guardrail registration will significantly influence user trust in our security and compliance features.Key ResponsibilitiesEnsure all guardrail and policy enforcement calls (e.g., applyguardrail) are logged and traceable within our SpendLogs and relevant database tables.Proactively identify and resolve silent failures in guardrail creation, registration, and policy application, ensuring robust error handling and clarity for end-users.Collaborate with observability integrations, including Datadog, Splunk, Prometheus, and OpenTelemetry, to maintain effective monitoring and logging for backend systems.Refactor and enhance our Prometheus integration to facilitate configurable latency histogram buckets that can scale for high-traffic environments.Work collaboratively across teams on backend engineering priorities such as performance, reliability, and security.QualificationsRecent graduate with a Bachelor’s or Master’s degree in Computer Science or a related field.Proficient in Python and familiar with backend frameworks like FastAPI or Flask.Knowledge of logging best practices, error handling, and secure backend development principles.Exposure to monitoring and logging platforms such as Datadog, Splunk, Prometheus, or OpenTelemetry.Familiarity with database integration and troubleshooting (e.g., PostgreSQL, Redis).A strong drive to deliver high-quality backend code with attention to detail.
About CodeRabbitCodeRabbit is a pioneering research and development firm that specializes in crafting highly efficient human-machine collaboration systems. Our mission is to develop the next wave of AI-driven code review solutions—a collaborative synergy between human creativity and advanced algorithms that surpasses the capabilities of individual engineers. By integrating state-of-the-art language models with human insight, we aim to redefine the standards of software development efficiency and quality.Role OverviewAs a DevOps Engineer at CodeRabbit, you will be instrumental in scaling, securing, and fortifying the infrastructure that supports our AI-powered developer tools. Collaborating with our platform engineers, backend team, and applied AI specialists, you will ensure our systems are robust, observable, high-performing, and easy to deploy.This position is hands-on and tailored for an individual who excels in a dynamic environment, takes initiative in managing critical infrastructure, and is eager to develop tools that empower an ambitious engineering team.ResponsibilitiesDesign, implement, and manage scalable CI/CD pipelines.Develop and oversee infrastructure as code (e.g., Terraform, Pulumi).Enhance system reliability through effective monitoring, alerting, logging, and failover strategies.Collaborate with platform and backend teams to identify and mitigate performance bottlenecks.Contribute to deployment workflows, environment automation, and developer tooling enhancements.Ensure that infrastructure security and compliance measures are rigorously enforced.
Join CodeRabbit as a Senior DevOps Engineer!At CodeRabbit, we are at the forefront of research and development, crafting cutting-edge systems for human-machine collaboration. Our mission is to revolutionize software development by creating the next generation of AI-driven code review tools. These advancements represent a powerful synergy between human creativity and advanced algorithms, allowing us to maximize productivity and elevate code quality to unprecedented heights.As a Senior DevOps Engineer, you will be instrumental in scaling, securing, and optimizing the infrastructure that fuels our AI-powered developer tools. Collaborating closely with our platform engineers, backend developers, and applied AI teams, you will ensure our systems are robust, efficient, and easy to deploy, all while maintaining high standards of observability and resilience.This position is ideal for a proactive individual who thrives in dynamic environments, takes initiative with critical infrastructure, and is passionate about developing tools that empower an ambitious engineering team.
Join Our Team as a Senior Security EngineerAt LiteLLM, the leading AI Gateway trusted by industry giants such as Adobe, Netflix, and NASA, we empower developers with secure and reliable access to LLMs and associated services. We are seeking a talented Senior Security Engineer to establish robust security measures and observability tools as we scale our platform.Your Role:Become a cornerstone of our security team as our inaugural Security Engineer, tackling pivotal security challenges head-on.Key Responsibilities:Perform in-depth security assessments of the LiteLLM proxy codebase to uncover potential supply chain vulnerabilities.Develop and manage automated security scans for our Docker images, PyPI packages, and CI/CD workflows (including dependency scanning and secrets detection).Create and enforce secure-by-default configurations for both cloud and self-hosted environments (API authentication, IAM least privilege, key rotation).Implement and oversee intrusion detection systems and alerts tailored to model and API usage patterns.Lead incident response efforts and post-mortem analyses, including vulnerability assessments and stakeholder communication.Establish a formal CVE triage and disclosure protocol in collaboration with the engineering team.Conduct internal red teaming and adversarial testing to simulate real-world attacks and enhance our defenses.Collaborate with engineering teams to fortify release pipelines (signed builds, provenance checks, reproducible builds).Develop secure coding standards and conduct regular training sessions for developers focused on supply chain and dependency management.Maintain and update threat models as LiteLLM’s products and architecture evolve.
Join Lever, a pioneering force in the recruitment technology sector, as a DevOps Engineer. This position is crucial in enhancing our platform that powers the hiring processes for top-tier companies like Netflix, Shopify, and Spotify. You will play a vital role in our mission to innovate and streamline talent acquisition.Lever, founded a decade ago, is committed to redefining how organizations attract and hire the best talent. We are proud to be recognized as the #1 workplace in San Francisco and a top company to work for across the United States. Our culture is centered around our team members, whom we refer to as “Leveroos,” and we continuously invest in their growth and success.
Why Choose Flux?At Flux, we are transforming the hardware landscape by creating the world's first AI Hardware Engineer. Our mission is to democratize access to cutting-edge hardware development and revolutionize global electronics design and manufacturing.About the OpportunityAs a DevOps Engineer at Flux, you will be integral in ensuring the smooth operation of our innovative platform. Your work will encompass a wide range of full-stack systems, impacting various aspects of our service, including billing, authentication, onboarding, and seamless integrations.Your contributions will directly influence user experience, and your role will be crucial in maintaining operational efficiency as Flux continues to scale.Key ResponsibilitiesEnhance the reliability, availability, and operational health of our production systems.Establish observability standards across services, including metrics, logs, and error tracking.Define Service Level Objectives (SLOs) and Service Level Indicators (SLIs) while implementing effective alerting strategies.Collaborate with engineering teams to design robust systems and proactively mitigate operational risks.Develop internal tools to enhance system safety, debugging capabilities, and developer productivity.Manage infrastructure using Pulumi across GCP, AWS, and Firebase.
Why Join Flux?At Flux, we are pioneering the future of technology by creating the world's first AI Hardware Engineer. Our mission is to make cutting-edge hardware accessible to everyone, transforming the landscape of electronics design and development globally.About the OpportunityAs a Senior DevOps Engineer at Flux, you will be crucial in ensuring the seamless operation of our innovative platform. Your focus will extend beyond the code editor to vital systems such as billing, authentication, user onboarding, and integrations.You will have the opportunity to deliver production features that impact all users, maintain operational stability, and enable Flux's growth trajectory.
litellm seeks a Solutions Architect based in San Francisco. This position centers on designing and implementing solutions tailored to client requirements. Role overview As a Solutions Architect, the main responsibility is to create and deliver technical solutions that align with client goals. The work contributes directly to litellm’s service offerings and aims to strengthen customer satisfaction. What you will do Design solutions that address specific client needs Implement these solutions as part of litellm’s services Support ongoing efforts to improve the customer experience Location This role is based in San Francisco.
About ArtisanAt Artisan, we're pioneering the development of AI employees – not mere chatbots or copilots, but fully autonomous digital workers capable of performing real jobs.Our flagship product, Ava, is an AI Business Development Representative (BDR) already utilized by hundreds of companies. Ava autonomously researches leads, crafts and sends emails in a customer's voice, executes complex outbound sequences, manages her own deliverability infrastructure, optimizes her performance over time, and effectively handles objections and meeting scheduling. Rather than being just a tool, Ava acts as a vital teammate.As a Y Combinator W24 company, we've successfully raised over $35 million from esteemed investors and have reached over $8 million in Annual Recurring Revenue (ARR). Currently, we are ambitiously developing Ava 2.0, which represents a significant leap in the capabilities of AI employees. The engineering challenges we face are complex and the scope of our work is vast.Your RoleAs a Staff DevOps Engineer at Artisan, you'll be responsible for our platform that manages hundreds of millions of leads, orchestrates autonomous AI agents in real-time, and facilitates massive-scale email distribution across thousands of customer mailboxes. You will play a key role in delivering a product suite that effectively replaces an entire sales stack, including CRM, inbox, dialer, lead database, and campaign engine. The infrastructure that underpins this functionality is critical to our success.In this staff-level position, you will take ownership of the entire DevOps and infrastructure layer at Artisan, setting the groundwork, defining best practices, and developing the systems that all other functions rely on.Your key responsibilities will include:Kubernetes and Container Orchestration: You will manage our AWS-based Kubernetes infrastructure, overseeing cluster management, scaling policies, resource optimization, and the compute layer that supports AI workloads, data pipelines, and customer-facing product infrastructure.CI/CD and Deployment Pipelines: We implement continuous shipping practices. You will design and maintain the deployment infrastructure that allows engineers to deploy with confidence multiple times a day, including automated testing, staging environments, rollback strategies, and feature flags.Observability and Reliability: You will build a comprehensive observability layer from the ground up, ensuring effective monitoring, alerting, logging, and tracing. Quick detection and understanding of failures, such as an AI agent failing to send an email or a stalled pipeline, will be a priority.Email Deliverability Infrastructure: Ava’s ability to send emails at scale across thousands of domains and mailboxes relies on intricate sender reputation management, domain warming, DNS configuration, and IP management.Security and Compliance: Managing customer data and email credentials requires strict adherence to security standards and compliance regulations.
About Plaud Inc.Plaud is pioneering the development of a highly trusted AI work companion tailored for professionals, enhancing productivity and performance through innovative note-taking solutions that have garnered the affection of over 1,500,000 users globally since its inception in 2023. Our mission is to amplify human intelligence by constructing next-generation infrastructure and interfaces capable of capturing, extracting, and utilizing verbal, auditory, visual, and cognitive information.Headquartered in San Francisco and incorporated in Delaware, Plaud Inc. is at the forefront of merging human and AI intelligence through an integrated hardware and software approach. We adhere to the highest standards of data security and privacy protection, with compliance certifications including ISO 27001, ISO 27701, GDPR, SOC 2, HIPAA, and EN 18031.Discover more about us at https://www.plaud.ai and connect with us on Instagram, X, Facebook, LinkedIn, and YouTube.Why You Should Join UsBe part of a bootstrapped, rapidly growing, and profitable company that has achieved an impressive $250 million revenue run rate within just three years.Help define the future of human-AI interaction.Engage with cutting-edge AI tools and contribute directly to our international growth.Collaborate with enthusiastic colleagues who prioritize innovation, teamwork, and customer success.Advance your career in a culture that promotes continuous learning and development.
Full-time|$180K/yr - $220K/yr|On-site|San Francisco
About AKASAAt AKASA, we are dedicated to revolutionizing healthcare through artificial intelligence. As the foremost provider of generative AI solutions for the healthcare revenue cycle, we enable health systems to effectively capture and convey the entire patient clinical journey. Our innovative approach allows healthcare providers to enhance operational efficiency, allowing them to prioritize what truly matters: delivering outstanding patient care. We have successfully secured over $205 million in funding from top-tier investors, including Andreessen Horowitz, BOND, and Costanoa Ventures.This is an exhilarating time to join AKASA. Since the launch of our AI-native product suite in 2024, revenue bookings have skyrocketed over 20 times. During this period, we have shattered our record for the largest deal in our company’s history three times in a row. This remarkable growth is a testament to the significant advancements we are facilitating for our clients in clinical quality and documentation accuracy—two critical focus areas for healthcare leaders.Our deployments have garnered national recognition as one of the most extensive real-world applications of GenAI in healthcare finance to date (link). Our clientele represents over $120 billion in net patient revenue, comprising some of the most forward-thinking health systems in the nation, including Cleveland Clinic, Duke, Stanford, and Johns Hopkins.We have recently earned accolades such as being named the #1 most promising healthcare RCM startup of 2025 by Black Book Market Research and one of the fastest-growing GenAI startups to monitor by AIM Research. Our CEO has been recognized among the “Top 50 Healthcare Technology CEOs” by the Healthcare Technology Report, and we have proudly maintained our status as a “Great Place to Work” for the past five consecutive years.We are leveraging this momentum to redefine the possibilities within the healthcare industry. We are in search of exceptional talent to help us accelerate this vision.About the RoleAs a Senior Software Engineer specializing in DevOps at AKASA, you will collaborate closely with our Infrastructure and Platform teams to manage, enhance, and scale the systems that fuel our products. Your primary focus will be to ensure that our infrastructure is reliable, observable, and straightforward to operate, with a strong emphasis on automation, operational excellence, and cross-functional teamwork.
Are you a passionate and experienced DevOps engineer looking to elevate your career? Join us at 360 IT Professionals, where we pride ourselves on innovation and excellence in IT solutions. As a Senior DevOps Engineer, you will play a pivotal role in enhancing our deployment processes, ensuring the reliability and scalability of our infrastructure, and collaborating with cross-functional teams to drive project success.Your expertise will help us optimize our cloud environments and implement best practices in CI/CD pipelines. If you thrive in a dynamic environment and are eager to tackle challenging problems, we want to hear from you!
Join Levelpath:At Levelpath, we are redefining procurement with our cutting-edge AI platform designed for global enterprises. Founded in 2022, our mission is to transform procurement into a seamless experience. With Levelpath, organizations achieve agility in days and unlock substantial savings in weeks. Our platform provides a comprehensive view of expenses, facilitating collaboration among procurement, finance, legal, and IT teams from sourcing to payment. Our AI-driven solutions deliver relevant insights, enhancing efficiency and uncovering valuable opportunities.Engineering Culture at Levelpath:We are on the lookout for a skilled DevOps Engineer who will play a crucial role in scaling our innovative products. Your proven experience with Terraform and AWS (focusing on IAM, ECS, RDS, Route53, S3, SES) will be essential as you work closely with our engineering teams utilizing Ruby on Rails and React.This is an on-site position at our San Francisco office, where you will elevate our infrastructure, enhance system performance, and revolutionize observability through AI integration. Join our high-performance engineering culture that leverages AI to create a self-optimizing platform, significantly improving developer productivity and expediting delivery. Become part of a vibrant, international team dedicated to rapid iteration to exceed customer expectations.Your Responsibilities:Design, build, and maintain scalable infrastructure for our applications using Terraform.Lead initiatives to optimize system performance, resource utilization, and ensure high availability in our AWS environment.Advance our observability practices by developing and implementing robust monitoring, logging, and tracing frameworks.Integrate AI into our observability stack (AIOps) for predictive alerting and automated anomaly detection, enhancing incident response efficiency.Utilize AI coding assistants and autonomous agents to accelerate Infrastructure as Code (IaC) development and resolve complex infrastructure issues rapidly.Apply your expertise to enhance infrastructure quality, developer tools, and operational workflows.
Full-time|$135K/yr - $225K/yr|Hybrid|San Francisco Bay Area
About UsEma is at the forefront of developing cutting-edge AI technology designed to empower employees across enterprises to maximize their creativity and productivity. Our innovative platform allows organizations to delegate repetitive tasks to Ema, the AI employee. Founded by former executives from Google, Coinbase, and Okta, as well as seasoned entrepreneurs, we have attracted investment from prestigious firms including Accel Partners, Naspers, and Section32, along with influential Silicon Valley Angels such as Sheryl Sandberg, Divesh Makan, Jerry Yang, Dustin Moskovitz, David Baszucki, and Gokul Rajaram.Our exceptional team includes engineers from top tech giants like Google, Microsoft Research, Facebook, Square/Block, and Coinbase. We are proud to have talent from elite educational institutions such as Stanford, MIT, UC Berkeley, CMU, and IIT. With substantial backing from leading investors, Ema operates from Silicon Valley and Bangalore, India. This position is hybrid, requiring employees to work on-site three days a week.Your RoleWe are in search of a skilled DevOps Engineer to join our dynamic team. You will play a crucial role in the design and construction of our platform and infrastructure, ensuring it scales effectively to meet the demands of our growing product and user base. You will thrive in a fast-paced environment, focusing on system reliability, scalability, and performance, while working on service architecture, deployment, query optimization, distributed systems, machine learning infrastructure, and security protocols. Most importantly, you will be part of a mission-driven, high-growth startup poised to make a significant impact.Key ResponsibilitiesCollaborate with product teams to architect and build the foundational infrastructure for our offerings.Design, develop, and deploy resilient, highly available multi-tenant SaaS solutions across public cloud platforms such as AWS, Azure, and GCP, utilizing technologies like Kubernetes, Helm, Terraform, and Istio.Automate infrastructure tasks including provisioning, configuration management, and deployment, using tools like Terraform, Ansible, and Kubernetes.Work closely with software development teams to enhance CI/CD pipelines, optimize service interfaces, and improve deployment strategies.
About the Role litellm is hiring a Developer Evangelist in San Francisco. This role connects the company’s technology with the developer community. The Developer Evangelist builds relationships, shares knowledge, and helps developers make the most of litellm’s tools. What You Will Do Engage directly with developers to understand their needs and answer questions about litellm’s products Share insights and updates about new features and solutions Support and grow a community around litellm’s technology Gather feedback from developers to help shape future product direction Location This position is based in San Francisco.
Full-time|On-site|San Francisco, California, United States
Join code-metal as a Senior Platform DevOps Engineer, where you will play a pivotal role in enhancing our cloud and on-premises infrastructure. You will be responsible for deploying, managing, and optimizing systems to ensure high availability and performance. This position offers an exciting opportunity to work with cutting-edge technologies and collaborate within a dynamic team.
About UsAt Amari AI, we are revolutionizing global trade by eliminating outdated, manual workflows in the logistics sector. Our innovative AI agents work in tandem with human operators to automate cumbersome, document-intensive tasks, enabling companies to expedite shipment processing while minimizing errors.Having successfully transitioned beyond the initial development phase, we have established a strong product-market fit, evidenced by our impressive growth rate of over 100% month-over-month in revenue. With a recent funding round of $5 million from First Round Capital and Pear VC, we are poised for significant expansion. Our highly skilled team comprises professionals from leading tech giants such as Google, LinkedIn, and Salesforce, as well as esteemed academic institutions and AI research labs.The RoleWe are seeking a talented Production Engineer who expertly navigates the realms of software development and systems engineering. Your primary objective will be to ensure our production environment is robust, fully automated, and easily observable. You will take charge of our CI/CD pipelines, oversee our AI infrastructure, and develop the internal tools that empower our development team to deliver code more rapidly and reliably.
About the RoleJoin our dynamic team at allinbits as a Platform Engineer, where your expertise will be vital in designing and maintaining the robust infrastructure that supports our cutting-edge projects. Your role will combine technical acumen with strategic insight, ensuring our development and operational environments are finely tuned for optimal performance, reliability, and scalability.We prioritize experience in our team, especially if you have transitioned from a developer role into DevOps or Site Reliability Engineering (SRE). Your capacity to innovate and construct resilient systems will prove invaluable.In this position, you will utilize tools such as Ansible, Docker, and Hashicorp Nomad to enhance our operations.
Feb 4, 2025
Sign in to browse more jobs
Create account — see all 11,301 results
Tailoring 0 resumes…
Tailoring 0 resumes…
We'll move completed jobs to Ready to Apply automatically.