DevOps Cloud Engineer
Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Unlock Your Potential
Generate Job-Optimized Resume
One Click And Our AI Optimizes Your Resume to Match The Job Description.
Is Your Resume Optimized For This Role?
Find Out If You're Highlighting The Right Skills And Fix What's Missing
Experience Level
Mid to Senior
Qualifications
About Innovative Solutions
Innovative Solutions is a leading partner in cloud transformation, dedicated to helping organizations optimize their IT investments through advanced AWS solutions. With a focus on cutting-edge technology and customer-centric service, we empower businesses to thrive in the digital landscape.
Similar jobs
Search for Staff Software Engineer Devops And Infrastructure
4,293 results
Stratos Labs Inc.
Overview Role: Lead Software Engineer - DevOps and Infrastructure On-site role at our New York City headquarters, 5 days a week Annual base salary: $175,000 - $250,000 Equity: Competitive initial equity package along with refreshers Minimum of 3 years of relevant experience required About Stratos Labs Stratos Labs is revolutionizing commodity risk management for the $10 trillion physical economy. Our innovative platform merges real-time market data with AI-driven exposure modeling and automated trade generation, empowering operators with precise tools to manage volatility. From instantaneous trade execution to ongoing monitoring, alerts, and actionable recommendations, Stratos Labs transforms complex market risks into a seamless, always-on hedging solution. Founded in 2023 by a former macro market-maker from Barclays and a trading systems engineer from Coinbase, we have successfully raised over $20 million in funding from esteemed investors including Andreessen Horowitz (a16z), Crucible Capital, Neo, and DST Global. Key Responsibilities Manage and Optimize AWS Cloud: Take complete ownership of our AWS environment, architecting, scaling, and optimizing our cloud infrastructure to enable new services while ensuring it is cost-effective and integrated with our core trading systems. Infrastructure Architecture: Lead the transition to a full Infrastructure as Code (IaC) model by developing a sophisticated Terraform stack that eliminates manual configurations. Security and Compliance: Act as the gatekeeper for our environments, managing container security, CVE remediation, and ensuring compliance with SOC2 standards without hindering team productivity. Observability Strategy: Design a comprehensive monitoring strategy to proactively identify bottlenecks before they lead to outages. CI/CD Pipeline Refinement: Enhance our GitHub Actions and CI/CD workflows to streamline the process of deploying Go, C++, and TypeScript services from commit to production. Reliability and Incident Response: Participate in on-call rotations, conducting thorough post-mortems to ensure that recurring issues are resolved effectively.
Genius Sports
At Genius Sports, we combine cutting-edge technology with premier live data to revolutionize the sports experience for fans around the globe. Our mission is to create more immersive, interactive, and personalized experiences than ever before. Discover more about us at geniussports.com.The Role - Staff Engineer - Infrastructure Platform We are on the lookout for an exceptional Staff Engineer to spearhead critical projects within our core infrastructure platform. Genius Sports is currently integrating its diverse tech teams and acquisitions under a cohesive technical strategy, and our infrastructure platform is the foundation of this transformation. Our primary objective is to empower engineering teams to efficiently build, deploy, and manage Genius Sports’ extensive product catalog in a consistent manner. In this role, you will collaborate with fellow InfraPlat leaders to define and execute the technical vision and implementation across an array of projects. These initiatives encompass multi-account and region Kubernetes clusters, MLOps, standardized deployment processes, and a centralized authentication platform. You will also engage with stakeholders from product engineering teams to assess requests, identify common challenges, and prioritize initiatives.
Join Our Team at Basic CapitalBasic Capital is at the forefront of transforming America’s $1 trillion retirement sector. Our innovative approach focuses on developing the mortgage for retirement, granting market access, and ensuring that wealth is within reach for every American. Our mission is to create cutting-edge products, platforms, and a comprehensive credit marketplace that revolutionizes the retirement system.Our founding team comprises seasoned professionals from prestigious companies like Goldman Sachs, Uber, Block, Stripe, and Robinhood. Supported by top-tier investors such as Lux Capital, Forerunner Ventures, BoxGroup, SVAngel, Inspired Capital, and Henry Kravis, we are located in SoHo, NYC, and are building a dynamic, high-performance team dedicated to mitigating wealth inequality.Discover more about us by visiting Basic Capital’s website.Your RoleBecome a pivotal member of our Infrastructure & Security team, ensuring platform reliability, security integrity, and compliance that empowers Basic Capital to scale with assurance.Take ownership of our SOC 2 Type II compliance journey by implementing security measures, documenting essential processes, and collaborating with auditors to secure and uphold certification.Architect and implement robust observability frameworks, including logging, monitoring, alerting, and distributed tracing across our infrastructure.Develop and uphold CI/CD pipelines, automate deployments, and establish infrastructure-as-code practices to facilitate rapid, secure releases.Enhance system performance, reduce latency, and optimize cost-efficiency as we grow to accommodate a larger customer base and increased transaction volumes.Oversee our cloud infrastructure (AWS), managing networking, security configurations, IAM policies, and compliance measures.
Profound
At Profound, we are dedicated to empowering businesses to comprehend and manage their AI presence effectively. As an Infrastructure Software Engineer, you will play a pivotal role in building and scaling the systems that facilitate our rapid expansion. Your primary focus will be on ensuring that our infrastructure is not only highly available but also cost-efficient and capable of managing significant traffic and compute demands. You will collaborate closely with engineers across product, research, and operations teams to design scalable architectures, streamline deployment processes, and enhance system observability.Your ResponsibilitiesDesign and maintain core infrastructure across diverse cloud environments.Develop infrastructure-as-code workflows to automate deployment and scaling processes.Enhance monitoring, logging, and alerting systems to ensure system reliability.Oversee CI/CD pipelines to facilitate seamless deployments.Assist in disaster recovery planning to guarantee high system availability.Collaborate with product and research teams to design architectures that can scale with increasing workload demands.Identify and resolve performance bottlenecks in compute, storage, and networking.Implement security best practices and compliance frameworks across the infrastructure.
Amperos Health
About Amperos HealthAmperos Health stands at the forefront of revolutionizing revenue cycle management (RCM) for healthcare clinics, enabling them to optimize revenue collection efficiently and swiftly. Established in 2023 and supported by notable investors such as Uncork, Neo, Nebular.vc, and strategic angels from industry leaders like OpenAI, Stripe, and Twilio, our mission is to transform the relationship between healthcare providers and payers. We envision an AI-driven workforce that alleviates administrative workload and accelerates the financial return for healthcare professionals.About the RoleAs our inaugural Infrastructure Engineer, you will spearhead the development of our infrastructure at Amperos. You will manage DevOps, enhance developer experience, ensure compliance, and oversee observability and monitoring within our AWS environment. This role presents exciting challenges associated with AI infrastructure, and you will have the chance to mold and lead your team as we expand.Key ResponsibilitiesEstablish and maintain the infrastructure to support Amperos' growth trajectory.Enhance and optimize AWS infrastructure, boosting development efficiency and creating modular engineering frameworks.Develop systems that expedite the deployment of AI features and enhance observability of large language models (LLMs).Minimize AWS expenditures while improving visibility into infrastructure costs.Lead initiatives related to security and technical compliance, including VPNs, firewalls, and network configurations.Candidate ProfileA minimum of 5 years of experience in leading infrastructure teams within top-tier organizations.Ability to identify key ROI infrastructure challenges from the outset and develop a strategic roadmap for enhancements.Proficient in leading engineering discussions and articulating the impact of infrastructure projects.Adaptable and willing to take on diverse roles as needed; no task is too small.Demonstrates high agency, with the capability to produce results with minimal guidance.Excellent verbal and written communication skills; proactively shares vital information with the team.Passionate about leveraging technology to address challenges in an underserved industry.
Harvey develops AI-driven solutions for legal and professional services, serving over 1,000 organizations in more than 60 countries. The company is growing rapidly and has strong support from leading investors. Harvey’s team values ownership, quick decision-making, and close collaboration. Engineers work side by side with leadership and customers to address practical challenges. The company operates in person in New York City and provides relocation support for new hires. Role overview The Staff Software Engineer - Core Infrastructure joins a team responsible for designing, building, and scaling Harvey’s core infrastructure. This platform handles billions of prompt tokens and millions of daily requests, forming the backbone of Harvey’s global legal AI services. The work involves both creating new systems and maintaining high operational standards. Reliability, scalability, and security are central as Harvey continues to expand its reach among top law firms and professional service providers. This is a full-time, in-person position based in New York City. Relocation assistance is available. What you will do Design and build scalable, fault-tolerant infrastructure systems that support Harvey’s AI platform across multiple cloud regions. Take ownership of and improve multi-cloud infrastructure (Azure, GCP), focusing on Kubernetes orchestration, networking, and container management. Lead technical projects in areas such as observability, incident response, and performance optimization.
About KiddomKiddom is an innovative educational platform dedicated to enhancing student equity and growth. By combining high-quality instructional resources with interactive digital learning, Kiddom enables schools and districts to take control of their curriculum. This results in personalized learning experiences that cater to the distinct needs and aspirations of local communities. The platform is enriched with insightful data for teachers and leaders, fostering continuous improvement in instructional strategies, school programming, and professional development.As a member of the InfraOps team, you will play a crucial role in supporting Kiddom's engineering efforts by developing a scalable and sustainable infrastructure that aligns with our company objectives. This is a unique chance to join a small, skilled team during a pivotal moment as we transition from Series C to D funding. You will have the autonomy to influence your work significantly, embracing a variety of tasks that reflect your expertise and enthusiasm.Key Responsibilities:Promote and cultivate a robust DevOps culture at Kiddom by collaborating with teams to establish best practices and guide both new and existing services.Implement Infrastructure as Code (IaC) to ensure confidence in automated, repeatable processes.
Playground
Playground builds software to help child care providers manage their businesses more efficiently. The platform supports thousands of schools across the United States, with a focus on making high-quality child care more accessible. The company has secured millions in funding and holds several statewide contracts, reflecting strong momentum in the education technology space. Playground’s founders have been recognized on the Forbes 30 Under 30 list for their work in child care technology. The team values a culture where ownership and collaboration are central. Engineers at Playground regularly work together on challenging projects that have a direct impact on customers and the broader child care sector. Role overview This Staff Software Engineer position is based in New York City. The role centers on building and improving Playground’s core products as the company grows. Engineers here contribute to significant technical decisions and help shape the future of the platform. What you will do Work on complex software projects that support child care businesses nationwide Collaborate closely with other engineers and teams to deliver high-impact solutions Take ownership of technical challenges and contribute to a culture of shared responsibility Requirements Experience leading or contributing to large-scale software projects Ability to work effectively in a collaborative, team-driven environment Based in or willing to work from New York City
Join Highbeam as we revolutionize business banking and cash management.Our innovative platform integrates AI agents, automated financial workflows, and comprehensive financial products designed to optimize time and cost for brands.Our impressive clientele has generated billions in sales, featuring renowned brands such as Cuts, Tushy, NYON, Sabah, Still Here, Alice Mushrooms, Original Grain, birddogs, and many others.Our dynamic team includes talents from Shopify, Square, Toast, Rippling, and McKinsey.We have successfully raised $42 million in equity funding from prominent investors including Acrew, FirstMark, Mayfield, and Two Sigma Ventures.About the RoleWe are seeking a DevOps & Platform Engineer to lead the management of our cloud infrastructure, observability, and development tools. You will be pivotal in enhancing our platform's scalability, reliability, and security, enabling our engineering team to deploy features swiftly and safely.This role is at the crossroads of infrastructure, developer experience, and backend systems. You will collaborate closely with our backend, security, and product teams, while occasionally contributing directly to feature development.Our technology stack includes Google Cloud, PostgreSQL, Terraform, Kotlin, TypeScript / React, and Python. Explore our open-source Kotlin toolkit Kairo.
Thread AI
Join Thread AI as an Infrastructure Software EngineerAt Thread AI, we are pioneering the development of an AI-native workflow orchestration engine. We are in search of passionate and skilled professionals eager to be a part of our expanding team. Our mission is to simplify infrastructure for enterprises and public sector organizations aiming to harness the full potential of artificial intelligence.Located in the heart of New York, our diverse team comprises seasoned experts in AI, product management, and engineering, all dedicated to crafting and implementing intricate workflows and infrastructure solutions.Our Engineering CultureWe pride ourselves on having a compact yet highly committed technical team that encompasses engineering, research, design, product, and operations. We believe that a small, empowered team with a flat organizational structure fosters rapid innovation and superior product development compared to larger, hierarchical entities.Role OverviewWe are looking for a talented software engineer with a robust background in infrastructure to contribute significantly to the scalability and reliability of our platform. The ideal candidate will support complex network topologies and should possess a deep curiosity and creativity, excelling in problem-solving within a fast-paced, ownership-driven environment.
Are you ready to revolutionize the world of finance? At Superstate, we are dedicated to crafting investment products that leverage the rapidity, programmability, and compliance benefits of blockchain tokenization, steering clear of the inefficiencies of traditional finance.As a Staff DevOps Engineer, you will play a pivotal role in constructing and sustaining a highly reliable and scalable infrastructure that underpins our innovative financial products. Your expertise will be crucial in designing robust systems that guarantee our platforms' exceptional performance, availability, and security as they scale. You'll take on diverse responsibilities, including general DevOps, security, observability, and reliability, leading these vital initiatives.
NBCUniversal Media, LLC
Join our innovative team at NBCUniversal as a Staff Software Engineer specializing in AI Infrastructure and Python development. This role involves designing, building, and maintaining scalable AI systems. We seek a passionate engineer who thrives in a collaborative environment and is eager to contribute to cutting-edge projects that impact millions of users.
Join the Revolution at CheckAt Check, we transform the way people get paid, simplifying payment processes for payroll businesses. As pioneers of embedded payroll, we collaborate with our partners to redefine payroll systems, enabling businesses to launch, grow, and flourish. Discover our journey | Listen in.Check is more than just an API infrastructure; we serve as a launchpad for payroll businesses.Our TeamPayroll systems are in need of innovation. Join a passionate team dedicated to solving these challenges! At Check, your problem-solving skills, critical thinking, and determination will drive impactful changes across our projects. We view challenges as opportunities and encourage collaboration that leverages the unique strengths of every team member.If you are ready to dive in and reshape payroll, let’s work together to simplify complexities and create a brighter future for businesses of all sizes.The RoleEngineering is the backbone of Check. We envision payroll as part of modern financial software, which necessitates robust systems that our operators and partners can rely on. Every solution we develop is built on reliable, scalable, and secure systems that ensure timely payments.We are in search of a Staff Software Engineer who possesses strong software design expertise coupled with hands-on infrastructure experience. In this position, you will enhance the core systems that enable payroll operations, focusing on scalability, production efficiency, and empowering engineers with reliable tools for software deployment.You'll collaborate across product and platform teams to advance our cloud infrastructure, enhance system deployment and monitoring, and simplify the architecture underpinning embedded payroll. Your challenges will often bridge the domains of infrastructure, product, and operations.This role is perfect for individuals who have managed complex systems in dynamic environments and take pride in creating resilient, understandable infrastructure that is essential for business operations.
Join Privy as a Senior Infrastructure Engineer and play a pivotal role in shaping the future of online privacy and user ownership. Our team is dedicated to creating innovative developer tools that prioritize users, leveraging cutting-edge cryptography to redefine digital ownership. You will collaborate with a passionate engineering team to design and manage robust multi-tenant infrastructures that support billions of requests monthly, ensuring high performance and reliability across our services.
Confido
At Confido, we are revolutionizing the AI infrastructure that drives Consumer Packaged Goods (CPG) brands from analysis to execution. Our integrated platform streamlines cash applications, deductions, disputes, trade promotion management, forecasting, demand planning, and analytics, delivering significant time savings and enabling more intelligent financial decisions for our clients.We proudly support over 200 brands managing more than $20 billion in revenue, partnering with notable names such as OLIPOP, Simple Mills, and Dr. Squatch. With a recent $15 million Series A funding round led by Footwork Ventures and Y Combinator, we are poised for rapid growth and innovation.As a Staff Software Engineer, you will be instrumental in architecting and developing the core systems that drive the Confido platform. You will spearhead significant product and infrastructure projects, from AI-driven document processing to extensive financial analytics systems.This role offers a unique opportunity to merge technical excellence with impactful product development, collaborating closely with engineering, product teams, and clients to transform intricate operational workflows into scalable software solutions.Location: New York, NY (Relocation assistance available)
About KnotAt Knot, we are on a mission to revolutionize the way consumers and businesses interact through seamless merchant and banking experiences. Think of us as the 'Plaid for merchant connectivity.' Our innovative platform is designed to connect merchants with the multitude of applications that enhance everyday transactions. Our flagship product, CardSwitcher, empowers consumers to effortlessly update and manage their payment methods across various online merchant accounts like Netflix and PayPal. Additionally, our advanced solution, TransactionLink, allows for the retrieval of detailed transaction data, paving the way for new product development on our unique merchant connectivity platform. We invite you to join us in building these exciting new solutions!Founded in 2021 by brothers and Thiel Fellows Rory and Kieran O’Reilly, Knot currently facilitates connected online payment experiences for hundreds of thousands of users. Our technology is trusted by industry leaders like American Express, PayPal, Current, BILT, and Step, who integrate Knot’s SDK into their applications to deliver exceptional experiences to their customers.Backed by a distinguished group of investors including Nava Ventures, 8VC, and prominent figures from companies such as Twitter, Warby Parker, and DraftKings, Knot is well-positioned for continued growth and innovation.Working at KnotWe pride ourselves on having a world-class team from diverse backgrounds, with a strong emphasis on engineering talent. As we expand our footprint in NYC, we aim to be at the forefront of the financial services landscape.Our team is dedicated to building exceptional products for our users, balancing a serious approach to our work with a fun and engaging work environment. We believe both aspects are integral to our success.Your RoleDesign, architect, deploy, document, and oversee our cloud-based network infrastructure.Take ownership of critical API infrastructure that handles hundreds of requests per second.Lead technical decisions, providing justification for designs and coordinating with other teams to ensure alignment on values and requirements.Continuously enhance your knowledge of our infrastructure's long-term needs and capabilities.Manage and troubleshoot complex technical issues and incidents, providing support and solutions as necessary.
Innovative Solutions
Innovative Solutions is a distinguished AWS Premier Tier Services Partner, recognized for our expertise in Generative AI, Migrations, DevOps, and Networking. We focus on empowering businesses to revolutionize their IT infrastructure through AWS cloud migrations, managed services, application modernization, and cloud operations. Our goal is to transform IT expenditures into strategic investments that not only attract new clients but also enhance customer retention and drive consistent profitability.The OpportunityWe are on the lookout for a skilled DevOps Cloud Engineer to join our dynamic professional services engineering team. You will engage directly with mid-sized businesses to architect, implement, and enhance AWS infrastructure solutions across various simultaneous projects. This position demands a unique combination of technical prowess, project management abilities, and excellent communication skills as you guide our clients in harnessing the cloud's potential to foster business growth.*Quarterly travel to our Rochester, NY Headquarters is requiredKey Responsibilities:· Design and implement scalable and secure AWS infrastructure leveraging Infrastructure as Code (IaC) practices across multiple client engagements simultaneously.· Build and maintain CI/CD pipelines, automate deployment workflows, and set up monitoring and observability solutions to optimize client operations in the cloud.· Collaborate closely with solutions architects and project managers to translate client requirements into robust technical solutions, ensuring high standards for security, reliability, and performance.· Work in partnership with client technical teams to implement DevOps best practices, resolve complex infrastructure challenges, and facilitate knowledge transfer for ongoing success.· Manage multiple project priorities, adapt to diverse client environments, and contribute to the enhancement of our internal tools and methodologies.
Gecko Robotics develops technology that helps major organizations monitor and maintain their critical infrastructure. The company’s wall-climbing robots, advanced sensors, and AI-driven data platform provide insight into the condition of physical assets. This information supports real-time decisions, improves safety, and helps prevent failures. Role overview The Software Platform team at Gecko Robotics works to remove obstacles for engineers by building automation, shared libraries, and DevOps tools. Their efforts streamline software development, deployment, and management, allowing engineers to focus on core technical challenges rather than routine tasks. This Software Engineer position centers on the Government cloud platform. The main focus is to advance cloud architecture, deployment processes, and automation strategies tailored for government clients. The objective is to maintain and scale single-tenant environments, enabling engineers to manage deployments without needing deep cloud expertise. What you will do Design and automate secure infrastructure environments for high-compliance government clients, aiming to deliver a seamless experience similar to public cloud operations. Simplify and automate differences across various cloud environments, so developers do not need to handle complex compliance requirements or environment-specific details. Create Infrastructure as Code (IaC) modules and shared libraries to support all environments and reduce redundant work. Collaborate with the Software Platform team to improve the process of building, deploying, and managing software at Gecko Robotics. Contribute to ongoing focus areas such as: Automated and user-friendly CI/CD pipelines Customer authentication systems Permission management frameworks DevOps tool management and optimization Enhancing the local development experience Onboarding and internal training Location This position is based in New York City.
Fireworks AI
About Us:At Fireworks AI, we are at the forefront of generative AI infrastructure innovation. We provide cutting-edge models with unmatched inference speed and scalability, establishing ourselves as leaders in the industry. Our projects include groundbreaking function calling and multimodal models, solidifying our reputation for excellence. As a Series C company valued at $4 billion, we are backed by esteemed investors such as Benchmark, Sequoia, Lightspeed, Index, and Evantic. Our dynamic team, composed of veterans from Meta PyTorch and Google Vertex AI, thrives on collaboration and ambition.The RoleJoin us in developing the fundamental systems that drive Fireworks AI, ranging from customer-centric APIs and product features to the distributed infrastructure facilitating AI workloads on a massive scale.This position is a comprehensive full-stack backend and infrastructure role. You will design systems, deliver products, and take ownership of the entire process from inception to deployment.What You’ll Work OnAPIs, web backend, and developer toolingModel training, fine-tuning, and inference orchestrationJob scheduling, autoscaling, and model servingBilling, enterprise features, and access controlCross-cloud infrastructure (compute, storage, networking)Global scale GPU cluster managementWhat You’ll DoDevelop and scale backend services and distributed systemsEnsure system reliability from design through productionCollaborate directly with customers to address real-world challengesEnhance performance, cost-effectiveness, and developer experienceRapidly implement AI tools to automate processesYou Might Be a Fit IfYou are eager to engage in the AI revolutionYou enjoy building infrastructure and backend systems that enhance productsYou think critically about systems, trade-offs, and their impactsYou demonstrate ownership and drive initiatives across teams
At Scale AI, we are revolutionizing the foundation of enterprise AI. We seek a skilled Staff Infrastructure Software Engineer who will serve as the principal technical leader in developing our 'paved road' for knowledge retrieval and inference engines. In this role, you will not only oversee resources but also establish deployment standards for scalable Agentic workflows. Your goal is to connect intricate AI orchestration with top-notch infrastructure, ensuring our platform is the most dependable choice for enterprise agents.The ideal candidate will excel in a dynamic environment, demonstrate a passion for deep technical engagement and mentorship, and possess the ability to define long-term technical strategies for a crucial domain while maintaining a hands-on focus on delivery. You will design and implement solutions across multiple cloud platforms (GCP, Azure, AWS) for clients in highly regulated sectors such as healthcare, telecommunications, finance, and retail.
Sign in to browse more jobs
Create account — see all 4,293 results

