About the job
Join our dynamic and forward-thinking Tech Team as a Site Reliability Engineering Manager focused on Data Infrastructure. In this pivotal role, you will lead and inspire a talented group of Site Reliability Engineers (SREs) while collaborating closely with Engineering, Product, and Security teams. Your mission is to enhance the resilience, scalability, and security of our platforms, playing a critical part in the execution of our strategy to combat financial crime.
Key Responsibilities:
- Oversee the growth and development of your team, including hiring and onboarding new members.
- Foster a thriving environment that encourages innovation and high-quality outcomes.
- Act as a mentor and coach, embodying a learning mindset and embracing new technologies.
- Set strategic direction for your team in alignment with our overarching technology vision, making accountable tech decisions.
- Utilize your expertise in cloud systems to inform technical decision-making.
- Engage with stakeholders across engineering to ensure your team’s services meet the needs of internal customers.
- Collaborate within your team and across the organization to maintain industry standards in implementation.
This role reports to the Director of Infrastructure and involves managing a team focused on our Stateful/Data layer technologies that power all services in both development and production environments. Our technology stack includes YugaByte (sharded Postgres), Kafka (via Strimzi), Elasticsearch (via ECK), Redis, and Spark/data warehousing on GCP and AWS utilizing their PaaS systems. Given the foundational nature of this technology stack, a collaborative mindset is essential.
Technology Stack Overview:
ComplyAdvantage operates entirely on a cloud-based architecture with a modern Kubernetes-centric tech stack. All computing workloads are managed in Kubernetes, with clusters distributed across multiple regions to cater to our global clientele. Our production services are strategically designed to be multi-cloud, currently hosted in both AWS and GCP.
We leverage Terraform and Helm for defining our infrastructure and services, adhering to GitOps paradigms, ensuring that both production and non-production environments are version-controlled in Git, with changes managed through this system.

