About the job
Senior Backend Engineer
Join LiteLLM, the leading AI Gateway trusted by industry giants including Adobe, Netflix, and NASA. Our platform empowers developers with secure and reliable access to Large Language Models (LLMs) and related services. We are currently seeking a dedicated Senior Backend Engineer to contribute to the development of robust guardrails and observability tools at scale.
About The Role
In this role, you'll take ownership of our guardrail and logging implementations. You will oversee backend code to ensure all guardrail interactions are logged accurately, user errors are communicated effectively, and that our observability tools function seamlessly under high traffic conditions. Your meticulous attention to detail regarding latency metrics, logging traceability, and backend guardrail registration will significantly bolster user confidence in our security and compliance capabilities.
Responsibilities
Develop and enhance our product to maximize performance, reliability, and ongoing improvements.
Ensure guardrail and policy enforcement calls (e.g.,
applyguardrail) are accurately logged and traceable through our SpendLogs and relevant database tables.Design and implement CPU-level guardrails to mitigate common attacks on LLM APIs, MCP servers, and Agents.
Identify and resolve silent failure points in guardrail creation, registration, and policy application—ensuring robust error handling and transparency for end users.
Collaborate with observability tools such as Datadog, Splunk, Prometheus, and OpenTelemetry to maintain accurate, configurable, and effective monitoring and logging for backend systems.
Enhance observability integrations to manage over 1 billion requests per month with minimal latency and no memory leaks due to Prometheus metrics cardinality.
Work cross-functionally on backend engineering priorities including performance, reliability, and security enhancements.
What We’re Looking For
Bachelor’s or Master’s degree in Computer Science or a related field.
4+ years of experience with Python and backend frameworks (e.g., FastAPI, Flask).
Strong understanding of logging best practices, error handling, and secure backend development.
Familiarity with monitoring, logging, or metrics tools.

