About the job
About Us:
At Fireworks AI, we are at the forefront of creating next-generation generative AI infrastructure. Our cutting-edge platform is recognized for delivering the highest-quality models with unparalleled speed and scalability in inference. Independently benchmarked as a leader in LLM inference speed, we drive significant advancements through innovative projects, including our proprietary function calling and multimodal models. As a Series C company valued at $4 billion and backed by leading investors such as Benchmark, Sequoia, Lightspeed, Index, and Evantic, we are a dynamic team of builders, comprised of veterans from Meta PyTorch and Google Vertex AI.
The Role:
We are seeking a talented Software Engineer to join our AI Infrastructure team. In this pivotal role, you will contribute to designing and developing the foundational systems that power Fireworks AI’s generative AI platform. Your focus will be on building robust infrastructure and tools that guarantee the reliability, performance, quality, and availability of our AI systems.
Our mission is to establish Fireworks AI as the most dependable and user-friendly generative AI platform globally. You will collaborate closely with our cloud infrastructure, product, and performance teams to create infrastructure solutions that connect our customers with the high-performance proprietary Fireworks inference engine.
Key Responsibilities:
- Design and develop scalable backend infrastructure supporting distributed training, inference, and data pipelines.
- Build and maintain essential backend services, including LLM CI/CD pipelines, control planes, and model serving systems.
- Enhance performance optimization, cost efficiency, and reliability across compute, storage, and networking layers.
- Create frameworks and safeguards to ensure Fireworks AI maintains the highest model quality in the industry.
- Work alongside performance, training, and product teams to translate research and product requirements into effective infrastructure solutions.
- Engage in code reviews, technical discussions, and continuous integration and deployment processes.

