companyJulius logo

Software Engineer - Infrastructure (Mid to Senior Level)

JuliusSan Francisco, CA
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Mid to Senior

Qualifications

Solid experience with production Kubernetes and container internals (Docker/containerd); strong understanding of networking principles. Familiarity with cloud services (AWS/GCP/Azure) and Infrastructure as Code (IaC) tools (Terraform/Helm). Proficient in monitoring and logging tools (Prometheus, Grafana, OpenTelemetry, ELK/Vector). Knowledge of security best practices for containerized, multi-tenant frameworks.

About the job

Julius operates as an applied AI lab, developing advanced coding agents for a broad user base. The platform executes about 1 million lines of code every 36 hours, serves over 1 million users, and generates more than 3 million visualizations. All code runs in tightly managed, isolated sandboxes. Julius is a revenue-generating business backed by AI Grant, YCombinator, Bessemer Venture Partners, and founders from leading technology companies.

Role overview

This mid to senior level Software Engineer - Infrastructure role focuses on designing and scaling the code-execution sandboxes that form the backbone of Julius. The infrastructure spans cloud platforms such as AWS and GCP, orchestrating over 500,000 containers each month. The main priorities are reliability, performance, and security in a multi-tenant compute environment.

What you will do

  • Design and maintain secure, multi-tenant container infrastructure with rapid startup and intelligent autoscaling.
  • Deploy and manage cloud resources using Helm and Terraform, including SSO, network controls, and audit logging.
  • Enhance observability through metrics, traces, and logs. Define SLOs and lead incident response efforts.
  • Optimize container images, scheduling, networking, and costs. Develop and enforce fair-use and rate-limiting policies.

Requirements

  • Hands-on experience with production Kubernetes and container internals (Docker or containerd), as well as strong networking skills.
  • Familiarity with cloud services (AWS, GCP, or Azure) and Infrastructure as Code tools such as Terraform and Helm.
  • Proficiency with monitoring and logging tools like Prometheus, Grafana, OpenTelemetry, ELK, or Vector.
  • Understanding of security best practices for containerized, multi-tenant systems.

Preferred qualifications

  • Experience with technologies such as gVisor, Kata, Firecracker, Cilium, eBPF, GPU scheduling, or serverless autoscaling frameworks (KEDA, Knative, Karpenter).
  • Interest in AI projects, especially those involving large language models (LLMs).

Benefits and compensation

  • Competitive base salary
  • Substantial equity options
  • Comprehensive health and dental coverage
  • Gym reimbursement
  • Daily team meals
  • Commuter assistance

Julius offers the chance to work in San Francisco, CA, alongside a small and highly skilled team tackling large-scale infrastructure challenges. The systems here operate at significant scale and complexity, providing opportunities to solve demanding technical problems in a collaborative setting.

About Julius

Julius is an innovative applied AI lab that develops advanced coding agents. Our platform is designed to execute a significant volume of code in a highly efficient manner, delivering impactful visualizations and ensuring secure, scalable solutions for our users.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.