companyJulius logo

Senior Software Engineer - Infrastructure

JuliusSan Francisco, CA
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Senior

Qualifications

What You BringProduction Kubernetes and container internals (Docker/containerd); strong networking fundamentals. Cloud (AWS/GCP/Azure) and IaC (Terraform/Helm). Monitoring/Logging (Prometheus, Grafana, OpenTelemetry, ELK/Vector). Security best practices for containerized, multi‑tenant systems.

About the job

Compensation: Competitive base salary + substantial equity
Benefits: Health & dental insurance, gym reimbursement, daily team lunches, 401(K)

About Julius

At Julius, we're pioneering advancements in applied AI by developing cutting-edge coding agents. Our platform executes approximately 1 million lines of code every 36 hours, serving over 1 million users and generating 3 million+ visualizations. We manage all code in isolated remote containers. As a revenue-generating entity, we are backed by AI Grant and founders with remarkable backgrounds from companies like Vercel, Notion, Perplexity, Palantir, Replit, Zapier, Intercom, and Dropbox.

The Role

Join us in building and scaling the robust code-execution platform that powers Julius, across both cloud and on-prem environments. We orchestrate over 500,000 containers/month and the demand is growing rapidly. You will take ownership of reliability, performance, and security within our multi-tenant compute environment.

Your Responsibilities

  • Design and manage a secure, multi-tenant container infrastructure that ensures quick startup and intelligent autoscaling.
  • Implement on-prem/private cloud deployments using Helm and Terraform, integrating SSO, network controls, and audit logging.
  • Enhance observability (metrics, traces, logs) with well-defined SLOs and lead incident response initiatives.
  • Optimize images, scheduling, networking, and costs, while developing fair-use and rate-limiting controls.

Your Qualifications

  • Strong experience with production Kubernetes and container internals (Docker/containerd); solid understanding of networking principles.
  • Familiarity with cloud environments (AWS/GCP/Azure) and Infrastructure as Code (Terraform/Helm).
  • Proficiency in monitoring and logging tools (Prometheus, Grafana, OpenTelemetry, ELK/Vector).
  • Understanding of security best practices for containerized, multi-tenant systems.

Preferred Qualifications

  • Experience with gVisor, Kata, Firecracker; Cilium/eBPF; GPU scheduling; serverless autoscaling (KEDA/Knative/Karpenter).
  • Proven experience delivering on-prem or air-gapped enterprise software solutions.
  • A passion for AI, with experience building side projects involving LLMs.

Why Join Julius?

Be part of a small, senior team where your contributions will have a massive impact. Tackle challenging infrastructure problems at a meaningful scale.

About Julius

Julius is an innovative applied AI lab at the forefront of developing advanced coding agents. With a robust infrastructure and a team of exceptional talent, we are committed to solving complex challenges in the AI space.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.