company

Technical Staff Member - ML Infrastructure & Performance

embedding-vcSan Mateo, CA
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

Candidates should possess a strong background in machine learning infrastructure, performance optimization, and practical experience with the technologies outlined in the scope of work. A passion for pushing the boundaries of AI technology and a collaborative mindset are essential.

About the job

Join the innovative team at Moonlake, where we harness the power of AI to create real-time interactive content.

Mission: Elevate performance metrics by enhancing throughput, reducing latency, and optimizing costs - deploying our models 2–10 times faster and at lower costs without compromising quality.

Scope of Work:

  • GPU Performance: Expertise in CUDA/Triton kernels, FlashAttention family, paged attention, and CUDA Graphs.
  • Serving Stack: Proficiency with TensorRT-LLM/Triton Inference Server, vLLM/TGI; continuous batching; on-GPU KV reuse; speculative decoding/medusa; and mixture-of-agents routing.
  • Parallelism: Experience with FSDP/ZeRO, TP/PP/expert parallel; NCCL tuning.
  • Quantization/PEFT: Familiarity with AWQ/GPTQ/FP8; LoRA/DoRA serving.
  • Systems: Knowledge of Ray/k8s/Argo, observability tools (Prom/Grafana/OpenTelemetry), autoscaling, A/B infrastructure, and canary + rollback.

Tech Signals:

Ideal candidates will have previous experience at infrastructure-heavy startups such as Databricks or Roblox.

We are dedicated to maintaining an on-site, in-person team based in San Mateo.

About embedding-vc

Moonlake is at the forefront of AI-driven innovation, specializing in the development of real-time interactive content. Our mission is to deliver cutting-edge technology solutions that enhance user experiences and streamline operations.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.