Skip to content
CloudWizz

Careers · engineering

AI Infrastructure Engineer

Lead AI Infrastructure engagements — GPU clusters, model serving, MLOps, LLM observability. The work that makes production AI as boring as the rest of the stack.

Remote (India / EU-friendly hours) Full-time Senior

What you’d do

  • Lead AI Infrastructure engagements. Discovery, architecture, hands-on delivery. The clients are mostly Series A–C teams putting their first ML or LLM workloads into production.
  • Design GPU cluster topologies. EKS / GKE / AKS GPU node groups, NVIDIA operator, Spot vs. on-demand mix, multi-region resilience.
  • Stand up serving runtimes. vLLM, TGI, Triton — picking, deploying, tuning, autoscaling. You know which flag matters and which is theatre.
  • Build the MLOps platform. Argo Workflows, Kubeflow, MLflow — depending on the client’s existing stack. Paved-path templates so data scientists ship without filing platform tickets.
  • Wire LLM observability. Langfuse / Phoenix / Helicone alongside the standard stack. Make hallucinations a debuggable artefact.

What we’re looking for

  • 4+ years on production infrastructure, with at least 1 year on AI/ML workloads
  • Deep Kubernetes — you’ve shipped GPU workloads, you know the failure modes
  • Comfort with the modern serving stack (vLLM, TGI, etc.) and the trade-offs between them
  • FinOps instincts — GPU spend is a different beast and clients hire us partly to control it
  • Ability to write — blog posts, runbooks, deliverable documents

Nice to have

  • Open-source contributions to vLLM, Kubeflow, Argo, Langfuse, or similar
  • Prior experience scaling ML inference at meaningful traffic
  • Background that includes both ops AND a real touch of ML (we don’t need a PhD; we do need someone who can pair with data scientists fluently)

Why CloudWizz

  • A real AI infrastructure practice. Not “we tried it once.” Real engagements, real metrics, real on-call.
  • AI-native workflow. AI does the routine; you do the senior judgment.
  • Open-source aligned. Our work feeds back into the OSS we publish.
  • Sustainable pace. No graveyard rotations.

How we hire

  1. 30-min intro call
  2. Technical conversation on a real (anonymized) AI infra problem
  3. Paid trial — 1–2 weeks on a real engagement with another engineer
  4. Offer

Apply below.