Generative AI on Kubernetes — Book Review
There's a specific kind of frustration that comes from reading a book that's half about what you actually needed. Most resources on running AI in production either assume you're a data scientist who p

Search for a command to run...
Articles tagged with #aws
There's a specific kind of frustration that comes from reading a book that's half about what you actually needed. Most resources on running AI in production either assume you're a data scientist who p

From integer counting to structured resources — how Dynamic Resource Allocation and the AI Cluster Readiness framework finally make GPU infrastructure manageable at scale. Contents The Two Nightmares

Your vLLM cluster has a problem you probably don't know about. It's not a bug. Nothing is crashing. The metrics dashboard looks fine. But right now, every time a request hits your load balancer, there

There's a class of production incident that doesn't page anyone. No error rate spikes. No latency alert fires. The cluster health dashboard shows green. GPU nodes are online. Pods are running. And yet

Travel has been relentless lately. Back-to-back weeks, airports blurring into each other, calendar looking like a game of Tetris someone is losing badly. But AWS Community Day Pune was non-negotiable.

I watched Jensen's keynote live. Three hours. I had tea going cold next to me and a notepad filling up fast. I'm not going to recap every announcement. There are enough of those. What I want to do is
