Kayvane

Notes from shipping ML systems, debugging pipelines, and stress-testing ideas in production.

blog repo

GitHub LinkedIn X Email

observabilityml systemsinfratooling

Writing

4 entries

Jan 12, 20268 min

How vLLM Works

A practical tour of vLLM's LLMEngine, scheduler, and paged KV cache, plus why paging and radix trees drive throughput.

llmvllmgpu

read

Jan 08, 202610 min

Setting Up Datadog APM in Modal

A practical walkthrough for wiring Datadog APM into Modal, with tracing pitfalls and sampling tradeoffs.

observabilitymodaldatadogpython

read

github

Jan 02, 20261 min

Building a mini game with GPT-5.2-Codex, Suno and GPT Image Gen

We speed-ran a playable Farkle web game in about an hour using GPT-5.2-Codex for the core build, GPT Image Gen for character art, and Suno for a Game Boy-style theme.

llmgame-devgptsunoimage-gen

read

github

May 13, 20249 min

Guided Generation with Outlines

A walkthrough of Outlines and finite-state machines for constrained LLM generation, with regex and Pydantic examples.

llmoutlinesstructured-generationpydantic

read