Kayvane

Notes from shipping ML systems, debugging pipelines, and stress-testing ideas in production.

observabilityml systemsinfratooling

latest posts

Writing

8 entries
May 20, 20261 min

Claiming, not caching — singleflight on Modal

Three cache layers in front of an entity-resolution database — in-process LRU, Modal Dict, and the database itself. A distributed lock plus Queue-coordinated notify collapses 50 concurrent containers into one database round-trip per key. With code, animated diagrams, and an interactive simulator.

modalcachingsystem-designpython
Jan 17, 20261 min

Event Driven Evolutionary Software

How observability tools can lead to a self-evolving product

observabilityagentssystem-design
Jan 15, 20261 min

On building with agents

A field report on agentic workflows: the bitter lesson, guard rails, sub-agent loops, and the evolving role of engineers.

agentic-coding
Jan 13, 20261 min

LLM-TLDR + Claude Hooks: Fast Context Without the Token Tax

A tutorial for installing llm-tldr with uv and wiring Claude hooks for instant, structured context, plus real numbers from this repo.

llmtoolingclaudeproductivity
Jan 12, 20261 min

How vLLM Works

A practical tour of vLLM's LLMEngine, scheduler, and paged KV cache, plus why paging and radix trees drive throughput.

llmvllmgpu
Jan 08, 20261 min

Setting Up Datadog APM in Modal

A practical walkthrough for wiring Datadog APM into Modal, with tracing pitfalls and sampling tradeoffs.

observabilitymodaldatadogpython
Jan 02, 20261 min

Building a mini game with GPT-5.2-Codex, Suno and GPT Image Gen

We speed-ran a playable Farkle web game in about an hour using GPT-5.2-Codex for the core build, GPT Image Gen for character art, and Suno for a Game Boy-style theme.

llmgame-devgptsunoimage-gen
May 13, 20241 min

Guided Generation with Outlines

A walkthrough of Outlines and finite-state machines for constrained LLM generation, with regex and Pydantic examples.

llmoutlinesstructured-generationpydantic