Chris Krough

Chris KroughWorking observations on building agentic workflows in production.https://dev.krough.org/Evaluating Design in Agentic Developmenthttps://dev.krough.org/notes/corpus-shape-over-corpus-size/https://dev.krough.org/notes/corpus-shape-over-corpus-size/A document-loader A/B where the synthetic corpus said reject and a real-world rerun said accept. Notes on corpus shape, joint metrics, and the evaluation criteria I had wrong.Mon, 04 May 2026 00:00:00 GMTEvaluating agent skill effectivenesshttps://dev.krough.org/notes/measuring-a-claude-code-skill/https://dev.krough.org/notes/measuring-a-claude-code-skill/A 60-trial A/B of the Advisors plugin against a no-skill baseline: 96% vs 45%, zero parse failures. Open harness, clean-room isolation, reproducible from a public repo.Mon, 20 Apr 2026 00:00:00 GMT