<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>Chris Krough</title><description>Working observations on building agentic workflows in production.</description><link>https://dev.krough.org/</link><item><title>Evaluating Design in Agentic Development</title><link>https://dev.krough.org/notes/corpus-shape-over-corpus-size/</link><guid isPermaLink="true">https://dev.krough.org/notes/corpus-shape-over-corpus-size/</guid><description>A document-loader A/B where the synthetic corpus said reject and a real-world rerun said accept. Notes on corpus shape, joint metrics, and the evaluation criteria I had wrong.</description><pubDate>Mon, 04 May 2026 00:00:00 GMT</pubDate></item><item><title>Evaluating agent skill effectiveness</title><link>https://dev.krough.org/notes/measuring-a-claude-code-skill/</link><guid isPermaLink="true">https://dev.krough.org/notes/measuring-a-claude-code-skill/</guid><description>A 60-trial A/B of the Advisors plugin against a no-skill baseline: 96% vs 45%, zero parse failures. Open harness, clean-room isolation, reproducible from a public repo.</description><pubDate>Mon, 20 Apr 2026 00:00:00 GMT</pubDate></item></channel></rss>