Don’t Let the Golem Go Mad – Notes on Context Hygiene for Coding Agents
How do I keep my coding agents from spiraling when the context gets messy?
Working with coding agents has its ups and downs. Sometimes it even has its loops of back and forth while an unreasonable number of tokens get burned, birthing nothing but high invoices and code which needs to be reverted.
Context hygiene and the golem of Prague
In the Jewish myth of the Golem of Prague, a mysterious voice commanded Rabbi Löw to create a human-like entity out of clay. By laying a “Shem” (magic words) into the golem it came to life. This tireless servant worked day and night to do the bidding of the Rabbi. Every day it worked, every day but one: the Sabbath. When it was time to rest, Rabbi Löw would take the Shem out of the golem, which would make it stop. After the Sabbath he would put the Shem back into the Golem and let it start again.
One day, Rabbi Löw forgot to take the Shem out of the Golem and put it to rest. As a result, it went mad, ran amok and began to rampage in the community it was made to protect.
Agentic coding loops (Ralph Wiggum loop)
We live in a weird world. People talk about Ralph Wiggum engineering, as if this is a reasonable strategy for engineering problems. There are multiple issues with this approach:
Stacking failure rates. (A bigger topic, but to get an overview look into The Agent Company)
Instructions are not clear, and the agent tries to implement solutions unnecessarily complex
The context necessary for solution is bigger than context window, or relevant parts live outside of current source code
To make these loops work you need a very clear understanding of how to steer the agent away from wrong solutions and towards the desired behavior of the system to implement.
I’m still trying to get there, but it’s harder than just staying in the loop and doing the steering myself after each implementation iteration.
Context windows and the limits of sanity
While being a human guard rail for my coding agents, I noticed an obvious issue: As soon as the context window gets utilized to its fullest, the quality of the suggested code drops so far behind what I’m even willing to look at, that starting over is the only way to fix this.
Just like Rabbi Löw, I need to take away the Shem and start from scratch with the magic words.
Simple workflow
I am a Cursor user and tend to follow a simple three step approach:
Write a prompt laying out
a. What I want to achieve (goal of system behavior)
b. Where the required spots for change are (if I am already sure what I want changed)
c. How I want them to be changed. (This is optional)
Additionally, I use project-based prompts laying out tools to use to validate the result as well as implementation hints. But these are the topic of a following article.
Run this prompt through a Planning step in Cursor. Adapt the resulting plan when necessary.
Start implementation
After the implementation, I go over the code. So far, I rarely had an instance of this working without me needing to change anything afterwards. But of the implementations I tend to commit, at least 80% of code was fine.
Outlook for more automation
I’m pondering how I could let agents really work full automatically on specific tasks. This requires tool based as well as prompt-based constraints and success criteria. I like Jason Gorman approach of using old and battle-tested software engineering approaches. This collides with how Dex Horthy describes the process at HumanLayer in his talk No Vibes Allowed: Solving Hard Problems in Complex Codebases. If this is possible, I need to make this work for my team and I without reducing the quality of our output.
But until then, I will continue to take the Shem out of the golem before Sabbath.


