Cozytown

Feb 02, 2026

Last weekend I started building Cozytown. What is Cozytown? Think Stardew Valley, but every NPC is Claude. And you can go in houses to read books. The books are just a front end to Linear, Markdown files, or your favorite task/knowledge management system. The NPCs can talk to each other when you’re not around, and they can read the books too.

Since it’s a shameless Stardew Valley clone there’s also1 forging, crafting, potions, combat, several distinct biomes, and a day/night system. (Bandits come out at night.) And swimming, when you want to talk to crabs and fish. The fish are also Claude, and the fish have genes, and you can breed them. It all makes a lot of sense.

Ralph Wiggum

I’ve been thinking about building this for over a year without getting ~~very far~~ anywhere. But I had twenty minutes of free time last weekend, so it seemed like a good opportunity to try out the Ralph Wiggum technique on an empty directory.

It far exceeded my expectations: within an hour it had one-shotted the town, buildings, NPCs, and player character. All the graphics were procedurally generated with cute little walking animations, directional rotations, and correct depth layering so you can walk behind the tops of buildings and trees.

Here’s some of the graphics code, if you’re curious.

The books

Books are backed by a pluggable ContentProvider system which is a little bit documented here. For the moment we’ve implemented two backends: one that looks for Markdown files on the game server, and another that uses the Linear API.

As a dogfooding proof of concept I asked Claude to write some gameplay docs and lore to put in Markdown files. So now we2 have in-game bookshelves full of documentation and lore.

I mentioned above that the Claude-backed NPCs can read the books. When you talk to an NPC, we provide a read_book() tool and an enumerated list of all book titles. This context-seeding means that the villagers know all their own lore, will warn you about the bandits, and can answer questions about how to play. But they stay very delightfully in character: as a general rule of thumb, for example, Bjorn (the village blacksmith) is the only one who is willing to read the book about smelting and forging.

The task management integration

For more dogfooding we also connected the books to a Linear instance which we’re using to track and prioritize work on Cozytown. This too all goes into the NPC prompts with issue titles and statuses, which creates some awesomely unexpected NPC dialog: they’ll tease features that don’t yet exist (“I’ve heard there might be magic wands…”) and if you know to ask about undocumented features, they know how to reply, because we had ticketed them.

My original vision here was to use all this as a sort of interactive memory palace, modeled as one building per project, with little dudes in each house uttering gnomic utterances upon request like “working on so-and-so? this is your next highest priority task: …”

I’ll probably still do that, but it will require a bit of prompt tuning — the NPCs really want to stay in character.

Baby steps toward autonomous orchestration of agent swarms

Everything is implemented in Claude Code on my computer, in YOLO mode + Ralph Wiggum whenever I remember.

Our CLAUDE.md encourages heavy use of gh and linear CLIs, and we try to enforce a fairly rigid workflow to make sure it stays playable at all times & maintain some broad understanding of what’s going on. Claude Code should always:

Look over all open Linear issues
- If any are unprioritized, or none are high priority, stop and reprioritize them all
- If you see a high-priority issue that’s bigger than you feel like tackling, stop and break it down into smaller tickets.
Pick the highest-priority open ticket that you feel like working on.
Get it done!
- If it takes a while, leave comments along the way in the Linear ticket.
- Make sure the tests pass.
- Make sure we have close-to-100% test coverage. (I'll accept some wiggle room here, I’m not a monster.)
Submit a pull request with gh. Then run a subagent to review your code and submit a comment approving, requesting changes, or flagging for human intervention.
Read the comment and implement all feedback.
- If you and the reviewer disagree about something, flag it for human intervention3.
- If the reviewer had an action item that you don't disagree with but don't wanna bother with, ticket it.
Merge the PR and close the Linear issue.
Repeat.

Responsible maintenance

This setup has allowed me to tune my involvement level without getting in the way. I can — from my phone — peek at the Linear issues; get caught up on PR discussions as time allows4; and reprioritize/wontfix/create issues myself. When I’m feeling really engaged I can drop in on Claude Code itself to tell it what to do or watch it work.

Usually this comes in the form of “ticket this new feature (fish genetics, fire-building, the thing with the rocks, etc) as an epic.” But sometimes I need to steer the ship a bit more actively.

Although I have not been reviewing the code and haven't seen most of it, I feel a strong sense of responsibility for its quality and rigor. So after every couple of features I ask Claude to pick some subsystem, document it in the repo with some representative snippets, and do a comprehensive architectural review, which surfaces exciting opportunities to eliminate sloppy code5.

We also have some special callouts in CLAUDE.md — and accompanying skills — to nudge Claude towards incorporating every new feature into the save/load system from day one. This has proven unreliable, so Claude also wrote an architecture document and some tickets about why it’s so hard to remember.

What's next?

What’s next? Mostly just charging full speed ahead at every feature we can think of, with that periodic break-taking for refactorings and other cleanup.

There's some tension between “Cozytown is a front end for your knowledge base” and “Cozytown is a front end for its own knowledge base.” This may be a productive tension. I'm not sure.

The NPCs will get really into their role playing, to a fault: they start hallucinating game facts that don't exist (“I hear there's some buried treasure hidden in the forest,” “try dropping a coin in the well,” etc).

We're building out some tools to mitigate this, starting with a goto(landmark) and an incept_item_near(landmark) so they can make those hallucinations real and also lead you toward them. (Plus end_conversation() which I still think is enormously underrated in general.)

This doesn't help with the more fundamental hallucinations, though: e.g. we have no digging system, nor shovels, and there is no well in Cozytown. For this it seems clear we need a file_ticket(summary) tool so that the NPCs can author their own feature requests, although we'll still gatekeep them somewhat through prioritization and ticket review. I don't want this to turn into a constantly shifting AI dream landscape, after all; that would be impractical.

It’s a shameless Stardew Valley clone — what about farming? There is no farming.

We = my daughters and I, and of course our dear friend Claude.

This has been extremely rare. I think I need to tweak the prompts to make them disagree more often.

Is the PR → review → revise thing actually improving the code quality at all, or just slowing us down?

I’d probably keep it either way because the PR discussions make for great late-night reading when I’m struggling to sleep.

But I have seen some evidence that it’s worth the time and tokens: genuine bugs flagged in review, good small-scale feedback that make it into the ticket backlog, and at least one showstopper.

Like the variant: ‘mushroom_red’ // Which item renderer to use stuff in the graphics system where we’re lossily projecting what should be strictly typed API contracts and rich protocols onto a flat plane of untyped unvalidated unregistered string references. I got really angry about this one.

The Third Bear Thinks

Discussion about this post

Ready for more?