Wildermine Phase 3 Results: 432x Memory Reduction and 97% Code Cleanup

We set out to tackle Wildermine’s riskiest refactoring yet: delta-based history, renderer architecture splits, and state management consolidation. The results? 99.8% memory reduction, a 1,674-line monolith reduced to 50 lines, and zero functional regressions.

Phase 3 proved that careful planning, comprehensive testing, and the right AI collaboration tools can make even the scariest refactors succeed.

What We Set Out to Do

In our Phase 3 announcement, we outlined four tasks ordered by risk:

Viewport Culling (Low-Medium Risk) - Expected 2-3x rendering boost
Delta History (Medium-High Risk) - Expected 95%+ memory reduction
Split CanvasRenderer (High Risk) - Better maintainability
State Management (High Risk) - Cleaner architecture

The plan was to tackle them one at a time, test thoroughly between each, and maintain rollback options. Here’s what actually happened.

Task 1: Viewport Culling Optimization

Status: ✅ COMPLETE (Already Implemented)

The Pleasant Surprise

Before we touched a single line of code in Phase 3, CanvasRenderer was already doing the smart thing: culling anything the camera couldn’t see. The helper _calculateViewportBounds() computes the exact tile window based on the editor’s scroll position and canvas size, then pads the top edge by two rows to guard against pop-in.

Every rendering pass—terrain tiles, doodads, grid overlays, building sprites—respects those bounds. Buildings even make a second pass to confirm their sprite footprint overlaps the viewport before they’re drawn. In short: the renderer only paints what matters and ignores the 90% of the map that’s off-screen.

Performance Impact

Because this optimization was already live (and battle-tested in production), there wasn’t fresh instrumentation to run during Phase 3. What we do know from earlier profiling is that trimming the draw loops to the visible window slashed per-frame work by roughly 2-3x, depending on monitor size.

The math speaks for itself:

Full grid: 3,600 tiles per frame (60×60)
Typical viewport: ~300 tiles per frame (20×15)
Reduction: 92% fewer tiles rendered

This existing optimization is what made the rest of our refactors possible—we had rendering headroom to work with.

Validation

✅ Pan/scroll testing confirmed tiles and buildings stream in without artifacts

✅ No zoom mode exists yet, so nothing to validate there

✅ Building highlights, grids, and other overlays all honor the same bounds

✅ No regressions surfaced during the broader Phase 3 work

What We Learned

Sometimes the best refactor is the one you don’t have to do. Task 1 became a quick audit rather than a rebuild, reminding us to catalog wins that already exist before diving into risky changes.

Knowing the renderer was already efficient freed us to focus on the heavier lifts—like memory savings and architectural splits—without worrying about raw draw performance.

Analysis courtesy of Codex CLI’s code review.

Task 2: Delta-Based History System

Status: ✅ COMPLETE

The Implementation

Undo in Wildermine used to mean cloning the entire 60×60×6 board every time the player touched a brush. One paint stroke carried the weight of 21,600 tile values—overkill when we usually tweaked a handful of cells. Phase 3 swapped that brute-force snapshotting for a surgical delta engine.

We now capture just the “before” footprint of an action—both the terrain cells and the building placements—and let a lightweight LevelDelta compute the exact differences. Each history entry stores those deltas alongside any buildings that were added or removed, and the DeltaHistory manager replays them forward or backward on demand.

The refactor also forced us to harden the building layer: by guaranteeing stable placement IDs and keying “unique” structures (like the starting tile and gold mine) by tileName, delta reconciliation can track every placement precisely. The result is an undo stack that behaves exactly as it always did, without dragging a full level clone around for every click.

Memory Results

The payoff was dramatic:

Memory per history entry (old): 168.8 KB (full snapshot)
Memory per history entry (new): 0.391 KB (delta only)
Memory reduction: 99.8%
Reduction factor: 432x smaller
Undo performance: 1.1x faster (8.7% improvement)

For 10 undo states:

Approach	Total Memory	Delta vs. Old
Full snapshots	1,687.5 KB	baseline
Delta history	3.9 KB	–1,683.6 KB (–99.8%)

We now fit hundreds of history entries in the space that used to accommodate one.

Test Results

Delta-based history performance test results

We codified the migration with a dedicated performance and correctness suite:

=== OLD Approach: Full Snapshots ===
Snapshots stored: 10
Cells per snapshot: 3600
Memory per snapshot: 168.8 KB
Total memory: 1687.5 KB

=== NEW Approach: Deltas ===
Deltas stored: 10
Cells changed per delta: 10
Memory per delta: 0.391 KB
Total memory: 3.9 KB

=== Memory Comparison ===
OLD total: 1687.5 KB
NEW total: 3.9 KB
Memory saved: 1683.6 KB (99.8%)
Reduction factor: 432x smaller

=== Undo Performance Comparison ===
OLD (full snapshot): 4.039ms
NEW (delta):         3.689ms
Speedup:             1.1x faster
Improvement:         8.7%

✓ Delta-based undo produces identical results to snapshot-based
✓ Delta-based redo restores state correctly
✓ 8 tests passing, 43,207 assertions

Manual editor drills backed that up: single-tile tweaks, large brush sweeps, building placements, repeated undos/redos, and even the “undo after loading a saved level” edge case all behaved exactly like the pre-refactor editor—just without the memory balloon.

What We Learned

Lean can beat fast. We expected to trade a bit of speed for huge memory wins; instead, the delta path shaved nearly 9% off our undo latency. Less data to copy means less work overall.

Buildings needed rigor. The initial delta cutover exposed a blind spot: building placements weren’t tracked reliably because the old system tolerated missing IDs. Fixing that (auto-generating IDs, matching by tileName) not only restored undo fidelity but also hardened the data model.

Instrumentation pays off. Having a single test harness spit out memory, timing, and correctness metrics let us quantify success immediately and catch regressions as we iterated.

Future-proofing unlocked. With memory no longer a bottleneck, we can support deeper undo stacks, heavier brushes, and richer building interactions without sweating RAM.

The AI Collaboration Story

This task presented an interesting AI collaboration dynamic. Claude Code initially struggled with the building integration bugs, hitting a wall after several attempts. We switched to Codex CLI, which successfully diagnosed and fixed the building placement tracking issues. The narrative above is adapted from Codex CLI’s writeup of the solution.

Delta history started as a risky optimization; it ended up redefining our baseline. Undo is now lighter, faster, and battle-tested—exactly the foundation we need for the bigger architectural changes still to come in Phase 3.

Special thanks to Codex CLI for helping crack the building placement bug and documenting the solution.

Task 3: Split CanvasRenderer

Status: ✅ COMPLETE

The Refactoring Journey

Before: CanvasRenderer.ts carried 1,674 lines of terrain drawing, building logic, hover effects, notifications, and every helper in between. One change meant scrolling past a small novella.

After: Rendering now happens in four focused collaborators:

File	Responsibility	Lines
TerrainRenderer.ts	Tiles, autotiling, multi-layer terrain	1,020
BuildingRenderer.ts	Building sprites, eraser hover, destruction FX	149
EffectsRenderer.ts	Grid, ghosts, contextual brush, wood pickups	518
RendererCoordinator.ts	Orchestration, viewport math, sequencing	213
CanvasRenderer.ts	Canvas wiring + façade	50 (down 97%)

The refactor happened in four controlled passes so we could test after each:

The Four-Part Approach

Part A: TerrainRenderer

Extracted the terrain pipeline—autotiling, layered draws, doodad ordering—into TerrainRenderer
Captured everything behind a single renderTerrain(...) entrypoint
Manual validation: terrain loads, autotile blends, doodads, and Y-sorting still behave

Part B: BuildingRenderer

Moved building sprite work (sorting, flashing, destruction animations, eraser hover) into BuildingRenderer
Ensured unique-building rules and hover dimming stayed intact
Manual validation: all structures render, highlights/flash effects still fire

Part C: EffectsRenderer

Pulled grid lines, ghost previews, contextual brush overlays, invalid-footprint warnings, and wood pickup sprites into EffectsRenderer
Split into preBuildings and postBuildings stages so ordering stayed flexible
Manual validation: ghosts track the cursor, invalid cells flash red, contextual brush previews animate correctly

Part D: RendererCoordinator

Introduced RendererCoordinator to clear the canvas, compute viewport bounds, and run terrain → effects → buildings → effects in order
CanvasRenderer now just constructs the three renderers and delegates each frame

Code Quality Metrics

CanvasRenderer.ts: Shrank from 1,674 to 50 lines (97% reduction)
Architecture: Rendering logic now lives in 4 purpose-built modules instead of one monolith
Interfaces: Collaborators expose small, typed interfaces (renderTerrain, renderBuildings, renderEffects, render) so future features plug in faster
Indirect win: TypeScript finally has room to enforce stricter types inside each renderer—a follow-up we can tackle without fear of breaking unrelated systems

What We Learned

Sequencing matters. Breaking the monolith into staged parts (A/B/C/D) let us regression-test each slice before moving on.

Viewport math is the glue. Consolidating background clears and visible-range calculation inside the coordinator eliminated duplicate code and made culling easier to tweak.

Play mode is the canary. Running the game loop after every extraction caught a couple of hover/previews we nearly forgot to wire back in.

This architectural split didn’t just make the file shorter—it put every rendering concern back in its own lane, so Task 4 (state management) and future rendering features now have a clear structure to build on.

Refactoring executed and documented by Codex CLI.

Task 4: State Management Consolidation

Status: ✅ COMPLETE

The Problem We Solved

Phase 2’s copy-on-write work left the level editor with a split brain. Some state (tile selection, brush size) lived in the zustand store, while other global concerns—like the building placements array and the “unsaved changes” flag—quietly hid inside LevelCreator.tsx and were mirrored into the store via a side-effect.

That mirroring looked harmless, but it was a maintenance trap:

Components had to remember to push their updates back into the store
The store contained placeholder no-op functions, so TypeScript thought everything was wired even when it wasn’t
Features like undo/redo, file IO, and wood collection all flipped the unsaved flag in different ways, sometimes bypassing the store entirely

The Solution

We flipped the ownership model: the store is now the single source of truth for everything the editor ecosystem cares about. Components consume the store directly; hooks orchestrate behavior but route their writes through dedicated actions.

useLevelEditor stopped keeping a local copy of placements and just asks the store when it needs to update history, thumbnails, or the renderer
useLevelEditorHistory no longer tracks a duplicate unsaved flag—it toggles the shared store action whenever history changes
UI surfaces like LevelEditorRightPanel receive undo/redo/status as props from LevelCreator, eliminating the old “mirror everything into the store” effect
Secondary systems (the wood store, eraser helpers, etc.) now call the same store action, so we don’t risk forgetting to mark edits as dirty

The Payoff

Fewer moving parts. LevelCreator.tsx shed a 20-line effect whose sole job was to resync state into zustand. There’s no longer a risk of forgetting to include a new field in that mirror.

Predictable data flow. Undo, redo, file IO, test mode, destruction animations—every subsystem reads placements from the same place and updates the unsaved flag the same way.

Safer future work. Want to introduce autosave, multiplayer, or multiple editor tabs? The plumbing is already centralized.

Cleaner tests. Hooks that previously needed both local state and store state can now be exercised by seeding useLevelEditorStore alone.

Validation

Manual regression passes covered the full editor workflow, saving/loading, undo/redo, and play-mode toggles—no regressions surfaced.

State management refactoring completed by the development team.

Overall Phase 3 Impact

Performance Summary Table

Metric	Before	After	Improvement
Tiles rendered per frame	3,600	~300	92% reduction (already optimized)
Memory per history entry	168.8 KB	0.391 KB	99.8% reduction (432x smaller)
Undo/redo speed	4.039ms	3.689ms	8.7% faster
CanvasRenderer size	1,674 lines	50 lines	97% reduction
State synchronization	20-line effect	Centralized	Eliminated duplication

Combined Impact

Phase 3 wasn’t just about individual wins—it was about creating a foundation that makes everything else possible.

Memory is no longer a constraint. With 99.8% less memory per history entry, we can now afford deep undo stacks (hundreds of entries instead of a handful), heavier brushes, and richer building interactions without worrying about RAM.

Rendering is already efficient. Discovering that viewport culling was already implemented meant we didn’t have to risk breaking rendering code—we could focus on the architectural improvements that had higher payoff.

Code is maintainable. Splitting the 1,674-line CanvasRenderer into four focused modules (97% reduction) means new rendering features can be added without navigating a monolith. State management consolidation eliminated the risky “mirror state into store” pattern.

The editor just works. All four tasks completed with zero functional regressions. The level editor behaves exactly as it did before Phase 3—it’s just faster, leaner, and built on a foundation that won’t fight us as Wildermine grows.

Challenges & Lessons Learned

What Went Well

Comprehensive testing paid dividends. Our test harness for Task 2 (delta history) caught correctness issues immediately with 43,207 assertions. We knew the optimization was safe before it ever hit the editor.

Staged refactoring worked. Breaking Task 3 (CanvasRenderer split) into four parts (A, B, C, D) let us test after each extraction. No big-bang rewrites, no multi-day debugging sessions.

AI collaboration was flexible. When Claude Code hit a wall on building placement bugs, switching to Codex CLI got us unstuck. Having multiple AI tools in the toolkit meant no single blocker could derail the project.

Unexpected Challenges

Building placement IDs were fragile. The delta history work exposed that building placements weren’t always tracked with stable IDs. The old system tolerated missing IDs; the new delta system required them. This forced us to harden the data model—a hidden win.

State synchronization was worse than expected. We thought state management was “a bit messy.” Turned out there was a 20-line effect just mirroring data into the store, and multiple subsystems updating the unsaved flag independently. Task 4 was more important than we realized.

Viewport culling was already done. Task 1 became an audit instead of a build. Not a problem, but it reminded us to check what already exists before planning work.

What We’d Do Differently

Front-load the discovery phase. We could have audited existing optimizations (like viewport culling) before planning Phase 3. Would have saved time and let us focus resources on the tasks that mattered most.

Document AI tool strengths earlier. We learned mid-project that Codex CLI was better at certain debugging tasks than Claude Code. Knowing that upfront would have prevented some thrashing.

Real-World Player Impact

What does all this mean for someone creating levels in Wildermine?

Virtually unlimited undo history. With 99.8% less memory per history entry, we can now store hundreds of undo states instead of a handful. Paint, experiment, undo 50 times—the editor won’t blink.

Faster undo/redo. Delta-based history is 8.7% faster than full snapshots, making undo feel instant.

Smoother overall experience. Viewport culling (already optimized) + cleaner state management = no stutters, no lag, just painting.

Future-proof foundation. Cleaner architecture means new features (autosave, multiplayer editing, bigger levels) can be added without fighting the codebase.

What This Enables Next

Now that the architecture is cleaner and performance is optimized, we can:

Deeper undo stacks: 432x memory savings means we can afford to keep every edit
Autosave: Centralized state management makes periodic saves trivial
Multiplayer level editing: With single-source-of-truth state, syncing editors becomes feasible
Larger grids: Viewport culling means we could support 100×100 or even 200×200 levels without performance hits
More complex tools: Flood fill, selection tools, and advanced brushes now have headroom to work with

Document-Driven Development: The Results

Using detailed refactoring documents (like 03-optimizations.md) with AI collaboration (Claude Code + Codex CLI) proved its worth in Phase 3.

What worked:

Step-by-step checklists prevented missed steps and kept progress visible
Risk assessment helped us tackle tasks in the right order (low → high risk)
Rollback plans gave confidence to be bold—we knew we could revert if needed
AI collaboration turned documents into executable plans

Metrics:

Tasks completed: 4/4 ✅
Rollbacks needed: 0
Functional regressions: 0
Memory saved: 99.8%
Code reduced: 97%

The approach worked so well we’ll use it for future phases.

Conclusion

Phase 3 was Wildermine’s riskiest refactoring—and its most successful.

Phase 1: Foundation cleanup ✅ Phase 2: Copy-on-Write Revolution (248x faster) ✅ Phase 3: Architectural Transformation (432x memory, 97% code cleanup) ✅

We achieved 99.8% memory reduction, split a 1,674-line monolith into 50 lines, consolidated state management, and shipped it all with zero functional regressions. The level editor works exactly as it did before—it’s just faster, leaner, and built on a foundation that scales.

Wildermine’s level editor is now ready for the features we’ve been dreaming about: autosave, multiplayer editing, massive levels, and advanced tools. Phase 3 didn’t just optimize the code—it unlocked the future.

Special thanks to Claude Code and Codex CLI for their collaboration throughout Phase 3.

Phase 3 Results at a glance:

💾 99.8% memory reduction (432x smaller history)
🏗️ 97% code cleanup (1,674 → 50 lines)
⚡ 8.7% faster undo/redo
✅ Zero functional regressions
🎮 Smoother, future-proof editor

Wildermine Phase 3 Results: 432x Memory Reduction and 97% Code Cleanup

What We Set Out to Do

Task 1: Viewport Culling Optimization

The Pleasant Surprise

Performance Impact

Validation

What We Learned

Task 2: Delta-Based History System

The Implementation

Memory Results

Test Results

What We Learned

The AI Collaboration Story

Task 3: Split CanvasRenderer

The Refactoring Journey

The Four-Part Approach

Code Quality Metrics

What We Learned

Task 4: State Management Consolidation

The Problem We Solved

The Solution

The Payoff

Validation

Overall Phase 3 Impact

Performance Summary Table

Combined Impact

Challenges & Lessons Learned

What Went Well

Unexpected Challenges

What We’d Do Differently

Real-World Player Impact

What This Enables Next

Document-Driven Development: The Results

Conclusion

Written by WBG Development Team