Hayao's engine is broad and its verification is real. What it can't yet hand you out of the box is a professional, hour-long, polished game. This page says exactly where that gap is and how it closes — no check the engine can't cash.
“Create a metroidvania like Ori & the Blind Forest. Build with npm package hayao.” — and get back something that feels professional, plays for an hour, and is polished out of the box.
That's the destination, not a claim about today. Below is the honest distance to it.
The engine covers the popular 2D genres end to end, and every example proves its own truth in CI. The dimension that lags is art & polish: five games are art-finished, the rest render in functional placeholder style.
The code-as-art toolkit (palettes, organic shapes, autotile, textures) is proven on Lanternway, Rootward, and Tarnholm. Next: extract the shared cosmetic-layer helpers and lift the remaining games off placeholder art.
Game-feel — hit-stop, screen-shake, tweened readability — proven per-game today. Goal: make it the default an agent inherits, not something it must hand-author each time.
Reproduce harder human-ranked js13k winners under determinism + proof discipline, feeding each engine gap back into src/. See the ladder below.
From single verified games to an hour-long, multi-zone experience an agent can compose — the true test of the north-star prompt.
Rather than grade hayao against invented specs, we reproduce games that humans already ranked highly in open jams — mechanics in spirit, nothing copied — and machine-measure the fidelity. Each rung stresses a specific engine muscle; a red gate means it doesn't count.
| Rung | Target (jam, year, rank) | Stresses | Status |
|---|---|---|---|
| 1 | Edge Not Found — js13k 2020, #2 | solver on a twisted torus | ✅ Seamfold |
| 2 | Black Hole Square — js13k 2021, #9 | BFS-provable tap puzzle | ✅ Gravewell |
| 3 | Dying Dreams — js13k 2022, #2 | multi-avatar state explosion | ☐ planned |
| 4 | Norman the Necromancer — js13k 2022, #3 | bot-provable real-time balance | ☐ planned |
| 5 | Ninja vs EVILCORP — js13k 2020, #1 | feel-critical movement, par-time bot | ☐ planned |
| 6 | Space Huggers — js13k 2021, #8 | juice + particles + perf ceiling | ☐ planned |
Method and scoring rubric: docs/BENCHMARK.md.
The engine is MIT and the whole surface is text. Pick a placeholder-art game and lift it to the woodblock bar, take a benchmark rung, or drive the engine with your own agent and file what fought it. Start from AGENTS.md and the developer docs.