The build story
The curriculum that marks itself.
Kākā Math is a maths tutor for New Zealand kids, fronted by a kākā with opinions. Under the feathers it’s the most ambitious AI system in the workshop: 1,596 lessons spanning Year 1 to Year 13, mapped to the NZ curriculum and NCEA numeracy standards — the secondary-school configs literally trace each NCEA content area down to specific lessons, with fallbacks into earlier years for kids with foundational gaps. The word problems are populated by a recurring Kiwi cast (Coach Tama, Nan Helen, Uncle Sione, Farmer John), because a NZ kid should not be calculating in dollars that look suspiciously American.
Twenty-two small agents, not one big one
Content generation is decomposed into single-purpose agents — one outlines a syllabus, one plans a lesson, one authors questions, one writes hints, one prompts illustrations, one draws SVG diagrams — about seven hundred lines of prompt engineering in total, run through queued jobs. Tutoring is a different problem with a different model: when a kid photographs their handwritten working, a fast vision model reads it and responds with Socratic hints, never answers. Deterministic checking handles everything that doesn’t need a model at all.
AI writes it. A panel of AI experts marks it.
Generated curriculum has an obvious failure mode: confidently wrong, subtly off-syllabus, or pitched at the wrong age. So Kākā Math runs a standing review pipeline — a worst-first queue that pulls the weakest lessons and puts each one in front of five expert personas: a mathematician for rigour, a teacher for pedagogy, a kid-UX reviewer for language, an illustration reviewer for the SVGs, and an NZ numeracy-standards examiner for alignment. The reviewers don’t just flag — they fix, behind safety gates, committing batch by batch with a ledger of what was changed and what got blocked for a human. It runs on autopilot in idle hours, dozens of batches deep, and the queue re-sorts itself as quality improves. Cheap models flag; expensive models fix; humans arbitrate the residue.
The bill is part of the architecture
An AI tutor for children fails differently from a chatbot: the user can’t be trusted with the meter. So the cost layer is explicit — a global daily budget that circuit-breaks all AI calls, per-user daily call and spend caps, and every request logged with model, tokens, and cost. One scar earned it: generation jobs originally retried unbounded for hours, which on a job that fans out into a hundred questions is a way to set money on fire. Retries are finite now, and the breaker catches what slips past.
Built for the hand-me-down
The device a kid actually learns on is rarely new. It’s a parent’s old phone with the cracked corner, the iPad that’s been demoted to the kitchen drawer, the tablet handed down the sibling line. So Kākā Math holds its floor at iOS 15 — one universal iPhone and iPad build that installs on hardware close to a decade old, an iPhone 6s or an iPad Air 2. Holding that floor is an active decision, not a default: it means turning down the shiny new frameworks — NavigationStack, the @Observable macro, the iOS 16/17 conveniences every Swift tutorial now assumes — and staying on the older patterns that still run on the older glass. It costs me some developer comfort. It’s worth it, because the first kid I built this for is learning on exactly that kind of phone.
Honestly: in dev
The gamification loop is built — feathers to earn, altitude to climb, streaks with a gentle push notification when one’s at risk, a reward shop with AI-illustrated prizes — and the iOS app rides the same API through TestFlight. What it doesn’t have yet is a classroom of real kids, and that’s the next milestone: the content review exists precisely so that when the first kid opens it, the maths deserves them.
Generating content with AI and wondering how you'd ever trust it? Here’s how we could work together.