Case Study IV — Educational Technology¶
Turning Student Misconceptions into Curriculum¶
An edtech startup builds a tutoring LLM for high school physics. Factual accuracy: 91%. The 9% wrong answers cluster around specific conceptual boundaries: mass vs. weight, direction of friction in rolling motion, sign convention in thermodynamics. The model confabulates at the same boundaries as human students — because it was trained on human-written text that over-represents the same misconceptions.
The PHANTASM intervention¶
HGT traces every answer over three months of student interactions.
- HGT identifies 23 recurring
knowledge_gappatterns — positions where gradient norms spike consistently across diverse queries - Each gap pattern corresponds to a specific conceptual boundary in the physics curriculum
- The patterns match exactly with the most-failed items on end-of-semester exams
The startup rebuilds curriculum scaffolding around the 23 knowledge-gap patterns. Targeted micro-modules are created for each.
Outcome¶
Student performance on previously-failed exam items: +34% in the following semester.
The key reversal¶
The model's hallucinations were not errors in the tutoring system. They were a perfect map of where the curriculum needed reinforcement — a map that would have taken human designers years and thousands of student failures to construct.
HGT produced it in three months, for free, from failures.