Case Study II — Scientific Discovery¶

Finding Drugs by Listening to a Model Lie¶

A computational biology lab uses an LLM to generate drug-receptor interaction hypotheses. Standard approach: filter out hallucinations, keep grounded outputs. After six months: 200 "known" interactions, all already in the literature. Zero discoveries.

A junior researcher asks: what if the hallucinations are the point?

The PHANTASM intervention¶

The team switches to CMN. Instead of filtering confabulations, they mine them. CMN is fine-tuned on known drug-receptor interactions (factual references) paired with the model's confabulated outputs. The contrastive training learns to surface confabulations that are novel AND plausible.

Over four months, CMN mines 847 high-confidence hypotheses.

Results¶

Method	Hypotheses	In-Literature Rate	Novel Rate	Expert Plausibility
Standard (filter)	200	100%	0%	89%
RAG augmented	312	94%	6%	82%
PHANTASM CMN	847	31%	69%	77%

Expert plausibility drops slightly — because CMN is deliberately surfacing territory where human experts are less certain. That is the signal that the hypotheses are genuinely new.

The key reversal¶

The old approach was panning for gold and throwing away everything yellow. The confabulations were not noise. They were the signal.