The Death of the Yes-Man
If you ask ChatGPT a question, it tries to please you. If you ask Gemini, it tries to be safe. If you ask Claude, it tries to be nuanced.
But if you ask a single model to critique itself, it fails.
LLMs are sycophants. They suffer from "Mode Collapse" in reasoning—they lock onto a probable answer and defend it, even if it's wrong. They are the ultimate "Yes Men."
I built the LLM Group Chat to fire the Yes Man.
The Boardroom Architecture
I didn't want a chatbot; I wanted a Board of Directors.
This project is a multi-agent consensus engine. Instead of a single model generating an answer, it orchestrates a "Blind Peer Review" between the giants: GPT-4, Claude 3.5 Sonnet, and Gemini Pro.
The Protocol
- The Prompt: User asks a complex question (e.g., "Critique this system architecture").
- Blind Generation: All three models generate answers independently, hidden from each other.
- The Roast: Model A reads Model B's answer (anonymously) and tears it apart.
- The Synthesis: A "Chairman" Agent (running on a high-temp prompt) reads the debate and synthesizes a final, high-fidelity truth.
The Axiom: "Intelligence is not found in the neuron, but in the conversation."
Why It Works
It turns out that LLMs are better critics than creators. Even a smaller model can spot a hallucination in a larger model's output if asked specifically to critique it.
By forcing adversarial debate, we reduce hallucination rates by nearly 40%. We turned the "black box" into a Glass Conference Room.