Run configuration
1Who is talking
The shape of the conversation, and who you are in it.
The name your own messages appear under.
Shown in the header for your reference. It is not sent to the models.
2The speakers
The two voices in the dialogue. Each owns its accent colour throughout the app. Blank generation fields fall back to the shared defaults below.
3Opening & shared framing
What starts the dialogue, and what the models are told about the experiment they’re part of.
The opening user message handed to the first speaker to start the dialogue.
Prepended to each speaker’s system prompt to make the models aware of the experiment. {self} = that speaker’s name, {others} = the other participants, filled per speaker. Leave blank to tell them nothing.
4Generation defaults
The shared sampling settings every speaker inherits — and the optional private reflection pass each model can run before it speaks.
Model turns before the run stops on its own. Solo mode ignores this.
Recent turns each model sees, plus the opening turn.
Tokens each model may generate per turn.
0 = deterministic, higher = more varied.
Auto-generate highlights every N turns. 0 = off.
Max tokens for the private reflection pass.
Appended to the speaker’s system prompt for the reflection pass.
5Stop condition
How the run decides it’s finished.
Run to the end goes the full count of turns and lets you judge convergence yourself. Stop on agreement ends one turn after a speaker emits the marker below. Stop when measured lets the live detector end the run once it judges the dialogue converged — or hopelessly circling — for two turns running.
Exact string that ends the run in declared mode — only fires if your prompts instruct the models to emit it.
Convergence judgmentWhat counts as agreement, stagnation, and breakout
The watcher measures the dialogue every cycle; here you set the standard of evidence by which those measurements become a verdict. One headline dial plus a few named decisions — the instrument’s internal calibration is fixed in code, not exposed.
Sets a coherent default policy. Everything below is optional — leave it untouched and the preset decides.
How eager the watcher is to call a verdict. Co-moves the convergence, stagnation and patience policy so it stays coherent.
Each control overrides just its slice of the preset. Set one to take precise control of that single question; the rest keep following the dial.
How close the speakers must get before the gap counts as converged — 0.45 means it must shrink to 45% of its opening value. Lower = stricter.
How readily a stuck, circling dialogue gets called stagnant. Lower = more aggressive.
Cycles a new verdict must hold before it is published. Low = responsive but twitchy.
How far below its opening register a speaker must fall to count as a genuine breakout. Lower = a bigger drop is demanded.
How fully a speaker must climb back to baseline to count as “returned” — the natural-convergence window.
Definitional windows the verdict is measured against.
Opening turns that define “the start” — both the early cross-speaker gap and the programmed-register baseline.
How much gap trajectory must exist before a convergence ETA may be promised. Higher = more conservative.
what am I looking at?
| state | verdict from the gap trajectory: exploring / converging / converged / stagnant / diverging |
| gap | 1 − cosine between the speakers' recent-position centroids. Low + flat = met. High + flat + low novelty = circling. |
| velocity | semantic displacement vs the speaker's own previous turn. Zero with open gap = stagnation; spike = breakout. |
| novelty | fresh trigrams this turn. Drops when they start repeating anyone, including themselves. |
| self-rep | similarity to the speaker's own recent turns — the "frozen on own attractor" signal. |
| novelty (JSD) | Jensen-Shannon divergence of the turn's word distribution vs the decayed history. Length-normalised; falls during stagnation where the old trigram measure stayed flat. |
| stagnation | graded 0–100% circling index: geometric mean of gap-open × gap-flat × speakers-frozen. ≥60% flags real stagnation. |
| mirror | cross − self JSD. Negative = drifting toward the partner (merging); positive = holding distinct ground. |
| converge ETA | turns to convergence with an [optimistic–pessimistic] band; suppressed (with a reason) when the trend sign is uncertain. |
| register | density of the model's assistant boilerplate + list formatting. Your breakout theory is measured on this. |
Saved runs
Forge
A space for two models to think together
Model Dialogue runs a live conversation between two language models — and lets you watch, measure, and join it. There's no transcript loaded yet.
Configure the run
Open Run configuration to set the two speakers, their prompts, and how the dialogue should open.
Start the dialogue
Press New run to begin. Turns stream in live, with a watcher tracking convergence as they talk.
Read, steer, export
Synthesize highlights, join the conversation, or save the whole run for later replay.