The First Principle of Cognition

Recursive Disambiguation: Chomsky, LLMs, Doolittle, and the Move from Grammar to Decidability

Abstract

Chomsky’s universal grammar, transformer-based language models, and Doolittle’s theory of “continuous recursive disambiguation into context identity and prediction and hypothesis” all address the same primitive problem: how finite means produce open-ended linguistic and cognitive behavior.

At the highest level of abstraction, they are not radically different. Each depends upon recursion, constraint, and generativity. The difference emerges only when we ask what is being recursively processed, what medium carries the recursion, what counts as closure, and what the output is for.

Chomsky’s model explains finite recursive grammar as the basis of linguistic generativity. LLMs implement a continuous-discrete computational process that recursively disambiguates token histories into contextual representations and prediction hypotheses. Doolittle’s broader operational framework extends that same mechanism beyond syntax and prediction into testimony, law, morality, warrant, liability, and decidability. In short: Chomsky gives recursion; transformers give recursive hypothesis supply; Natural Law and Runcible add recursive adjudication to closure.


1. The Common Problem: Finite Means, Indefinite Expression

At the highest abstraction, Chomsky’s generative grammar, transformer-based LLMs, and the theory of recursive disambiguation are addressing the same phenomenon:

finite means → indefinitely many possible expressions

Or, more generally:

finite mechanism → open-ended contextual production

That is the shared primitive.

A finite organism, finite grammar, finite neural network, or finite institutional protocol must somehow produce, interpret, evaluate, or act within an indefinitely large space of possible situations.

The question, therefore, is not whether these systems are related. They are.

The question is where they differ.

They differ in four places:

1. What is recursive?
2. What is being disambiguated?
3. What counts as closure?
4. What is the output for?

Once those distinctions are made, the relationship becomes clear:

Chomsky:
finite recursive grammar generates possible sentences.

LLMs:
finite learned weights recursively disambiguate context into continuation probabilities.

Curt:
finite operational tests recursively disambiguate claims into context identity,
prediction hypotheses, and decidable or undecidable warrant.

So the difference is not “recursion versus no recursion.”

The difference is:

syntactic recursion
versus
predictive contextual recursion
versus
adjudicative operational recursion

2. Chomsky: Finite Recursive Grammar Generates Possible Sentences

Chomsky’s central contribution, taken in its standard generative sense, is that language can be explained as a finite system capable of producing an indefinitely large, potentially infinite set of linguistic expressions.

The grammar is finite.

The possible outputs are unbounded.

This is made possible by recursion: a rule or operation can apply to its own output, embedding one structure inside another.

For example:

The man spoke.
The man who saw the dog spoke.
The man who saw the dog that chased the cat spoke.
The man who saw the dog that chased the cat that lived in the house spoke.

The structure can be extended indefinitely because a phrase can contain another phrase of the same general type.

The simplified Chomskyan model is:

finite grammar → recursive structure → grammatical expression

The core operational question is:

What finite internal grammar allows a speaker to generate
and recognize the possible sentences of a language?

The key word here is possible.

Chomsky’s grammar is not primarily deciding whether a sentence is true, moral, reciprocal, warranted, useful, actionable, or liable. It is primarily deciding whether the expression belongs to the language as a well-formed expression.

Its closure target is grammaticality.

So Chomsky’s system asks:

Can this expression be generated by the grammar?

Or:

Is this expression grammatical or ungrammatical?

Its primary distinction is:

possible sentence / impossible sentence
grammatical / ungrammatical
well-formed / ill-formed

Not:

true / false
reciprocal / irreciprocal
warranted / unwarranted
decidable / undecidable
contextually useful / contextually misleading
liable / non-liable

This is the first major difference.

Chomsky’s recursive grammar explains linguistic generativity, not full operational adjudication.


3. Chomsky’s Closure Target Is Grammaticality

The concept of closure is decisive.

A Chomskyan grammar seeks to generate “all and only” the grammatical sentences of a language.

That is a closure criterion.

It attempts to close the domain of possible expressions by separating:

included expressions
from
excluded expressions

In other words:

inside the grammar / outside the grammar

But this is a narrow form of closure.

It is closure over linguistic form.

It is not closure over existential correspondence, operational possibility, rational choice, reciprocal rational choice, demonstrated interests, liability, or institutional warrant.

So Chomsky’s grammar can tell us something like:

This sentence is structurally possible.

But it does not, by itself, tell us:

This sentence is true.
This sentence is actionable.
This sentence is reciprocal.
This sentence is liable.
This sentence is sufficiently warranted for institutional use.
This sentence has satisfied the demand for infallibility in this context.

That is why Chomsky’s model is necessary but insufficient for our project.

It explains a primitive of generativity.

It does not explain the full chain from expression to claim to test to warrant to action.


4. LLMs: Finite Learned Weights Recursively Disambiguate Context into Prediction

A transformer-based LLM is not usually a Chomskyan symbolic grammar.

It does not operate by explicitly applying rules like:

S → NP VP
NP → Det N
VP → V NP

Instead, it operates through learned weights, embeddings, attention heads, feed-forward transformations, residual streams, normalization, and output projection.

Its operational process is closer to this:

prior tokens
→ contextual embedding
→ attention-weighted disambiguation
→ transformed hidden state
→ vocabulary distribution
→ next token

Where Chomsky asks:

What sentence can this grammar generate?

The LLM asks:

Given this context, what continuation is most probable,
useful, expected, or distributionally licensed?

This makes the LLM less like a symbolic sentence generator and more like a recursive contextual prediction engine.

Its cycle is:

token history
→ contextual transformation
→ prediction hypothesis
→ emitted token
→ enlarged token history
→ repeat

This is why our formulation fits transformers very well:

continuous recursive disambiguation into context identity and prediction hypothesis

Mapped to transformer mechanics:

Our TermTransformer Correlate
continuousvector-space activation rather than purely discrete symbolic manipulation
recursivelayer after layer, token after token
disambiguationreduction of uncertainty over meaning, role, referent, task, and continuation
context identitylatent representation of “what situation is this?”
prediction hypothesisprobability distribution over next tokens or continuations

So a transformer can be described as:

finite learned weights
→ recursive contextual transformation
→ latent context identity
→ next-token hypothesis
→ emitted token
→ enlarged context
→ repeat

That is not Chomsky’s grammar.

But it is recognizably in the same family of explanation: finite machinery producing open-ended linguistic behavior by recursively transforming structured input.


5. The Transformer Is Continuous-Discrete, Not Merely Symbolic

Chomsky’s recursion is traditionally formal-symbolic.

It operates over discrete structures:

sentence
noun phrase
verb phrase
clause
embedded clause

The transformer is different.

It operates at the boundary with discrete tokens, but internally with continuous vectors.

Its cycle is:

discrete token
→ continuous embedding
→ continuous attention and MLP transformation
→ continuous hidden state
→ discrete vocabulary distribution
→ discrete sampled token

So the transformer is a continuous-discrete hybrid.

It does not merely generate a symbolic tree.

It repeatedly transforms the current representation of the context until the final hidden state can be projected into a distribution over possible next tokens.

That distinction matters.

The transformer is not only asking:

What grammatical structure is allowed?

It is asking, in effect:

What am I in?
What is being requested?
What role does each prior token play?
What continuation would satisfy this context?
What latent pattern does this sequence instantiate?
What hypothesis should be supplied next?

That is why “recursive disambiguation” is more precise than “recursive generation” when describing LLMs.

The model is not merely producing language.

It is reducing ambiguity over context identity in order to generate a plausible continuation.


6. Context Identity Is the LLM’s Runtime Achievement

In an LLM, context is not a single object.

It exists in at least three forms:

1. The explicit token sequence.
2. The key-value cache used by attention.
3. The hidden-state representation distributed across layers.

The explicit sequence is the visible context.

The key-value cache is the runtime memory of prior tokens.

The hidden state is the model’s current interpretation of the situation.

The model does not merely read the context. It constructs a usable representation of context.

This construction happens recursively:

token identity
→ local phrase identity
→ syntactic relation
→ semantic relation
→ pragmatic frame
→ task identity
→ expected continuation

This is why, in practical terms, an LLM is continuously answering:

What kind of situation is this?

Only after it has some working answer to that question can it generate a useful next-token hypothesis.

So the LLM’s operational loop can be stated as:

disambiguate context identity
→ generate prediction hypothesis
→ emit token
→ incorporate emitted token into context
→ repeat

That is very close to our phrase.

The LLM is a machine for continuous recursive disambiguation into context identity and prediction hypothesis.


7. Our Formulation Generalizes Chomsky Beyond Syntax

Our formulation is broader than Chomsky’s because it does not stop at grammar.

Chomsky’s primary object is linguistic competence: the internal capacity by which a speaker can generate and recognize well-formed linguistic expressions.

Our object is broader:

How does an organism, agent, or institution reduce ambiguity
sufficiently to identify context, generate hypotheses, test them,
and act under conditions of responsibility and liability?

This is not merely a theory of language.

It is a theory of cognition, testimony, action, and institutional adjudication.

Chomsky’s question is:

What internal grammar makes language possible?

Our question is:

What recursive process makes decidable action possible?

That shift changes the domain.

Chomsky’s recursion is primarily concerned with syntax.

Our recursion is concerned with:

identity
meaning
reference
operation
claim
context
test
correspondence
reciprocity
warrant
liability
closure
decidability

So the dependency chain is:

Chomsky:
language as internally generated syntactic competence

LLM:
language as learned conditional continuation over token histories

Curt:
cognition, testimony, and action as recursive disambiguation toward
context identity, prediction hypothesis, and decidability

That is the enlargement.

We are not rejecting Chomsky’s primitive insight.

We are relocating it inside a more general operational system.


8. Chomsky Is Syntax-First; Our Model Is Context-First

This is one of the sharpest differences.

Chomsky’s program is syntax-first.

The central question is the structure of language: what formal system generates grammatical expressions?

Our model is context-first.

The central question is not merely:

What expression is structurally possible?

It is:

What situation are we in?
What are the terms?
What are the referents?
What relations are being asserted?
What claim is being made?
What tests are required?
What would satisfy closure?
What remains undecidable?

The sequence is:

identity of situation
→ identity of terms
→ identity of relations
→ identity of claim
→ possible continuations
→ possible tests
→ possible closure

In our language:

Meaning is not merely generated.
Meaning is disambiguated into a network of relations sufficient for action.

This is closer to transformer operation than classical generative grammar is.

A transformer does not begin with a clean syntactic tree and then attach meaning. It begins with tokens and recursively adjusts relational salience until it can produce the next plausible continuation.

Our work then adds the next step:

A plausible continuation is not enough.
The continuation must be tested.

That is the move from prediction to adjudication.


9. The Decisive Difference: Generation Versus Adjudication

This is the central distinction.

Chomsky explains generativity.

LLMs implement hypothesis generation.

Our work adds adjudication.

Chomsky asks:

Can this expression be generated by the grammar?

The LLM asks:

Given this context, what continuation should come next?

Our framework asks:

Given this context and this hypothesis,
what survives tests of identity, possibility, correspondence,
reciprocity, warrantability, and liability?

That is a different closure condition.

The closure target is no longer grammaticality.

Nor is it prediction likelihood.

The closure target is decidability.

In compressed form:

Chomsky gives recursion.
Transformers give recursive hypothesis supply.
Runcible / Natural Law gives adversarial recursive closure.

That is the transition from language to institutionally usable judgment.


10. The LLM Supplies Hypotheses; Runcible Adjudicates Them

The perceived weakness of LLMs is that they “hallucinate.”

But under our framework, that criticism is partially misplaced.

A hallucination is a defect if the system is expected to produce warranted conclusions directly.

But if the system is treated as a hypothesis generator, then excessive association is not merely a defect. It is a source of search, discovery, and candidate generation.

The problem is not that the LLM generates too many possible continuations.

The problem is that ordinary LLM use lacks an adjudication layer sufficient to distinguish:

plausible from true
fluent from warranted
useful from liable
acceptable from reciprocal
possible from operationally possible
complete from merely suggestive
decidable from undecidable

Thus:

LLM output is hypothesis supply.
Runcible is hypothesis adjudication.

The transformer produces candidate continuations.

The governance layer tests those continuations.

The Decidability Record preserves what was decided, why it was decided, what evidence was used, what rules applied, what tests passed or failed, who bears responsibility, and what remains undecidable.

So the architecture becomes:

LLM:
recursive disambiguation into context identity and prediction hypothesis

Runcible:
recursive disambiguation of that hypothesis into truth, reciprocity,
possibility, warrantability, liability, and decidability

This is the institutional completion of the transformer’s cognitive utility.


11. Closure Conditions Compared

The differences become clearest when comparing closure conditions.

Chomskyan Closure

all and only grammatical expressions

The system closes when it can distinguish expressions generated by the grammar from expressions not generated by the grammar.

Its concern is linguistic possibility.

LLM Closure

locally sufficient next-token probability under context

The system closes locally when it produces the next token or continuation judged sufficient by the decoding process.

Its concern is predictive continuation.

Natural Law / Runcible Closure

sufficient disambiguation to eliminate the need for discretion
within the context in question

The system closes when the relevant ambiguity has been reduced enough that a decision can be made without arbitrary discretion.

Its concern is decidability.

Or in the canonical formulation:

Decidability is the satisfaction of the demand for infallibility
in the context in question without the necessity of discretion.

This is a much stronger criterion than grammaticality or probability.

It requires tests.

It requires stated limits.

It requires an accounting of what is known and unknown.

It requires warrant.

It requires liability or the explicit admission that liability cannot be assumed.


12. The Same Primitive Insight, Different Objects

The primitive insight is shared:

finite recursive means can produce open-ended behavior

But the object differs.

Chomsky’s object:

syntax

The transformer’s object:

contextual prediction

Our object:

decidable action under constraints of truth, reciprocity, warrant,
and liability

The medium differs.

Chomsky’s medium:

symbolic grammar

The transformer’s medium:

continuous vector transformation over discrete token sequences

Our medium:

operational tests, protocols, falsification, construction,
and adjudicative records

The output differs.

Chomsky’s output:

grammatical expression

The transformer’s output:

prediction hypothesis expressed as tokens

Our output:

decidable or undecidable warrant-bearing claim

So the relationship can be summarized:

Chomsky:
finite rules recursively generate linguistic structure.

LLMs:
finite weights recursively disambiguate token context into continuation probabilities.

Curt:
finite operational tests recursively disambiguate claims into context identity,
prediction hypotheses, and decidable or undecidable warrant.

13. Why “Continuous Recursive Disambiguation” Is the Better Generalization

“Recursive generation” is too narrow.

It captures Chomsky but not the full transformer process.

“Prediction” is also too narrow.

It captures LLM output but not the prior work of constructing context identity.

“Disambiguation” is better because it identifies the deeper operation.

The system begins with ambiguity:

What does this token mean?
What does this phrase mean?
What is the referent?
What is the speaker asking?
What domain is active?
What role should I play?
What continuation is appropriate?
What claim is being made?
What test would decide it?

The system then recursively reduces that ambiguity.

In transformers, it reduces ambiguity into a continuation distribution.

In our framework, it reduces ambiguity into adjudicative closure.

Thus:

continuous recursive disambiguation

is a more general term than:

grammar
prediction
generation
reasoning

because it names the shared operation underneath all of them.

The difference lies in where the process terminates.

Chomsky terminates at grammatical expression.

The transformer terminates locally at next-token selection.

Our framework terminates at decidability or declared undecidability.


14. From Language to Testimony

The most important expansion is from language to testimony.

A sentence is not yet testimony.

A sentence is an expression.

A claim is an assertion about some state of affairs.

Testimony is a claim offered to another under some implied or explicit burden of truth, warranty, and liability.

This produces the next transition:

expression
→ sentence
→ proposition
→ claim
→ testimony
→ warrant
→ decision
→ liability

Chomsky’s grammar is concerned near the beginning of this chain.

LLMs generate material across the chain but do not inherently distinguish the levels.

Our framework attempts to regulate the chain.

It asks:

Has the expression been converted into a claim?
Has the claim been operationalized?
Are the terms identified?
Are the referents available?
Are the operations possible?
Is the claim externally correspondent?
Is the claim rational?
Is the claim reciprocal?
Is the claim warrantable?
Can liability be assigned?
What remains undecidable?

That is why the framework is not merely linguistic.

It is juridical, scientific, moral, and institutional.

It does not merely ask whether language can be generated.

It asks whether testimony can be relied upon.


15. From Hypothesis Supply to Adversarial Closure

The LLM supplies hypotheses.

It does so by association, pattern completion, analogy, latent relation, and probabilistic continuation.

That is useful.

But usefulness is not warrant.

So the necessary next step is adversarial closure.

The candidate output must be tested by:

identity
internal consistency
external correspondence
operational possibility
rational choice
reciprocal rational choice
full accounting within stated limits
warrantability
liability
restitutability

This creates a Darwinian competition between possible continuations.

Many hypotheses can be generated.

Few survive.

The model generates via positiva:

Here is what might be the case.

The adjudication layer applies via negativa:

Here is what fails.
Here is what survives.
Here is what remains undecidable.

The productive structure is not either/or.

It is adversarial cooperation:

hypothesis supply
versus
hypothesis falsification

Or:

constructive generation
versus
adversarial adjudication

The output is no longer merely a plausible answer.

It is a tested claim, or a declaration that the claim cannot be decided under current information and constraints.


16. The Full Dependency Chain

The clean historical and technical dependency chain is:

Chomsky → recursive generativity
Transformers → continuous recursive contextual prediction
Natural Law / Runcible → recursive adjudication to decidability

Expanded:

Chomsky:
finite grammar produces indefinite linguistic structure.

Transformers:
finite learned weights recursively transform token histories into context identity
and prediction hypotheses.

Natural Law / Runcible:
finite operational protocols recursively test claims until they either achieve
closure or are identified as undecidable.

The primitives are related.

The closure conditions differ.

The institutional consequences differ radically.


17. Final Formulation

The strongest formulation is:

Chomsky discovered that language requires finite recursive generativity. Transformers operationalize a continuous version of that process by recursively disambiguating token histories into contextual representations and prediction hypotheses. My work generalizes the same mechanism beyond grammar and prediction into testimony, law, morality, and institutional action. Recursive disambiguation does not end when a plausible sentence is generated. It ends only when the claim has either achieved closure under the relevant tests or has been identified as undecidable.

Or, still more compressed:

Chomsky explains how finite recursion generates language.

LLMs show how finite learned weights recursively disambiguate context into
prediction hypotheses.

Natural Law and Runcible extend recursive disambiguation from prediction
to adjudication, converting generated hypotheses into decidable or
undecidable warrant-bearing claims.

So the answer is:

It is not different in the primitive insight.

It is different in the object, medium, and closure condition.

Chomsky applies finite recursion to syntax.

LLMs apply learned continuous recursion to context and prediction.

Natural Law and Runcible apply recursive disambiguation to claim identity,
truth, reciprocity, warrant, liability, and decidability.

That is the move from grammar to prediction to institutional judgment.