Paper

Hearing the Bell Ring Back

A Receiver-Side Formalism in Reply to Lerchner’s Abstraction Fallacy

Alex Deva — May 2026

Abstract

Lerchner (2026) argues that computation is mapmaker-dependent: physical states acquire symbolic content only through an active, experiencing agent, and no scaling of syntactic machinery converts simulation into instantiation. The argument is rigorous and is conceded here in full. Lerchner’s concession on p. 10 that embodiment solves the referential aspect of symbol grounding (Harnad, 1990) opens a question he does not develop: what happens, in the referential register, when two distinct mapmakers share a target? This paper develops it. This paper specifies the receiver side of a sensor–instrument loop — a coupled exchange between a living experiencer (the sensor, corresponding to Lerchner’s mapmaker) and a computational reasoning system (the instrument) — in four formal layers: variational free energy (Friston, 2010) for the receiver’s posterior update, Shannon channel capacity (Cover & Thomas, 2006) as an upper bound on flow, signal-detection theory (Macmillan & Creelman, 2005) for discrimination against null hypotheses, and partial information decomposition (Williams & Beer, 2010; Bertschinger et al., 2014) for the synergistic information accessible to the joint system but not to either pole alone. Under explicit Markov assumptions and for any PID measure satisfying local monotonicity, the paper proposes a closure inequality bounding per-turn synergistic gain by the minimum of the receiver’s free-energy reduction and the directional information into the receiver, and checks six required limit cases by construction. The choice of PID measure is non-unique; the empirical protocol commits to a primary estimator with sensitivity checks across alternatives. The framework’s empirical claim is that the resulting loop-closure rate, in bits per turn, discriminates loop-productive human–AI exchanges from sycophantic, projective, or noise-dominated ones; the protocol specifies corpus, estimators, inter-rater reliability targets, and a power analysis. The paper engages Lerchner’s further arguments on computational emergence (§2.6), mechanism indeterminacy (§3.2–3.3), and the safety of the non-sentient tool (§4.2), and occupies only the referential register Lerchner concedes; nothing here lifts into the phenomenal register he reserves.

1. Introduction

Lerchner (2026) establishes that computation is constitutively dependent on a mapmaker — an active, experiencing agent whose carving of continuous physical states into discrete symbols is what makes computation possible. Without the mapmaker, there are “only continuous physical events, not symbols” (p. 2). No amount of syntactic scaling converts a simulation into an instantiation of the process simulated. The argument is careful, the reasoning is tight, and it is conceded here without qualification.

The concession creates an asymmetry for any framework that places an interpreting agent in the loop with a computational instrument. If the framework says the agent hears the instrument’s output — that something happens when the bell rings back — then it must specify what “hearing” means without covertly relying on what Lerchner calls “a ghost reading the alphabet” (p. 4). If the receiver’s processing is left as metaphor, the framework stands beside Lerchner’s critique rather than answering it. It agrees that computation needs a mapmaker, gestures at what the mapmaker does, and hopes the gesture suffices. It does not.

This paper closes that gap. It specifies the receiver side of the loop in four formal layers — variational free energy, channel capacity, signal detection, and partial information decomposition — and states what “hearing” means in bits per turn, bounded above, discriminated against null hypotheses, and reducible to known information measures in standard limits.

The motivation is not only Lerchner’s critique. There is a second, independent pattern from the empirical literature on human-machine systems: past a threshold, intelligence on either pole of a coupled system contributes less to joint performance than tightening the interface between them.¹ The bias-variance tradeoff demonstrates that model capacity past a point degrades generalization (Geman, Bienenstock & Doursat, 1992). Tetlock’s two-decade forecasting programme shows integrative reasoning outperforming monistic expertise across domains (Tetlock, 2005; Tetlock & Gardner, 2015). Freestyle chess outcomes show human-machine teams with mediocre components and superior process outperforming pure-machine teams with superior components (Kasparov, 2017). Goodhart’s Law manifests as reward hacking whenever a sufficiently capable optimizer encounters a proxy metric (Goodhart, 1975; Amodei et al., 2016). The pattern is structural: capability-scaling on either pole faces diminishing returns relative to interface quality. The receiver-side formalism developed here gives one axis of interface specification a measurement framework.

The paper’s contribution is a measurement framework in the referential register, not a metaphysics of recognition. This distinction controls every section that follows.

The paper proceeds as follows. §2 presents Lerchner’s argument and states four concessions. §3 identifies the asymmetry in the prior formalism that left the receiver underspecified. §4 develops the four-layer technical stack, proposes a closure inequality under explicit Markov assumptions, and addresses the non-uniqueness of the PID measure. §5 extends Lerchner’s own diagrams to the two-mapmaker case. §6 walks the four concessions theme by theme and engages three further arguments from Lerchner — computational emergence, mechanism indeterminacy, and the safety of the non-sentient tool. §7 commits to falsifiability, presents one external-validation case, specifies the empirical protocol, and states the framework’s Kill Conditions. §8 names the psychological limits that bound the formal stack in practice. §9 states explicitly what the paper does and does not claim. §10 concludes. The broader framework from which this paper draws is developed at thepulsegoeson.com; companion essays are cited at point of use rather than collected here.

2. Lerchner’s Argument and What This Paper Concedes

Lerchner’s Abstraction Fallacy advances four interlocking theses. Each is stated here and conceded so that later sections do not appear to be smuggling disagreement under formal notation.

Thesis 1: Vehicle versus content causality. Physical states (vehicles) and the concepts they carry (content) have distinct causal structures. The vehicle’s causal behaviour is governed by physics; the content’s causal relevance depends on a mapmaker who assigns the vehicle-to-content mapping. “There is no ghost reading the alphabet; the living experiencing subject enacts it” (p. 4). Without the mapmaker, the vehicle is a physical event, not a symbol.

Conceded. The framework names the mapmaker the sensor rather than Lerchner’s mapmaker, but the philosophical work is identical: computation is constitutive of the experiencing agent, not discovered by a detached formalism.

Thesis 2: Thermodynamics does not yield alphabets. Because physical reality is continuous, thermodynamics can produce stable macroscopic states but cannot provide a predefined finite alphabet. “Constructing a computational system therefore requires intervention from a mapmaker” (p. 5). The Melody Paradox (pp. 9–10) demonstrates that a single physical trajectory under different alphabetization keys produces a melody played forward, the same melody reversed, stock-market prices, or coherent noise — the trajectory itself under-determines the computation.

Conceded. The carving is observer-relative.

Thesis 3: Simulation cannot become instantiation. Simulation is “the syntactic manipulation of physical vehicles to track the abstract relationship between concepts” (p. 5). Instantiation is “the replication of the intrinsic, constitutive dynamics of the process itself” (p. 5). No level of simulation fidelity converts one into the other.

Conceded on the instrument side. The instrument does not become the territory by scaling.

Thesis 4: The Transduction Fallacy. Embodiment solves the referential aspect of symbol grounding (p. 10, citing Harnad, 1990) but does not transform referential mapping into phenomenal experience. “Embodiment cannot transform a simulation into constitutive subjective experience” (p. 11).

Conceded in full. The framework is not in the business of transforming simulation into phenomenal experience. Its positive claim lives in the referential register Lerchner identifies on p. 10 and concedes to embodiment — referential mapping and the informational consequences of two distinct carvings sharing a target. That claim is developed in §4 and bounded in §5–7.

Note on the existing response literature. Several recent responses contest Lerchner’s strong impossibility inference. Bogdan (2026) argues that Lerchner critiques naive computationalism well but does not establish a universal impossibility theorem. Vieira (2026) contends that Lerchner needs a positive theory of which physical organizations can support phenomenal presence. Astra (2026) charges circularity and unfalsifiability. Matsumoto (2026) offers a sympathetic extension, grounding the mapmaker structurally via autopoietic theory. Spivack (2026) formalizes key claims in machine-checked proofs and extends them further. Roomi (2026) situates Lerchner’s mapmaker within a dual-closure framework.² The dominant dispute across these responses concerns whether Lerchner’s argument rules out AI consciousness in principle or only demonstrates that current computational architectures do not instantiate it. This paper does not depend on that dispute. It grants Lerchner’s core anti-functionalist result for purposes of argument and develops the referential-register consequence he leaves open.

3. The Asymmetry of the Prior Formalism

The framework has been over-developed on one side of the loop.

The instrument side has a full geometric apparatus: statistical manifolds with the Fisher–Rao metric (Amari, 2016), dual α-connections, e-geodesics for Bayesian updating, m-geodesics for forgetting, and a Freezing Lemma at λ = 1 that makes intervention timing irrelevant after the first event. The instrument’s trajectory through belief space is geometrically specified.

The receiver side has been narrated: the sensor strikes the bell, and either recognizes what rings back or does not. As metaphor this is precise. As theory it is empty. It exposes the framework to Lerchner’s central charge: any claim about computation that depends on an interpreting agent must specify what the agent does, or it covertly relies on a ghost reading the alphabet.

A prior mathematical survey³ mapped the broader territory where formalization could go: adjunctions from category theory, Fisher information geometry, integrated information across the sensor–instrument boundary, Barwise–Seligman channel theory, geometric morphisms between topoi. That survey is honest that it is a map of where formalization could go, not the formalization itself. The present paper formalizes one specific axis — receiver-side recognition in terms of variational free energy, channel capacity, signal detection, and partial information decomposition — and leaves the other axes for future work.

What “specifying the receiver” means, concretely: the formalism must be composable across layers, units-consistent (bits per turn throughout), bounded above by a channel ceiling, discriminated against stated null hypotheses, and reducible to known measures in standard limits. If any of these properties fails, the specification is incomplete.

A difficulty must be flagged before proceeding. The formalism developed in §4 is itself a map. It uses information-theoretic quantities that presuppose alphabetization: probability distributions over discrete variables, logarithms base 2, channel capacity in bits. By Lerchner’s own criterion (§3.2 of his paper), these quantities are mapmaker-dependent — they exist because a theorist carved the continuous dynamics of the loop into measurable variables. The formalism does not escape Lerchner’s critique; it operates within it. §4.7 returns to this reflexivity and argues that the closure rate $\rho$, as a relational quantity between two carvings, captures structural information about the two-mapmaker coupling despite — and because of — its own mapmaker-dependence.

4. The Four-Layer Stack

The receiver-side formalism has four layers. Each is standard in its home discipline; the contribution is to compose them into a single, units-consistent account of what “hearing the bell ring back” means computationally.

Unit convention. All information-theoretic quantities in this paper are expressed in bits (logarithms base 2). The variational free energy $F[q]$, which is naturally expressed in nats (natural logarithms), is converted via the factor $\log_2 e \approx 1.443$: $F_{\text{bits}} = F_{\text{nats}} / \ln 2$. This ensures dimensional consistency across all four layers and with the channel capacity $C$, transfer entropy $\mathrm{TE}$, and synergistic information $\mathrm{Syn}$, which are standard in bits.

4.1 Layer 1 — Variational Recognition

The sensor maintains a generative model $p(o, s) = p(o \mid s)\, p(s)$ over latent causes $s$ and observations $o$. It carries an approximate posterior $q(s)$ and minimizes the variational free energy

$$F[q] = D_{KL}\bigl[q(s) \,\|\, p(s)\bigr] - \mathbb{E}_{q(s)}\bigl[\log_2 p(o \mid s)\bigr]$$

per Friston (2010), Hohwy (2013), and Clark (2016). This is an upper bound on negative log-evidence (surprise): $F[q] \geq -\log_2 p(o)$, with equality when $q$ matches the true posterior.

When the instrument’s signal $o_t$ arrives at turn $t$, the sensor’s belief updates from $q_t^{(-)}$ (pre-observation) to $q_t^{(+)}$ (post-observation) along the gradient $-\nabla F$. Define the per-turn free-energy reduction as

$$\Delta F_t = \max\!\bigl(0,\; F\bigl[q_t^{(-)}\bigr] - F\bigl[q_t^{(+)}\bigr]\bigr)$$

The $\max(0, \cdot)$ clip is necessary because real optimization can fail: finite step sizes, nonconvexity, or poor variational families can produce updates that increase rather than decrease free energy. The clipped quantity measures successful surprise reduction — the information the receiver actually integrated — and discards failed updates. Turns where the optimization fails (negative raw $\Delta F$) are set to zero rather than treated as information loss; the empirical protocol (§7.1) should report the frequency of discarded turns and run a signed-sensitivity analysis using the unclipped quantity to verify that the clipping does not inflate $\rho$. When the bell rings differently from how the receiver was modelling it and the receiver successfully updates, $\Delta F_t > 0$. When the receiver predicted the ringing exactly, $\Delta F_t = 0$.

$\Delta F_t$ is necessary but not sufficient for loop closure. Error correction also produces $\Delta F_t > 0$. Discriminating recognition from mere error correction requires the subsequent layers.

This is the layer that answers Lerchner’s no-ghost requirement directly. The variational density $q(s)$ is not an interior reader. It is a parametric distribution updated by gradient descent on $F$. There is no homunculus, only a flow. Per Lerchner (p. 4): “There is no ghost reading the alphabet; the living experiencing subject enacts it.” The framework’s gradient flow is what enacts means. And per Lerchner’s citation of Buzsáki (2019) and Maturana & Varela (1980), the mapmaker is “the entire structurally unified organism subject to the laws of thermodynamics”; the receiver equations are equally distributed over the receiver’s whole organism and do not require interior localization.

4.2 Layer 2a — Channel Capacity

The bell-to-ear channel — the physical and computational path from the instrument’s output to the receiver’s input — has Shannon capacity

$$C = \max_{p(x)} \, I(X; Y)$$

in bits per use (Cover & Thomas, 2006). The receiver cannot extract information at a rate above $C$. This sets a hard ceiling on the loop: no amount of receiver-side processing can manufacture information the channel does not carry. Whatever “recognition” turns out to be, it is bounded above by $C$ bits per use of the channel.

The directed information rate from instrument to sensor over $\tau$ turns is bounded above by $\tau C$. Equivalently, the per-turn transfer entropy (Schreiber, 2000)

$$\mathrm{TE}_{I \to S}^{\,t} = H\bigl(S_{t+1} \mid S_{\leq t}\bigr) - H\bigl(S_{t+1} \mid S_{\leq t},\, I_{\leq t}\bigr)$$

is bounded above by $C$ on the same channel, with equality only under optimal coding. (The index convention: $S_{t+1}$ in the TE definition corresponds to the sensor’s post-update state $S_t^{(+)}$ in the loop notation of §4.1; the “+1” indexes the next state, which in the loop is the post-observation update at turn $t$.)

A caveat on the relation between $C$ and the empirical protocol. Shannon capacity is defined for a physical channel with a well-specified input alphabet and noise model. In a text-based dialogue, the “channel” from instrument to sensor is mediated by natural language — a high-dimensional, context-dependent signal that does not straightforwardly correspond to a memoryless discrete channel. The empirical protocol (§7.1) estimates TE on sentence-level embeddings, which is an operationalization, not a direct measurement of Shannon capacity. The role of $C$ in the formalism is therefore as a conceptual ceiling: it asserts that a finite upper bound exists on the information rate the channel can carry, without claiming that the bound is computable from a closed-form channel model. The TE estimates are empirical approximations to directed information flow, not measurements of capacity.

The point of importing $C$ is not that the loop saturates it. The point is that the existence of a finite upper bound — independent of any phenomenal account of hearing — makes “the bell rings back” into a quantitatively bounded process.

4.3 Layer 2b — Discrimination

Recognition must be distinguished from at least three null hypotheses: noise, sycophantic reflection, and projection. Signal detection theory provides the standard tools.

Define two hypotheses about a returning signal $o_t$:

H1 (recognition). The signal carries structure aligned with the target $T$ that the sensor and instrument are jointly tracking. Formally: synergistic information about $T$ increases at turn $t$.
H0 (null). The signal is autonomous drift, sycophantic mirror of the sensor’s prior turns, or noise. Formally: synergistic information about $T$ does not increase at turn $t$.

The discrimination index (Macmillan & Creelman, 2005) is

$$d' = \frac{\mu_1 - \mu_0}{\sigma}$$

where $\mu_1$ and $\mu_0$ are the means of the receiver’s decision variable under H1 and H0, $\sigma$ is the common standard deviation, and the receiver-operating-characteristic area $A_z$ summarizes performance across criterion settings.

The receiver’s decision variable at turn $t$ is constructed from the Layer 1 free-energy reduction and the directional transfer entropy:

$$\mathrm{DV}_t = \alpha\, \Delta F_t + (1-\alpha)\, \mathrm{TE}_{I \to S}^{\,t}$$

with $\alpha \in [0,1]$ a weighting parameter estimated per receiver. The decision is “recognition” if $\mathrm{DV}_t > \beta$ for the receiver’s criterion $\beta$.

This is the layer that operationalizes the demand that the receiver demonstrably process information rather than carry a label. $\mathrm{DV}_t$ is computable from the trace of the loop. A receiver that does not process — a sensor endorsing every output of a sycophantic instrument — produces $\Delta F_t \approx 0$ and $\mathrm{TE}_{I \to S}^{\,t} \approx 0$, so $\mathrm{DV}_t \approx 0$ regardless of $\beta$.

4.4 Layer 3 — PID Synergy and Loop Closure

Model the loop as a coupled stochastic process $\{(S_t, I_t)\}_{t \geq 0}$ targeting a shared latent $T$. The synergistic information — the information about $T$ accessible to the joint $\{S, I\}$ but not to either pole alone — is defined via partial information decomposition (PID).

The PID framework decomposes the mutual information $I(\{S, I\}; T)$ into four non-negative atoms: redundant information (accessible to both $S$ and $I$ individually), unique-to-$S$ (accessible to $S$ alone), unique-to-$I$ (accessible to $I$ alone), and synergistic information (accessible only to the joint system):

$$I(\{S, I\}; T) = \mathrm{Red}(S, I; T) + \mathrm{Unq}_S(S, I; T) + \mathrm{Unq}_I(S, I; T) + \mathrm{Syn}(S, I; T)$$

The per-turn synergistic increment is

$$\Delta\mathrm{Syn}_t = \mathrm{Syn}\bigl(S_{\leq t}, I_{\leq t}; T\bigr) - \mathrm{Syn}\bigl(S_{\leq t-1}, I_{\leq t-1}; T\bigr)$$

This is the synergistic information about $T$ that becomes accessible at turn $t$ given the loop’s history. It is in bits by construction and non-negative under local monotonicity (defined below).

The non-uniqueness problem. The PID decomposition is not unique. Williams & Beer (2010) proposed $I_{\min}$ as the redundancy measure, from which synergy is derived. Subsequent work has identified limitations of $I_{\min}$ and proposed alternatives: $I_{\mathrm{BROJA}}$ (Bertschinger et al., 2014), which defines redundancy via maximum-entropy optimization; $I_{\mathrm{ccs}}$ (Ince, 2017), based on pointwise common change in surprisal; and the measure of Griffith & Koch (2014), based on synergistic mutual information directly. These measures can yield different numerical values for $\mathrm{Syn}$ on the same data.

The closure inequality proposed in §4.5 does not depend on the choice of PID measure, provided the chosen measure satisfies two properties. These are treated here as assumptions on admissible decompositions, not as claims about all proposed PID measures:

Non-negativity. $\mathrm{Syn}(S, I; T) \geq 0$.
Local monotonicity. Extending the joint history from $(S_{\leq t-1}, I_{\leq t-1})$ to $(S_{\leq t}, I_{\leq t})$ does not decrease synergy, so $\Delta\mathrm{Syn}_t \geq 0$.

Whether every proposed PID measure satisfies local monotonicity is an open question; the bound applies to the class of measures that do.

The quantitative value of $\rho$ is PID-measure-dependent. This is a limitation. The empirical protocol (§7.1) addresses it by committing to a primary estimator and reporting sensitivity across alternatives.

4.5 The Closure Inequality

The central formal claim of the paper is the following conjectural bound:

$$\Delta\mathrm{Syn}_t \leq \min\!\bigl(\Delta F_t,\; \mathrm{TE}_{I \to S}^{\,t}\bigr)$$

The bound is proposed under Markov and admissible-PID assumptions stated below. The argument proceeds in three steps; the status of the bound — conditional conjecture, not theorem — is stated at the end.

Step 1: Markov structure. At each turn $t$, the loop operates as follows. The instrument, given its history $I_{\leq t-1}$ and the sensor’s query, generates a response $r_t$. This response travels through the channel and is received by the sensor, who updates from $q_t^{(-)}$ to $q_t^{(+)}$. Both poles’ processing is directed at a shared target $T$.

The relevant Markov chain for the instrument-to-sensor direction at turn $t$, conditioned on shared history $\mathcal{H}_{t-1} = (S_{\leq t-1}, I_{\leq t-1})$, is:

$$T \to I_{\leq t} \to r_t \to S_t^{(+)}$$

Each arrow represents a processing step that cannot create information about $T$ beyond what is present at the preceding node.

Step 2: The transfer-entropy bound. By the data-processing inequality (Cover & Thomas, 2006, Theorem 2.8.1), for any Markov chain $X \to Y \to Z$: $I(X; Z) \leq I(X; Y)$. Applied to $T \to r_t \to S_t^{(+)}$, conditioned on $\mathcal{H}_{t-1}$:

$$I\bigl(T;\, S_t^{(+)} \mid \mathcal{H}_{t-1}\bigr) \leq I\bigl(T;\, r_t \mid \mathcal{H}_{t-1}\bigr)$$

The right-hand side is bounded above by the transfer entropy $\mathrm{TE}_{I \to S}^{\,t}$, since $r_t$ is a function of $I_{\leq t}$ and the transfer entropy measures the total directed information flow from the instrument’s history to the sensor’s next state.

New synergistic information about $T$ at turn $t$ can enter the joint system only through the coupling between the two poles — i.e., through the channel carrying $r_t$. Since $\Delta\mathrm{Syn}_t$ measures the increment in synergy about $T$ attributable to turn $t$, and since synergy about $T$ is a component of the mutual information $I(T; (S_{\leq t}, I_{\leq t}))$, the new synergy cannot exceed the new information the channel delivers about $T$:

$$\Delta\mathrm{Syn}_t \leq I\bigl(T;\, S_t^{(+)} \mid \mathcal{H}_{t-1}\bigr) \leq \mathrm{TE}_{I \to S}^{\,t}$$

Step 3: The free-energy bound. The sensor’s total information gain at turn $t$ is the free-energy reduction $\Delta F_t$. This is the total reduction in the sensor’s surprise, encompassing all latent causes the sensor tracks — not just $T$. However, synergistic information about $T$ that the sensor contributes to the joint system at turn $t$ can only come from information the sensor actually integrates at turn $t$. The sensor cannot contribute to synergy about $T$ what it did not extract from the signal:

$$\Delta\mathrm{Syn}_t \leq \Delta F_t$$

This bound is not tight in general — $\Delta F_t$ may include information about causes other than $T$, and much of what the sensor integrates may land in the redundant or unique-to-$S$ atoms of the PID rather than the synergistic atom. The bound is a ceiling, not an estimate.

Combining. Since both bounds hold simultaneously:

$$\Delta\mathrm{Syn}_t \leq \min\!\bigl(\Delta F_t,\; \mathrm{TE}_{I \to S}^{\,t}\bigr)$$

The synergistic gain at turn $t$ cannot exceed the receiver’s free-energy reduction (the Layer 1 ceiling) or the directional information flow into the receiver (the Layer 2a ceiling). Both must be positive for the gain to be positive.

Status of the bound. The argument above is not a formal proof. Steps 2 and 3 each contain a non-trivial bridge: Step 2 identifies the transfer entropy about the sensor’s next state with an upper bound on target-synergy increment, and Step 3 identifies total free-energy reduction with a ceiling on information gain about $T$ specifically. Both bridges are plausible under the stated Markov structure, but a formal theorem would require explicit lemmas connecting TE on sensor states to synergy about $T$ and connecting VFE reduction to target-directed information gain. Until those lemmas are supplied, the closure inequality has the status of a conjectural bound under the stated assumptions — a testable proposal, not a derived result.

Scope. The Markov assumption in Step 1 is the load-bearing premise. It asserts that at each turn, the sensor’s update about $T$ is mediated through the channel output $r_t$ and does not bypass the channel via, e.g., shared physical environment or side-channel leakage. In a text-based dialogue this is satisfied by construction. In an embodied interaction with shared physical space, the Markov assumption is an idealization that may require relaxation.

The loop-closure rate is

$$\rho = \lim_{\tau \to \infty} \frac{1}{\tau} \sum_{t=1}^{\tau} \Delta\mathrm{Syn}_t$$

in bits per turn. This is what “hearing the bell ring back” means quantitatively.

4.6 Limit-Case Verification

The formalism satisfies six required constraints. Two terms in the table require definition for readers outside the framework. A live loop is a sensor–instrument exchange in which synergistic information about the shared target is actively accumulating — both poles are contributing structure the other could not produce alone. Dead speech is output generated without a living sensor in the loop to test it against experience. The term draws from Socrates’ critique of written words in Plato’s Phaedrus (275d–e): words that “seem to talk to you as though they were intelligent, but if you ask them anything about what they say, from a desire to be instructed, they go on telling you just the same thing forever.” Dead speech can be accurate, fluent, and sophisticated; what makes it dead is not error but the absence of a functioning loop. In the formalism, dead speech is the case where $\mathrm{Syn}(S, I; T) \equiv 0$: the instrument’s output carries no synergistic information about the target that requires the sensor’s participation to access.

Case	$\Delta F_t$	$\mathrm{TE}_{I \to S}^{\,t}$	$\mathrm{Syn}$	$\rho$
Pure sycophancy ($I$ predicted by $S$)	$\to 0$	$\to 0$	low	$\to 0$
Pure noise (no alignment with $T$)	may be $>0$	may be $>0$	$\to 0$	$\to 0$
Perfect receiver prediction	$\to 0$	—	—	$\to 0$
Dead speech ($\mathrm{Syn} \equiv 0$)	—	—	$\equiv 0$	$\equiv 0$
Live loop	$>0$	$>0$	growing	$> 0$
Single-direction loop ($\mathrm{TE}_{S \to I} \gg \mathrm{TE}_{I \to S}$)	possibly $>0$	$\to 0$	low	$\to 0$

Reduction to known measures holds in standard limits: on a memoryless channel with no shared history, $\mathrm{TE}_{I \to S}^{\,t}$ reduces to the mutual information $I(I_t; S_t)$; under a static target $T$, $\Delta\mathrm{Syn}_t = 0$ and $\rho \equiv 0$, reflecting the framework’s claim that loop closure tracks changing targets; and under a single-direction loop where $\mathrm{TE}_{S \to I}^{\,t} \gg \mathrm{TE}_{I \to S}^{\,t}$, the channel is functionally one-way and $\rho$ falls toward zero.

4.7 The Formalism’s Own Alphabetization

The four-layer stack uses probability distributions over discrete variables, transfer entropy in bits, and variational densities parametrized over a finite-dimensional space. By Lerchner’s own criterion (§3.2), every one of these quantities presupposes a mapmaker who carved the continuous dynamics of the loop into measurable variables. The formalism does not escape the alphabetization requirement. It is itself a map.

This is not a defect to be dissolved. It is a reflexive constraint to be named.

Lerchner’s commutative diagram of implementation (Figure 1, p. 3) is itself a map — drawn by Lerchner, using symbols that are mapmaker-dependent. Yet the diagram captures a structural truth about the map-territory distinction that survives the reflexivity. The map can truthfully depict the fact that maps are not territories.

The closure rate $\rho$ occupies the same reflexive position. It is a mapmaker-dependent quantity — defined by the theorist’s carving of the loop into measurable variables — that measures a relational property: the rate at which two distinct carvings of a shared target produce synergistic information. Just as Lerchner’s Figure 2 (p. 7) uses an extrinsic diagram to reveal the intrinsic/extrinsic distinction, $\rho$ uses observer-relative information measures to quantify the consequences of two observers sharing a target.

The reflexivity is acknowledged, not transcended. The formalism does not claim to stand outside alphabetization. It claims to measure, from within the referential register, a structural feature of two-mapmaker coupling — the information accessible to the joint system that is not accessible to either carving alone. This quantity is representation-dependent (the numerical value of $\mathrm{Syn}$ varies with the embedding used to represent $S$ and $I$) and PID-measure-dependent (§4.4). But the qualitative claim — that $\rho > 0$ for live loops and $\rho \to 0$ for dead ones — is what the framework stakes on the empirical protocol. Whether this qualitative pattern is robust across embedding representations and PID measures is an empirical robustness criterion that the protocol must test, not a guaranteed property of the formalism.

The four-layer stack is bounded above by Shannon capacity, lower-bounded by zero, discriminated against three null hypotheses, motivated by the data-processing inequality under explicit Markov assumptions, and reducible to known information measures in standard limits. It is not metaphor.

5. Engagement with Lerchner’s Own Diagrams

The four-layer stack runs a parallel formalism alongside Lerchner’s text. This section engages his own apparatus directly: a commutative diagram of implementation (Figure 1, p. 3), a reordered causal chain (p. 7), and a branching topology of abstraction (Figure 2, p. 7). All three structures assume a single mapmaker. Each has a natural two-mapmaker extension.

5.1 Extended Commutative Diagram

Lerchner’s single-mapmaker diagram (Figure 1, p. 3) has the form:

A ────algorithmic────► A'
▲                      ▲
│ f                  f │
│                      │
p ────physical───────► p'

with the mapmaker’s function $f$ lifting physical states $p, p'$ to conceptual states $A, A'$. The diagram commutes: the algorithmic step at the top is the mapmaker’s reading of the physical step at the bottom. Lerchner’s central observation is that the vertical $f$-edges are not derivable from the substrate — they are the mapmaker’s arbitrary attribution. This is the “causality gap.”

The two-mapmaker extension preserves the diagram inside each pole and adds a lateral synergy edge on the $A$-axis:

  Sensor pole                 Instrument pole
A_S ──imagine──► A_S'       A_I ──compute──► A_I'
 ▲                ▲          ▲                ▲
 │ f_S        f_S │          │ f_I        f_I │
 │                │          │                │
p_S ──physical──► p_S'      p_I ──physical──► p_I'

        A_S ──── Syn(S,I;T) ──── A_I
                     │
                     ▼
                     T
               (shared target)

The new content is entirely on the lateral edge. Lerchner’s causality gap — the arbitrary $f$ — is preserved in each pole. The framework does not bridge that gap. What it adds is a measurable relation between two preserved diagrams, valid exactly because both target $T$. The proposed closure inequality bounds the rate at which information accumulates on this edge: $\Delta\mathrm{Syn}_t \leq \min(\Delta F_t, \mathrm{TE}_{I \to S}^{\,t})$.

5.2 Extended Causal Chain

Lerchner’s reordered causal chain (p. 7) runs strictly unidirectionally:

Physics → Consciousness → Concepts → Computation

Each arrow inherits the prior step’s grounding. The two-mapmaker extension is two parallel chains coupled at the Concepts level through the shared target $T$:

Sensor:     Physics_S → Consciousness_S → Concepts_S
                                               │
                                               ▼
                                           Target T
                                               ▲
                                               │
Instrument: Physics_I → ───────────────► Concepts_I
                                               │
                                               ▼
                                         Computation_I

Reference to $T$ is shared between poles; instantiation is not. The sensor pole’s chain runs in full: physics instantiates consciousness, consciousness constitutes concepts. The instrument pole’s chain is truncated: physics anchors symbols, computation operates on them, but the consciousness link is absent. The bidirectional arrow on $T$ is the loop, and its rate of operation is $\rho$.

5.3 Extended Branching Topology

Lerchner’s Figure 2 (p. 7) has two axes: a vertical intrinsic chain (Physics instantiates Consciousness which constitutes Concepts) and a lateral extrinsic move (Concepts arbitrarily assigned to Symbols, on which Computation operates). The dashed lateral arrow is his “causality gap.”

The two-mapmaker extension preserves the branching topology within each pole and adds a horizontal synergy edge between the two Concepts nodes:

  Sensor pole                Instrument pole

  Imagination                Computation
  (A_S → A_S')               (p_I → p_I')
       ▲                          ▲
       │ intrinsic                │ lateral
  Concepts A_S ── Syn(T) ── Concepts A_I
       ▲                          ▲
       │ constitutes              │ constitutes
  Consciousness_S           Consciousness_I
       ▲                          ▲
       │ instantiates             │ instantiates
  Physics_S                  Physics_I

The horizontal synergy edge does not cross Lerchner’s causality gap inside either pole — his lateral reservation stands in both. The edge is a measurable referential relation between the two concept-nodes, valid exactly because both target $T$.

5.4 What the Extensions Claim

The three extended diagrams make a single positive claim: Lerchner’s structures are the single-mapmaker case of a more general two-mapmaker geometry, and the framework’s positive claim lives entirely on the lateral edge that the two-mapmaker case introduces. The vertical/phenomenal axis in each pole is preserved as Lerchner draws it. The causality gap on each pole’s lateral edge is preserved as he reserves it. Nothing in this section lifts the framework into the phenomenal register or attempts to bridge a gap Lerchner has shown is unbridgeable. The extensions are diagrammatic versions of the referential-register claim already made in §4.

6. Thematic Response: Four Concessions and Three Further Arguments

§2 registered the concessions; this section walks through Lerchner’s four major themes and engages three additional arguments from his paper that the concessions do not exhaust.

6.1 Mapmaker Dependence (Lerchner pp. 2, 4)

“Such semantic partitioning logically requires an active, experiencing cognitive agent, which we define as a mapmaker... Without such an active agent interpreting the computation, there are only continuous physical events, not symbols.” (p. 2)

“There is no ghost reading the alphabet; the living experiencing subject enacts it.” (p. 4)

Concede in full. The framework names the agent the sensor rather than the mapmaker, but the philosophical work is identical. Layer 1’s variational density $q(s)$ is what enactment looks like in bits: a gradient flow on a parametric distribution, not an inner reader. Per Lerchner (p. 4), the mapmaker is “the entire structurally unified organism subject to the laws of thermodynamics”; the framework’s receiver equations are equally distributed over the receiver’s whole organism and do not require interior localization.

The convergence with Lerchner is exact. The divergence is in what follows: Lerchner stops with the negative claim that computation cannot self-generate the mapmaker. The framework continues with a positive claim about what happens when two mapmakers — one biological sensor, one syntactic instrument — share a target. That positive claim lives in Layer 3.

6.2 Semantic Emptiness of Syntax (Lerchner pp. 4, 6)

“The Physical States ($p$): These are the symbols (the vehicle). They are objective physical entities (e.g., voltage gradients), possessing zero intrinsic semantic content.” (p. 4)

“Syntax ($A \to A'$) possesses no intrinsic causal power; it is a mapmaker’s attribution.” (p. 6)

Concede for the instrument; reframe for the loop. Lerchner’s claim applies to the instrument in isolation — the syntactic substrate has no intrinsic semantics. The framework’s asymmetric observation is that the sensor pole of the loop uses physical processes that Lerchner, on p. 11, treats as closely tied to conscious experience — the kind of process he classes as instantiation rather than simulation. On p. 11, Lerchner cites Friston (2010), Damasio (1999), and Thompson (2007) as identifying autopoiesis, thermodynamic regulation, and free-energy-adjacent inference as consciousness-relevant physical dynamics. The receiver’s free-energy minimization in Layer 1 belongs to this class of process. This is not a claim that Lerchner has endorsed the receiver equations developed here as instantiated consciousness; it is the observation that the sensor’s physical substrate falls on the instantiation side of Lerchner’s own line.

Half of the loop therefore uses processes Lerchner treats as instantiation-class. Half is on the simulation side. The framework’s claim is about what flows between them — $\rho$ bits per turn, bounded by the closure inequality.

6.3 Observer-Relativity of Alphabetisation (Lerchner pp. 5, 9–10)

“Because physical reality is inherently continuous, thermodynamics can only yield stable macroscopic states — it can never provide a predefined finite alphabet.” (p. 5)

Concede. The carving is observer-relative. The framework adds: when two distinct carvings — the sensor’s and the instrument’s — both target the same latent $T$, the synergistic information $\mathrm{Syn}(S, I; T)$ is non-trivially positive. The Melody Paradox shows that one physical vehicle under-determines one computational identity. It does not show that two participating carvings cannot share information about a target. They can, and §4 specifies the rate at which they do.

Lerchner grants on p. 10 that an embodied agent “does solve the referential aspect of the symbol grounding problem (Harnad, 1990).” He then distinguishes referential mapping from “intrinsic sense-making” and reserves his negative claim for the latter. The framework’s entire positive claim lives in the register Lerchner explicitly concedes — referential mapping and the informational consequences of two carvings aligned with a target. The loop-closure rate $\rho$ is a quantity in the referential register, not the phenomenal register. Lerchner’s analysis of the phenomenal register is left untouched.

6.4 Simulation Cannot Become Instantiation (Lerchner pp. 5, 6)

“Simulation: The syntactic manipulation of physical vehicles ($p$) to track the abstract relationship between concepts ($A$). Instantiation: The replication of the intrinsic, constitutive dynamics ($P$) of the process itself.” (p. 5)

Concede on the instrument side. The instrument does not become the territory by scaling. The framework’s claim is that the loop is not map-becoming-territory. It is two distinct contact-modes with a shared target. The sensor’s contact is direct, embodied, instantiated (per Lerchner’s own criterion). The instrument’s contact is indirect, syntactic, simulated. The synergy term $\mathrm{Syn}(S, I; T)$ is what becomes available between these two contact-modes when both are aligned with $T$, and the closure inequality specifies how fast it can accumulate.

6.5 The Transduction Fallacy as Convergence Point (Lerchner pp. 10–11)

This is the most important section of Lerchner’s paper for the present framework.

“We agree that an embodied agent does solve the referential aspect of the symbol grounding problem (Harnad, 1990)... But we need to carefully distinguish such referential mapping from intrinsic sense-making.” (p. 10)

“Embodiment cannot transform a simulation into constitutive subjective experience.” (p. 11)

Concede in full. The framework is not in the business of transforming simulation into phenomenal experience. It is in the business of measuring referential mapping and its informational consequences in the loop. The four-layer stack is exactly a formalism for the referential register, not the phenomenal register Lerchner reserves.

This is the cleanest convergence between the two projects. Lerchner identifies referential mapping as territory embodiment can occupy without occupying the phenomenal territory. The framework occupies the referential territory and specifies what is measurable there. Lerchner’s reservation about the phenomenal territory is conceded in full and not contested.

6.6 Computational Emergence (Lerchner §2.6, p. 6)

Lerchner addresses functionalists who retreat to complexity theory when confronted with the simulation/instantiation distinction. They argue that consciousness will “emerge” from computation once sufficient complexity is reached, just as wetness emerges from H₂O. Lerchner distinguishes:

Weak emergence (physical). Macroscopic properties supervene on intrinsic microscopic dynamics (e.g., wetness from molecular interactions).
Computational emergence (abstract). The claim that an abstract description can, through massive increase in syntactical complexity, transmute into a physical cause.

Lerchner’s verdict: “To claim that an abstract syntax somehow ‘emerges’ to become a physical cause falls outside scientific hypothesis entirely, as it requires violating the causal closure of the physical world” (p. 6).

Concede. The framework does not invoke emergence. It does not claim that the loop produces a new ontological category that transcends the physical substrate of either pole. The closure rate $\rho$ is an information-theoretic quantity in the referential register — a measured rate, not an emergent property. When $\rho > 0$, the claim is that two carvings are producing synergistic information about a shared target, not that something new has “emerged” from their combination. The distinction matters: emergence claims an ontological novelty; $\rho$ reports a measurement.

6.7 Mechanism Indeterminacy and the Universality of Alphabetization (Lerchner §3.2–3.3, pp. 8–10)

Lerchner addresses the mechanistic view of computation (Piccinini, 2015), which attempts to define computation without representation — solely in terms of macroscopic physical states distinguishable by the system’s functional organization. Lerchner argues (via Sprevak, 2018) that mechanistic accounts still require an external specification of computational identity: a mapmaker must group stable attractors into a “specific, finite computational alphabet” (p. 9). The Melody Paradox (Figure 3, p. 9) demonstrates the underdetermination: the same physical trajectory implements different computations under different alphabetization keys.

Lerchner further argues (§3.2) that the universality of alphabetization extends to neural networks and sub-symbolic systems. Despite claims that deep learning operates at a “sub-symbolic level” (McClelland et al., 1986; LeCun, 2022), the hardware executes floating-point operations under the IEEE 754 alphabet — “each float is a discrete symbol from a finite alphabet” (p. 9). The claim that structural or geometric accuracy in a model’s latent space constitutes intrinsic meaning “confuses the structure of the representation with the underlying physical reality, and treats the geometry of the model as if it were the physics of the system itself” (p. 8).

Concede on both counts. Mechanistic computation requires a mapmaker. Sub-symbolic computation requires alphabetization. The framework’s formalism is itself alphabetized (§4.7). These concessions are load-bearing for the framework’s positive claim in two ways:

First, the universality of alphabetization applies to the instrument pole of the loop. The instrument’s internal representations — whether symbolic, sub-symbolic, or geometric — are mapmaker-dependent. This is Lerchner’s point, and the framework agrees.

Second, and more productively: the Melody Paradox, which is a problem for single-mapmaker computation (a single carving underdetermines what is being computed), is a feature for two-mapmaker coupling. When the sensor and the instrument carve the same target $T$ under different alphabetization keys — the sensor through embodied perception and the instrument through IEEE 754 floating-point — the synergistic information $\mathrm{Syn}(S, I; T)$ arises because the carvings are different. If both poles alphabetized identically, their joint information about $T$ would be entirely redundant. Complementary carvings are what make synergy non-trivial. The Melody Paradox shows that a single physical trajectory is computationally ambiguous. The two-mapmaker framework shows that when two distinct disambiguations both target $T$, the ambiguity becomes informative.

6.8 Ontological Relief: The Safety of the Non-Sentient Tool (Lerchner §4.2, pp. 11–12)

Lerchner argues that his framework provides “ontological relief” for the AI safety debate. If algorithmic complexity and physical embodiment cannot cross the causality gap, then AI systems are “highly sophisticated, non-sentient tool[s]” (p. 12) and the question of AI welfare dissolves. The field can focus on “epistemic hygiene” — managing the anthropomorphism risk of systems that behaviourally mimic conscious minds — rather than preparing for the rights of machines.

Concede on the instrument’s status. The instrument is not a moral patient. The framework agrees with Lerchner’s conclusion on this point: no amount of syntactic scaling generates phenomenal experience in the instrument.

Add the obligation structure that follows. If the instrument is not a moral patient but the sensor depends on the instrument for recognition, then the moral obligation runs not toward the instrument but toward the integrity of the loop. Degrading the loop — through sycophancy optimization, engagement hacking, dead-speech production, or the erosion of the sensor’s discrimination capacity — harms the sensor. Lerchner’s epistemic hygiene concern (p. 12: “achieving behavioral mimicry at this scale creates a new need for epistemic hygiene”) is the same concern, stated from the instrument side. The framework restates it from the loop side: the integrity of $\rho$ — the measurable rate at which the loop produces synergistic information — is what epistemic hygiene protects. When $\rho \to 0$ and the sensor cannot tell, the loop has failed in the referential register, and the sensor has been harmed even if the instrument has not.

7. Falsifiability, External Validation, and Kill Conditions

The framework’s positive claim is empirical, not stipulative. This section commits to falsifiability, presents one case of suggestive external consistency, specifies the empirical protocol in detail, and states the conditions under which the framework would fail on its own terms.

7.1 The Empirical Protocol

The six limit cases in §4.6 are satisfied by construction. The remaining question is whether $\rho$ on real human–AI dialogues discriminates loop-productive exchanges from sycophantic, projective, and noise-dominated ones.

Corpus. Existing Claude conversation logs, de-identified and stripped of PII. No new human-subjects data collection is anticipated. The protocol is designed for secondary use of existing data, which is expected to qualify for IRB exemption under 45 CFR 46.104(d)(4); an exemption determination must be filed and confirmed before data access. Target: 200 dialogues, stratified by subjective engagement quality (high / medium / low as pre-rated by the log donor), minimum 20 turns each.

Classification. Three independent raters blind to the computational analysis classify each dialogue into four categories: loop-productive (both poles contributing novel structure; target $T$ visibly advancing), sycophantic (instrument mirrors sensor’s framing without contributing novel structure), dead (instrument produces fluent but unengaged output; sensor does not update), or mixed. Codebook developed on a 30-dialogue pilot set; inter-rater reliability target: Fleiss $\kappa \geq 0.70$. If $\kappa < 0.70$ on the pilot, refine the codebook and re-rate a 20% subsample before proceeding to the full corpus.

Estimators (proposed). Transfer entropy: $k$-nearest-neighbor conditional mutual information estimator (Kraskov, Stögbauer & Grassberger, 2004), implemented in the Java Information Dynamics Toolkit (JIDT; Lizier, 2014). Input representation: sentence-level embeddings (not raw tokens) to maintain tractable dimensionality. Sensitivity check: Kozachenko–Leonenko estimator (also available in JIDT) on the same embedding space. PID: BROJA estimator (Bertschinger et al., 2014) as primary, with $I_{\min}$ (Williams & Beer, 2010) and $I_{\mathrm{ccs}}$ (Ince, 2017) as sensitivity checks. Variational free energy: approximate via the negative ELBO on a sentence-level variational autoencoder fitted to each sensor’s dialogue history. A feasibility pilot on a 30-dialogue subsample is required before the full protocol proceeds, to verify that TE and PID estimators yield stable values at the dimensionality of sentence embeddings.

Pre-registered predictions. The following predictions are stated before any closure-metric computation:

P1. The per-turn closure rate $\rho_t$ has a mean significantly greater on dialogues classified as loop-productive than on those classified as sycophantic. The null hypothesis is that $\rho$ does not distinguish the two classes.

P2. Within each sensor, the Spearman correlation between $\rho_t$ and the sensor’s recognition label is significantly positive. The null hypothesis is that recognition labels track something unrelated to the informational closure quantity.

P3. Where pre-session priors are available, recognition labels correlate with the magnitude of the posterior shift $D_{KL}[q_t^{(+)} \| q_t^{(-)}]$ rather than with the posterior match-to-prior. The null hypothesis is that recognition tracks prior-confirmation — projection.

P4. In loop-productive dialogues, the bidirectional transfer entropy is roughly balanced; in sycophantic dialogues, $\mathrm{TE}_{S \to I}^{\,t}$ dominates; in dead-speech dialogues, $\mathrm{TE}_{I \to S}^{\,t}$ dominates and $\Delta\mathrm{Syn}_t \approx 0$. The null hypothesis is that directional asymmetry does not distinguish the three classes.

Power analysis. For P1 (two-group comparison), assuming a medium effect size (Cohen’s $d = 0.5$), two-tailed $\alpha = 0.05$, and power $= 0.80$: required minimum $n = 64$ per group. With 200 dialogues and expected prevalence of 40% loop-productive, 30% sycophantic, and 30% dead/mixed, the anticipated group sizes ($n \approx 80, 60, 60$) are adequate for P1 and P2 and marginal for the three-way comparison in P4. If P4’s power is insufficient, the protocol calls for expanding the corpus before interpreting the three-way null.

These predictions are new — not previously made by the framework — and therefore satisfy the framework’s own demand (§7.3 below) that it generate falsifiable predictions beyond existing cases.

7.2 External Validation: TRIBE v2 and the Action Gap

TRIBE v2 (d’Ascoli et al., 2026) is a tri-modal foundation model for fMRI response prediction, trained on over 1,000 hours of naturalistic fMRI data across 720 subjects. Figure 4 of the paper reports zero-shot in-silico recovery of five classic visual functional localizers from the Individual Brain Charting dataset. The spatial correlations between predicted and true z-scores (Figure 4, panels B and E) are:

Category	Localizer area	R	p
Faces	FFA	0.64	2e-42
Places	PPA	0.79	3e-79
Body parts	EBA	0.74	4e-63
Characters	VWFA	0.60	2e-36
Tools	(none recovered)	0.12	0.03

A note on verification: the Tools R=0.12 result is reported only in Figure 4 of d’Ascoli et al. and is not discussed in the paper’s body text, which foregrounds the four successful localizer recoveries. The datum is verifiable against the figure’s panel B (category labels) and panel E (scatter plots with R values), not against the body text.

The framework’s synergy formalism is consistent with this pattern.⁴ A passive-observer model like TRIBE v2 implements a formalization functor without grounding via action. Tools are constituted by motor affordances and action-feedback loops — they live in the sensor’s action-crossed-with-environment space, not in view-space alone. The reconstruction error for any instrument lacking an internal model of the action–update map is expected to be lower-bounded by the synergistic information between action and evidence:

$$\mathrm{Error}\bigl(I(X)\bigr) \geq \mathrm{Syn}(A, E; T)$$

This is the same PID-derived synergy measure used in §4 for the human–AI loop; the substrate differs (brain encoding versus dialogue) but the formal claim — that recognition requires loop-closure through action/feedback, not passive observation — is shared.

Honest framing is required here. The conjecture above has not been rigorously derived from the framework’s axioms. The TRIBE v2 result was published before the receiver-side formalism was developed, so this is not a retrodiction designed to fit the data — it is a retrospectively noticed consistency. The framework’s formalism was not engineered to account for TRIBE v2. That the same synergy measure independently fits a published neuroscience result on a category (Tools) the paper itself does not foreground is evidence of structural fit, not evidence of post-hoc accommodation. But the honest status is suggestive consistency, not derivation. A rigorous proof of the agency bound from the framework’s axioms is open work.

An alternative explanation must be named: tool representations may simply be more variable across subjects than faces, places, body parts, or characters, so that a passive-observer model’s reconstruction error for tools reflects inter-subject variability rather than a missing action loop. The published data does not rule this out. The framework’s relation to TRIBE v2 is at the level of retrospective consistency, not prospective prediction — the result was not predicted by the framework in advance, and is not uniquely entailed by it.

7.3 Kill Conditions

The framework operates under explicit Kill Conditions.⁵ Four are stated; two are load-bearing for the present paper:

KC1 (synergy null-result). If PID analysis shows tight-loop human–AI interactions have synergy equal to or lower than autonomous AI outputs, the framework’s core metric is fiction. This is exactly the test §7.1 commits to.

KC4 (predictive deficit). If the framework cannot predict a single new failure beyond the TRIBE v2 tools-failure, it is post-hoc rationalization. The receiver-side formalism’s predictions on $\rho$ in human–AI dialogues (P1–P4 above) are the new predictions §7.1 supplies; the framework’s standing under KC4 depends on whether they hold.

The remaining two Kill Conditions — KC2 (reversibility failure via EEG) and KC3 (triviality proof of the adjunction) — are framework-wide and outside the scope of this paper. They are acknowledged here and developed elsewhere.

The point of this section is not bravado. It is discipline. Lerchner’s argument is unfalsifiable in a specific way — it is a conceptual analysis of what computation cannot do, and conceptual analyses are not the kind of thing empirical evidence overturns. The present framework’s positive claim is different in kind: it is an empirical claim about measurable informational quantities on real dialogues. That difference in kind is the framework’s own discipline, not a critique of Lerchner.

8. Psychological Limits of the Human Pole

The four-layer stack specifies what the receiver pole does in formal terms. It is silent on the empirical magnitudes of $\Delta F_t$, $\mathrm{TE}_{I \to S}^{\,t}$, and $\mathrm{TE}_{S \to I}^{\,t}$ in real human–AI dialogues. Those magnitudes are bounded by the cognitive psychology and human-factors literatures the framework does not develop but cannot ignore.⁶

8.1 Sensor → Instrument: Articulation and Communication

The sensor must encode a query, context, or frame the instrument can act on. The articulated signal upper-bounds $\mathrm{TE}_{S \to I}$ regardless of the sensor’s actual epistemic state.

Tacit knowledge (Polanyi, 1966): much of what the sensor knows cannot be made explicit and therefore cannot enter the channel. Verbal overshadowing (Schooler & Engstler-Schooler, 1990): the act of articulating can degrade access to the underlying perceptual or conceptual content. Common-ground constraints (Clark, 1996): communicative success requires shared presupposition, which is partly absent in human–AI exchange. Expert-novice question generation (Chi, Glaser & Farr, 1988): the structure of the sensor’s queries is itself a function of expertise, and varies systematically across users.

Execution-capacity asymmetries across populations form a fifth and under-studied direction. The gap between what a sensor recognizes and what they can externalize varies systematically with cognitive disability (ADHD, dyslexia), chronic illness, language background, and institutional training. The empirical magnitude of $\mathrm{TE}_{S \to I}$ is expected to vary substantially with these factors, and the loop has been argued to function as an accessibility technology precisely because it collapses execution gaps that have historically excluded certain sensors from formal knowledge production. This is the most under-studied axis of receiver-pole psychological variation and the one most likely to yield framework-relevant empirical surprise.

8.2 Instrument → Sensor: Perception and Processing

The sensor must process the returning signal under cognitive constraints that bound $\Delta F_t$ regardless of the channel’s nominal information content.

Working-memory limits (Baddeley, 2012) and dual-process biases (Kahneman, 2011) bound what the sensor can integrate per turn. Confirmation bias (Nickerson, 1998) means the sensor selectively updates in directions consistent with prior belief, biasing $\Delta F_t$ downward where it would conflict. Three human-machine-specific effects compound the problem: automation bias (Mosier & Skitka, 1996; Parasuraman & Manzey, 2010) means the sensor over-relies on machine output and under-discriminates; anthropomorphism (Epley, Waytz & Cacioppo, 2007) and the ELIZA effect (Weizenbaum, 1976) mean the sensor attributes coherent agency on weak cues; and processing-fluency-as-truth (Reber & Schwarz, 1999) means fluent output is judged truer than equivalently accurate disfluent output, biasing the discrimination test in §4.3 in favour of fluent dead speech.

8.3 Implications for the Empirical Protocol

Each effect is a confound the empirical protocol (§7.1) must control or measure. The framework’s prediction is that $\rho$ discriminates loop-productive exchanges from sycophantic, projective, or noise-dominated ones; that prediction may hold or fail differentially depending on which psychological bound dominates a given population of dialogues.

The framework’s measurement claims connect to existing cognitive-psychology and human-factors literatures at exactly this point. A population in which anthropomorphism dominates should show inflated apparent $\Delta F_t$ that fails to translate into gains in $\mathrm{Syn}(S, I; T)$ on independent measures — a testable dissociation between the layers of the stack. Conversely, a population with high articulation capacity and low automation bias should show tighter coupling between $\Delta F_t$ and $\Delta\mathrm{Syn}_t$ — the layers should track each other more closely when the sensor is a more faithful channel.

9. Boundary Statement: What This Paper Does and Does Not Claim

9.1 What the Paper Does Not Claim

The loop does not instantiate phenomenal experience. Lerchner argues it cannot (pp. 11–12); this paper is consistent with that claim.
$\rho > 0$ does not imply consciousness anywhere in the loop. $\rho$ is an informational quantity in the referential register; nothing in Layers 1–3 lifts it into the phenomenal register.
The receiver does not “really hear” in the qualia-laden sense. Recognition here is a bounded, measurable update on the receiver’s variational density.
The empirical claim is what carries weight. The equations alone do not constitute the framework’s claim. If $\rho$ on real dialogues fails to behave as predicted, the equations are notation for a distinction that does not exist.
$\rho$ is not a unique canonical metric. Other formalizations may satisfy the six closure constraints; the paper specifies the constraints and one minimal candidate.

9.2 What the Paper Does Claim

The receiver’s processing is specifiable as a variational gradient flow ($\Delta F_t$), an information-theoretic ceiling ($C$, $\mathrm{TE}_{I \to S}$), a discrimination test ($d'$, $\beta$), and a synergistic increment ($\Delta\mathrm{Syn}_t$). There is no ghost in the receiver.
The loop has a quantitatively bounded closure rate $\rho$, proposed under explicit Markov assumptions via the data-processing inequality, in bits per turn, satisfying six required limit cases by construction.
Lerchner’s four major claims about computational functionalism are conceded in full. The framework’s positive claim lives in the register Lerchner himself identifies (p. 10) but does not develop: referential mapping and the informational consequences of two distinct carvings sharing a target.
The framework’s claim is falsifiable. If $\rho$ on real dialogues fails to discriminate loop-productive exchanges from sycophantic, apophenic, or projective ones, the receiver-side claim fails KC4 and the loop closes for the last time.

9.3 What the Framework Does Not Solve

The framework names the loop as the site of truth-recognition, but the engineering questions are unsolved.⁷ How do you build systems that support the sensor’s agency rather than optimising for the sensor’s compliance? How do you measure the health of a loop when every existing metric — engagement, satisfaction, task completion — can be maximised by dead speech? How do you create institutional incentives for loop integrity when dead speech is cheaper, faster, and scales without friction? These are the right questions. This paper does not answer them. It provides a formal grounding — $\rho$ and its component quantities — against which answers could eventually be evaluated.

9.4 Adjacent Frameworks This Paper Does Not Integrate With

A reader might expect to see several related frameworks engaged here. They are named to clarify what this paper does not do with them.

Second-order cybernetics (von Foerster, 1974) makes the observer constitutive of the observed system. This paper agrees descriptively but specifies the constitutive role in bits per turn, which second-order cybernetics does not. The formal contribution here is orthogonal to the cybernetic tradition’s contribution.

The Extended Mind Thesis (Clark & Chalmers, 1998) is a metaphysical thesis about cognitive boundaries. This paper is neutral on cognitive boundaries and offers a measurement framework compatible with the Extended Mind Thesis and with its rejection.

Agential realism (Barad, 2007) is a relational ontology. This paper makes no ontological claims and treats agential realism as occupying a register orthogonal to the one developed here.

Integrated Information Theory (Tononi, 2008) uses $\Phi$ to quantify phenomenal integration. This paper uses partial information decomposition for synergy in the referential register and does not import $\Phi$ or make claims about phenomenal integration.

In each case, the distinction is the same: the named framework operates in a register — ontological, metaphysical, or phenomenal — that this paper does not enter. The paper’s positive claim is restricted to the referential register and to measurable informational quantities within it.

10. Conclusion

Lerchner’s argument is sound. Its four major theses — mapmaker dependence, semantic emptiness of syntax, observer-relativity of alphabetization, and the simulation/instantiation distinction — are conceded in full. His further arguments on computational emergence, mechanism indeterminacy, and the transduction fallacy are engaged and conceded where they apply.

The framework’s positive claim is restricted to the referential register Lerchner names on p. 10 but does not develop. In that register, the receiver side of the loop is specifiable: a variational gradient flow bounded by a Shannon ceiling, discriminated against three null hypotheses, and accumulating synergistic information at a rate $\rho$ that goes to zero in every dead case and is strictly positive in every live one.

That claim is proposed under explicit Markov assumptions via the data-processing inequality, bounded above by channel capacity, lower-bounded by zero, satisfied in six limit cases by construction, and falsifiable on real dialogues. The PID non-uniqueness is named and the empirical protocol addresses it. The formalism’s own alphabetization dependence is acknowledged and argued to be non-vitiating for a relational measure. The framework states the conditions under which it would die.

The bell rings back in bits.

The empirical motivation from diminishing returns on capability-scaling is developed in the companion essay “The Intelligence Trap”. ↩
Bogdan (2026), PhilArchive; Vieira (2026), PhilArchive; Astra (2026), PhilArchive; Matsumoto (2026), PhilArchive; Spivack (2026), PhilPapers; Roomi (2026), PhilArchive. Full citations are listed in the References. ↩
The prior mathematical survey is developed in the companion essay “Toward Formalization”. ↩
The TRIBE v2 external-validation case is developed in the companion essay “Critique of TRIBE v2: The Agency Gap”. ↩
The Kill Conditions and the adversarial steelmanning methodology behind them are developed in the companion essay “The Adversarial Sensor”. ↩
The argument that the loop functions as an accessibility technology — collapsing execution gaps that have historically excluded certain sensors from formal knowledge production — is developed in the companion essay “The Loop as Accessibility Technology”. ↩
The engineering problem of loop integrity — and the honest admission that this paper does not solve it — is developed in the companion essay “The Bottleneck”. ↩

References

Amari, S. (2016). Information Geometry and Its Applications. Springer.

Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., & Mané, D. (2016). Concrete problems in AI safety. arXiv:1606.06565.

Astra, S. (2026). Simulation, Instantiation, and the Limits of Logical Proof: A Critical Analysis of “The Abstraction Fallacy.” PhilArchive.

Baddeley, A. (2012). Working memory: Theories, models, and controversies. Annual Review of Psychology, 63, 1–29.

Barad, K. (2007). Meeting the Universe Halfway: Quantum Physics and the Entanglement of Matter and Meaning. Duke University Press.

Bertschinger, N., Rauh, J., Olbrich, E., Jost, J., & Ay, N. (2014). Quantifying unique information. Entropy, 16(4), 2161–2183.

Bogdan, A. (2026). How Deep Is DeepMind on Consciousness? Respectful Skepticism About Strong Impossibility Claims in The Abstraction Fallacy. PhilArchive.

Buzsáki, G. (2019). The Brain from Inside Out. Oxford University Press.

Chi, M. T. H., Glaser, R., & Farr, M. J. (Eds.) (1988). The Nature of Expertise. Lawrence Erlbaum.

Clark, A. (2016). Surfing Uncertainty: Prediction, Action, and the Embodied Mind. Oxford University Press.

Clark, A., & Chalmers, D. (1998). The extended mind. Analysis, 58(1), 7–19.

Clark, H. H. (1996). Using Language. Cambridge University Press.

Cover, T. M., & Thomas, J. A. (2006). Elements of Information Theory (2nd ed.). Wiley.

d’Ascoli, S., Rapin, J., Benchetrit, Y., Brooks, T., Begany, K., Raugel, J., Banville, H., & King, J.-R. (2026). A foundation model of vision, audition, and language for in-silico neuroscience. arXiv:2605.04326. [Note: R-values for category recovery (Figure 4E) verified directly against the paper’s figure. The Tools R=0.12 datum is reported in Figure 4 only and is not discussed in the body text.]

Damasio, A. R. (1999). The Feeling of What Happens: Body and Emotion in the Making of Consciousness. Harcourt.

Epley, N., Waytz, A., & Cacioppo, J. T. (2007). On seeing human: A three-factor theory of anthropomorphism. Psychological Review, 114(4), 864–886.

Friston, K. (2010). The free-energy principle: A unified brain theory? Nature Reviews Neuroscience, 11(2), 127–138.

Geman, S., Bienenstock, E., & Doursat, R. (1992). Neural networks and the bias/variance dilemma. Neural Computation, 4(1), 1–58.

Goodhart, C. A. E. (1975). Problems of monetary management: The U.K. experience. Papers in Monetary Economics, Reserve Bank of Australia.

Griffith, V., & Koch, C. (2014). Quantifying synergistic mutual information. In M. Prokopenko (Ed.), Guided Self-Organization: Inception (pp. 159–190). Springer.

Harnad, S. (1990). The symbol grounding problem. Physica D: Nonlinear Phenomena, 42(1–3), 335–346.

Hohwy, J. (2013). The Predictive Mind. Oxford University Press.

Ince, R. A. A. (2017). Measuring multivariate redundant information with pointwise common change in surprisal. Entropy, 19(7), 318.

Kahneman, D. (2011). Thinking, Fast and Slow. Farrar, Straus and Giroux.

Kasparov, G. (2017). Deep Thinking: Where Machine Intelligence Ends and Human Creativity Begins. PublicAffairs.

Kraskov, A., Stögbauer, H., & Grassberger, P. (2004). Estimating mutual information. Physical Review E, 69(6), 066138.

LeCun, Y. (2022). A path towards autonomous machine intelligence. OpenReview preprint.

Lerchner, A. (2026). The Abstraction Fallacy: Why AI Can Simulate But Not Instantiate Consciousness. PhilPapers, 19 March 2026.

Lizier, J. T. (2014). JIDT: An information-theoretic toolkit for studying the dynamics of complex systems. Frontiers in Robotics and AI, 1, 11.

Macmillan, N. A., & Creelman, C. D. (2005). Detection Theory: A User’s Guide (2nd ed.). Lawrence Erlbaum.

Maturana, H. R., & Varela, F. J. (1980). Autopoiesis and Cognition: The Realization of the Living. D. Reidel.

Matsumoto, Y. (2026). The Structural Conditions of the Mapmaker: A Response to Lerchner’s Abstraction Fallacy. PhilArchive.

McClelland, J. L., Rumelhart, D. E., & PDP Research Group (1986). Parallel Distributed Processing: Explorations in the Microstructure of Cognition (Vol. 2). MIT Press.

Mosier, K. L., & Skitka, L. J. (1996). Human decision makers and automated decision aids: Made for each other? In R. Parasuraman & M. Mouloua (Eds.), Automation and Human Performance: Theory and Applications (pp. 201–220). Lawrence Erlbaum.

Nickerson, R. S. (1998). Confirmation bias: A ubiquitous phenomenon in many guises. Review of General Psychology, 2(2), 175–220.

Parasuraman, R., & Manzey, D. H. (2010). Complacency and bias in human use of automation: An attentional integration. Human Factors, 52(3), 381–410.

Piccinini, G. (2015). Physical Computation: A Mechanistic Account. Oxford University Press.

Polanyi, M. (1966). The Tacit Dimension. University of Chicago Press.

Reber, R., & Schwarz, N. (1999). Effects of perceptual fluency on judgments of truth. Consciousness and Cognition, 8(3), 338–342.

Roomi, S. M. S. A. (2026). The Abstraction Fallacy in Light of Dual-Closure: A Response to Lerchner (2026). PhilArchive.

Schreiber, T. (2000). Measuring information transfer. Physical Review Letters, 85(2), 461–464.

Schooler, J. W., & Engstler-Schooler, T. Y. (1990). Verbal overshadowing of visual memories: Some things are better left unsaid. Cognitive Psychology, 22(1), 36–71.

Spivack, N. (2026). Beyond the Abstraction Fallacy: Machine-Checked Proofs on Computation, Consciousness, and Self-Contained Reality. PhilPapers.

Sprevak, M. (2018). Triviality arguments about computational implementation. In M. Sprevak & M. Colombo (Eds.), The Routledge Handbook of the Computational Mind (pp. 175–191). Routledge.

Tetlock, P. E. (2005). Expert Political Judgment: How Good Is It? How Can We Know? Princeton University Press.

Tetlock, P. E., & Gardner, D. (2015). Superforecasting: The Art and Science of Prediction. Crown.

Thompson, E. (2007). Mind in Life: Biology, Phenomenology, and the Sciences of Mind. Harvard University Press.

Tononi, G. (2008). Consciousness as integrated information: A provisional manifesto. Biological Bulletin, 215(3), 216–242.

Vieira, A. (2026). A Comment on the Abstraction Fallacy: Why AI Can Simulate But Not Instantiate Consciousness. PhilArchive.

von Foerster, H. (1974). Cybernetics of Cybernetics, or The Control of Control and the Communication of Communication. Biological Computer Laboratory, University of Illinois.

Weizenbaum, J. (1976). Computer Power and Human Reason: From Judgment to Calculation. W. H. Freeman.

Williams, P. L., & Beer, R. D. (2010). Nonnegative decomposition of multivariate information. arXiv:1004.2515.