Draft. The asymptotics of language [0017]

27 April 2026

1. Preliminaries

27 April 2026
J.P. Loo

This thesis is about the application of asymptotic analysis from theoretical computer science to language.

Arguments from asymptotic analysis have been offered for varied conclusions, including Chomskyan universal grammar, compositionality, and the collapse of the competence/performance distinction.

Asymptotic analysis of algorithms: I introduce asymptotic analysis.
A model of computation: I introduce Turing machines as a model of computation with which to undertake asymptotic analysis.
Computation, cognition and language: I explain how computational asymptotic analysis bears on the study of language, though perhaps as an idealisation.
Some varieties of idealisation: I discuss some varieties of idealisation.
An asymptotic argument for compositionality: I critically examine an asymptotic argument that meanings must be compositional.

Note 2. Asymptotic analysis of algorithms [0018]

27 April 2026

We begin with a preliminary example.

Example 2.1. Sorting a list

27 April 2026
J.P. Loo

Consider a list of two numbers, and a task: to sort the list in ascending order.

We will assume that only one operation is permitted.

Operation 2.1.1. Comparison-cum-swap

27 April 2026
J.P. Loo

One may compare any two adjacent numbers, and, if they are in the wrong order, swap them.

How many comparison-cum-swaps do we need, on a list of two numbers, to sort it in ascending order? Clearly: one. For instance: if the list is $⟨ 1, 2 ⟩$ , the comparison-cum-swap immediately shows that they are in ascending order, so we are done. Alternatively, if the list is $⟨ 5, 2 ⟩$ , the comparison-cum-swap immediately shows they are in the wrong order, they are swapped, and then we are done.

Suppose the list contains three elements: for instance, $⟨ 1, 2, 3 ⟩$ . How many comparison-cum-swaps do we need? One isn’t enough; we’d only be able to consider two of the numbers, and so the third number would have no effect on the final putatively sorted list we produce. What about two?

Two might be enough to verify that a list is sorted, if we’re lucky. Given $⟨ 1, 2, 3 ⟩$ , we need only observe that $1 < 2$ and $2 < 3$ .

However, two is not enough, in general, to sort a list (rather than simply to check whether it is actually sorted). Consider: every comparison-cum-swap leads to at most two possible orderings of some list $⟨ a, b, c ⟩$ . So if there are two comparison-cum-swaps, there are at most $2 \times 2 = 4$ orderings the procedure yields. But there are in fact six possible correct orderings of the list:

$⟨ a, b, c ⟩$ ,
$⟨ a, c, b ⟩$ ,
$⟨ b, a, c ⟩$ ,
$⟨ b, c, a ⟩$ ,
$⟨ c, a, b ⟩$ , and
$⟨ c, b, a ⟩$ .

So, if there are only two comparison-cum-swaps, we will not be able to obtain the correct result in two of these cases.

We can pose the question for a problem of arbitrary size.

Problem 2.1.2. List-sorting

27 April 2026
J.P. Loo

Given a list of $n$ numbers to sort, as a function of $n$ , how many comparison-cum-swaps do we need?

Similarly, other problems of arbitrary size can be posed.

Problem 2.2. Travelling salesman

27 April 2026
J.P. Loo

Given $n$ cities and the distances between them, what is the shortest path by which the salesman may visit each of them at least once?

Problem 2.3. 3-SAT

27 April 2026
J.P. Loo

Given a formula over $n$ literals in conjunctive normal form each of whose clauses has at most three literals, is there a satisfying assignment?

A generalisation of the argument above in fact gives the answer to the question for sorting: approximately $n log n$ (Hoare, Quicksort). The complexity of 3-SAT is open, but widely conjectured to be exponential.

Question 2.4. What does ‘approximately’ mean here?

27 April 2026
J.P. Loo

Answer.

27 April 2026
J.P. Loo

Asymptotic analysis makes this precise. Here we shall mostly follow Cormen, Leiserson and Rivest (Introduction: § 2.1).

Definition 2.5. Tight bounds

27 April 2026
J.P. Loo

Given a monotonically non-decreasing function $g : ℕ \to ℕ$ , we write $Θ (g (n)) = {f (n) : there exist positive constants c_{1}, c_{2}, and n_{0} such that, for all n ⩾ n_{0}, 0 ⩽ c_{1} g (n) ⩽ f (n) ⩽ c_{2} g (n)} .$

Example 2.6. $2 n \in Θ (n)$

27 April 2026
J.P. Loo

Proof.

27 April 2026
J.P. Loo

For all $n ⩾ 1$ , $0 ⩽ \frac{1}{2} \cdot 2 n ⩽ n ⩽ n ⩽ 1 \cdot 2 n .$

Example 2.7. $n^{3} \notin Θ (n^{2})$

27 April 2026
J.P. Loo

Proof.

27 April 2026
J.P. Loo

Suppose otherwise; then $n^{3} ⩽ c_{2} n^{2}$ for all $n ⩾ n_{0}$ as above, but this is contradictory if $n > c_{2}$ .

Definition 2.8. Upper bounds

27 April 2026
J.P. Loo

Given a monotonically increasing function $g : ℕ \to ℕ$ , we write $O (g (n)) = {f (n) : there exist positive constants c and n_{0} such that, for all n ⩾ n_{0}, 0 ⩽ f (n) ⩽ c g (n)} .$

We can now restate the result of Hoare (Quicksort).

Proposition 2.9. Quicksort II

27 April 2026
J.P. Loo

The task of sorting a list of $n$ numbers takes $Θ (n log n)$ comparison-cum-swaps.

Note 3. A model of computation [001D]

27 April 2026

In Asymptotic analysis of algorithms, we saw how, given a certain sort of operation on a list of numbers, we could work out how many times that operation needs to be applied to sort the list. However, in general, we should like to be able to investigate the resource-intensiveness of computations for all sort of problems. In some cases, it is not obvious what the analogous operations should be. We shall now investigate a more general approach that allows us to deal with a considerably greater number of problems. We begin with a seemingly arbitrary model of computation, and then explain why it is not so arbitrary after all.

Definition 3.1. Turing machines (informal) [0019]

27 April 2026

Arora and Barak (Complexity: § 1.1).

Let $f$ be a function that takes a string of bits…and outputs either 0 or 1. An algorrthm for computing $f$ is a set of mechanical rules, such that by following them we can compute $f (x)$ given any input $x \in [0, 1]^{*}$ . The set of rules being followed is fixed (i.e. the same rule must work for all possible input) though each rule in this set may be applied arbitrarily many times. Each rule involves one or more of the following ‘elementary’ operations:

Read a bit of the input.

Read a bit…from the scratch pad or working space we allow the algorithm to use.

Based on the values read,

Write a bit/symbol to the scratch pad.

Either stop and output 0 or 1, or choose a new rule from the set that will be applied next.

Finally, the running time is the number of these basic operations performed.

This seems quite arbitrary—

3.2. Worries about Turing machines [001B]

27 April 2026

why limit ourselves just to symbols 0 and 1?
why are the operations so limited? and
why not read multiple bits at the same time?

In general, we might therefore think that the putative runtime of an algorithm required to solve a problem varies too much with respect to the model of computation. However, one class of runtimes is highly robust to such changes.

Definition 3.3. Polynomial time [001C]

27 April 2026

We say that the runtime of an algorithm is polynomial if its runtime is $f (n) \in O (n^{k})$ for some fixed $k$ .

Note that this definition is relative to a model of computation. However, it turns out that the class of problems that admit polynomial-time algorithms is robust with respect to all the worries above, i.e., changing any of those features does not change the class of problems for which there are polynomial-time algorithms.

Emde Boas (Models: § 1) puts it thus.

There exists a standard class of machine models…[that] simulate each other with polynomially bounded overhead…

This standard class corresponds, it is thought, to all ‘reasonable’ (Dean, Complexity: § 2.2) or physically realisable models of computation. The evidence for this is in effect that no convincing counterexamples have yet been found; it is somewhat analogous, therefore, to the Church–Turing thesis.

Note 4. Computation, cognition and language [001E]

27 April 2026

Our earlier conclusions (A model of computation) might be summarised as making the following conjecture: physically plausible models of computation do not differ in the problems they can solve in polynomial time.

What has this to do with language? To a certain extent, the argument must be reconstructed, but its rough outlines are clear.

A first premiss is the so-called Cobham–Edmonds thesis.

Claim 4.1. Cobham–Edmonds thesis

27 April 2026
J.P. Loo

A function is feasibly computable if and only if some machine with polynomial runtime computes it (Cobham, Difficulty, Edmonds, Paths).

The Cobham–Edmonds thesis is only meaningful relative to a model of computation. But, if the van Emde Boas invariance thesis is correct, any reasonable model of computation will yield the same class of feasibly computable functions.

I shall, at this point, make a

Distinction 4.2. Conjectures and heuristics

27 April 2026
J.P. Loo

By conjecture, I mean a claim that is thought to straightforwardly be true, but for which we have no proof. For instance, Fermat’s last theorem was long a conjecture. By heuristic, I mean a claim counterexamples to which are accepted but which is thought generally to be at least approximately true. For instance, a useful heuristic in most engineering problems is to assume that relativistic effects are negligible; this is not true e.g. for GPS but is (at least approximately) true e.g. for most architectural purposes.

The Cobham–Edmonds thesis might seem to admit obvious counterexamples. For instance, an algorithm that takes $Θ (n^{2^{100}})$ time seems obviously infeasible. (For that runtime to be meaningful, it would have to be associated with some privileged subset of the ‘reasonable’ machine-class above; a good example is e.g. physically instantiated computers.) Less obviously, there is some reason to suspect that some problems that likely have no polynomial-time algorithm are feasible. In this case, we might nevertheless hold that, most of the time, the Cobham–Edmonds thesis is correct. For instance, there are surprisingly few ‘galactic’ algorithms, i.e. algorithms with polynomial asymptotic runtime that in practice are unusable; see e.g. Helfgott (Isomorphismes de graphes en temps quasi-polynomial (d'après Babai et Luks, Weisfeiler-Leman...)), Helfgott (Isomorphismes de graphes en temps quasi-polynomial (d’après Babai et Luks, Weisfeiler-Leman, ...)), and Lipton and Regan (David Johnson). We can therefore provisionally take the Cobham–Edmonds thesis to be a (defeasible) heuristic in the sense above.

What has this to do with language or cognition? There are, broadly, I think, two sorts of argument that relate asymptotic analysis to language.

Argument 4.3. The conceptual argument for asymptotic analysis of language

27 April 2026
J.P. Loo

Certain processes are constitutive of human language: production, comprehension and acquisition. These are cognitive processes that map from inputs to outputs—e.g. from acoustic or visual signals to representation of linguistic information; they are ipso facto comptuations.

Ristad (Game: 1) seems to offer something like this argument.

So do Clark and Lappin (Nativism: § 4.1) in the context of language acquisition.

[T]he child is, among other things, an information processing system, and is subject to the laws that govern computational systems. By studying the theoretical bounds on learning systems, we can make important inferences about the limiting conditions of the empirical question. To take a simple analogy, the problem of determining how birds fly is an empirical matter, but it is also clear that the principles of aerodynamics are highly relevant to this study. The bird is subject to the laws of physics, and no plausible account will violate those laws. We can reject out of hand any explanation that does not conform to the principles of aerodynamics.

The invitation, then, is to reject out of hand any account of language or linguistic phenomena that does not conform, putatively, to the laws of computation, whence asymptotic analysis.

Argument 4.4. The physicalist argument for asymptotic analysis of language

27 April 2026
J.P. Loo

The physicalist argument proceeds in four steps, the latter three from Rooij et al. (Cognition: 14). First, we should study in terms of cognitive processes (e.g. acquisition and production). Second, postulation of some cognitive capacity entails commitment to some explanation as to how that capacity is realised. Third, physically feasible explanations of cognition must be ‘computable and tractable’. Fourth, by the invariance and Cobham–Edmonds thesis, such explanations must (to a first approximation) be in polynomial time.

For the sake of argument, I will assume that these arguments are right, and that, therefore, ‘language computations’ are amenable to asymptotic analysis.

Is the application of asymptotic analysis to cognition and language an idealisation? First, I shall give an example of computational analysis that plausibly is not an idealisation.

Let us suppose, with Searle (Rediscovery: 200), that

[T]here [is] some description of the brain such that under that description you could do a computational simulation of the operations of the brain…given Church's thesis that anything that can be given a precise enough characterization as a set of steps can be simulated on a digital computer…in the same sense in which weather systems, the behavior of the New York stock market, or the pattern of airline flights over Latin American can.

Turing (Computability: § 11) showed that no Turing machine decides the Entscheidungsproblem. If we follow Searle’s rendering of Church’s thesis, it follows that the brain does not solve the Entscheidungsproblem either. I suggest there is no obvious idealisation here. (Of course, the argument might be false; but on its intended reading, it should be read as approximately false in the way idealisations usually are.) In other words, a fairly natural computationalist view is that these results from computability theory apply straightforwardly to computation generally, and, therefore, to cognition—and so to language computations.

However, in a considerable number of the arguments to follow, some antecedent assumptions seem to be false, and, therefore, the arguments will have to be parsed as idealisations. To say that the antecedent assumptions are false is not to deny computationalism. Suppose we are given a learning task over some stream of data $x_{1}, x_{2}, x_{3}, . . .$ (e.g. linguistic input). It may turn out that the probability distributions from which $x_{1}$ and $x_{2}$ , for instance, are drawn are not independent. For simplicity, however we may assume that $x_{1}, . . .$ are i.i.d. variables; making this assumption, we may find some asymptotic bound on the learning task. There is nothing inherently objectionabfle about the i.i.d. assumption, but it must be assessed as an idealised antecedent assumption, in addition to the computational model of language computations supposed.

Note 5. Some varieties of idealisation [001F]

27 April 2026

I shall take idealisation to be the ‘intentional introduction of distortion into scientific theories’ (Weisberg, Idealization: 639) (presumably with non-malicious intent). The accuracy or realism of an idealisation is not, per se, necessarily of interest. It might mislead as to the preidctive inaccuracy entailed by the idealisation (Friedman, Methodology: § III.)

Example. Acceleration in a vacuüm and in the atmosphere

27 April 2026
J.P. Loo

In certain circumstances, we idealise the acceleration of a body on earth as if it were dropped in a vacuüm.

We could test the hypothesis by comparing the air pressure in a vacuüm and on earth. But this is ‘a foolish question by itself’. If we are dropping a ball from a building, its acceleration probably will closely approximate that in a vacuüm; but not a feather. So the putative accuracy of the antecedent idealisation is not, per se, a good guide to its legitimacy or usefulness. We should also consider the predictive inaccuracy it introduces.

Friemdan says that predictive accuracy is the ‘only relevant standard of comparison’. We need not go that far to appreciate the point that it is a relevant standard of comparison, and that the accuracy of the idealisation per se is not always relevant.

One lesson to draw from that example is that the purpose of an idealisation matters. Weisberg (Idealization: § 1) distinguishes ‘Galilean’ and ‘minimalist’ idealisation. The former distorts theories to simplify them, for computational tractability—‘to get traction on the problem’ on pragmatic grounds. The latter construts and studies theoretical models containing only the ‘core causal factors which give rise to a phenomenon’. In general, idealisations’ purposes are representational ideals, e.g. (Weisberg, Idealization: § 2)

inclusion rules [which] tell the theorist which kinds of properties of the phenomenon [are] of interest[; and] fidelity rules [concerning] the degrees of precision and accuracy with which each part of the model is to be judged.

For instance, completeness requires inclusion of all the properties of the target phenomenon, anything external giving rise to those properties, and an analogue of any structural or causal relationship within the model, all with arbitrary accuracy and precision. Simplicity can e.g. serve pedagogical purposes, or show the minimal conditions required to generate some property. Isolating ‘only…the factors that made a difference’ can help us to formulate or analyse more complex models. Predictive accuracy may be practically useful. And so on.

6. Identification in the limit [001G]

27 April 2026

7. An asymptotic argument for compositionality [001H]

27 April 2026

Reference 7.1. Compositionality, computability, and complexity [pagin2021]

Definition 7.1.1. Grammatical term algebra (§ 2.1) [pagin2021-gta]

September 2021
Peter Pagin

A grammatical term algebra $G T A_{L}$ of a language $L$ is a partial algebra $⟨ G T_{L}, A T_{L}, Σ_{L} ⟩$ , where—

$G T_{L}$ is the set of grammatical terms for $L$ ,
$A T_{L}$ is a finite set of atomic (grammatical) terms for $L$ , and
$Σ_{L}$ is a finite set of syntactic operations for $L$ . $G T_{L}$ is the closure of $A T_{L}$ under $Σ_{L}$ .

Definition 7.1.2. Meaning operations (§ 2.2) [pagin2021-meaning-operation]

September 2021
Peter Pagin

A meaning operation $r$ over a domain of meanings $M$ is some function $M^{n} \to M$ .

Definition 7.1.3. Semantic functions (§ 2.2) [pagin2021-semantic-function]

September 2021
Peter Pagin

Where $G T A_{L}$ is a grammatical term algebra and $M$ is a domain of meanings, any $μ : G T_{L} \to M$ is a semantic function.

Definition 7.1.4. Standard compositionality (§ 2.2) [pagin2021-standard-compositionality]

September 2021
Peter Pagin

Given a grammatical term algebra $G T A_{L}$ and domain $M$ of meanings, a semantic function $μ : G T_{L} \to M$ is standard compositional iff for every $n$ -ary operation $σ \in Σ_{L}$ there is a meaning operation $r_{σ} : M^{n} \to M$ such that for any $t_{1}, . . ., t_{n} \in Dom σ \subseteq G T_{L}$ , $μ (σ (t_{1}, . . ., t_{n})) = r_{σ} (μ (t_{1}), . . ., μ (t_{n})) .$

Definition 7.1.5. General compositionality (§ 2.3) [pagin2021-general-compositionality]

September 2021
Peter Pagin

A semantic system $S$ is general compositional iff for every $n$ -ary operation $σ \in Σ_{L}$ and every member $μ_{i} \in S^{'}$ there is a meaning operation $M^{n} \to M$ such that for any grammatical terms $t_{1}, . . ., t_{n} \in Dom σ$ $μ_{i} (σ (t_{1}, . . ., t_{n})) = r_{σ, i} (μ_{j_{1}} (t_{1}), . . ., μ_{j_{n}} (t_{n})),$ where for $1 ⩽ k ⩽ n$ $μ_{j_{k}} = Φ_{S} (μ_{i}, σ, k) .$

Definition 7.1.6. Meaning algebras (§ 3) [pagin2021-meaning-algebra]

September 2021
Peter Pagin

A meaning algebra over a set of meanings $M$ is a triple $M A_{M} = ⟨ M, B_{M}, R_{M} ⟩$ , where—

$B_{M}$ is a finite set of basic meanings,
$R_{M}$ is a finite set of elementary meaning operations,
$M$ closes $B_{M}$ under $R_{M}$ , and
for all $δ, δ^{'} \in R_{M}$ and $x_{1}, . . . x_{n}, y_{1}, . . ., y_{m}$ , $δ (x_{1}, . . ., x_{n}) \notin B_{M}$ and, if $δ (x_{1}, . . ., x_{n}) = δ^{'} (y_{1}, . . ., y_{m})$ then $δ = δ^{'}$ , $m = n$ , and $x_{i} = y_{i}$ for $1 ⩽ i ⩽ n$ .

Definition 7.1.7. Semantic systems (§ 3) [pagin2021-semantic-system]

September 2021
Peter Pagin

Given a grammatical term algebra $G T A_{L}$ and domain $M$ of meanings, a semantic system $S$ is a triple $⟨ S^{'}, μ_{S}, Φ_{S} ⟩$ where—

$S^{'}$ is a finite set of semantic functions $μ_{j} : G T_{L} \to M$ , $μ_{S} \in S$ , and
$Φ_{S} : S^{'} \times Σ_{L} \times N \to S^{'}$ is a partial selection function.

Given some semantic function $μ_{j}$ and a syntactic operation $σ$ , $Φ_{S}$ then gives another semantic function $μ_{k}$ for each $n$ th-position term.

Finally, $S (t) = μ_{S} (t)$ .

Definition 7.1.8. Semantic algebras (§ 3) [pagin2021-semantic-algebra]

September 2021
Peter Pagin

A semantic algebra over a language $L$ and domain $M$ of meanings is a triple $S A_{L} M = ⟨ ⟨ G T_{L}, M ⟩, ⟨ A T_{L}, B_{M} ⟩, ⟨ Σ_{L}, R_{M} ⟩ ⟩$ where—

$⟨ G T_{L}, A T_{L}, Σ_{L} ⟩$ is a grammatical term algebra for $L$ , and
$⟨ M, B_{M}, R_{M} ⟩$ is a meaning algebra for $M$ .

Definition 7.1.9. Size (§ 5) [pagin2021-size]

September 2021
Peter Pagin

The size of a term is the number of atoms and operators contained within it.

Definition 7.1.10. Input terms (§ 5) [pagin2021-input-terms]

September 2021
Peter Pagin

The input terms of a rewriting system are of the form ‘ $μ (t)$ ’ where $t$ is a syntactic term.

Definition 7.1.11. Input complexity (§ 5) [pagin2021-input-complexity]

September 2021
Peter Pagin

Let $s$ be a normal term.

The term complexity $C t_{R} (s)$ of $s$ relative to $R$ is the length of the shortest derivation from some input term $μ (t)$ to $s$ .

The input complexity $C_{R}^{i} (k)$ is the maximal $C t_{R} (s)$ where $s$ has size $k$ .

Definition 7.1.12. Term rewrite systems (§ 6.1) [pagin2021-trs]

September 2021
Peter Pagin

A term rewrite system (TRS) $Φ$ is a pair $⟨ Σ_{Φ}, R_{Φ} ⟩$ where—

$Σ_{Φ}$ is a signature comprising a finite set of $n$ -ary operators (including nullary constants), and
$R_{Φ}$ is a set of reductions over the set $T_{Φ}$ of terms over $Σ_{Φ}$ .

We will write $t_{1} \to_{Φ} t_{2}$ (omitting the subscript where context makes it obvious) when the application of some rule $r \in R_{Φ}$ yields the contractum $t_{2}$ .

A sequence of applications $t_{1} \to t_{2} \to . . . \to t_{n}$ is a derivation. We write $t_{1} \to_{Φ}^{+} t_{n}$ where $\to_{Φ}^{+}$ is the transitive closure of $\to$ .

Definition 7.1.12.1. Conditional term rewrite systems (§ 6.1) [pagin2021-ctrs]

September 2021
Peter Pagin

In a conditional term rewrite system $Σ_{Φ}, R_{Φ}$ , the reductions are conditioned over some conditions $C_{1}, . . ., C_{n}$ .

Definition 7.1.12.2. Many-sorted term rewrite systems (§ 6.1) [pagin2021-many-sorted-trs]

September 2021
Peter Pagin

In a many-sorted term rewrite system, $T_{P} s i$ is partitioned into subsets corresponding to sorts, and argument places for operators have sort restrictions; complex terms whose subterms satisfy the sort restrictions are well-sorted (corresponding to grammaticality).

Definition 7.1.12.3. Normal form (§ 6.1) [pagin2021-normal-form]

September 2021
Peter Pagin

A term that cannot be reduced further is in normal form.

Definition 7.1.12.4. Termination (§ 6.1) [pagin2021-terminate]

September 2021
Peter Pagin

If every derivation leads to a term in normal form, a system terminates.

Definition 7.1.12.5. Confluence (§ 6.1) [pagin2021-confluence]

September 2021
Peter Pagin

$Φ$ is confluent iff, for all terms $t_{1}, t_{2}, t_{3}$ , whenever $t_{1} \to_{Φ}^{+} t_{2}$ and $t_{1} \to_{Φ}^{+} t_{3}$ , there is some $t_{4}$ such that $t_{2} \to_{Φ}^{+} t_{4}$ and $t_{3} \to_{Φ}^{+} t_{4}$ .

Definition 7.1.12.6. Convergence (§ 6.1) [pagin2021-convergence]

September 2021
Peter Pagin

A confluent system that terminates is convergent.

Definition 7.1.13. $μ$ -systems (§ 6.1) [pagin2021-mu-system]

September 2021
Peter Pagin

A $μ$ -system is a pair $Φ = ⟨ Σ_{Φ}, R_{Φ} ⟩$ where—

$Σ = ⟨ Σ_{o}, Σ_{m} ⟩$ (an object- and meta-language signature),
$Σ_{o} = ⟨ A_{o}, Σ_{o}^{'} ⟩$ where $A_{o}$ is a set of atomic terms and $Σ^{'}$ is a set of $n$ -ary syntactic operators, so that the closure $T_{o}$ of $A_{O}$ under $Σ^{'}$ yields the grammatical term algebra $⟨ T_{O}, A_{O}, Σ_{O}^{'} ⟩$ of OL,
$Σ_{m} = ⟨ A_{m}, Σ_{m}^{'}, S, F, G ⟩$ where—

$A_{m}$ is a nonempty set of atomic meta-language expressions,
$Σ_{m}^{'}$ is a nonempty set of meta-language syntactic operators,
$S$ is a nonempty set of elementary meta-language function symbols,
$F$ is a possibly empty set of meta-language recursive function symbols over object language terms, and
$G$ is a possibly empty set of meta-language recursive function symbols over meta-language terms.

Remark 7.1.13.1.

September 2021
Peter Pagin

The canonical expressions of ML, $C_{m}$ , close $A_{m}$ under $Σ_{m}^{'}$ , canonically express meanings, do not contain any $μ$ operators or $F$ / $G$ functors, and rewrite derivations terminate with them.

Remark 7.1.13.2.

September 2021
Peter Pagin

The set $S$ of the semantic vocabulary of $Φ$ contains at least the semantic function symbol $μ$ and is finite.

Remark 7.1.13.3.

September 2021
Peter Pagin

The set $S (T_{o})$ of $μ$ -terms is the set of input terms to derivations, given by applying a semantic function symbol to an OL term.

Remark 7.1.13.4.

September 2021
Peter Pagin

$F$ and $G$ contain additional function symbols do not represent semantic functions, are not inputs to any rules in $R_{Φ}$ , but can take arguments from $T_{O}$ , $S (T_{o})$ , and $C_{m}$ .

Remark 7.1.13.5.

September 2021
Peter Pagin

The set $T_{m}$ of meta-linguistic terms of $Σ_{M}$ is the closure of $A_{m} \cup T_{o}$ under $Σ_{m}^{'} \cup S \cup F \cup G$ .

Definition 7.1.14. Complex indirect constant $μ$ -systems (§ 8.3.4) [pagin2021-CIC]

September 2021
Peter Pagin

A complex indirect constant $μ$ -system $Φ$ must satisfy the following.

$R_{Φ}$ is finite.
For any rule, the rewrite variables on the rhs must be a subset of those on the left.
Rewrite variables $v_{1}, v_{2}, . . .$ take all and only object language grammatical terms as instances.
Rewrite variables $y_{1}, y_{2}, . . .$ take all and only expressions in $C_{m}$ as instances.
Every atomic rule $r \in R_{Φ}$ is of the form $μ_{i} (t) \to e$ where $μ_{i} \in S$ , $t \in A_{O}$ and $e$ is a simple or complex expression in $C_{m}$ .
For every pair $⟨ μ_{i} \in S, t \in A_{o} ⟩$ there is an atomic rule.
Every direct complex rule $r$ has the form $μ_{i} (α (v_{1}, . . ., v_{n})) \to K (μ_{k_{1}} (v_{1}), . . ., μ_{k_{n}} (v_{n})) \Leftarrow Gr (α (v_{1}, . . ., v_{n}))$ where Gr corresponds to grammaticality.
Every indirect complex rule has the form $μ_{i} (α (v_{1}, . . ., v_{n})) \to f (μ_{k_{1}} (v_{1}), . . ., μ_{k_{n}} (v_{n}), v_{1}, . . ., v_{n}) \Leftarrow Gr (α (v_{1}, . . ., v_{n}))$ where $v_{1}, . . ., v_{n}$ are optional and $f \in F$ .
For every pair $⟨ μ_{i} \in S, α \in Σ_{o}^{'}$ , there is a unique complex (indirect or direct) rule.
Every indirect ground rule is of the $F$ -form $f (y_{1}, . . ., y_{n}, . . ., t_{n}) \to K^{'} (f^{'} (y_{1}, . . ., y_{n}))$ or the $G$ -form $g (e_{1}, . . ., e_{n}) \to K (e_{1}, . . ., e_{n})$ where $f, f^{'} \in F$ , $g \in G$ , $K, K^{'}$ are operators over $Σ_{m}^{'}$ , $e_{1}, . . ., e_{n} \in A_{m}$ and $t_{1}, . . ., t_{j} \in A_{o}$ , and at most one of the primed operators is null.
Every indirect recurisve rule in $R_{Φ}$ has an $F$ -form $f (y_{1}, . . ., y_{n}, σ (v_{1}, . . ., v_{n})) \to K (f (y_{1}, . . ., y_{n}, v_{1}) . . ., f (y_{1}, . . ., y_{n}, v_{n}))$ or $G$ -form $g (K (y_{1}, . . ., y_{n})) \to K^{'} (g^{'} (y_{1}, . . ., y_{n}))$ where $f \in F$ , $g, g^{'} \in G$ , $K, K^{'}$ are operators over $Σ_{m}^{'}$ , $σ \in Σ_{O}^{'}$ , and at most one of the primed operators is null.

Proposition 7.1.15. [pagin2021-cic-terminate]

September 2021
Peter Pagin

Complex indirect constant systems are convergent (§ 8.3.4).

Corollary 7.1.16. [pagin2021-mdl]

September 2021
Peter Pagin

Given a complex indirect constant $μ$ -system, for any term $t$ , there is some minimum derivation length to a unique normal form.

Proposition 7.1.17. [pagin2021-general]

September 2021
Peter Pagin

Let $Φ$ be any complex indirect constant $μ$ -system satisfying the following.

For every $n ⩾ 1$ , there are object-language atoms $a_{1}, . . . a_{n} \in A_{o}$ such that for every $1 ⩽ k ⩽ n$ , where $t_{1} = α_{1}$ and $t_{k + 1} = δ (α_{k + 1}, t_{k})$ , for some canonical expression $e \in C_{m}$ , $μ (t_{k}) \to_{Φ}^{+} e$ , and $t_{k} \in T_{o}$ .
The complex rule for $μ, δ$ is $μ (δ (v_{1}, v_{2})) \to f (μ (v_{1}), μ (v_{2}), v_{2}) \Leftarrow Gr (d (v_{1}, v_{2})) .$
There are rules $f (y_{1}, y_{2}, δ (v_{1}, v_{2})) \to K (f (y_{1}, y_{2}, v_{1}), f (y_{1}, y_{2}, v_{2}))$ and $f (y_{1}, y_{2}, α) \to K_{α} (y_{1}, y_{2})$ for each $α \in A_{o}$ .

Then $Φ$ is intractable.

Lemma 7.1.17.1.

September 2021
Peter Pagin

Fix some $k ⩾ 1$ and $α_{1}, . . ., α_{k}$ . Define $F (t) : = f (μ (α_{k + 1}, μ (t_{k}), t^{'})) .$

Moreover, define $l (t)$ as the number of atomic leaves of $t$ .

For every object-language subterm $t$ of $t_{k}$ , $T (F (t)) ⩾ l (t) \cdot T (μ (t_{k})) .$

Proof.

September 2021
Peter Pagin

By structural induction on $t$ .

If $t = α \in A_{O}$ , $F (α)$ has subterm $μ (t_{k})$ , which must be reduced to eliminate $m$ to obtain a canonical form. Moreover, there is only one atomic leaf. So $T (F (a)) ⩾ T (μ (t_{k})) = 1 \cdot T (μ (t_{k})) .$

Suppose $t = δ (t_{1}, t_{2})$ . Normalisation requires elimination of the root $f$ of $F (t)$ . Since the third argument has head $δ$ , only $f (y_{1}, y_{2}, δ (v_{1}, v_{2})) \to K (f (y_{1}, y_{2}, v_{1}), f (y_{1}, y_{2}, v_{2}))$ will do so. Every derivation (minimum-length or otherwise) will have to apply that rule at some point. Without loss of generality, apply it at the beginning. Then $F (t) \to K (F (t_{1}), F (t_{2})) .$ Note $K$ is terminal by definition of complex indirect constant $μ$ -systems. So $T (F (t)) = 1 + T (F (t_{1})) + T (F (t_{2})) ⩾ l (t_{1}) \cdot T (μ (t_{k})) + l (t_{2}) \cdot T (μ (t_{k})) = l (t) \cdot T (μ (t_{k})) .$

Proof. 7.1.17.2.

September 2021
Peter Pagin

Now consider a minimum-length derivation of $μ (t_{k + 1})$ . There is some minimum-length derivation that begins $μ (t_{k + 1}) \to F (t_{k})$ whence $T (μ (t_{k +_{1}})) = 1 + T (F (t_{k})) ⩾ k \cdot T (μ (t_{k}))$ whence $T (μ (t_{n})) ⩾ (n - 1)!$ .