CIv7-ECA: Structural Break Detection via Symbolic Substrate Compression and Fault Geometry

Hypothesis Structural breaks in univariate time series can be robustly detected by encoding the input as symbolic sequences (e.g., via permutation entropy, delta-sign encoding), evolving these sequences through Elementary Cellular Automata (ECA), and analysing the resulting 2D symbolic evolution as an algorithmic and topological substrate. This substrate exposes latent causal dynamics, semantic irregularities, and algorithmic fault lines. These transitions manifest as discontinuities in:

Algorithmic compressibility (e.g., via BDM or CTM)
Joint compression failure (predictive collapse à la Sutskever)
Torsion and persistent topological invariants (Walch)
Motif circuit reconfiguration and symbolic attention rewiring (Circuit Tracer analogy)
Local motif entropy shifts and permutation entropy gradients
Collapse of symbolic attractors and bifurcation points in 2D dynamics
Failure of logical boundary preservation (torsion and curvature loss per Hodge)
Reality signal collapse (based on perceptual vs. imagined structure loss)
Prediction breakdown at the edge of chaos (Zhang et al.)
Simulability gaps in world models built from symbolic evolution (Ha & Schmidhuber)
Emergence of intelligent representation capacity from non-intelligent but complex symbolic sources (Zhang et al.)
Reversible inference layers from local symbolic MCMC transitions (Vivier-Ardisson et al.)
Self-replicating symbolic agent emergence in open-ended improvement loops (Darwin Gödel Machine)
LLM-guided symbolic alpha search and functional meta-discovery (AlphaEvolve, DeepMind)
Symbolic rule factorisation and causal primes behind complexity emergence (Riedel & Zenil)
Compression-meaning divergence between human and LLM semantics in cluster formation and fidelity trade-offs (Shani et al.)
Human-guided symbolic segmentation and conceptual scaffolding for efficient reasoning emergence (OpenThoughts)
Negative complexity zones from local motif degeneracy and symmetry folding (Grosse et al.)
Symbolic fault geometry as hybrid MDL-complexity ridge between manifold sectors (Grünwald & Roos)
Dynamic curriculum-guided symbolic evolution using SFT-RL hybrid training (SASR)
Curriculum-aware adaptive strategy switching for symbolic exploration vs. memorisation (Chen et al., SASR)
Multi-agent R\&D orchestration for factor-model co-evolution in symbolic financial substrates (RD-Agent(Q))
Self-supervised symbolic encodings across structural graph domains with transferable motif-awareness (GFSE, Chen et al.)
Causal fidelity loss from symbolic information bottlenecks in compression vs. meaning trade-offs (Shani et al.)
Step-wise adaptation of symbolic reasoning dynamics using gradient-informed switching (SASR)
Bandit-driven symbolic substrate scheduling for hypothesis refinement (RD-Agent(Q))
Universal motif encoders for symbolic graph-to-sequence compression (GFSE)
Latent alignment instabilities and symbolic leakage paths across representation manifolds (Jha et al., vec2vec)

The symbolic substrate is not merely representational—it is causally expressive, revealing failure surfaces where data-driven structure collapses. The substrate thus functions as a model-agnostic detection layer for algorithmic, topological, and semantic regime shifts.

Rationale

Sutskever’s theory of unsupervised learning demonstrates that compression is prediction; joint compression failure between segments of symbolic time series reflects true structural discontinuity.
Walch’s torsion analysis confirms that fragile, high-dimensional topological signatures collapse under even minor perturbations—making torsion an excellent early-warning detector.
Sakabe et al. show that BDM captures algorithmic changes in symbolic representations better than Shannon entropy, making it ideal for symbolic phase-shift detection in ECA substrates.
Anthropic’s Circuit Tracer demonstrates that influence graphs can localise semantic drift in LLMs; we extend this metaphor to symbolic motifs in ECA evolution, where rewiring of symbolic circuits signals change.
Grosse et al. provide a geometric generalisation of Occam’s Razor showing how singular models can have negative complexity; in symbolic substrates, degeneracy or symmetry in motif transitions may represent low-complexity zones that are broken by regime change.
Grünwald & Roos (2019) offer a modern MDL theory emphasizing model selection and predictive robustness via universal distributions. Their use of NML and luckiness-weighted codes supports our treatment of symbolic substrates as sequential predictors whose compressibility divergence reflects structural breakpoints.
BrightStar Labs (EMs) use CA-like substrates for generalisation via initial state optimization. Our use is inverse: symbolic inputs are evolved under fixed rules, making the substrate a diagnostic canvas rather than a policy machine.
OpenThoughts (2024) emphasises the critical role of structured symbolic input for training high-fidelity reasoning systems; our substrate acts as a pre-LLM semantic preprocessor.
Dijkstra et al. (2025) on the neural basis of reality monitoring shows that additive imagery-perception confusion maps directly onto symbolic substrate failures, where local motif reinforcement mimics perceptual signal misclassification.
Zhang et al. (2024) explore intelligence at the edge of chaos, showing that exposure to structured yet unpredictable systems (like Class IV ECAs) enables models to develop generalisable reasoning capabilities. ECA-evolved substrates inhabit this complexity sweet spot—structured enough to encode causality, complex enough to force predictive reasoning.
Ha & Schmidhuber’s World Models highlight the importance of a compressed latent space capable of simulating the environment. ECA substrates can similarly act as symbolic generative scaffolds, allowing detection of divergences between real and simulated dynamics.
Vivier-Ardisson et al. (2025) demonstrate that symbolic substrates based on local MCMC layers can serve as reversible reasoning scaffolds. The symbolic dynamics of ECAs can be understood as heuristic proposal distributions whose temperature-driven collapse or bifurcation traces the onset of algorithmic phase transitions.
Darwin Gödel Machine (2024) introduces a framework for autonomous, open-ended self-improvement through empirical feedback loops. The symbolic agent maintains a lineage of increasingly capable variants via archive-driven selection and self-modification, mirroring symbolic motif evolution within the ECA substrate.
AlphaEvolve (2025) demonstrates the potential of LLM-guided, feedback-grounded evolutionary code improvement. Within symbolic substrates, Alpha-style LLM agents can discover compressible, high-performance motif configurations that exceed static design patterns.
Riedel & Zenil (2025) introduce the concept of ECA rule primality, showing that all Class 4 (complex) ECA rules are composites of simpler Class 1 or 2 primes. Their causal decomposition framework demonstrates that complexity in symbolic dynamics emerges from non-commutative compositions of simpler primitives—providing an algebraic foundation for fault geometry and symbolic fault line inference. These prime rules, especially diagonal shifters like 15 and 170, function as symbolic basis vectors whose combinatorics drive phase transitions in the ECA evolution.
Shani et al. (2024) reveal how LLMs favour statistically efficient clustering at the expense of human-aligned typicality and semantic richness. This compression-meaning trade-off becomes visible as a form of structural distortion in symbolic substrates, where meaning-preserving clusters are sacrificed for global compactness.
Chen et al. (2024) present a hybrid SFT-RL framework (SASR) that guides training via gradient-informed transitions between learning modes. This idea parallels symbolic evolution driven by motif divergence minimisation, where symbolic substrates adaptively switch between memorisation (fine-tuning) and generalisation (policy exploration).
RD-Agent(Q) (2024) demonstrates that a multi-agent R\&D framework for financial factor and model co-evolution outperforms monolithic baselines via closed-loop synthesis, validation, and scheduling. This architecture offers a template for symbolic substrate orchestration where hypothesis generation, code realisation, and fault pattern evaluation are delegated to symbolic agents whose interactions mirror substrate evolution.
GFSE (Chen et al., 2024) establishes a universal graph encoder that captures domain-agnostic structural patterns across graphs. When integrated with symbolic substrates, GFSE offers structure-preserving positional encodings, transferable across motif classes, enabling motif-aware compressibility analysis and topological fault detection in hybrid graph-symbolic systems.
Jha et al. (2024) demonstrate that embedding spaces across different models share a universal geometric alignment, even without paired data. This implies that symbolic motifs, once encoded in a universal latent manifold, may be projected across architectures while retaining semantic fidelity. The vec2vec framework highlights potential leakage pathways and causal geometry transfer that reveal otherwise hidden symbolic structure breaks.

Supporting Literature

Zenil et al. – Algorithmic Information Dynamics
BrightStar Labs (2025) – Emergent Models and symbolic computation substrates
Sakabe et al. – Attribution Drift from symbolic perturbations
Maria Walch (2024) – Torsion as topological signal in high-dimensional symbolic dynamics
Ilya Sutskever (2024) – Unsupervised Learning as Compression (joint vs. separate encoding)
Anthropic (2025) – Circuit Tracer and latent attribution path rewiring
Grosse et al. – Occam’s Razor in Geometric-Algorithmic Deep Models
Peter Grünwald & Teemu Roos – MDL Revisited
OpenThoughts (2024) – Recipes for compositional reasoning over symbolic sequences
Dijkstra et al. (2025) – Neural Basis for Reality Signal Formation and Binary Judgement
Zhang et al. – Intelligence at the Edge of Chaos
Ha & Schmidhuber – World Models and Evolutionary Symbol Machines
Zhang et al. – DGM and Self-Modifying Substrate Architectures
Vivier-Ardisson et al. – Physics of Learning & MCMC Layers
DeepMind (2025) – AlphaEvolve: LLM-supervised evolutionary algorithm discovery
Riedel & Zenil (2025) – Rule Primality, Turing Composition, and Causal Decomposition in ECA
Shani et al. (2024) – From Tokens to Thoughts: Compression-Meaning Divergence in LLM vs Human Concepts
Chen et al. (2024) – SASR: Adaptive Integration of SFT and RL
Li et al. (2024) – RD-Agent(Q): Multi-Agent System for Symbolic Financial Factor-Model Co-Design
Chen et al. (2024) – GFSE: Universal Graph Structural Encoder for Cross-Domain Symbolic Motif Integration
Jha et al. (2024) – Universal Geometry of Embeddings and Platonic Alignment