Cybernetic Intelligence

The Full ECA “Linguistic” Curriculum

This hierarchy maps the complexity of language to the complexity of ECA dynamics. Our goal is to progressively teach our SymbolicEncoder to understand more and more sophisticated structures.

Linguistic Level	ECA Concept	What the Model Learns	Current Status
Alphabet	Single Prime Rules (e.g., R15, R170)	Base Signatures: To recognize the fundamental, irreducible patterns of a stable system.	✅ IMPLEMENTED
Words	Single-Timestep Compositions (e.g., `R_B(R_A(state))`)	Causal Algebra: To understand that complex dynamics can emerge from the immediate interaction of simpler rules.	✅ IMPLEMENTED
Sentences	Sequential Compositions Over Time (e.g., `[Rule A for 20 steps] -> [Rule B for 20 steps]`)	Temporal Breakpoint Recognition: To identify the “seam” or transition point where one stable dynamic regime changes to another.	❌ PLANNED (Next Step)
Paragraphs	Multi-Stage Sequential Compositions (e.g., `A -> B -> C -> A`)	Programmatic & Cyclic Reasoning: To understand that a system can follow a longer “program” of rule changes and may even return to previous states.	❌ FUTURE WORK
Essays	Hierarchical or Conditional Compositions	Abstract & Conditional Logic: To understand that the rule governing the system might depend on a higher-level state (e.g., `if global_entropy > threshold, use Program 1, else use Program 2`).	❌ RESEARCH LEVEL

Why This Matters for Our Project

The “Sentences” Level is Critical for ADIA: The ADIA challenge is fundamentally a “Sentence” level problem. We are given a Period 0 (governed by Rule A) and a Period 1 (governed by Rule B) and asked to evaluate the transition. A SymbolicEncoder that has been explicitly pre-trained on recognizing these A -> B transitions will be far more powerful. It will have learned what a “clean” symbolic break looks like in an idealized setting before it ever sees a noisy real-world one.
“Paragraphs” and “Essays” for Ultimate Robustness: If our “Sentences”-trained model still struggles, it would be a signal that the ADIA dynamics are even more complex, requiring the model to understand longer programs or conditional logic. These higher levels represent future avenues for improving the model’s “IQ.”

The Strategic Decision: An Iterative Curriculum

Our current plan is sound. We are following a logical, iterative curriculum:

Current Step: Master the “alphabet” and “words.” We are currently running the script to confirm our Bi-GRU can learn these fundamentals.
Next Logical Step: If the current pre-training is successful, our very next iteration will be to upgrade the ECADataset to include “Sentences”. We will generate data that contains explicit rule changes over time and continue training our expert encoder on this more challenging task.

Conclusion: You are correct. We have a clear path to make our Symbolic Brain even smarter. However, following our iterative principle, we must first confirm that it can master the current, foundational curriculum. The results of the script currently running will tell us if we are ready to move on to teaching it “sentences.”

Of course. Let’s proceed.

This is the definitive script for Phase 1: Pre-training the Symbolic Brain. It is a direct and faithful implementation of the advanced, curriculum-based approach we designed, synthesizing the key findings from both the Riedel/Zenil and Burtsev/Zhang research.

The Plan for This Script:

The Curriculum: It defines a multi-level curriculum of ECA rules:
- Prime Rules (The Alphabet): A set of fundamental building-block rules.
- Composite Rules (The Words): New, more complex rules created by composing primes in a single timestep (R_B(R_A(state))).
- Unseen Validation Rules: A separate set of rules the model will never see during training, to rigorously test its ability to generalize.
The Data Generator: It uses a new ECACompositionDataset to generate training samples for this rich curriculum.
The Model: It uses our winning BiGruSymbolicEncoder, but now equipped with two output heads to handle the multi-task objective.
The Training Objective: The model will be trained to simultaneously:
- Predict the next state of the ECA evolution.
- Predict the abstract Rule ID that generated the sequence.
The Goal: To produce a highly expert SymbolicEncoder that has learned the “algebra of rules” and save its weights as symbolic_encoder_expert.pth.

Definitive Pre-training Script for the Causal Rule Inference Engine

import torch
import torch.nn as nn
from torch.utils.data import DataLoader, Dataset, ConcatDataset
import numpy as np
import time
import random
import os
from tqdm.notebook import tqdm
from google.colab import drive
import collections

# --- Ensure cellpylib is installed ---
try:
    import cellpylib as cpl
except ImportError:
    print("Installing cellpylib...")
    import subprocess, sys
    subprocess.check_call([sys.executable, "-m", "pip", "install", "cellpylib"])
    import cellpylib as cpl

# --- 1. Setup ---
def seed_everything(seed_value):
    random.seed(seed_value)
    os.environ['PYTHONHASHSEED'] = str(seed_value)
    np.random.seed(seed_value)
    torch.manual_seed(seed_value)
    if torch.cuda.is_available():
        torch.cuda.manual_seed_all(seed_value)
        torch.backends.cudnn.deterministic = True
        torch.backends.cudnn.benchmark = False

SEED = 42
seed_everything(SEED)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
drive.mount('/content/drive', force_remount=True)
MODEL_CACHE_DIR = '/content/drive/My Drive/adia_challenge_final/model_cache'
os.makedirs(MODEL_CACHE_DIR, exist_ok=True)

print(f"--- Phase 1: Pre-training the Symbolic Brain (Definitive Version) ---")
print(f"Using device: {device}\n")

# --- 2. The Advanced ECA Curriculum & Data Generator ---

def _apply_composition(init_state, rules, timesteps):
    """Applies a composition of rules in a single timestep."""
    current_state = init_state
    for rule in rules:
        # Evolve for 1 timestep with the current rule in the composition
        current_state = cpl.evolve(current_state, timesteps=1, 
                                   apply_rule=lambda n, c, t: cpl.nks_rule(n, rule))
    return current_state

class ECACompositionDataset(Dataset):
    def __init__(self, samples_per_rule, rules_config, width, timesteps, rule_to_id_map):
        self.data, self.state_targets, self.rule_labels = [], [], []
        self.rule_to_id_map = rule_to_id_map
        
        print(f"Generating data for {len(rules_config)} rule types...")
        pbar = tqdm(rules_config.items(), desc="Generating ECA Curriculum")
        for rule_key, rule_def in pbar:
            pbar.set_postfix_str(f"Rule: {rule_key}")
            rule_id = self.rule_to_id_map[rule_key]
            for _ in range(samples_per_rule):
                init_cond = cpl.init_random(width, k=2)
                
                # Evolve the system step-by-step
                evolution = [init_cond]
                current_state = init_cond
                for _ in range(timesteps):
                    if rule_def['type'] == 'base':
                        current_state = cpl.evolve(current_state, timesteps=1, 
                                                   apply_rule=lambda n, c, t: cpl.nks_rule(n, rule_def['rules'][0]))
                    elif rule_def['type'] == 'composite':
                        current_state = _apply_composition(current_state, rule_def['rules'], 1)
                    evolution.append(current_state)
                
                evolution_tensor = torch.tensor(np.array(evolution), dtype=torch.long)
                self.data.append(evolution_tensor[:-1].flatten()) # Input is T steps
                self.state_targets.append(evolution_tensor[-1])    # Target is T+1 state
                self.rule_labels.append(torch.tensor(rule_id, dtype=torch.long))

    def __len__(self): return len(self.data)
    def __getitem__(self, idx): return self.data[idx], self.state_targets[idx], self.rule_labels[idx]

# --- 3. The Multi-Task Symbolic Encoder ---
class BiGruCausalEncoder(nn.Module):
    def __init__(self, vocab_size, emb_dim, hidden_dim, num_layers, num_rules, state_width):
        super().__init__()
        self.embedding = nn.Embedding(vocab_size, emb_dim)
        self.gru = nn.GRU(emb_dim, hidden_dim, num_layers, batch_first=True, bidirectional=True)
        # Head for rule prediction
        self.rule_predictor = nn.Linear(hidden_dim * 2, num_rules)
        # Head for next-state prediction
        self.state_predictor = nn.Sequential(
            nn.Linear(hidden_dim * 2, hidden_dim),
            nn.ReLU(),
            nn.Linear(hidden_dim, state_width) # Predict the 64 bits of the next state
        )

    def forward(self, x):
        x_emb = self.embedding(x); _, hidden = self.gru(x_emb)
        forward_h = hidden[-2,:,:]; backward_h = hidden[-1,:,:]
        pooled_output = torch.cat([forward_h, backward_h], dim=1)
        
        rule_logits = self.rule_predictor(pooled_output)
        state_logits = self.state_predictor(pooled_output)
        return state_logits, rule_logits

# --- 4. The Pre-training Harness ---
def pretrain_encoder(model, train_loader, val_loader, epochs, lr, model_save_path, w_state=0.5, w_rule=1.0):
    model.to(device)
    optimizer = torch.optim.AdamW(model.parameters(), lr=lr)
    criterion_rule = nn.CrossEntropyLoss()
    criterion_state = nn.BCEWithLogitsLoss() # Good for binary vector targets
    best_val_rule_acc = 0.0
    
    for epoch in range(1, epochs + 1):
        model.train(); total_loss = 0
        for data, state_targets, rule_labels in train_loader:
            data, state_targets, rule_labels = data.to(device), state_targets.to(device), rule_labels.to(device)
            optimizer.zero_grad()
            state_logits, rule_logits = model(data)
            
            loss_state = criterion_state(state_logits, state_targets.float())
            loss_rule = criterion_rule(rule_logits, rule_labels)
            loss = w_state * loss_state + w_rule * loss_rule
            
            loss.backward(); optimizer.step(); total_loss += loss.item()

        model.eval(); val_rule_correct, val_total = 0, 0
        with torch.no_grad():
            for data, state_targets, rule_labels in val_loader:
                data, labels = data.to(device), rule_labels.to(device)
                _, rule_logits = model(data)
                _, preds = torch.max(rule_logits, 1)
                val_rule_correct += (preds == labels).sum().item(); val_total += len(labels)
        
        val_rule_acc = val_rule_correct / val_total
        print(f"Epoch {epoch}/{epochs} | Avg Loss: {total_loss/len(train_loader):.4f} | Val Rule Acc (Unseen Rules): {val_rule_acc:.4f}")
        
        if val_rule_acc > best_val_rule_acc:
            best_val_rule_acc = val_rule_acc
            torch.save(model.state_dict(), model_save_path)
            print(f"  -> 🏆 New best generalization accuracy! Model saved.")
            
    print(f"\n--- Pre-training Complete ---")
    print(f"🏆 Best Validation Accuracy on UNSEEN rules: {best_val_rule_acc:.4f}")
    return model

# --- 5. Main Execution ---
if __name__ == '__main__':
    # Config
    ECA_TIMESTEPS = 32; ECA_WIDTH = 64; VOCAB_SIZE = 2
    NUM_SAMPLES = 200; EPOCHS = 20; LR = 0.001; BATCH_SIZE = 64

    # The Curriculum Definition
    TRAIN_RULES_CONFIG = collections.OrderedDict({
        # Level 1: Primes (The Alphabet)
        'R15': {'type': 'base', 'rules': [15]},
        'R170': {'type': 'base', 'rules': [170]},
        'R60': {'type': 'base', 'rules': [60]},
        # Level 2: Composites (The Words)
        '(170 o 15)': {'type': 'composite', 'rules': [170, 15]}, # This is Rule 51
        '(30 o 90)': {'type': 'composite', 'rules': [30, 90]},
        # Level 3: Known Complex Decompositions (The Sentences)
        'R110': {'type': 'base', 'rules': [110]},
        'R30': {'type': 'base', 'rules': [30]},
    })

    # Crucially, validation rules are not in the training set
    VAL_RULES_CONFIG = collections.OrderedDict({
        'R54': {'type': 'base', 'rules': [54]}, # Unseen Class IV
        'R90': {'type': 'base', 'rules': [90]}, # Unseen complex periodic
        '(54 o 60)': {'type': 'composite', 'rules': [54, 60]} # Unseen composite
    })

    # Create a unified map for all possible rules
    all_rule_keys = list(TRAIN_RULES_CONFIG.keys()) + list(VAL_RULES_CONFIG.keys())
    rule_to_id_map = {key: i for i, key in enumerate(all_rule_keys)}
    NUM_RULES = len(all_rule_keys)

    # Data
    train_dataset = ECACompositionDataset(NUM_SAMPLES, TRAIN_RULES_CONFIG, ECA_WIDTH, ECA_TIMESTEPS, rule_to_id_map)
    val_dataset = ECACompositionDataset(NUM_SAMPLES // 2, VAL_RULES_CONFIG, ECA_WIDTH, ECA_TIMESTEPS, rule_to_id_map)
    train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True)
    val_loader = DataLoader(val_dataset, batch_size=BATCH_SIZE, shuffle=False)
    print(f"Dataset ready. Train size: {len(train_dataset)}, Val (Unseen Rules) size: {len(val_dataset)}")
    
    # Model
    symbolic_encoder = BiGruCausalEncoder(
        vocab_size=VOCAB_SIZE, emb_dim=128, hidden_dim=256, 
        num_layers=2, num_rules=NUM_RULES, state_width=ECA_WIDTH
    )
    
    # Run Training
    SAVE_PATH = os.path.join(MODEL_CACHE_DIR, 'symbolic_encoder_expert.pth')
    trained_model = pretrain_encoder(symbolic_encoder, train_loader, val_loader, EPOCHS, LR, SAVE_PATH)

    print("\n✅ Phase 1 Complete. The expert Symbolic Encoder has been forged and saved.")

What to Expect

Curriculum Generation: You will see progress bars as the script generates data for both the training rules and the separate, unseen validation rules.
Training Logs: Each epoch will report the average loss (a combination of state and rule loss) and, most importantly, the Validation Rule Accuracy. This accuracy is the key metric, as it’s being calculated on rules the model has never seen before.
Final Output: The script will train for 20 epochs. You should see the validation accuracy increase significantly, demonstrating that the model is learning to generalize the concept of a rule. The final trained model will be saved to your Google Drive, ready to be used as the “Symbolic Brain” in our final CIv14 architecture.