An open exploration of viable human-AI systems.
View the Project on GitHub algoplexity/cybernetic-intelligence
This project is divided into three main stages: I. Expert Encoder Pre-training, II. Full System Assembly & Training, and III. Evaluation & Analysis.
(Goal: Forge the two specialized “brains” of our system by training them on idealized tasks.)
SymbolicEncoder that understands the algebra of causal rules and the emergence of complexity. This is the intensive curriculum we just designed.symbolic_encoder_expert.pth, containing the weights of our trained Causal Rule Inference Engine.LatentEncoder that can generate rich, contextual “fingerprints” of raw time series dynamics.TSEncoder (or a similar architecture like a GRU) using the hierarchical contrastive loss on a large, unlabeled time series dataset. Ideally, this would be the unlabeled ADIA training data itself, allowing the encoder to learn the specific “feel” of the target domain.latent_encoder_expert.pth, containing the weights of our trained Dynamics Fingerprinter.(Goal: Integrate the two expert brains into the final Siamese architecture and train the decision-making head.)
ADIADualPathDataset, which takes a real-world ADIA sample (id) and produces the four required inputs: raw_A, symbolic_A (using our d=6, τ=10 symbolizer), raw_B, and symbolic_B.CIv14_DivergenceClassifier.symbolic_encoder_expert.pth and latent_encoder_expert.pth into the two respective encoder pathways.classifier_head on the ADIA training data. The task is to learn the mapping from the concatenated [symbolic_divergence, latent_divergence] vector to a break/no-break prediction.CIv14_final_model.pth, containing the weights for the entire system, with the trained head.(Goal: Measure the final performance of our system and analyze its behavior.)
CIv14_final_model.pth.True and False break examples.This is our phased roadmap for building and validating the final model.
| Phase | Target | Objective & Methodology | Status & Key Learnings |
|---|---|---|---|
| Phase 0 | 🧪 Symbolizer Calibration | Calibrate the Symbolic “Sensor.” Run the regression-style test harness on the real ADIA data to find the optimal (d, τ) parameters by maximizing Jensen-Shannon Divergence. |
✅ COMPLETE. Result: The optimal, most sensitive parameters for the ADIA data are d=6, τ=10. We will proceed with this configuration. |
| Phase 1 | 🧪 Latent Brain & Baseline | Forge the Latent Encoder & Establish a Baseline. 1. Pre-train a TSEncoder using the TS2Vec contrastive learning methodology on unlabeled data to create an expert in raw dynamics. 2. Test a “latent-only” Siamese model on the ADIA data to get a baseline AUC. |
✅ COMPLETE. Result: A pre-trained TSEncoder is ready. The latent-only baseline is AUC = 0.5024, proving this path is insufficient alone. |
| Phase 2 | 🧪 Symbolic Brain & Baseline | Forge the Symbolic Encoder & Establish a Baseline. 1. (Bake-Off): Empirically test Transformer vs. GRU architectures on an ECA rule-inference task. 2. (Definitive Pre-training): Train the winning architecture (GRU) on a complex, composite rule dataset based on Sequential Rule Application ( State(t+1) = Rule_B(Rule_A(State(t)))) to forge a true expert in causal inference. |
✅ Bake-Off Complete: The Unidirectional GRU is the decisive winner (better accuracy, 2.4x faster). ⏳ NEXT: Definitive Pre-training on the composite rule dataset. |
| Phase 3 | 🧩 Final Assembly & Fine-Tuning | Build and Train the Full CIv14 Model. 1. Construct the CIv14-DivergenceClassifier with the two pre-trained expert encoders. 2. Create the DualStreamDataset to serve (raw_A, symbolic_A, raw_B, symbolic_B) tuples. 3. Freeze the encoders and train the final classifier head on the combined divergence signal. 4. (Optional) Unfreeze and fine-tune the entire system with a low learning rate. |
🔜 PENDING. This is the final step after both expert encoders are ready. |
This is the definitive architecture we will build in Phase 3.
graph TD
A[Input: Pre & Post Segments] --> B{Symbolic Pathway};
A --> C{Latent Pathway};
B --> E[Final Classifier];
C --> E[Final Classifier];
E --> F[Output: Break Probability];
style B fill:#f9f,stroke:#333,stroke-width:2px
style C fill:#9cf,stroke:#333,stroke-width:2px
| Stage | Input Data | Model / Process | Output |
|---|---|---|---|
| Phase 2 (Pre-training) | Synthetic sequences from composite ECA rules. | Symbolic Encoder (GRU) | A pre-trained encoder that can infer causal rules. |
| Phase 3 (Fine-tuning) | Real ADIA data (raw_A, sym_A, raw_B, sym_B). |
Full CIv14-DivergenceClassifier |
A final, trained model that outputs a break probability. |
| Inference | A new pair of segments from the test set. | The final trained model. | An AUC score on the test set. |
This consolidated plan provides a clear, logical, and evidence-based path from where we are now to the final, robust solution. It leverages all our key findings and ensures that each component is validated before integration.