Cybernetic Intelligence

An open exploration of viable human-AI systems.

View the Project on GitHub algoplexity/cybernetic-intelligence

Research Proposal

Title

Scaling Algorithmic Market Modeling with LLM-Augmented Cellular Automata: Toward a Hybrid Meta-Evolutionary Discovery Engine


1. Background and Motivation

Traditional financial models based on backward-looking statistics and stochastic processes have limited capacity to capture the nonlinear, emergent, and causal dependencies governing complex market dynamics.

In contrast, my MSc thesis introduced an algorithmic generative modeling approach using elementary cellular automata (ECA), minimal algorithmic information loss methods (MILS), and genetic algorithms (GA) to uncover hidden structures in binary-encoded market data. This methodology demonstrated early promise in identifying rule-based generative processes that resemble observed financial behaviors.

However, scalability remains constrained by the combinatorial growth of the rule space and the computational expense of fitness evaluations using BDM and MILS.

Recent advances in transformer-based CA learning (Burtsev, 2024) suggest a way forward—neural abstraction of CA dynamics and rule generalization. This project integrates those techniques into a hybrid, scalable, open-source discovery engine.


2. Research Objectives

This research extends the original thesis by:

  1. Accelerating the exploration of high-dimensional CA rule spaces using LLMs and surrogate models.
  2. Training transformer models to learn mappings between observed market data and CA-generated patterns.
  3. Integrating causal decomposition, MILS compression, and evolutionary search into a hybrid neuro-symbolic framework.
  4. Building a reusable open-source toolkit for explainable, generative market modeling.

3. Research Questions


4. Methodology

4.1 LLM-Guided Rule Generation

Component: LLM-based Generator

Benefit: Strongly narrows search space with semantic priors; aligns with the framework’s generative module.


4.2 Surrogate Fitness Modeling

Component: Evaluator with Internal State Module

Benefit: Provides fast fitness estimation and filters out weak candidates before full evaluation.


4.3 Meta-Evolutionary Strategy

Component: Meta-Controller

Benefit: Introduces adaptive optimization control; accelerates convergence and escapes local minima.


4.4 Memory and Caching Mechanisms

Component: Memory / Cache

Benefit: Avoids redundant evaluations and enables long-term learning.


4.5 Full Workflow Summary

  1. Data Preparation
    Encode financial time-series data into 2D binary arrays.

  2. Rule Generation
    Combine LLM-generated, randomly sampled, and historically strong rule tuples.

  3. Evaluation Pipeline
    Run CA simulations → MILS compression → BDM scoring
    Or use surrogate model for approximation.

  4. Evolution Control
    Apply meta-controller to steer GA operations dynamically.

  5. Causal Decomposition
    Analyze top-performing rules via perturbation and modular decomposition.

  6. Toolkit Output
    Package functionality as a CLI / Jupyter toolset with:

    • Visual diagnostics
    • Rule match heatmaps
    • Pattern similarity plots

5. Deliverables


6. Timeline (6–9 months)

Month Milestone
1–2 Rebuild modular CA simulation & data pipeline
2–3 Train transformer model on CA sequences
3–5 Implement LLM search + surrogate models
5–6 Integrate causal decomposition module
6–7 Validate system over market datasets
7–9 Finalize paper, documentation, and toolkit release

7. References


8. License

Proposed toolkit to be released under MIT or Apache 2.0 License.