An open exploration of viable human-AI systems.
View the Project on GitHub algoplexity/cybernetic-intelligence
This proposal outlines the development of a modular, multi-agent system for autonomously generating novel, decorrelated alpha expressions using a compact LLM trained entirely through self-supervised interaction with the WorldQuant BRAIN platform. The agent will combine:
The alpha discovery process in quantitative finance suffers from:
Recent breakthroughs in multi-agent LLM architectures (Zhang et al., 2024) and reward-driven reasoning without supervision (Chen et al., 2024) offer a promising alternative: compact models that learn from environment feedback, not labels.
This system will serve as a testbed for applying this integrated architecture to a real-world, high-stakes domain: alpha factor mining on the WorldQuant BRAIN platform.
We propose a multi-agent LLM system that learns to generate Fast Expressions — the DSL used in BRAIN to construct alpha signals — entirely through interaction with the BRAIN backtesting environment.
| Agent Role | Description |
|---|---|
| Proposer | LLM generates candidate Fast Expressions in valid DSL syntax. |
| Implementer | Wraps the expression into a BRAIN-compatible format and runs simulation via the Python API. |
| Validator | Extracts key metrics (Fitness, Sharpe, turnover, decorrelation) from backtest results. |
| Critic | Assesses novelty, stability, and adherence to constraints; filters poor outputs. |
| Scheduler (optional) | Bandit-based role switching (e.g., prioritize exploration, refinement, or high-confidence picks). |
The system uses AZR-style curriculum-free self-play:
trl or Unsloth.rank(close)).ace_lib) to run expressions and extract metrics.Expand beyond single-loop AZR to R\&D-Agent(Q) roles:
| Area | Contribution |
|---|---|
| Methodological | Demonstrates the effectiveness of AZR in a financial setting with zero pretraining. |
| Architectural | Combines AZR with the modular agent structure of R\&D-Agent(Q) for enhanced interpretability. |
| Practical | Produces high-Fitness, decorrelated Fast Expressions on real data using only API access. |
| Computational | Can run on Colab-tier compute using LoRA, TRL, and lightweight 1B models. |