Introduction
Introduction
This document outlines a novel, three-stage training architecture for a Structural Break Detection (SBD) system designed to identify fundamental regime shifts in non-stationary time series data. Traditional approaches often struggle with the “domain gap” between the clean, abstract models of system dynamics and the noisy, high-dimensional reality of financial markets. To address this, our architecture employs a carefully sequenced transfer learning strategy that systematically builds knowledge from the abstract to the concrete. The core of the system is a hierarchical U-Net autoencoder, pre-trained on the universal dynamics of Elementary Cellular Automata (ECAs) to learn the “physics” of complex systems. This foundational knowledge is then bridged to the target domain through a dedicated “translator” training stage, where an embedding layer is taught to map real-world symbolic patterns to the model’s latent space. Finally, the entire system is fine-tuned on the specific task of break detection. This staged methodology ensures that each component of the model is trained on a task for which it is best suited, creating a robust, multi-layered system that is both deeply principled and pragmatically effective. The following sections will detail the precise data flow and component states for each of these three critical training stages.
The Three-Stage Training Architecture: A Detailed Walkthrough
The training process is broken down into three distinct, sequential stages. Each stage builds upon the last, progressively refining the model’s capabilities from abstract knowledge to task-specific expertise.
Stage 1: Pre-training the “Physics Engine”
Purpose: To teach the core model the fundamental rules of how complex systems evolve, using a perfect, controlled “toy universe” (Elementary Cellular Automata).
Data Flow:
[ECADataGenerator]
|
+--(Generates)--> [Batch of 2D Float Tensors (ECA Orbits) + Rule Labels]
|
V
[MDL_AU_Net_Autoencoder] (TRAINING)
|
+-- 1. [eca_input_proj] (ACTIVE / LEARNING)
| |
| V
+-- 2. [Core U-Net Body (Encoder/Decoder)] (ACTIVE / LEARNING)
| |
| V
| +--> [Reconstruction Logits] ->|
| | |
| +--> [Rule Logits] ---------->|
| V
+-- [embedding_head] (INACTIVE / FROZEN) [Dual MDL Loss]
|
V
[Backpropagation updates ALL ACTIVE components]
Outcome of Stage 1:
- The
Core U-Net Body
has learned the general “physics” of dynamical systems. - The
eca_input_proj
has learned to translate raw ECA data into a format the Core Body understands. - The
embedding_head
remains randomly initialized and untrained.
Stage 2: Training the “Translator”
Purpose: To teach the embedding_head
how to convert our symbolic representation of market data into the “language” that the now-frozen “Physics Engine” can understand. We use simple, clean sin/cos
mock data for this initial translation task.
Data Flow:
[SeriesProcessor]
|
+--(Processes `sin/cos` data)--> [Batches of Integer Symbol Sequences]
|
V
[StructuralBreakClassifier] (TRAINING)
|
+-- [MDL_AU_Net_Autoencoder]
| |
| +-- [eca_input_proj] (INACTIVE / FROZEN)
| |
| +-- 1. [embedding_head] (ACTIVE / LEARNING)
| | |
| | V
| +-- 2. [Core U-Net Body] (INACTIVE / FROZEN) --> (Generates Fingerprints)
|
+-- 3. [Classifier Head] (ACTIVE / LEARNING)
|
V
[BCE Loss]
|
V
[Backpropagation updates 'embedding_head' and 'Classifier Head' ONLY]
Outcome of Stage 2:
- The
embedding_head
has learned to translate simple symbolic patterns into the “language” the Core U-Net Body understands. - The
Classifier Head
has learned a basic classification rule from the simple data.
Stage 3: Fine-tuning for Nuance
Purpose: To take the fully assembled system (with its trained “Physics Engine” and “Translator”) and fine-tune only the final “judgement” layer on the complex, noisy patterns of the real X_train
market data.
Data Flow:
[SeriesProcessor]
|
+--(Processes real `X_train` data)--> [Batches of Integer Symbol Sequences]
|
V
[StructuralBreakClassifier] (TRAINING)
|
+-- [MDL_AU_Net_Autoencoder]
| |
| +-- [eca_input_proj] (INACTIVE / FROZEN)
| |
| +-- 1. [embedding_head] (INACTIVE / FROZEN)
| | |
| | V
| +-- 2. [Core U-Net Body] (INACTIVE / FROZEN) --> (Generates Fingerprints)
|
+-- 3. [Classifier Head] (ACTIVE / LEARNING)
|
V
[BCE Loss]
|
V
[Backpropagation updates the 'Classifier Head' ONLY]
Outcome of Stage 3:
- The
Classifier Head
is now fine-tuned on the complex, noisy patterns of real market data. -
The entire
StructuralBreakClassifier
is now the final, ready-to-use model artifact for inference.
Conclusion
In conclusion, the proposed three-stage training architecture provides a robust and theoretically-grounded framework for developing a highly sensitive structural break detector. By explicitly separating the training phases into (1) abstract physics learning, (2) domain-specific translation, and (3) task-specific fine-tuning, we mitigate common issues in transfer learning and effectively bridge the domain gap between synthetic and real-world data. The pre-training on complex ECA dynamics instills a foundational understanding of rule-based systems, while the intermediate training of the embedding head provides a principled method for representing real-world patterns in a way the core model can understand. This layered approach ensures that the final model does not simply learn superficial correlations from the target data, but instead leverages a deep, pre-trained knowledge base of what constitutes a fundamental change in a system’s underlying generative mechanism. This architecture represents a significant step towards creating more generalizable and reliable models for time series analysis, capable of moving beyond pattern matching to infer shifts in causal dynamics. The successful implementation of this training regimen is expected to yield a state-of-the-art model for the ADIA Structural Break Detection challenge.