Introduction

This document outlines a novel, three-stage training architecture for a Structural Break Detection (SBD) system designed to identify fundamental regime shifts in non-stationary time series data. Traditional approaches often struggle with the “domain gap” between the clean, abstract models of system dynamics and the noisy, high-dimensional reality of financial markets. To address this, our architecture employs a carefully sequenced transfer learning strategy that systematically builds knowledge from the abstract to the concrete. The core of the system is a hierarchical U-Net autoencoder, pre-trained on the universal dynamics of Elementary Cellular Automata (ECAs) to learn the “physics” of complex systems. This foundational knowledge is then bridged to the target domain through a dedicated “translator” training stage, where an embedding layer is taught to map real-world symbolic patterns to the model’s latent space. Finally, the entire system is fine-tuned on the specific task of break detection. This staged methodology ensures that each component of the model is trained on a task for which it is best suited, creating a robust, multi-layered system that is both deeply principled and pragmatically effective. The following sections will detail the precise data flow and component states for each of these three critical training stages.

The Three-Stage Training Architecture: A Detailed Walkthrough

The training process is broken down into three distinct, sequential stages. Each stage builds upon the last, progressively refining the model’s capabilities from abstract knowledge to task-specific expertise.

Stage 1: Pre-training the “Physics Engine”

Purpose: To teach the core model the fundamental rules of how complex systems evolve, using a perfect, controlled “toy universe” (Elementary Cellular Automata).

Data Flow:

[ECADataGenerator]
     |
     +--(Generates)--> [Batch of 2D Float Tensors (ECA Orbits) + Rule Labels]
                                |
                                V
                  [MDL_AU_Net_Autoencoder] (TRAINING)
                  |
                  +-- 1. [eca_input_proj] (ACTIVE / LEARNING)
                  |         |
                  |         V
                  +-- 2. [Core U-Net Body (Encoder/Decoder)] (ACTIVE / LEARNING)
                  |         |
                  |         V
                  |         +--> [Reconstruction Logits] ->|
                  |         |                             |
                  |         +--> [Rule Logits] ---------->|
                  |                                       V
                  +-- [embedding_head] (INACTIVE / FROZEN)   [Dual MDL Loss]
                                                            |
                                                            V
                              [Backpropagation updates ALL ACTIVE components]

Outcome of Stage 1:

The Core U-Net Body has learned the general “physics” of dynamical systems.
The eca_input_proj has learned to translate raw ECA data into a format the Core Body understands.
The embedding_head remains randomly initialized and untrained.

Stage 2: Training the “Translator”

Purpose: To teach the embedding_head how to convert our symbolic representation of market data into the “language” that the now-frozen “Physics Engine” can understand. We use simple, clean sin/cos mock data for this initial translation task.

Data Flow:

[SeriesProcessor]
     |
     +--(Processes `sin/cos` data)--> [Batches of Integer Symbol Sequences]
                                          |
                                          V
                        [StructuralBreakClassifier] (TRAINING)
                        |
                        +-- [MDL_AU_Net_Autoencoder]
                        |   |
                        |   +-- [eca_input_proj] (INACTIVE / FROZEN)
                        |   |
                        |   +-- 1. [embedding_head] (ACTIVE / LEARNING)
                        |   |         |
                        |   |         V
                        |   +-- 2. [Core U-Net Body] (INACTIVE / FROZEN) --> (Generates Fingerprints)
                        |             
                        +-- 3. [Classifier Head] (ACTIVE / LEARNING)
                                  |
                                  V
                               [BCE Loss]
                                  |
                                  V
     [Backpropagation updates 'embedding_head' and 'Classifier Head' ONLY]

Outcome of Stage 2:

The embedding_head has learned to translate simple symbolic patterns into the “language” the Core U-Net Body understands.
The Classifier Head has learned a basic classification rule from the simple data.

Stage 3: Fine-tuning for Nuance

Purpose: To take the fully assembled system (with its trained “Physics Engine” and “Translator”) and fine-tune only the final “judgement” layer on the complex, noisy patterns of the real X_train market data.

Data Flow:

[SeriesProcessor]
     |
     +--(Processes real `X_train` data)--> [Batches of Integer Symbol Sequences]
                                               |
                                               V
                             [StructuralBreakClassifier] (TRAINING)
                             |
                             +-- [MDL_AU_Net_Autoencoder]
                             |   |
                             |   +-- [eca_input_proj] (INACTIVE / FROZEN)
                             |   |
                             |   +-- 1. [embedding_head] (INACTIVE / FROZEN)
                             |   |         |
                             |   |         V
                             |   +-- 2. [Core U-Net Body] (INACTIVE / FROZEN) --> (Generates Fingerprints)
                             |             
                             +-- 3. [Classifier Head] (ACTIVE / LEARNING)
                                       |
                                       V
                                    [BCE Loss]
                                       |
                                       V
                [Backpropagation updates the 'Classifier Head' ONLY]

Outcome of Stage 3:

The Classifier Head is now fine-tuned on the complex, noisy patterns of real market data.
The entire StructuralBreakClassifier is now the final, ready-to-use model artifact for inference.

Conclusion

In conclusion, the proposed three-stage training architecture provides a robust and theoretically-grounded framework for developing a highly sensitive structural break detector. By explicitly separating the training phases into (1) abstract physics learning, (2) domain-specific translation, and (3) task-specific fine-tuning, we mitigate common issues in transfer learning and effectively bridge the domain gap between synthetic and real-world data. The pre-training on complex ECA dynamics instills a foundational understanding of rule-based systems, while the intermediate training of the embedding head provides a principled method for representing real-world patterns in a way the core model can understand. This layered approach ensures that the final model does not simply learn superficial correlations from the target data, but instead leverages a deep, pre-trained knowledge base of what constitutes a fundamental change in a system’s underlying generative mechanism. This architecture represents a significant step towards creating more generalizable and reliable models for time series analysis, capable of moving beyond pattern matching to infer shifts in causal dynamics. The successful implementation of this training regimen is expected to yield a state-of-the-art model for the ADIA Structural Break Detection challenge.

Introduction

The Three-Stage Training Architecture: A Detailed Walkthrough

Stage 1: Pre-training the “Physics Engine”

Stage 2: Training the “Translator”

Stage 3: Fine-tuning for Nuance

The entire StructuralBreakClassifier is now the final, ready-to-use model artifact for inference.

Conclusion

The entire `StructuralBreakClassifier` is now the final, ready-to-use model artifact for inference.