Solution Architecture v4
Solution Architecture v4
This document outlines the complete C4 architecture for our solution to the ADIA challenge. This version (v4) is a self-contained blueprint incorporating all design decisions and corrections, providing a definitive guide for implementation.
Level 1: System Context
This view shows our system in relation to the user and the external platform. Our solution is a self-contained library that is called by the ADIA Platform Runner, which expects train()
and infer()
entrypoints.
C4Context
title System Context Diagram for ADIA Challenge
System_Ext(adia_platform, "ADIA Platform Runner", "The environment that calls our train() and infer() functions.")
System(our_system, "Our Solution", "A two-stage deep learning pipeline to detect structural breaks in time series.")
Rel(adia_platform, our_system, "Calls train() and infer(), providing data.")
Level 2: Container Diagram
This view breaks down our solution into its major, high-level structural blocks or “containers.”
C4Container
title System Container Diagram for ADIA Challenge
System_Ext(adia_platform, "ADIA Platform Runner", "The environment that calls our train() and infer() functions.")
Container_Boundary(our_system, "Our Solution") {
Container(core_lib, "Core Services Library", "Python Module", "Contains all foundational, reusable code for data processing and model definitions.")
Container(training_pipeline, "Training Pipeline Logic", "Python Module", "Orchestrates the two-stage training process (pre-training and fine-tuning).")
Container(inference_pipeline, "Inference Pipeline Logic", "Python Module", "Orchestrates the prediction process using the trained model.")
ContainerDb(model_store, "Model Store", "File System Directory", "Stores the final, trained model artifact and its configuration.")
}
Rel(adia_platform, training_pipeline, "Calls train()")
Rel(adia_platform, inference_pipeline, "Calls infer()")
Rel(training_pipeline, core_lib, "Uses data processing & model components from")
Rel(training_pipeline, model_store, "Writes final model artifact to")
Rel(inference_pipeline, core_lib, "Uses data processing components from")
Rel(inference_pipeline, model_store, "Reads final model artifact from")
Level 3: Component Diagrams
This level shows the components inside each container.
Container 1: Core Services Library
This container holds all the foundational, reusable building blocks of our system.
C4Component
title Component Diagram for Core Services Library
Container_Boundary(core_container, "Core Services Library") {
Component(perm_sym, "PermutationSymbolizer", "Converts a vector to a symbolic permutation.")
Component(series_proc, "SeriesProcessor", "Transforms a full time series into symbolic sequences.")
Component(eca_gen, "ECADataGenerator", "Creates synthetic ECA data for pre-training.")
Component(trans_enc, "HierarchicalDynamicalEncoder", "Model primitive for encoding sequences.")
Component(trans_dec, "HierarchicalDynamicalDecoder", "Model primitive for decoding sequences.")
Component(dyn_ae, "MDL_AU_Net_Autoencoder", "Composite model for pre-training.")
Component(break_class, "StructuralBreakClassifier", "Composite model for fine-tuning.")
}
Rel(series_proc, perm_sym, "Uses")
Rel(dyn_ae, trans_enc, "Is composed of")
Rel(dyn_ae, trans_dec, "Is composed of")
Rel(break_class, trans_enc, "Is composed of")
Container 2: Training Pipeline Logic
This container’s components are pure orchestrators that manage the two-stage training flow.
C4Component
title Component Diagram for Training Pipeline
Container_Boundary(core_container, "Core Services Library") {
Component(eca_gen, "ECADataGenerator")
Component(dyn_ae, "MDL_AU_Net_Autoencoder")
Component(series_proc, "SeriesProcessor")
Component(break_class, "StructuralBreakClassifier")
}
Container_Boundary(training_container, "Training Pipeline Logic") {
Component(pre_trainer, "MDLPreTrainer", "Manages the pre-training loop.")
Component(fine_tuner, "BreakClassifierFinetuner", "Manages the fine-tuning loop.")
Component(saver, "EncoderSaver", "Saves the final model artifact.")
}
System_Ext(model_store, "Model Store")
Rel(pre_trainer, eca_gen, "Uses")
Rel(pre_trainer, dyn_ae, "Trains")
Rel(fine_tuner, series_proc, "Uses")
Rel(fine_tuner, break_class, "Fine-tunes")
Rel_D(break_class, saver, "Provides Final Encoder to")
Rel_R(saver, model_store, "Writes artifact to")
Container 3: Inference Pipeline Logic
This container’s components load the final model and use core services to generate predictions.
C4Component
title Component Diagram for Inference Pipeline
System_Ext(model_store, "Model Store")
System_Ext(adia_platform, "ADIA Platform Runner")
Container_Boundary(inference_container, "Inference Pipeline Logic") {
Component(loader, "EncoderLoader", "Loads model from the Model Store.")
Component(encoder, "HierarchicalDynamicalEncoder", "The loaded, fine-tuned model artifact.")
Component(series_proc, "SeriesProcessor", "Transforms raw test data into symbolic sequences.")
Component(fingerprinter, "Fingerprinter", "Generates a stable fingerprint for a data segment.")
Component(scorer, "BreakScoreCalculator", "Computes the final distance score.")
}
Rel_R(loader, model_store, "Reads artifact from")
Rel_D(loader, encoder, "Instantiates")
Rel_D(fingerprinter, series_proc, "Uses")
Rel_D(fingerprinter, encoder, "Uses")
Rel_D(scorer, fingerprinter, "Gets 'before' and 'after' fingerprints from")
Rel_R(scorer, adia_platform, "Yields Prediction to")
Container 4: Model Store
This container represents the persistence layer (model_directory_path
).
C4Component
title Component Diagram for Model Store
Container_Boundary(model_store_container, "Model Store (File System Directory)") {
Component(model_weights, "final_encoder.pth", "PyTorch State Dictionary", "The learned numerical weights of the final encoder.")
Component(model_config, "model_config.joblib", "Configuration File", "Hyperparameters needed to build the model architecture before loading weights.")
}
Level 4: Code View (The Blueprint for Implementation)
This level details the primary classes and their corrected “code contracts.”
Module 1: core_library/data_processing.py
Class Name | Role & Responsibilities | Key Public Methods | Key Collaborators |
---|---|---|---|
PermutationSymbolizer |
Symbolic Converter. - Converts a single numeric vector into a discrete ordinal pattern symbol. - Uses randomized tie-breaking for robustness. |
__init__(embedding_dim, seed) symbolize_vector(vector) |
(None - Foundational) |
SeriesProcessor |
Real Data Transformer. - Manages the full pipeline: time-delay embedding, symbolization, and windowing into sequences. - Handles edge cases like series being too short. |
__init__(symbolizer, sequence_length) process(series) |
PermutationSymbolizer |
ECADataGenerator |
Synthetic Data Factory. - Simulates Elementary Cellular Automata to create a labeled dataset. - Handles composite rules and ensures reproducibility. |
__init__(config) generate_training_data() |
(None - Uses cellpylib externally) |
Module 2: core_library/model_architecture.py
Class Name | Role & Responsibilities | Key Public Methods | Key Collaborators |
---|---|---|---|
HierarchicalDynamicalEncoder |
Sequence Encoder (Contracting Path). - A nn.Module that compresses a sequence into a final “fingerprint” sequence.- Its forward pass MUST return a tuple: (fingerprint_sequence, residuals_list) . |
__init__(args) forward(sequence_batch) |
(None - Primitive) |
HierarchicalDynamicalDecoder |
Sequence Decoder (Expanding Path). - A nn.Module that reconstructs the original sequence.- Its forward pass MUST accept two arguments: (fingerprint_seq, residuals) . |
__init__(args, transitions) forward(fingerprint_seq, residuals) |
(None - Primitive) |
MDL_AU_Net_Autoencoder |
Pre-training Model. - A composite nn.Module that combines the Encoder, Decoder, and a classification head.- Its internal logic correctly handles the tuple returned by the encoder. |
__init__(args) forward(sequence_batch) encode(sequence_batch) |
HierarchicalDynamicalEncoder , HierarchicalDynamicalDecoder |
StructuralBreakClassifier |
Fine-tuning Model. - A composite nn.Module that predicts a break from processed before and after periods.- Its forward pass MUST accept two lists of tensors: (before_seqs, after_seqs) .- Its internal logic MUST correctly unpack the (fingerprint, _) tuple when calling its encoder. |
__init__(encoder, latent_dim, ...) forward(before_seqs, after_seqs) |
HierarchicalDynamicalEncoder |
Module 3: training_pipeline.py
Class Name | Role & Responsibilities | Key Public Methods | Key Collaborators |
---|---|---|---|
MDLPreTrainer |
Pre-training Orchestrator. - Manages the training loop for the MDL_AU_Net_Autoencoder . |
__init__(model, config) pretrain(data_generator) |
MDL_AU_Net_Autoencoder , ECADataGenerator |
BreakClassifierFinetuner |
Fine-tuning Orchestrator. - Manages the training loop for the StructuralBreakClassifier .- Its implementation must pass lists of tensors to the classifier. |
__init__(model, config) finetune(X_train, y_train, processor) |
StructuralBreakClassifier , SeriesProcessor |
EncoderSaver |
Artifact Manager. - Saves the final fine-tuned encoder and its configuration. |
save(model, config, path) |
StructuralBreakClassifier |
Module 4: inference_pipeline.py
Class Name | Role & Responsibilities | Key Public Methods | Key Collaborators |
---|---|---|---|
EncoderLoader |
Artifact Loader. - Reads the config and weights from the Model Store . |
load(path) |
HierarchicalDynamicalEncoder |
Fingerprinter |
Vector Generator. - Orchestrates producing a single, stable fingerprint for a time series segment. - Its implementation must handle lists of sequences and average the resulting fingerprints. |
__init__(encoder, processor) generate(series) |
SeriesProcessor , HierarchicalDynamicalEncoder |
BreakScoreCalculator |
Prediction Calculator. - Takes two fingerprint vectors and computes their cosine distance. |
calculate(fp_before, fp_after) |
(None - simple math) |