Anastrophic Regularization CNN (Split-MNIST)
This model card hosts the weights for a CNN trained using Anastrophic Regularization ($\mathcal{R}_{ana}$), a novel approach to mitigate catastrophic forgetting in sequential learning tasks.
Model Description
Anastrophic Regularization is derived from Anastrophic Theory, a mathematical framework for analyzing discrete periodic systems. Unlike standard $L_2$ decay or EWC, this method preserves the structural "Harmonic Memory" of the network by guiding weight evolution along Fisher-Rao geodetic paths.
Key Advantages
- Maximum Plasticity: Weights adapt to new tasks while maintaining the global periodic functional invariants of previous ones.
- 100% Data-Free: Operates strictly in the spectral domain via Fast Fourier Transforms (FFT). No access to previous training data is required.
- Privacy Preserving: Ideal for environments with data-retention constraints where EWC or Replay buffers are not feasible.
Intended Use
This specific model serves as a benchmark for Continual Learning. It was trained on the Split-MNIST dataset:
- Task A: Digits 0-4
- Task B: Digits 5-9 (Trained using $\mathcal{R}_{ana}$ to prevent forgetting Task A).
Evaluation Results
The model achieves the following performance:
- Task B (Current) Accuracy: ~86.69%
- Task A (Retained) Accuracy: ~71.16%
Mathematical Formulation
The weights were optimized using the following objective:
Citation and Full Paper
For the complete theoretical framework, proof of the Fisher-Rao geodetic paths, and the original publication, please refer to:
Zenodo Repository: [https://zenodo.org/records/18699347]
GitHub Implementation: [https://github.com/MituMath/Anastrophic-Regularization-PyTorch]