Anastrophic Regularization CNN (Split-MNIST)

This model card hosts the weights for a CNN trained using Anastrophic Regularization ($\mathcal{R}_{ana}$), a novel approach to mitigate catastrophic forgetting in sequential learning tasks.

Model Description

Anastrophic Regularization is derived from Anastrophic Theory, a mathematical framework for analyzing discrete periodic systems. Unlike standard $L_2$ decay or EWC, this method preserves the structural "Harmonic Memory" of the network by guiding weight evolution along Fisher-Rao geodetic paths.

Key Advantages

Maximum Plasticity: Weights adapt to new tasks while maintaining the global periodic functional invariants of previous ones.
100% Data-Free: Operates strictly in the spectral domain via Fast Fourier Transforms (FFT). No access to previous training data is required.
Privacy Preserving: Ideal for environments with data-retention constraints where EWC or Replay buffers are not feasible.

Intended Use

This specific model serves as a benchmark for Continual Learning. It was trained on the Split-MNIST dataset:

Task A: Digits 0-4
Task B: Digits 5-9 (Trained using $\mathcal{R}_{ana}$ to prevent forgetting Task A).

Evaluation Results

The model achieves the following performance:

Task B (Current) Accuracy: ~86.69%
Task A (Retained) Accuracy: ~71.16%

Mathematical Formulation

The weights were optimized using the following objective:

$\mathcal{R}_{ana}(W) = \lambda(1 - \Phi(Spec(W))) + \eta BB(W, W_{prev})$

Citation and Full Paper

For the complete theoretical framework, proof of the Fisher-Rao geodetic paths, and the original publication, please refer to:

Zenodo Repository: [https://zenodo.org/records/18699347]

GitHub Implementation: [https://github.com/MituMath/Anastrophic-Regularization-PyTorch]

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support