Anastrophic Regularization CNN (Split-MNIST)

This model card hosts the weights for a CNN trained using Anastrophic Regularization ($\mathcal{R}_{ana}$), a novel approach to mitigate catastrophic forgetting in sequential learning tasks.

Model Description

Anastrophic Regularization is derived from Anastrophic Theory, a mathematical framework for analyzing discrete periodic systems. Unlike standard $L_2$ decay or EWC, this method preserves the structural "Harmonic Memory" of the network by guiding weight evolution along Fisher-Rao geodetic paths.

Key Advantages

  • Maximum Plasticity: Weights adapt to new tasks while maintaining the global periodic functional invariants of previous ones.
  • 100% Data-Free: Operates strictly in the spectral domain via Fast Fourier Transforms (FFT). No access to previous training data is required.
  • Privacy Preserving: Ideal for environments with data-retention constraints where EWC or Replay buffers are not feasible.

Intended Use

This specific model serves as a benchmark for Continual Learning. It was trained on the Split-MNIST dataset:

  1. Task A: Digits 0-4
  2. Task B: Digits 5-9 (Trained using $\mathcal{R}_{ana}$ to prevent forgetting Task A).

Evaluation Results

The model achieves the following performance:

  • Task B (Current) Accuracy: ~86.69%
  • Task A (Retained) Accuracy: ~71.16%

Mathematical Formulation

The weights were optimized using the following objective:

Rana(W)=λ(1Φ(Spec(W)))+ηBB(W,Wprev)\mathcal{R}_{ana}(W) = \lambda(1 - \Phi(Spec(W))) + \eta BB(W, W_{prev})

Citation and Full Paper

For the complete theoretical framework, proof of the Fisher-Rao geodetic paths, and the original publication, please refer to:

Zenodo Repository: [https://zenodo.org/records/18699347]

GitHub Implementation: [https://github.com/MituMath/Anastrophic-Regularization-PyTorch]

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support