num_moe_experts=layer_idx
#3
by
alexandretl
- opened
Hello,
Congrats for the release!
By looking through the code, I found that you set : num_moe_experts=layer_idx
That is unusual, especially that in the paper you say that you use 16 experts in total.
I wonder if there is a reason behind that, or it's here by mistake ?
Thank you
Hey, thanks for the comment! The name is a bit confusing, it's pulling number of experts from the config config.zaya_layers[layer_n]: https://huggingface.co/Zyphra/ZAYA1-base/blob/main/config.json#L282
Ok yes now I understand, thank you
alexandretl
changed discussion status to
closed