MixtureOfExperts

ThoughtStorms Wiki

Context : Transformers, NeuralNetworks, LanguageModels

A mixture of experts is when you replace one large neural network / language model by a number of smaller ones. And then have a cheaper switching circuit to select which "expert" to use for a task. In fact AFAICT, this is also like a way of partitioning a large neural net and just "switching off" some of the weights and calculations when they are not relevant.

https://huggingface.co/blog/moe

Autonomy of Experts : https://www.marktechpost.com/2025/01/26/autonomy-of-experts-aoe-a-router-free-paradigm-for-efficient-and-adaptive-mixture-of-experts-models/?amp

A million experts? MixtureOfAMillionExperts