MixtureOfExperts
ThoughtStorms Wiki
Context : Transformers, NeuralNetworks, LanguageModels
A mixture of experts is when you replace one large neural network / language model by a number of smaller ones. And then have a cheaper switching circuit to select which "expert" to use for a task. In fact AFAICT, this is also like a way of partitioning a large neural net and just "switching off" some of the weights and calculations when they are not relevant.
A million experts? MixtureOfAMillionExperts
See also :
Backlinks (2 items)