ByteLatentTransformer

ThoughtStorms Wiki

Transformer without tokenisation

Instead "byte patches" (or sequences of bytes) are used. But more unpredictable patches are given more training

Meta open-sources their version :

No Backlinks