ByteLatentTransformer

ThoughtStorms Wiki

Transformer without tokenisation

Instead "byte patches" (or sequences of bytes) are used. But more unpredictable patches are given more training

https://youtu.be/HuEgzyNOg7Y?si=E0_h55uQOXYbA3Dj

No Backlinks