A custom 32k vocab BPE tokenizer trained on the FineWeb Edu dataset. Will be used for all Chytrej2 series models.
Built by PingVortex Labs.
Made by PingVortex.
-