Back to All Tools

SuperBPE
Other
Pricing
Free
Platforms
web
Description
Efficient superword tokenization for superior language model performance
Detailed Description
SuperBPE transforms tokenization by extending traditional subword-based BPE into a two-stage process that first learns subwords via whitespace pretokenization and then lifts this restriction to discover superword tokens common multi-word expressions. The result is up to 33% fewer tokens for the same text, faster and more efficient inference, and stronger downstream performance for example, an 8B-scale language model with SuperBPE achieves a +8.2% gain on MMLU and a +4.0% average improvement across 30 benchmarks while reducing inference compute by ~27%.
Tags
superbpe
tokenization
superword
efficiency
whitespace
language models
bpe
Reviews
Write a Review
Please sign in to submit a review.
Loading reviews...