SuperBPE

Other

Pricing

Free

Platforms

web

Description

Efficient superword tokenization for superior language model performance

Detailed Description

SuperBPE transforms tokenization by extending traditional subword-based BPE into a two-stage process that first learns subwords via whitespace pretokenization and then lifts this restriction to discover superword tokens common multi-word expressions. The result is up to 33% fewer tokens for the same text, faster and more efficient inference, and stronger downstream performance for example, an 8B-scale language model with SuperBPE achieves a +8.2% gain on MMLU and a +4.0% average improvement across 30 benchmarks while reducing inference compute by ~27%.

SuperBPE

Pricing

Platforms

Description

Detailed Description

Tags

Tool Actions

Share

Reviews

Write a Review

Related Tools