Skip to content

Commit 4403183

Browse files
johnhaofuphantomfjh
andauthored
Fix inaccurate description of RoPE paper findings (#2962)
Co-authored-by: phantomfjh <[email protected]>
1 parent 87b0efc commit 4403183

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

designing-positional-encoding.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -406,7 +406,7 @@ of values from our input vector. For \\(2D\\) data, we need to encode both horiz
406406

407407
## The future of positional encoding
408408

409-
Is RoPE the final incarnation of positional encoding? This [recent paper](https://arxiv.org/pdf/2410.06205) from DeepMind deeply analyses RoPE and highlights some fundamental problems. TLDR: RoPE isn't a perfect solution, and the models mostly focus on the lower frequencies and the rotation for a certain percent of low frequencies improves performance on Gemma 2B!
409+
Is RoPE the final incarnation of positional encoding? This [recent paper](https://arxiv.org/pdf/2410.06205) from DeepMind deeply analyses RoPE and highlights some fundamental problems. TLDR: RoPE isn't a perfect solution, and the models mostly focus on the lower frequencies, but the paper shows that **removing** (not rotating) the lowest frequencies improves performance on Gemma 2B!
410410

411411
I anticipate some future breakthroughs, perhaps taking inspiration from
412412
signal processing with ideas like wavelets or hierarchical implementations. As models

0 commit comments

Comments
 (0)