Unverified Commit 01068abd authored by Patrick von Platen's avatar Patrick von Platen Committed by GitHub
Browse files

add blog to docs (#10997)

parent cd56f3fe
...@@ -41,6 +41,8 @@ propose novel applications to genomics data.* ...@@ -41,6 +41,8 @@ propose novel applications to genomics data.*
Tips: Tips:
- For an in-detail explanation on how BigBird's attention works, see `this blog post
<https://huggingface.co/blog/big-bird>`__.
- BigBird comes with 2 implementations: **original_full** & **block_sparse**. For the sequence length < 1024, using - BigBird comes with 2 implementations: **original_full** & **block_sparse**. For the sequence length < 1024, using
**original_full** is advised as there is no benefit in using **block_sparse** attention. **original_full** is advised as there is no benefit in using **block_sparse** attention.
- The code currently uses window size of 3 blocks and 2 global blocks. - The code currently uses window size of 3 blocks and 2 global blocks.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment