Change `BloomConfig` docstring (#19336)

* change `BloomConfig` docstring - slightly change the docstring of the `BloomConfig` - Use correct default vocab size - Use correct default `hidden_dim`, `n_head` * Update src/transformers/models/bloom/configuration_bloom.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/bloom/configuration_bloom.py Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com> * make style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>

Change `BloomConfig` docstring (#19336)
* change `BloomConfig` docstring - slightly change the docstring of the `BloomConfig` - Use correct default vocab size - Use correct default `hidden_dim`, `n_head` * Update src/transformers/models/bloom/configuration_bloom.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/bloom/configuration_bloom.py Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com> * make style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
4cbc797b · Younes Belkada · GitHub · e794ca5b · 4cbc797b
Unverified Commit 4cbc797b authored Oct 05, 2022 by Younes Belkada Committed by GitHub Oct 05, 2022
Hide whitespace changes
Inline Side-by-side

Showing with 8 additions and 6 deletions

src/transformers/models/bloom/configuration_bloom.py src/transformers/models/bloom/configuration_bloom.py +8 -6

No files found.
--- a/src/transformers/models/bloom/configuration_bloom.py
+++ b/src/transformers/models/bloom/configuration_bloom.py
@@ -53,14 +53,16 @@ class BloomConfig(PretrainedConfig):
    Args:
-        vocab_size (`int`, *optional*, defaults to 50257):
+        vocab_size (`int`, *optional*, defaults to 250880):
-            Vocabulary size of the Bloom model. Defines the number of different tokens that can be represented by the
+            Vocabulary size of the Bloom model. Defines the maximum number of different tokens that can be represented
-            `inputs_ids` passed when calling [`BloomModel`].
+            by the `inputs_ids` passed when calling [`BloomModel`]. Check [this
-        hidden_size (`int`, *optional*, defaults to 768):
+            discussion](https://huggingface.co/bigscience/bloom/discussions/120#633d28389addb8530b406c2a) on how the
+            `vocab_size` has been defined.
+        hidden_size (`int`, *optional*, defaults to 64):
            Dimensionality of the embeddings and hidden states.
-        n_layer (`int`, *optional*, defaults to 12):
+        n_layer (`int`, *optional*, defaults to 2):
            Number of hidden layers in the Transformer encoder.
-        n_head (`int`, *optional*, defaults to 12):
+        n_head (`int`, *optional*, defaults to 8):
            Number of attention heads for each attention layer in the Transformer encoder.
        layer_norm_epsilon (`float`, *optional*, defaults to 1e-5):
            The epsilon to use in the layer normalization layers.