Unverified Commit 721ee783 authored by Klaus Hipp's avatar Klaus Hipp Committed by GitHub
Browse files

[Docs] Fix spelling and grammar mistakes (#28825)

* Fix typos and grammar mistakes in docs and examples

* Fix typos in docstrings and comments

* Fix spelling of `tokenizer` in model tests

* Remove erroneous spaces in decorators

* Remove extra spaces in Markdown link texts
parent 2418c64a
...@@ -59,7 +59,7 @@ class Pix2StructTextConfig(PretrainedConfig): ...@@ -59,7 +59,7 @@ class Pix2StructTextConfig(PretrainedConfig):
relative_attention_max_distance (`int`, *optional*, defaults to 128): relative_attention_max_distance (`int`, *optional*, defaults to 128):
The maximum distance of the longer sequences for the bucket separation. The maximum distance of the longer sequences for the bucket separation.
dropout_rate (`float`, *optional*, defaults to 0.1): dropout_rate (`float`, *optional*, defaults to 0.1):
The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler. The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.
layer_norm_epsilon (`float`, *optional*, defaults to 1e-6): layer_norm_epsilon (`float`, *optional*, defaults to 1e-6):
The epsilon used by the layer normalization layers. The epsilon used by the layer normalization layers.
initializer_factor (`float`, *optional*, defaults to 1.0): initializer_factor (`float`, *optional*, defaults to 1.0):
...@@ -199,7 +199,7 @@ class Pix2StructVisionConfig(PretrainedConfig): ...@@ -199,7 +199,7 @@ class Pix2StructVisionConfig(PretrainedConfig):
layer_norm_eps (`float`, *optional*, defaults to 1e-06): layer_norm_eps (`float`, *optional*, defaults to 1e-06):
The epsilon used by the layer normalization layers. The epsilon used by the layer normalization layers.
dropout_rate (`float`, *optional*, defaults to 0.0): dropout_rate (`float`, *optional*, defaults to 0.0):
The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler. The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.
attention_dropout (`float`, *optional*, defaults to 0.0): attention_dropout (`float`, *optional*, defaults to 0.0):
The dropout ratio for the attention probabilities. The dropout ratio for the attention probabilities.
initializer_range (`float`, *optional*, defaults to 1e-10): initializer_range (`float`, *optional*, defaults to 1e-10):
......
...@@ -53,7 +53,7 @@ class QDQBertConfig(PretrainedConfig): ...@@ -53,7 +53,7 @@ class QDQBertConfig(PretrainedConfig):
The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`, The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`,
`"relu"`, `"selu"` and `"gelu_new"` are supported. `"relu"`, `"selu"` and `"gelu_new"` are supported.
hidden_dropout_prob (`float`, *optional*, defaults to 0.1): hidden_dropout_prob (`float`, *optional*, defaults to 0.1):
The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler. The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.
attention_probs_dropout_prob (`float`, *optional*, defaults to 0.1): attention_probs_dropout_prob (`float`, *optional*, defaults to 0.1):
The dropout ratio for the attention probabilities. The dropout ratio for the attention probabilities.
max_position_embeddings (`int`, *optional*, defaults to 512): max_position_embeddings (`int`, *optional*, defaults to 512):
......
...@@ -88,7 +88,7 @@ class Qwen2Tokenizer(PreTrainedTokenizer): ...@@ -88,7 +88,7 @@ class Qwen2Tokenizer(PreTrainedTokenizer):
""" """
Construct a Qwen2 tokenizer. Based on byte-level Byte-Pair-Encoding. Construct a Qwen2 tokenizer. Based on byte-level Byte-Pair-Encoding.
Same with GPT2Tokenzier, this tokenizer has been trained to treat spaces like parts of the tokens so a word will Same with GPT2Tokenizer, this tokenizer has been trained to treat spaces like parts of the tokens so a word will
be encoded differently whether it is at the beginning of the sentence (without space) or not: be encoded differently whether it is at the beginning of the sentence (without space) or not:
```python ```python
......
...@@ -46,7 +46,7 @@ class Qwen2TokenizerFast(PreTrainedTokenizerFast): ...@@ -46,7 +46,7 @@ class Qwen2TokenizerFast(PreTrainedTokenizerFast):
Construct a "fast" Qwen2 tokenizer (backed by HuggingFace's *tokenizers* library). Based on byte-level Construct a "fast" Qwen2 tokenizer (backed by HuggingFace's *tokenizers* library). Based on byte-level
Byte-Pair-Encoding. Byte-Pair-Encoding.
Same with GPT2Tokenzier, this tokenizer has been trained to treat spaces like parts of the tokens so a word will Same with GPT2Tokenizer, this tokenizer has been trained to treat spaces like parts of the tokens so a word will
be encoded differently whether it is at the beginning of the sentence (without space) or not: be encoded differently whether it is at the beginning of the sentence (without space) or not:
```python ```python
......
...@@ -82,7 +82,7 @@ class RealmConfig(PretrainedConfig): ...@@ -82,7 +82,7 @@ class RealmConfig(PretrainedConfig):
The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`, The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`,
`"relu"`, `"selu"` and `"gelu_new"` are supported. `"relu"`, `"selu"` and `"gelu_new"` are supported.
hidden_dropout_prob (`float`, *optional*, defaults to 0.1): hidden_dropout_prob (`float`, *optional*, defaults to 0.1):
The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler. The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.
attention_probs_dropout_prob (`float`, *optional*, defaults to 0.1): attention_probs_dropout_prob (`float`, *optional*, defaults to 0.1):
The dropout ratio for the attention probabilities. The dropout ratio for the attention probabilities.
max_position_embeddings (`int`, *optional*, defaults to 512): max_position_embeddings (`int`, *optional*, defaults to 512):
......
...@@ -62,7 +62,7 @@ class RemBertConfig(PretrainedConfig): ...@@ -62,7 +62,7 @@ class RemBertConfig(PretrainedConfig):
The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`, The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`,
`"relu"`, `"selu"` and `"gelu_new"` are supported. `"relu"`, `"selu"` and `"gelu_new"` are supported.
hidden_dropout_prob (`float`, *optional*, defaults to 0): hidden_dropout_prob (`float`, *optional*, defaults to 0):
The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler. The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.
attention_probs_dropout_prob (`float`, *optional*, defaults to 0): attention_probs_dropout_prob (`float`, *optional*, defaults to 0):
The dropout ratio for the attention probabilities. The dropout ratio for the attention probabilities.
classifier_dropout_prob (`float`, *optional*, defaults to 0.1): classifier_dropout_prob (`float`, *optional*, defaults to 0.1):
......
...@@ -52,7 +52,7 @@ class RoCBertConfig(PretrainedConfig): ...@@ -52,7 +52,7 @@ class RoCBertConfig(PretrainedConfig):
The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`, The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`,
`"relu"`, `"selu"` and `"gelu_new"` are supported. `"relu"`, `"selu"` and `"gelu_new"` are supported.
hidden_dropout_prob (`float`, *optional*, defaults to 0.1): hidden_dropout_prob (`float`, *optional*, defaults to 0.1):
The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler. The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.
attention_probs_dropout_prob (`float`, *optional*, defaults to 0.1): attention_probs_dropout_prob (`float`, *optional*, defaults to 0.1):
The dropout ratio for the attention probabilities. The dropout ratio for the attention probabilities.
max_position_embeddings (`int`, *optional*, defaults to 512): max_position_embeddings (`int`, *optional*, defaults to 512):
......
...@@ -72,7 +72,7 @@ class RoFormerConfig(PretrainedConfig): ...@@ -72,7 +72,7 @@ class RoFormerConfig(PretrainedConfig):
The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`, The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`,
`"relu"`, `"selu"` and `"gelu_new"` are supported. `"relu"`, `"selu"` and `"gelu_new"` are supported.
hidden_dropout_prob (`float`, *optional*, defaults to 0.1): hidden_dropout_prob (`float`, *optional*, defaults to 0.1):
The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler. The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.
attention_probs_dropout_prob (`float`, *optional*, defaults to 0.1): attention_probs_dropout_prob (`float`, *optional*, defaults to 0.1):
The dropout ratio for the attention probabilities. The dropout ratio for the attention probabilities.
max_position_embeddings (`int`, *optional*, defaults to 1536): max_position_embeddings (`int`, *optional*, defaults to 1536):
......
...@@ -223,7 +223,7 @@ class SeamlessM4TConfig(PretrainedConfig): ...@@ -223,7 +223,7 @@ class SeamlessM4TConfig(PretrainedConfig):
variance_predictor_kernel_size (`int`, *optional*, defaults to 3): variance_predictor_kernel_size (`int`, *optional*, defaults to 3):
Kernel size of the duration predictor. Applies to the vocoder only. Kernel size of the duration predictor. Applies to the vocoder only.
var_pred_dropout (`float`, *optional*, defaults to 0.5): var_pred_dropout (`float`, *optional*, defaults to 0.5):
The dropout probabilitiy of the duration predictor. Applies to the vocoder only. The dropout probability of the duration predictor. Applies to the vocoder only.
vocoder_offset (`int`, *optional*, defaults to 4): vocoder_offset (`int`, *optional*, defaults to 4):
Offset the unit token ids by this number to account for symbol tokens. Applies to the vocoder only. Offset the unit token ids by this number to account for symbol tokens. Applies to the vocoder only.
......
...@@ -183,7 +183,7 @@ class SeamlessM4Tv2Config(PretrainedConfig): ...@@ -183,7 +183,7 @@ class SeamlessM4Tv2Config(PretrainedConfig):
t2u_variance_predictor_kernel_size (`int`, *optional*, defaults to 3): t2u_variance_predictor_kernel_size (`int`, *optional*, defaults to 3):
Kernel size of the convolutional layers of the text-to-unit's duration predictor. Kernel size of the convolutional layers of the text-to-unit's duration predictor.
t2u_variance_pred_dropout (`float`, *optional*, defaults to 0.5): t2u_variance_pred_dropout (`float`, *optional*, defaults to 0.5):
The dropout probabilitiy of the text-to-unit's duration predictor. The dropout probability of the text-to-unit's duration predictor.
> Hifi-Gan Vocoder specific parameters > Hifi-Gan Vocoder specific parameters
...@@ -225,7 +225,7 @@ class SeamlessM4Tv2Config(PretrainedConfig): ...@@ -225,7 +225,7 @@ class SeamlessM4Tv2Config(PretrainedConfig):
variance_predictor_kernel_size (`int`, *optional*, defaults to 3): variance_predictor_kernel_size (`int`, *optional*, defaults to 3):
Kernel size of the duration predictor. Applies to the vocoder only. Kernel size of the duration predictor. Applies to the vocoder only.
var_pred_dropout (`float`, *optional*, defaults to 0.5): var_pred_dropout (`float`, *optional*, defaults to 0.5):
The dropout probabilitiy of the duration predictor. Applies to the vocoder only. The dropout probability of the duration predictor. Applies to the vocoder only.
vocoder_offset (`int`, *optional*, defaults to 4): vocoder_offset (`int`, *optional*, defaults to 4):
Offset the unit token ids by this number to account for symbol tokens. Applies to the vocoder only. Offset the unit token ids by this number to account for symbol tokens. Applies to the vocoder only.
......
...@@ -56,7 +56,7 @@ class SplinterConfig(PretrainedConfig): ...@@ -56,7 +56,7 @@ class SplinterConfig(PretrainedConfig):
The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`, The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`,
`"relu"`, `"selu"` and `"gelu_new"` are supported. `"relu"`, `"selu"` and `"gelu_new"` are supported.
hidden_dropout_prob (`float`, *optional*, defaults to 0.1): hidden_dropout_prob (`float`, *optional*, defaults to 0.1):
The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler. The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.
attention_probs_dropout_prob (`float`, *optional*, defaults to 0.1): attention_probs_dropout_prob (`float`, *optional*, defaults to 0.1):
The dropout ratio for the attention probabilities. The dropout ratio for the attention probabilities.
max_position_embeddings (`int`, *optional*, defaults to 512): max_position_embeddings (`int`, *optional*, defaults to 512):
......
...@@ -57,7 +57,7 @@ class TimesformerConfig(PretrainedConfig): ...@@ -57,7 +57,7 @@ class TimesformerConfig(PretrainedConfig):
The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`, The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`,
`"relu"`, `"selu"` and `"gelu_new"` are supported. `"relu"`, `"selu"` and `"gelu_new"` are supported.
hidden_dropout_prob (`float`, *optional*, defaults to 0.0): hidden_dropout_prob (`float`, *optional*, defaults to 0.0):
The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler. The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.
attention_probs_dropout_prob (`float`, *optional*, defaults to 0.0): attention_probs_dropout_prob (`float`, *optional*, defaults to 0.0):
The dropout ratio for the attention probabilities. The dropout ratio for the attention probabilities.
initializer_range (`float`, *optional*, defaults to 0.02): initializer_range (`float`, *optional*, defaults to 0.02):
......
...@@ -64,7 +64,7 @@ class TvltConfig(PretrainedConfig): ...@@ -64,7 +64,7 @@ class TvltConfig(PretrainedConfig):
The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`, The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`,
`"relu"`, `"selu"` and `"gelu_new"` are supported. `"relu"`, `"selu"` and `"gelu_new"` are supported.
hidden_dropout_prob (`float`, *optional*, defaults to 0.0): hidden_dropout_prob (`float`, *optional*, defaults to 0.0):
The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler. The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.
attention_probs_dropout_prob (`float`, *optional*, defaults to 0.0): attention_probs_dropout_prob (`float`, *optional*, defaults to 0.0):
The dropout ratio for the attention probabilities. The dropout ratio for the attention probabilities.
initializer_range (`float`, *optional*, defaults to 0.02): initializer_range (`float`, *optional*, defaults to 0.02):
......
...@@ -68,7 +68,7 @@ class UniSpeechConfig(PretrainedConfig): ...@@ -68,7 +68,7 @@ class UniSpeechConfig(PretrainedConfig):
feat_proj_dropout (`float`, *optional*, defaults to 0.0): feat_proj_dropout (`float`, *optional*, defaults to 0.0):
The dropout probability for output of the feature encoder. The dropout probability for output of the feature encoder.
feat_quantizer_dropout (`float`, *optional*, defaults to 0.0): feat_quantizer_dropout (`float`, *optional*, defaults to 0.0):
The dropout probabilitiy for the output of the feature encoder that's used by the quantizer. The dropout probability for the output of the feature encoder that's used by the quantizer.
final_dropout (`float`, *optional*, defaults to 0.1): final_dropout (`float`, *optional*, defaults to 0.1):
The dropout probability for the final projection layer of [`UniSpeechForCTC`]. The dropout probability for the final projection layer of [`UniSpeechForCTC`].
layerdrop (`float`, *optional*, defaults to 0.1): layerdrop (`float`, *optional*, defaults to 0.1):
......
...@@ -69,7 +69,7 @@ class UniSpeechSatConfig(PretrainedConfig): ...@@ -69,7 +69,7 @@ class UniSpeechSatConfig(PretrainedConfig):
feat_proj_dropout (`float`, *optional*, defaults to 0.0): feat_proj_dropout (`float`, *optional*, defaults to 0.0):
The dropout probability for output of the feature encoder. The dropout probability for output of the feature encoder.
feat_quantizer_dropout (`float`, *optional*, defaults to 0.0): feat_quantizer_dropout (`float`, *optional*, defaults to 0.0):
The dropout probabilitiy for the output of the feature encoder that's used by the quantizer. The dropout probability for the output of the feature encoder that's used by the quantizer.
final_dropout (`float`, *optional*, defaults to 0.1): final_dropout (`float`, *optional*, defaults to 0.1):
The dropout probability for the final projection layer of [`UniSpeechSatForCTC`]. The dropout probability for the final projection layer of [`UniSpeechSatForCTC`].
layerdrop (`float`, *optional*, defaults to 0.1): layerdrop (`float`, *optional*, defaults to 0.1):
......
...@@ -58,7 +58,7 @@ class VideoMAEConfig(PretrainedConfig): ...@@ -58,7 +58,7 @@ class VideoMAEConfig(PretrainedConfig):
The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`, The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`,
`"relu"`, `"selu"` and `"gelu_new"` are supported. `"relu"`, `"selu"` and `"gelu_new"` are supported.
hidden_dropout_prob (`float`, *optional*, defaults to 0.0): hidden_dropout_prob (`float`, *optional*, defaults to 0.0):
The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler. The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.
attention_probs_dropout_prob (`float`, *optional*, defaults to 0.0): attention_probs_dropout_prob (`float`, *optional*, defaults to 0.0):
The dropout ratio for the attention probabilities. The dropout ratio for the attention probabilities.
initializer_range (`float`, *optional*, defaults to 0.02): initializer_range (`float`, *optional*, defaults to 0.02):
......
...@@ -59,7 +59,7 @@ class ViltConfig(PretrainedConfig): ...@@ -59,7 +59,7 @@ class ViltConfig(PretrainedConfig):
The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`, The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`,
`"relu"`, `"selu"` and `"gelu_new"` are supported. `"relu"`, `"selu"` and `"gelu_new"` are supported.
hidden_dropout_prob (`float`, *optional*, defaults to 0.0): hidden_dropout_prob (`float`, *optional*, defaults to 0.0):
The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler. The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.
attention_probs_dropout_prob (`float`, *optional*, defaults to 0.0): attention_probs_dropout_prob (`float`, *optional*, defaults to 0.0):
The dropout ratio for the attention probabilities. The dropout ratio for the attention probabilities.
initializer_range (`float`, *optional*, defaults to 0.02): initializer_range (`float`, *optional*, defaults to 0.02):
......
...@@ -71,7 +71,7 @@ class VisualBertConfig(PretrainedConfig): ...@@ -71,7 +71,7 @@ class VisualBertConfig(PretrainedConfig):
The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`, The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`,
`"relu"`, `"selu"` and `"gelu_new"` are supported. `"relu"`, `"selu"` and `"gelu_new"` are supported.
hidden_dropout_prob (`float`, *optional*, defaults to 0.1): hidden_dropout_prob (`float`, *optional*, defaults to 0.1):
The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler. The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.
attention_probs_dropout_prob (`float`, *optional*, defaults to 0.1): attention_probs_dropout_prob (`float`, *optional*, defaults to 0.1):
The dropout ratio for the attention probabilities. The dropout ratio for the attention probabilities.
max_position_embeddings (`int`, *optional*, defaults to 512): max_position_embeddings (`int`, *optional*, defaults to 512):
......
...@@ -50,7 +50,7 @@ class ViTMAEConfig(PretrainedConfig): ...@@ -50,7 +50,7 @@ class ViTMAEConfig(PretrainedConfig):
The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`, The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`,
`"relu"`, `"selu"` and `"gelu_new"` are supported. `"relu"`, `"selu"` and `"gelu_new"` are supported.
hidden_dropout_prob (`float`, *optional*, defaults to 0.0): hidden_dropout_prob (`float`, *optional*, defaults to 0.0):
The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler. The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.
attention_probs_dropout_prob (`float`, *optional*, defaults to 0.0): attention_probs_dropout_prob (`float`, *optional*, defaults to 0.0):
The dropout ratio for the attention probabilities. The dropout ratio for the attention probabilities.
initializer_range (`float`, *optional*, defaults to 0.02): initializer_range (`float`, *optional*, defaults to 0.02):
......
...@@ -82,7 +82,7 @@ class Wav2Vec2Config(PretrainedConfig): ...@@ -82,7 +82,7 @@ class Wav2Vec2Config(PretrainedConfig):
The non-linear activation function (function or string) in the 1D convolutional layers of the feature The non-linear activation function (function or string) in the 1D convolutional layers of the feature
extractor. If string, `"gelu"`, `"relu"`, `"selu"` and `"gelu_new"` are supported. extractor. If string, `"gelu"`, `"relu"`, `"selu"` and `"gelu_new"` are supported.
feat_quantizer_dropout (`float`, *optional*, defaults to 0.0): feat_quantizer_dropout (`float`, *optional*, defaults to 0.0):
The dropout probabilitiy for quantized feature encoder states. The dropout probability for quantized feature encoder states.
conv_dim (`Tuple[int]` or `List[int]`, *optional*, defaults to `(512, 512, 512, 512, 512, 512, 512)`): conv_dim (`Tuple[int]` or `List[int]`, *optional*, defaults to `(512, 512, 512, 512, 512, 512, 512)`):
A tuple of integers defining the number of input and output channels of each 1D convolutional layer in the A tuple of integers defining the number of input and output channels of each 1D convolutional layer in the
feature encoder. The length of *conv_dim* defines the number of 1D convolutional layers. feature encoder. The length of *conv_dim* defines the number of 1D convolutional layers.
...@@ -140,7 +140,7 @@ class Wav2Vec2Config(PretrainedConfig): ...@@ -140,7 +140,7 @@ class Wav2Vec2Config(PretrainedConfig):
contrastive_logits_temperature (`float`, *optional*, defaults to 0.1): contrastive_logits_temperature (`float`, *optional*, defaults to 0.1):
The temperature *kappa* in the contrastive loss. The temperature *kappa* in the contrastive loss.
feat_quantizer_dropout (`float`, *optional*, defaults to 0.0): feat_quantizer_dropout (`float`, *optional*, defaults to 0.0):
The dropout probabilitiy for the output of the feature encoder that's used by the quantizer. The dropout probability for the output of the feature encoder that's used by the quantizer.
num_negatives (`int`, *optional*, defaults to 100): num_negatives (`int`, *optional*, defaults to 100):
Number of negative samples for the contrastive loss. Number of negative samples for the contrastive loss.
codevector_dim (`int`, *optional*, defaults to 256): codevector_dim (`int`, *optional*, defaults to 256):
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment