Unverified Commit db9f1891 authored by Patrick von Platen's avatar Patrick von Platen Committed by GitHub
Browse files

[ASR Pipeline] Correct init docs (#16833)

* correct

* up
parent 77de8d6c
...@@ -70,38 +70,54 @@ class AutomaticSpeechRecognitionPipeline(ChunkPipeline): ...@@ -70,38 +70,54 @@ class AutomaticSpeechRecognitionPipeline(ChunkPipeline):
The input can be either a raw waveform or a audio file. In case of the audio file, ffmpeg should be installed for The input can be either a raw waveform or a audio file. In case of the audio file, ffmpeg should be installed for
to support multiple audio formats to support multiple audio formats
"""
def __init__(self, feature_extractor: Union["SequenceFeatureExtractor", str], *args, **kwargs):
"""
Arguments: Arguments:
model ([`PreTrainedModel`] or [`TFPreTrainedModel`]): model ([`PreTrainedModel`] or [`TFPreTrainedModel`]):
The model that will be used by the pipeline to make predictions. This needs to be a model inheriting The model that will be used by the pipeline to make predictions. This needs to be a model inheriting from
from [`PreTrainedModel`] for PyTorch and [`TFPreTrainedModel`] for TensorFlow. [`PreTrainedModel`] for PyTorch and [`TFPreTrainedModel`] for TensorFlow.
tokenizer ([`PreTrainedTokenizer`]): tokenizer ([`PreTrainedTokenizer`]):
The tokenizer that will be used by the pipeline to encode data for the model. This object inherits from The tokenizer that will be used by the pipeline to encode data for the model. This object inherits from
[`PreTrainedTokenizer`]. [`PreTrainedTokenizer`].
feature_extractor ([`SequenceFeatureExtractor`]): feature_extractor ([`SequenceFeatureExtractor`]):
The feature extractor that will be used by the pipeline to encode waveform for the model. The feature extractor that will be used by the pipeline to encode waveform for the model.
chunk_length_s (`float`, *optional*, defaults to 0): chunk_length_s (`float`, *optional*, defaults to 0):
The input length for in each chunk. If `0` then chunking is disabled (default). Only available for CTC The input length for in each chunk. If `chunk_length_s = 0` then chunking is disabled (default). Only
models. available for CTC models, e.g. [`Wav2Vec2ForCTC`].
<Tip>
For more information on how to effectively use `chunk_length_s`, please have a look at the [ASR chunking
blog post](https://huggingface.co/blog/asr-chunking).
</Tip>
stride_length_s (`float`, *optional*, defaults to `chunk_length_s / 6`): stride_length_s (`float`, *optional*, defaults to `chunk_length_s / 6`):
The length of stride on the left and right of each chunk. Used only with `chunk_length_s > 0`. This The length of stride on the left and right of each chunk. Used only with `chunk_length_s > 0`. This enables
enables the model to *see* more context and infer letters better than without this context but the the model to *see* more context and infer letters better than without this context but the pipeline
pipeline discards the stride bits at the end to make the final reconstitution as perfect as possible. discards the stride bits at the end to make the final reconstitution as perfect as possible.
framework (`str`, *optional*):
The framework to use, either `"pt"` for PyTorch or `"tf"` for TensorFlow. The specified framework must <Tip>
be installed.
For more information on how to effectively use `stride_length_s`, please have a look at the [ASR chunking
blog post](https://huggingface.co/blog/asr-chunking).
If no framework is specified, will default to the one currently installed. If no framework is specified </Tip>
and both frameworks are installed, will default to the framework of the `model`, or to PyTorch if no
model is provided. framework (`str`, *optional*):
The framework to use, either `"pt"` for PyTorch or `"tf"` for TensorFlow. The specified framework must be
installed. If no framework is specified, will default to the one currently installed. If no framework is
specified and both frameworks are installed, will default to the framework of the `model`, or to PyTorch if
no model is provided.
device (`int`, *optional*, defaults to -1): device (`int`, *optional*, defaults to -1):
Device ordinal for CPU/GPU supports. Setting this to -1 will leverage CPU, a positive will run the Device ordinal for CPU/GPU supports. Setting this to -1 will leverage CPU, a positive will run the model on
model on the associated CUDA device id. the associated CUDA device id.
decoder (`pyctcdecode.BeamSearchDecoderCTC`, *optional*):
[PyCTCDecode's
BeamSearchDecoderCTC](https://github.com/kensho-technologies/pyctcdecode/blob/2fd33dc37c4111417e08d89ccd23d28e9b308d19/pyctcdecode/decoder.py#L180)
can be passed for language model boosted decoding. See [`Wav2Vec2ProcessorWithLM`] for more information.
""" """
def __init__(self, feature_extractor: Union["SequenceFeatureExtractor", str], *args, **kwargs):
super().__init__(*args, **kwargs) super().__init__(*args, **kwargs)
self.feature_extractor = feature_extractor self.feature_extractor = feature_extractor
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment