@@ -12,6 +12,8 @@ specific language governing permissions and limitations under the License.
# Automatic speech recognition
[[open-in-colab]]
<Youtube id="TksaY_FDgnk"/>
Automatic speech recognition (ASR) converts a speech signal to text, mapping a sequence of audio inputs to text outputs. Virtual assistants like Siri and Alexa use ASR models to help users everyday, and there are many other useful user-facing applications like live captioning and note-taking during meetings.
@@ -12,6 +12,8 @@ specific language governing permissions and limitations under the License.
# Audio classification
[[open-in-colab]]
<Youtube id="KWwzcmG98Ds"/>
Audio classification - just like with text - assigns a class label output from the input data. The only difference is instead of text inputs, you have raw audio waveforms. Some practical applications of audio classification include identifying speaker intent, language classification, and even animal species by their sounds.
@@ -12,6 +12,8 @@ specific language governing permissions and limitations under the License.
# Multiple choice
[[open-in-colab]]
A multiple choice task is similar to question answering, except several candidate answers are provided along with a context and the model is trained to select the correct answer.
@@ -12,6 +12,8 @@ specific language governing permissions and limitations under the License.
# Summarization
[[open-in-colab]]
<Youtube id="yHnr5Dk2zCI"/>
Summarization creates a shorter version of a document or an article that captures all the important information. Along with translation, it is another example of a task that can be formulated as a sequence-to-sequence task. Summarization can be:
@@ -12,6 +12,8 @@ specific language governing permissions and limitations under the License.
# Translation
[[open-in-colab]]
<Youtube id="1JvfrvZgi6c"/>
Translation converts a sequence of text from one language to another. It is one of several tasks you can formulate as a sequence-to-sequence problem, a powerful framework for returning some output from an input, like translation or summarization. Translation systems are commonly used for translation between different language texts, but it can also be used for speech or some combination in between like text-to-speech or speech-to-text.