Move Streamer API out of prototype (#2378)

Summary: This commit moves the Streaming API out of prototype module. * The related classes are renamed as following - `Streamer` -> `StreamReader`. - `SourceStream` -> `StreamReaderSourceStream` - `SourceAudioStream` -> `StreamReaderSourceAudioStream` - `SourceVideoStream` -> `StreamReaderSourceVideoStream` - `OutputStream` -> `StreamReaderOutputStream` This change is preemptive measurement for the possibility to add `StreamWriter` API. * Replace BUILD_FFMPEG build arg with USE_FFMPEG We are not building FFmpeg, so USE_FFMPEG is more appropriate --- After https://github.com/pytorch/audio/issues/2377 Remaining TODOs: (different PRs) - [ ] Introduce `is_ffmpeg_binding_available` function. - [ ] Refactor C++ code: - Rename `Streamer` to `StreamReader`. - Rename `streamer.[h|cpp]` to `stream_reader.[h|cpp]`. - Rename `prototype.cpp` to `stream_reader_binding.cpp`. - Introduce `stream_reader` directory. - [x] Enable FFmpeg in smoke test (https://github.com/pytorch/audio/issues/2381) Pull Request resolved: https://github.com/pytorch/audio/pull/2378 Reviewed By: carolineechen Differential Revision: D36359299 Pulled By: mthrok fbshipit-source-id: 6a57b702996af871e577fb7addbf3522081c1328

Move Streamer API out of prototype (#2378)
Summary: This commit moves the Streaming API out of prototype module. * The related classes are renamed as following - `Streamer` -> `StreamReader`. - `SourceStream` -> `StreamReaderSourceStream` - `SourceAudioStream` -> `StreamReaderSourceAudioStream` - `SourceVideoStream` -> `StreamReaderSourceVideoStream` - `OutputStream` -> `StreamReaderOutputStream` This change is preemptive measurement for the possibility to add `StreamWriter` API. * Replace BUILD_FFMPEG build arg with USE_FFMPEG We are not building FFmpeg, so USE_FFMPEG is more appropriate --- After https://github.com/pytorch/audio/issues/2377 Remaining TODOs: (different PRs) - [ ] Introduce `is_ffmpeg_binding_available` function. - [ ] Refactor C++ code: - Rename `Streamer` to `StreamReader`. - Rename `streamer.[h|cpp]` to `stream_reader.[h|cpp]`. - Rename `prototype.cpp` to `stream_reader_binding.cpp`. - Introduce `stream_reader` directory. - [x] Enable FFmpeg in smoke test (https://github.com/pytorch/audio/issues/2381) Pull Request resolved: https://github.com/pytorch/audio/pull/2378 Reviewed By: carolineechen Differential Revision: D36359299 Pulled By: mthrok fbshipit-source-id: 6a57b702996af871e577fb7addbf3522081c1328
72b712a1 · moto · Facebook GitHub Bot · 9499f642 · 72b712a1 · 72b712a1
Commit 72b712a1 authored May 13, 2022 by moto Committed by Facebook GitHub Bot May 13, 2022
20 changed files
--- a/.circleci/config.yml
+++ b/.circleci/config.yml
@@ -254,7 +254,7 @@ jobs:
            export FFMPEG_ROOT=${PWD}/third_party/ffmpeg
            packaging/build_wheel.sh
          environment:
-            BUILD_FFMPEG: true
+            USE_FFMPEG: true
      - store_artifacts:
          path: dist
      - persist_to_workspace:
@@ -277,7 +277,7 @@ jobs:
            export FFMPEG_ROOT=${PWD}/third_party/ffmpeg
            packaging/build_conda.sh
          environment:
-            BUILD_FFMPEG: true
+            USE_FFMPEG: true
      - store_artifacts:
          path: /opt/conda/conda-bld/linux-64
      - persist_to_workspace:
@@ -305,7 +305,7 @@ jobs:
            export FFMPEG_ROOT="${PWD}/third_party/ffmpeg"
            packaging/build_wheel.sh
          environment:
-            BUILD_FFMPEG: true
+            USE_FFMPEG: true
      - store_artifacts:
          path: dist
      - persist_to_workspace:
@@ -331,7 +331,7 @@ jobs:
            export FFMPEG_ROOT="${PWD}/third_party/ffmpeg"
            packaging/build_conda.sh
          environment:
-            BUILD_FFMPEG: true
+            USE_FFMPEG: true
      - store_artifacts:
          path: /Users/distiller/miniconda3/conda-bld/osx-64
      - persist_to_workspace:
@@ -358,7 +358,7 @@ jobs:
            export FFMPEG_ROOT="${PWD}/third_party/ffmpeg"
            bash packaging/build_wheel.sh
          environment:
-            BUILD_FFMPEG: true
+            USE_FFMPEG: true
      - store_artifacts:
          path: dist
      - persist_to_workspace:
@@ -391,7 +391,7 @@ jobs:
            export FFMPEG_ROOT="${PWD}/third_party/ffmpeg"
            bash packaging/build_conda.sh
          environment:
-            BUILD_FFMPEG: true
+            USE_FFMPEG: true
      - store_artifacts:
          path: C:/tools/miniconda3/conda-bld/win-64
      - persist_to_workspace:
@@ -621,7 +621,7 @@ jobs:
          command: .circleci/unittest/linux/scripts/install.sh
          environment:
              BUILD_MAD: true
-              BUILD_FFMPEG: true
+              USE_FFMPEG: true
      - run:
          name: Run tests
          command: .circleci/unittest/linux/scripts/run_test.sh
@@ -654,7 +654,7 @@ jobs:
          command: docker run -t --gpus all -e PYTHON_VERSION -v $PWD:$PWD -w $PWD "${image_name}" .circleci/unittest/linux/scripts/setup_env.sh
      - run:
          name: Install torchaudio
-          command: docker run -t --gpus all -e UPLOAD_CHANNEL -e CONDA_CHANNEL_FLAGS -e BUILD_FFMPEG=1 -e BUILD_MAD=1 -v $PWD:$PWD -w $PWD "${image_name}" .circleci/unittest/linux/scripts/install.sh
+          command: docker run -t --gpus all -e UPLOAD_CHANNEL -e CONDA_CHANNEL_FLAGS -e USE_FFMPEG=1 -e BUILD_MAD=1 -v $PWD:$PWD -w $PWD "${image_name}" .circleci/unittest/linux/scripts/install.sh
      - run:
          name: Run tests
          environment:
@@ -681,7 +681,7 @@ jobs:
          name: Install torchaudio
          command: .circleci/unittest/windows/scripts/install.sh
          environment:
-              BUILD_FFMPEG: true
+              USE_FFMPEG: true
      - run:
          name: Run tests
          command: .circleci/unittest/windows/scripts/run_test.sh
@@ -727,7 +727,7 @@ jobs:
          name: Install torchaudio
          command: .circleci/unittest/windows/scripts/install.sh
          environment:
-              BUILD_FFMPEG: true
+              USE_FFMPEG: true
      - run:
          name: Run tests
          command: .circleci/unittest/windows/scripts/run_test.sh
@@ -766,7 +766,7 @@ jobs:
          name: Install torchaudio
          command: .circleci/unittest/linux/scripts/install.sh
          environment:
-              BUILD_FFMPEG: true
+              USE_FFMPEG: true
              BUILD_MAD: true
      - run:
          name: Run tests

--- a/.circleci/config.yml.in
+++ b/.circleci/config.yml.in
@@ -254,7 +254,7 @@ jobs:
            export FFMPEG_ROOT=${PWD}/third_party/ffmpeg
            packaging/build_wheel.sh
          environment:
-            BUILD_FFMPEG: true
+            USE_FFMPEG: true
      - store_artifacts:
          path: dist
      - persist_to_workspace:
@@ -277,7 +277,7 @@ jobs:
            export FFMPEG_ROOT=${PWD}/third_party/ffmpeg
            packaging/build_conda.sh
          environment:
-            BUILD_FFMPEG: true
+            USE_FFMPEG: true
      - store_artifacts:
          path: /opt/conda/conda-bld/linux-64
      - persist_to_workspace:
@@ -305,7 +305,7 @@ jobs:
            export FFMPEG_ROOT="${PWD}/third_party/ffmpeg"
            packaging/build_wheel.sh
          environment:
-            BUILD_FFMPEG: true
+            USE_FFMPEG: true
      - store_artifacts:
          path: dist
      - persist_to_workspace:
@@ -331,7 +331,7 @@ jobs:
            export FFMPEG_ROOT="${PWD}/third_party/ffmpeg"
            packaging/build_conda.sh
          environment:
-            BUILD_FFMPEG: true
+            USE_FFMPEG: true
      - store_artifacts:
          path: /Users/distiller/miniconda3/conda-bld/osx-64
      - persist_to_workspace:
@@ -358,7 +358,7 @@ jobs:
            export FFMPEG_ROOT="${PWD}/third_party/ffmpeg"
            bash packaging/build_wheel.sh
          environment:
-            BUILD_FFMPEG: true
+            USE_FFMPEG: true
      - store_artifacts:
          path: dist
      - persist_to_workspace:
@@ -391,7 +391,7 @@ jobs:
            export FFMPEG_ROOT="${PWD}/third_party/ffmpeg"
            bash packaging/build_conda.sh
          environment:
-            BUILD_FFMPEG: true
+            USE_FFMPEG: true
      - store_artifacts:
          path: C:/tools/miniconda3/conda-bld/win-64
      - persist_to_workspace:
@@ -621,7 +621,7 @@ jobs:
          command: .circleci/unittest/linux/scripts/install.sh
          environment:
              BUILD_MAD: true
-              BUILD_FFMPEG: true
+              USE_FFMPEG: true
      - run:
          name: Run tests
          command: .circleci/unittest/linux/scripts/run_test.sh
@@ -654,7 +654,7 @@ jobs:
          command: docker run -t --gpus all -e PYTHON_VERSION -v $PWD:$PWD -w $PWD "${image_name}" .circleci/unittest/linux/scripts/setup_env.sh
      - run:
          name: Install torchaudio
-          command: docker run -t --gpus all -e UPLOAD_CHANNEL -e CONDA_CHANNEL_FLAGS -e BUILD_FFMPEG=1 -e BUILD_MAD=1 -v $PWD:$PWD -w $PWD "${image_name}" .circleci/unittest/linux/scripts/install.sh
+          command: docker run -t --gpus all -e UPLOAD_CHANNEL -e CONDA_CHANNEL_FLAGS -e USE_FFMPEG=1 -e BUILD_MAD=1 -v $PWD:$PWD -w $PWD "${image_name}" .circleci/unittest/linux/scripts/install.sh
      - run:
          name: Run tests
          environment:
@@ -681,7 +681,7 @@ jobs:
          name: Install torchaudio
          command: .circleci/unittest/windows/scripts/install.sh
          environment:
-              BUILD_FFMPEG: true
+              USE_FFMPEG: true
      - run:
          name: Run tests
          command: .circleci/unittest/windows/scripts/run_test.sh
@@ -727,7 +727,7 @@ jobs:
          name: Install torchaudio
          command: .circleci/unittest/windows/scripts/install.sh
          environment:
-              BUILD_FFMPEG: true
+              USE_FFMPEG: true
      - run:
          name: Run tests
          command: .circleci/unittest/windows/scripts/run_test.sh
@@ -766,7 +766,7 @@ jobs:
          name: Install torchaudio
          command: .circleci/unittest/linux/scripts/install.sh
          environment:
-              BUILD_FFMPEG: true
+              USE_FFMPEG: true
              BUILD_MAD: true
      - run:
          name: Run tests

--- a/.github/workflows/integration-test.yml
+++ b/.github/workflows/integration-test.yml
@@ -32,7 +32,7 @@ jobs:
        python -m pip install --quiet pytest requests cmake ninja deep-phonemizer sentencepiece
        python setup.py install
      env:
-        BUILD_FFMPEG: true
+        USE_FFMPEG: true
    - name: Run integration test
      run: |
        cd test && pytest integration_tests -v --use-tmp-hub-dir
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -58,11 +58,11 @@ endif()
 # Options
 option(BUILD_SOX "Build libsox statically" ON)
 option(BUILD_MAD "Enable libmad" OFF)
-option(BUILD_FFMPEG "Enable ffmpeg-based features" OFF)
 option(BUILD_KALDI "Build kaldi statically" ON)
 option(BUILD_RNNT "Enable RNN transducer" ON)
 option(BUILD_CTC_DECODER "Build Flashlight CTC decoder" ON)
 option(BUILD_TORCHAUDIO_PYTHON_EXTENSION "Build Python extension" OFF)
+option(USE_FFMPEG "Enable ffmpeg-based features" OFF)
 option(USE_CUDA "Enable CUDA support" OFF)
 option(USE_ROCM "Enable ROCM support" OFF)
 option(USE_OPENMP "Enable OpenMP support" OFF)

--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -39,6 +39,7 @@ API References
   :caption: API Reference

   torchaudio
+   io
   backend
   functional
   transforms
@@ -58,7 +59,6 @@ Prototype API References
   :caption: Prototype API Reference

   prototype
-   prototype.io
   prototype.ctc_decoder
   prototype.models
   prototype.pipelines

--- a/docs/source/io.rst
+++ b/docs/source/io.rst
+torchaudio.io
+=============
+
+.. currentmodule:: torchaudio.io
+
+
+StreamReader
+------------
+
+.. autoclass:: StreamReader
+  :members:
+
+StreamReaderSourceStream
+------------------------
+
+.. autoclass:: StreamReaderSourceStream
+  :members:
+
+StreamReaderSourceAudioStream
+-----------------------------
+
+.. autoclass:: StreamReaderSourceAudioStream
+  :members:
+
+StreamReaderSourceVideoStream
+-----------------------------
+
+.. autoclass:: StreamReaderSourceVideoStream
+  :members:
+
+StreamReaderOutputStream
+------------------------
+
+.. autoclass:: StreamReaderOutputStream
+  :members:
--- a/docs/source/prototype.io.rst
+++ b/docs/source/prototype.io.rst
-torchaudio.prototype.io
-=======================
-
-.. currentmodule:: torchaudio.prototype.io
-
-SourceStream
------------
-
-.. autoclass:: SourceStream
-  :members:
-
-SourceAudioStream
-----------------
-
-.. autoclass:: SourceAudioStream
-  :members:
-
-SourceVideoStream
-----------------
-     
-.. autoclass:: SourceVideoStream
-  :members:
-
-OutputStream
------------
-
-.. autoclass:: OutputStream
-  :members:
-
-Streamer
--------
-
-.. autoclass:: Streamer
-  :members:
--- a/docs/source/prototype.rst
+++ b/docs/source/prototype.rst
@@ -17,7 +17,6 @@ imported explicitly, e.g.
   import torchaudio.prototype.ctc_decoder

 .. toctree::
-    prototype.io
    prototype.ctc_decoder
    prototype.models
    prototype.pipelines
--- a/examples/tutorials/device_asr.py
+++ b/examples/tutorials/device_asr.py
@@ -10,15 +10,17 @@ on laptop.

 .. note::

-   This tutorial requires prototype Streaming API, ffmpeg>=4.1, and SentencePiece.
+   This tutorial requires Streaming API, FFmpeg libraries (>=4.1, <5),
+   and SentencePiece.

-   Prototype features are not part of binary releases, but available in
-   nightly build. Please refer to https://pytorch.org for installing
-   nightly build.
+   The Streaming API is available in nightly build.
+   Please refer to https://pytorch.org/get-started/locally
+   for instructions.

+   There are multiple ways to install FFmpeg libraries.
   If you are using Anaconda Python distribution,
-   ``conda install -c anaconda ffmpeg`` will install
-   the required libraries.
+   ``conda install 'ffmpeg<5'`` will install
+   the required FFmpeg libraries.

   You can install SentencePiece by running ``pip install sentencepiece``.

@@ -49,7 +51,7 @@ on laptop.
 #
 # Firstly, we need to check the devices that Streaming API can access,
 # and figure out the arguments (``src`` and ``format``) we need to pass
-# to :py:func:`~torchaudio.prototype.io.Streamer` class.
+# to :py:func:`~torchaudio.io.StreamReader` class.
 #
 # We use ``ffmpeg`` command for this. ``ffmpeg`` abstracts away the
 # difference of underlying hardware implementations, but the expected
@@ -76,7 +78,7 @@ on laptop.
 #
 # .. code::
 #
-#    Streamer(
+#    StreamReader(
 #        src = ":1",  # no video, audio from device 1, "MacBook Pro Microphone"
 #        format = "avfoundation",
 #    )
@@ -100,7 +102,7 @@ on laptop.
 #
 # .. code::
 #
-#    Streamer(
+#    StreamReader(
 #        src = "audio=@device_cm_{33D9A762-90C8-11D0-BD43-00A0C911CE86}\wave_{BF2B8AE1-10B8-4CA4-A0DC-D02E18A56177}",
 #        format = "dshow",
 #    )
@@ -134,10 +136,10 @@ NUM_ITER = 100


 def stream(q, format, src, segment_length, sample_rate):
-    from torchaudio.prototype.io import Streamer
+    from torchaudio.io import StreamReader

-    print("Building Streamer...")
-    streamer = Streamer(src, format=format)
+    print("Building StreamReader...")
+    streamer = StreamReader(src, format=format)
    streamer.add_basic_audio_stream(frames_per_chunk=segment_length, sample_rate=sample_rate)

    print(streamer.get_src_stream_info(0))
@@ -170,7 +172,7 @@ def stream(q, format, src, segment_length, sample_rate):
 #
 # For the detail of ``timeout`` and ``backoff`` parameters, please refer
 # to the documentation of
-# :py:meth:`~torchaudio.prototype.io.Streamer.stream` method.
+# :py:meth:`~torchaudio.io.StreamReader.stream` method.
 #
 # .. note::
 #
@@ -324,7 +326,7 @@ if __name__ == "__main__":
 #    Sample rate: 16000
 #    Main segment: 2560 frames (0.16 seconds)
 #    Right context: 640 frames (0.04 seconds)
-#    Building Streamer...
+#    Building StreamReader...
 #    SourceAudioStream(media_type='audio', codec='pcm_f32le', codec_long_name='PCM 32-bit floating point little-endian', format='flt', bit_rate=1536000, sample_rate=48000.0, num_channels=1)
 #    OutputStream(source_index=0, filter_description='aresample=16000,aformat=sample_fmts=fltp')
 #    Streaming...

--- a/examples/tutorials/online_asr_tutorial.py
+++ b/examples/tutorials/online_asr_tutorial.py
@@ -13,22 +13,17 @@ to perform online speech recognition.
 #
 # .. note::
 #
-#    This tutorial requires torchaudio with prototype features,
-#    FFmpeg libraries (>=4.1), and SentencePiece.
+#    This tutorial requires Streaming API, FFmpeg libraries (>=4.1, <5),
+#    and SentencePiece.
 #
-#    torchaudio prototype features are available on nightly builds.
+#    The Streaming API is available in nightly builds.
 #    Please refer to https://pytorch.org/get-started/locally/
 #    for instructions.
 #
-#    The interfaces of prototype features are unstable and subject to
-#    change. Please refer to `the nightly build documentation
-#    <https://pytorch.org/audio/main/>`__ for the up-to-date
-#    API references.
-#
 #    There are multiple ways to install FFmpeg libraries.
 #    If you are using Anaconda Python distribution,
-#    ``conda install -c anaconda ffmpeg`` will install
-#    the required libraries.
+#    ``conda install 'ffmpeg<5'`` will install
+#    the required FFmpeg libraries.
 #
 #    You can install SentencePiece by running ``pip install sentencepiece``.

@@ -54,7 +49,7 @@ import torch
 import torchaudio

 try:
-    from torchaudio.prototype.io import Streamer
+    from torchaudio.io import StreamReader
 except ModuleNotFoundError:
    try:
        import google.colab
@@ -123,7 +118,7 @@ print(f"Right context: {context_length} frames ({context_length / sample_rate} s
 # 4. Configure the audio stream
 # -----------------------------
 #
-# Next, we configure the input audio stream using :py:func:`~torchaudio.prototype.io.Streamer`.
+# Next, we configure the input audio stream using :py:func:`~torchaudio.io.StreamReader`.
 #
 # For the detail of this API, please refer to the
 # `Media Stream API tutorial <./streaming_api_tutorial.html>`__.
@@ -139,7 +134,7 @@ print(f"Right context: {context_length} frames ({context_length / sample_rate} s
 #
 src = "https://download.pytorch.org/torchaudio/tutorial-assets/greatpiratestories_00_various.mp3"

-streamer = Streamer(src)
+streamer = StreamReader(src)
 streamer.add_basic_audio_stream(frames_per_chunk=segment_length, sample_rate=bundle.sample_rate)

 print(streamer.get_src_stream_info(0))

--- a/examples/tutorials/streaming_api_tutorial.py
+++ b/examples/tutorials/streaming_api_tutorial.py
@@ -12,21 +12,15 @@ libavfilter provides.
 #
 # .. note::
 #
-#    This tutorial requires torchaudio with prototype features and
-#    FFmpeg libraries (>=4.1).
+#    This tutorial requires Streaming API and FFmpeg libraries (>=4.1, <5).
 #
-#    The torchaudio prototype features are available on nightly builds.
+#    The Streaming API is available in nightly builds.
 #    Please refer to https://pytorch.org/get-started/locally/
 #    for instructions.
 #
-#    The interfaces of prototype features are unstable and subject to
-#    change. Please refer to `the nightly build documentation
-#    <https://pytorch.org/audio/main/>`__ for the up-to-date
-#    API references.
-#
 #    There are multiple ways to install FFmpeg libraries.
 #    If you are using Anaconda Python distribution,
-#    ``conda install -c anaconda ffmpeg`` will install
+#    ``conda install -c anaconda 'ffmpeg<5'`` will install
 #    the required libraries.
 #

@@ -72,7 +66,7 @@ import torch
 import torchaudio

 try:
-    from torchaudio.prototype.io import Streamer
+    from torchaudio.io import StreamReader
 except ModuleNotFoundError:
    try:
        import google.colab
@@ -120,29 +114,29 @@ VIDEO_URL = f"{base_url}/stream-api/NASAs_Most_Scientifically_Complex_Space_Obse
 ######################################################################
 #
 # To open a media file, you can simply pass the path of the file to
-# the constructor of `Streamer`.
+# the constructor of `StreamReader`.
 #
 # .. code::
 #
-#    Streamer(src="audio.wav")
+#    StreamReader(src="audio.wav")
 #
-#    Streamer(src="audio.mp3")
+#    StreamReader(src="audio.mp3")
 #
 # This works for image file, video file and video streams.
 #
 # .. code::
 #
 #    # Still image
-#    Streamer(src="image.jpeg")
+#    StreamReader(src="image.jpeg")
 #
 #    # Video file
-#    Streamer(src="video.mpeg")
+#    StreamReader(src="video.mpeg")
 #
 #    # Video on remote server
-#    Streamer(src="https://example.com/video.mp4")
+#    StreamReader(src="https://example.com/video.mp4")
 #
 #    # Playlist format
-#    Streamer(src="https://example.com/playlist.m3u")
+#    StreamReader(src="https://example.com/playlist.m3u")
 #
 # If attempting to load headerless raw data, you can use ``format`` and
 # ``option`` to specify the format of the data.
@@ -159,7 +153,7 @@ VIDEO_URL = f"{base_url}/stream-api/NASAs_Most_Scientifically_Complex_Space_Obse
 #
 # .. code::
 #
-#    Streamer(src="raw.s2", format="s16le", option={"sample_rate": "16000"})
+#    StreamReader(src="raw.s2", format="s16le", option={"sample_rate": "16000"})
 #

 ######################################################################
@@ -170,30 +164,30 @@ VIDEO_URL = f"{base_url}/stream-api/NASAs_Most_Scientifically_Complex_Space_Obse
 # the output streams.
 #
 # You can check the number of source streams with
-# :py:attr:`~torchaudio.prototype.io.Streamer.num_src_streams`.
+# :py:attr:`~torchaudio.io.StreamReader.num_src_streams`.
 #
 # .. note::
 #    The number of streams is NOT the number of channels.
 #    Each audio stream can contain an arbitrary number of channels.
 #
 # To check the metadata of source stream you can use
-# :py:meth:`~torchaudio.prototype.io.Streamer.get_src_stream_info`
+# :py:meth:`~torchaudio.io.StreamReader.get_src_stream_info`
 # method and provide the index of the source stream.
 #
 # This method returns
-# :py:class:`~torchaudio.prototype.io.SourceStream`. If a source
+# :py:class:`~torchaudio.io.StreamReader.SourceStream`. If a source
 # stream is audio type, then the return type is
-# :py:class:`~torchaudio.prototype.io.SourceAudioStream`, which is
+# :py:class:`~torchaudio.io.StreamReader.SourceAudioStream`, which is
 # a subclass of `SourceStream`, with additional audio-specific attributes.
 # Similarly, if a source stream is video type, then the return type is
-# :py:class:`~torchaudio.prototype.io.SourceVideoStream`.
+# :py:class:`~torchaudio.io.StreamReader.SourceVideoStream`.

 ######################################################################
 # For regular audio formats and still image formats, such as `WAV`
 # and `JPEG`, the number of souorce streams is 1.
 #

-streamer = Streamer(AUDIO_URL)
+streamer = StreamReader(AUDIO_URL)
 print("The number of source streams:", streamer.num_src_streams)
 print(streamer.get_src_stream_info(0))

@@ -203,7 +197,7 @@ print(streamer.get_src_stream_info(0))
 #

 src = "https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/master.m3u8"
-streamer = Streamer(src)
+streamer = StreamReader(src)
 print("The number of source streams:", streamer.num_src_streams)
 for i in range(streamer.num_src_streams):
    print(streamer.get_src_stream_info(i))
@@ -228,8 +222,8 @@ for i in range(streamer.num_src_streams):
 # FFmpeg implements some heuristics to determine the default stream.
 # The resulting stream index is exposed via
 #
-# :py:attr:`~torchaudio.prototype.io.Streamer.default_audio_stream` and
-# :py:attr:`~torchaudio.prototype.io.Streamer.default_video_stream`.
+# :py:attr:`~torchaudio.io.StreamReader.default_audio_stream` and
+# :py:attr:`~torchaudio.io.StreamReader.default_video_stream`.
 #

 ######################################################################
@@ -238,8 +232,8 @@ for i in range(streamer.num_src_streams):
 #
 # Once you know which source stream you want to use, then you can
 # configure output streams with
-# :py:meth:`~torchaudio.prototype.io.Streamer.add_basic_audio_stream` and
-# :py:meth:`~torchaudio.prototype.io.Streamer.add_basic_video_stream`.
+# :py:meth:`~torchaudio.io.StreamReader.add_basic_audio_stream` and
+# :py:meth:`~torchaudio.io.StreamReader.add_basic_video_stream`.
 #
 # These methods provide a simple way to change the basic property of
 # media to match the application's requirements.
@@ -253,15 +247,15 @@ for i in range(streamer.num_src_streams):
 #   For video, it will be
 #   `(frames_per_chunk, num_channels, height, width)`.
 # - ``buffer_chunk_size``: The maximum number of chunks to be buffered internally.
-#   When the Streamer buffered this number of chunks and is asked to pull
-#   more frames, Streamer drops the old frames/chunks.
+#   When the StreamReader buffered this number of chunks and is asked to pull
+#   more frames, StreamReader drops the old frames/chunks.
 # - ``stream_index``: The index of the source stream.
 #
 # For audio output stream, you can provide the following additional
 # parameters to change the audio properties.
 #
-# - ``sample_rate``: When provided, Streamer resamples the audio on-the-fly.
-# - ``dtype``: By default the Streamer returns tensor of `float32` dtype,
+# - ``sample_rate``: When provided, StreamReader resamples the audio on-the-fly.
+# - ``dtype``: By default the StreamReader returns tensor of `float32` dtype,
 #   with sample values ranging `[-1, 1]`. By providing ``dtype`` argument
 #   the resulting dtype and value range is changed.
 #
@@ -277,7 +271,7 @@ for i in range(streamer.num_src_streams):
 #
 # .. code::
 #
-#    streamer = Streamer(...)
+#    streamer = StreamReader(...)
 #
 #    # Stream audio from default audio source stream
 #    # 256 frames at a time, keeping the original sampling rate.
@@ -324,9 +318,9 @@ for i in range(streamer.num_src_streams):
 #
 # You can check the resulting output streams in a similar manner as
 # checking the source streams.
-# :py:attr:`~torchaudio.prototype.io.Streamer.num_out_streams` reports
+# :py:attr:`~torchaudio.io.StreamReader.num_out_streams` reports
 # the number of configured output streams, and
-# :py:meth:`~torchaudio.prototype.io.Streamer.get_out_stream_info`
+# :py:meth:`~torchaudio.io.StreamReader.get_out_stream_info`
 # fetches the information about the output streams.
 #
 # .. code::
@@ -338,7 +332,7 @@ for i in range(streamer.num_src_streams):
 ######################################################################
 #
 # If you want to remove an output stream, you can do so with
-# :py:meth:`~torchaudio.prototype.io.Streamer.remove_stream` method.
+# :py:meth:`~torchaudio.io.StreamReader.remove_stream` method.
 #
 # .. code::
 #
@@ -355,16 +349,16 @@ for i in range(streamer.num_src_streams):
 # audio / video data to client code.
 #
 # There are low-level methods that performs these operations.
-# :py:meth:`~torchaudio.prototype.io.Streamer.is_buffer_ready`,
-# :py:meth:`~torchaudio.prototype.io.Streamer.process_packet` and
-# :py:meth:`~torchaudio.prototype.io.Streamer.pop_chunks`.
+# :py:meth:`~torchaudio.io.StreamReader.is_buffer_ready`,
+# :py:meth:`~torchaudio.io.StreamReader.process_packet` and
+# :py:meth:`~torchaudio.io.StreamReader.pop_chunks`.
 #
 # In this tutorial, we will use the high-level API, iterator protocol.
 # It is as simple as a ``for`` loop.
 #
 # .. code::
 #
-#    streamer = Streamer(...)
+#    streamer = StreamReader(...)
 #    streamer.add_basic_audio_stream(...)
 #    streamer.add_basic_video_stream(...)
 #
@@ -404,7 +398,7 @@ for i in range(streamer.num_src_streams):
 # Firstly, let's list the available streams and its properties.
 #

-streamer = Streamer(VIDEO_URL)
+streamer = StreamReader(VIDEO_URL)
 for i in range(streamer.num_src_streams):
    print(streamer.get_src_stream_info(i))

@@ -582,7 +576,7 @@ plt.show(block=False)
 #
 # .. code::
 #
-#    >>> Streamer(
+#    >>> StreamReader(
 #    ...     src="0:0",  # The first 0 means `FaceTime HD Camera`, and
 #    ...                 # the second 0 indicates `MacBook Pro Microphone`.
 #    ...     format="avfoundation",
@@ -601,7 +595,7 @@ plt.show(block=False)
 #
 # .. code::
 #
-#    >>> streamer = Streamer(
+#    >>> streamer = StreamReader(
 #    ...     src="0:0",
 #    ...     format="avfoundation",
 #    ...     option={"framerate": "30", "pixel_format": "bgr0"},
@@ -640,7 +634,7 @@ plt.show(block=False)
 #
 # .. code::
 #
-#    Streamer(src="sine=sample_rate=8000:frequency=360", format="lavfi")
+#    StreamReader(src="sine=sample_rate=8000:frequency=360", format="lavfi")
 #
 # .. raw:: html
 #
@@ -661,7 +655,7 @@ plt.show(block=False)
 # .. code::
 #
 #    # 5 Hz binaural beats on a 360 Hz carrier
-#    Streamer(
+#    StreamReader(
 #        src=(
 #            'aevalsrc='
 #            'sample_rate=8000:'
@@ -687,7 +681,7 @@ plt.show(block=False)
 #
 # .. code::
 #
-#    Streamer(src="anoisesrc=color=pink:sample_rate=8000:amplitude=0.5", format="lavfi")
+#    StreamReader(src="anoisesrc=color=pink:sample_rate=8000:amplitude=0.5", format="lavfi")
 #
 # .. raw:: html
 #
@@ -711,7 +705,7 @@ plt.show(block=False)
 #
 # .. code::
 #
-#    Streamer(src=f"cellauto", format="lavfi")
+#    StreamReader(src=f"cellauto", format="lavfi")
 #
 # .. raw:: html
 #
@@ -727,7 +721,7 @@ plt.show(block=False)
 #
 # .. code::
 #
-#    Streamer(src=f"mandelbrot", format="lavfi")
+#    StreamReader(src=f"mandelbrot", format="lavfi")
 #
 # .. raw:: html
 #
@@ -743,7 +737,7 @@ plt.show(block=False)
 #
 # .. code::
 #
-#    Streamer(src=f"mptestsrc", format="lavfi")
+#    StreamReader(src=f"mptestsrc", format="lavfi")
 #
 # .. raw:: html
 #
@@ -759,7 +753,7 @@ plt.show(block=False)
 #
 # .. code::
 #
-#    Streamer(src=f"life", format="lavfi")
+#    StreamReader(src=f"life", format="lavfi")
 #
 # .. raw:: html
 #
@@ -775,7 +769,7 @@ plt.show(block=False)
 #
 # .. code::
 #
-#    Streamer(src=f"sierpinski", format="lavfi")
+#    StreamReader(src=f"sierpinski", format="lavfi")
 #
 # .. raw:: html
 #
@@ -789,8 +783,8 @@ plt.show(block=False)
 # ------------------------
 #
 # When defining an output stream, you can use
-# :py:meth:`~torchaudio.prototype.io.Streamer.add_audio_stream` and
-# :py:meth:`~torchaudio.prototype.io.Streamer.add_video_stream` methods.
+# :py:meth:`~torchaudio.io.StreamReader.add_audio_stream` and
+# :py:meth:`~torchaudio.io.StreamReader.add_video_stream` methods.
 #
 # These methods take ``filter_desc`` argument, which is a string
 # formatted according to ffmpeg's
@@ -852,7 +846,7 @@ descs = [

 sample_rate = 8000

-streamer = Streamer(AUDIO_URL)
+streamer = StreamReader(AUDIO_URL)
 for desc in descs:
    streamer.add_audio_stream(
        frames_per_chunk=40000,
@@ -929,7 +923,7 @@ descs = [
 ######################################################################
 #

-streamer = Streamer(VIDEO_URL)
+streamer = StreamReader(VIDEO_URL)
 for desc in descs:
    streamer.add_video_stream(
        frames_per_chunk=30,

--- a/packaging/torchaudio/meta.yaml
+++ b/packaging/torchaudio/meta.yaml
@@ -46,12 +46,13 @@ build:
    - BUILD_VERSION
    - USE_CUDA
    - TORCH_CUDA_ARCH_LIST
-    - BUILD_FFMPEG
+    - USE_FFMPEG
    - FFMPEG_ROOT

 test:
  imports:
    - torchaudio
+    - torchaudio.io
    - torchaudio.datasets
    - torchaudio.kaldi_io
    - torchaudio.sox_effects

--- a/test/torchaudio_unittest/common_utils/case_utils.py
+++ b/test/torchaudio_unittest/common_utils/case_utils.py
@@ -117,7 +117,7 @@ def is_ffmpeg_available():
    global _IS_FFMPEG_AVAILABLE
    if _IS_FFMPEG_AVAILABLE is None:
        try:
-            from torchaudio.prototype.io import Streamer  # noqa: F401
+            from torchaudio.io import StreamReader  # noqa: F401

            _IS_FFMPEG_AVAILABLE = True
        except Exception:

--- a/test/torchaudio_unittest/prototype/io_test.py
+++ b/test/torchaudio_unittest/prototype/io_test.py
@@ -14,11 +14,11 @@ from torchaudio_unittest.common_utils import (
 )

 if is_ffmpeg_available():
-    from torchaudio.prototype.io import (
-        Streamer,
-        SourceStream,
-        SourceVideoStream,
-        SourceAudioStream,
+    from torchaudio.io import (
+        StreamReader,
+        StreamReaderSourceStream,
+        StreamReaderSourceVideoStream,
+        StreamReaderSourceAudioStream,
    )


@@ -27,13 +27,13 @@ def get_video_asset(file="nasa_13013.mp4"):


 @skipIfNoFFmpeg
-class StreamerInterfaceTest(TempDirMixin, TorchaudioTestCase):
-    """Test suite for interface behaviors around Streamer"""
+class StreamReaderInterfaceTest(TempDirMixin, TorchaudioTestCase):
+    """Test suite for interface behaviors around StreamReader"""

    def test_streamer_invalid_input(self):
-        """Streamer constructor does not segfault but raise an exception when the input is invalid"""
+        """StreamReader constructor does not segfault but raise an exception when the input is invalid"""
        with self.assertRaises(RuntimeError):
-            Streamer("foobar")
+            StreamReader("foobar")

    @nested_params(
        [
@@ -46,20 +46,20 @@ class StreamerInterfaceTest(TempDirMixin, TorchaudioTestCase):
        [{}, {"sample_rate": "16000"}],
    )
    def test_streamer_invalide_option(self, invalid_keys, options):
-        """When invalid options are given, Streamer raises an exception with these keys"""
+        """When invalid options are given, StreamReader raises an exception with these keys"""
        options.update({k: k for k in invalid_keys})
        src = get_video_asset()
        with self.assertRaises(RuntimeError) as ctx:
-            Streamer(src, option=options)
+            StreamReader(src, option=options)
        assert all(f'"{k}"' in str(ctx.exception) for k in invalid_keys)

    def test_src_info(self):
        """`get_src_stream_info` properly fetches information"""
-        s = Streamer(get_video_asset())
+        s = StreamReader(get_video_asset())
        assert s.num_src_streams == 6

        expected = [
-            SourceVideoStream(
+            StreamReaderSourceVideoStream(
                media_type="video",
                codec="h264",
                codec_long_name="H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10",
@@ -69,7 +69,7 @@ class StreamerInterfaceTest(TempDirMixin, TorchaudioTestCase):
                height=180,
                frame_rate=25.0,
            ),
-            SourceAudioStream(
+            StreamReaderSourceAudioStream(
                media_type="audio",
                codec="aac",
                codec_long_name="AAC (Advanced Audio Coding)",
@@ -78,14 +78,14 @@ class StreamerInterfaceTest(TempDirMixin, TorchaudioTestCase):
                sample_rate=8000.0,
                num_channels=2,
            ),
-            SourceStream(
+            StreamReaderSourceStream(
                media_type="subtitle",
                codec="mov_text",
                codec_long_name="MOV text",
                format=None,
                bit_rate=None,
            ),
-            SourceVideoStream(
+            StreamReaderSourceVideoStream(
                media_type="video",
                codec="h264",
                codec_long_name="H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10",
@@ -95,7 +95,7 @@ class StreamerInterfaceTest(TempDirMixin, TorchaudioTestCase):
                height=270,
                frame_rate=29.97002997002997,
            ),
-            SourceAudioStream(
+            StreamReaderSourceAudioStream(
                media_type="audio",
                codec="aac",
                codec_long_name="AAC (Advanced Audio Coding)",
@@ -104,7 +104,7 @@ class StreamerInterfaceTest(TempDirMixin, TorchaudioTestCase):
                sample_rate=16000.0,
                num_channels=2,
            ),
-            SourceStream(
+            StreamReaderSourceStream(
                media_type="subtitle",
                codec="mov_text",
                codec_long_name="MOV text",
@@ -117,30 +117,30 @@ class StreamerInterfaceTest(TempDirMixin, TorchaudioTestCase):

    def test_src_info_invalid_index(self):
        """`get_src_stream_info` does not segfault but raise an exception when input is invalid"""
-        s = Streamer(get_video_asset())
+        s = StreamReader(get_video_asset())
        for i in [-1, 6, 7, 8]:
            with self.assertRaises(IndexError):
                s.get_src_stream_info(i)

    def test_default_streams(self):
        """default stream is not None"""
-        s = Streamer(get_video_asset())
+        s = StreamReader(get_video_asset())
        assert s.default_audio_stream is not None
        assert s.default_video_stream is not None

    def test_default_audio_stream_none(self):
        """default audio stream is None for video without audio"""
-        s = Streamer(get_video_asset("nasa_13013_no_audio.mp4"))
+        s = StreamReader(get_video_asset("nasa_13013_no_audio.mp4"))
        assert s.default_audio_stream is None

    def test_default_video_stream_none(self):
        """default video stream is None for video with only audio"""
-        s = Streamer(get_video_asset("nasa_13013_no_video.mp4"))
+        s = StreamReader(get_video_asset("nasa_13013_no_video.mp4"))
        assert s.default_video_stream is None

    def test_num_out_stream(self):
        """num_out_streams gives the correct count of output streams"""
-        s = Streamer(get_video_asset())
+        s = StreamReader(get_video_asset())
        n, m = 6, 4
        for i in range(n):
            assert s.num_out_streams == i
@@ -158,7 +158,7 @@ class StreamerInterfaceTest(TempDirMixin, TorchaudioTestCase):

    def test_basic_audio_stream(self):
        """`add_basic_audio_stream` constructs a correct filter."""
-        s = Streamer(get_video_asset())
+        s = StreamReader(get_video_asset())
        s.add_basic_audio_stream(frames_per_chunk=-1, dtype=None)
        s.add_basic_audio_stream(frames_per_chunk=-1, sample_rate=8000)
        s.add_basic_audio_stream(frames_per_chunk=-1, dtype=torch.int16)
@@ -177,7 +177,7 @@ class StreamerInterfaceTest(TempDirMixin, TorchaudioTestCase):

    def test_basic_video_stream(self):
        """`add_basic_video_stream` constructs a correct filter."""
-        s = Streamer(get_video_asset())
+        s = StreamReader(get_video_asset())
        s.add_basic_video_stream(frames_per_chunk=-1, format=None)
        s.add_basic_video_stream(frames_per_chunk=-1, width=3, height=5)
        s.add_basic_video_stream(frames_per_chunk=-1, frame_rate=7)
@@ -201,7 +201,7 @@ class StreamerInterfaceTest(TempDirMixin, TorchaudioTestCase):

    def test_remove_streams(self):
        """`remove_stream` removes the correct output stream"""
-        s = Streamer(get_video_asset())
+        s = StreamReader(get_video_asset())
        s.add_basic_audio_stream(frames_per_chunk=-1, sample_rate=24000)
        s.add_basic_video_stream(frames_per_chunk=-1, width=16, height=16)
        s.add_basic_audio_stream(frames_per_chunk=-1, sample_rate=8000)
@@ -221,7 +221,7 @@ class StreamerInterfaceTest(TempDirMixin, TorchaudioTestCase):

    def test_remove_stream_invalid(self):
        """Attempt to remove invalid output streams raises IndexError"""
-        s = Streamer(get_video_asset())
+        s = StreamReader(get_video_asset())
        for i in range(-3, 3):
            with self.assertRaises(IndexError):
                s.remove_stream(i)
@@ -235,7 +235,7 @@ class StreamerInterfaceTest(TempDirMixin, TorchaudioTestCase):

    def test_process_packet(self):
        """`process_packet` method returns 0 while there is a packet in source stream"""
-        s = Streamer(get_video_asset())
+        s = StreamReader(get_video_asset())
        # nasa_1013.mp3 contains 1023 packets.
        for _ in range(1023):
            code = s.process_packet()
@@ -246,19 +246,19 @@ class StreamerInterfaceTest(TempDirMixin, TorchaudioTestCase):

    def test_pop_chunks_no_output_stream(self):
        """`pop_chunks` method returns empty list when there is no output stream"""
-        s = Streamer(get_video_asset())
+        s = StreamReader(get_video_asset())
        assert s.pop_chunks() == []

    def test_pop_chunks_empty_buffer(self):
        """`pop_chunks` method returns None when a buffer is empty"""
-        s = Streamer(get_video_asset())
+        s = StreamReader(get_video_asset())
        s.add_basic_audio_stream(frames_per_chunk=-1)
        s.add_basic_video_stream(frames_per_chunk=-1)
        assert s.pop_chunks() == [None, None]

    def test_pop_chunks_exhausted_stream(self):
        """`pop_chunks` method returns None when the source stream is exhausted"""
-        s = Streamer(get_video_asset())
+        s = StreamReader(get_video_asset())
        # video is 16.57 seconds.
        # audio streams per 10 second chunk
        # video streams per 20 second chunk
@@ -284,14 +284,14 @@ class StreamerInterfaceTest(TempDirMixin, TorchaudioTestCase):

    def test_stream_empty(self):
        """`stream` fails when no output stream is configured"""
-        s = Streamer(get_video_asset())
+        s = StreamReader(get_video_asset())
        with self.assertRaises(RuntimeError):
            next(s.stream())

    def test_stream_smoke_test(self):
        """`stream` streams chunks fine"""
        w, h = 256, 198
-        s = Streamer(get_video_asset())
+        s = StreamReader(get_video_asset())
        s.add_basic_audio_stream(frames_per_chunk=2000, sample_rate=8000)
        s.add_basic_video_stream(frames_per_chunk=15, frame_rate=60, width=w, height=h)
        for i, (achunk, vchunk) in enumerate(s.stream()):
@@ -302,7 +302,7 @@ class StreamerInterfaceTest(TempDirMixin, TorchaudioTestCase):

    def test_seek(self):
        """Calling `seek` multiple times should not segfault"""
-        s = Streamer(get_video_asset())
+        s = StreamReader(get_video_asset())
        for i in range(10):
            s.seek(i)
        for _ in range(0):
@@ -312,13 +312,13 @@ class StreamerInterfaceTest(TempDirMixin, TorchaudioTestCase):

    def test_seek_negative(self):
        """Calling `seek` with negative value should raise an exception"""
-        s = Streamer(get_video_asset())
+        s = StreamReader(get_video_asset())
        with self.assertRaises(ValueError):
            s.seek(-1.0)


 @skipIfNoFFmpeg
-class StreamerAudioTest(TempDirMixin, TorchaudioTestCase):
+class StreamReaderAudioTest(TempDirMixin, TorchaudioTestCase):
    """Test suite for audio streaming"""

    def _get_reference_wav(self, sample_rate, channels_first=False, **kwargs):
@@ -328,7 +328,7 @@ class StreamerAudioTest(TempDirMixin, TorchaudioTestCase):
        return path, data

    def _test_wav(self, path, original, dtype):
-        s = Streamer(path)
+        s = StreamReader(path)
        s.add_basic_audio_stream(frames_per_chunk=-1, dtype=dtype)
        s.process_all_packets()
        (output,) = s.pop_chunks()
@@ -357,7 +357,7 @@ class StreamerAudioTest(TempDirMixin, TorchaudioTestCase):

        expected = torch.flip(original, dims=(0,))

-        s = Streamer(path)
+        s = StreamReader(path)
        s.add_audio_stream(frames_per_chunk=-1, filter_desc="areverse")
        s.process_all_packets()
        (output,) = s.pop_chunks()
@@ -372,7 +372,7 @@ class StreamerAudioTest(TempDirMixin, TorchaudioTestCase):
        path, original = self._get_reference_wav(1, dtype=dtype, num_channels=num_channels, num_frames=30)
        for t in range(10, 20):
            expected = original[t:, :]
-            s = Streamer(path)
+            s = StreamReader(path)
            s.add_audio_stream(frames_per_chunk=-1)
            s.seek(float(t))
            s.process_all_packets()
@@ -383,7 +383,7 @@ class StreamerAudioTest(TempDirMixin, TorchaudioTestCase):
        """Calling `seek` after streaming is started should change the position properly"""
        path, original = self._get_reference_wav(1, dtype="int16", num_channels=2, num_frames=30)

-        s = Streamer(path)
+        s = StreamReader(path)
        s.add_audio_stream(frames_per_chunk=-1)

        ts = list(range(20)) + list(range(20, 0, -1)) + list(range(20))
@@ -409,7 +409,7 @@ class StreamerAudioTest(TempDirMixin, TorchaudioTestCase):
            8000, dtype="int16", num_channels=num_channels, num_frames=num_frames, channels_first=False
        )

-        s = Streamer(path)
+        s = StreamReader(path)
        s.add_audio_stream(frames_per_chunk=frames_per_chunk, buffer_chunk_size=buffer_chunk_size)
        i, outputs = 0, []
        for (output,) in s.stream():
@@ -422,7 +422,7 @@ class StreamerAudioTest(TempDirMixin, TorchaudioTestCase):


 @skipIfNoFFmpeg
-class StreamerImageTest(TempDirMixin, TorchaudioTestCase):
+class StreamReaderImageTest(TempDirMixin, TorchaudioTestCase):
    def _get_reference_png(self, width: int, height: int, grayscale: bool):
        original = get_image(width, height, grayscale=grayscale)
        path = self.get_temp_path("ref.png")
@@ -430,7 +430,7 @@ class StreamerImageTest(TempDirMixin, TorchaudioTestCase):
        return path, original

    def _test_png(self, path, original, format=None):
-        s = Streamer(path)
+        s = StreamReader(path)
        s.add_basic_video_stream(frames_per_chunk=-1, format=format)
        s.process_all_packets()
        (output,) = s.pop_chunks()
@@ -456,7 +456,7 @@ class StreamerImageTest(TempDirMixin, TorchaudioTestCase):
        path, original = self._get_reference_png(w, h, grayscale=False)
        expected = torch.flip(original, dims=(index,))[None, ...]

-        s = Streamer(path)
+        s = StreamReader(path)
        s.add_video_stream(frames_per_chunk=-1, filter_desc=filter_desc)
        s.process_all_packets()
        output = s.pop_chunks()[0]

--- a/tools/setup_helpers/extension.py
+++ b/tools/setup_helpers/extension.py
@@ -37,7 +37,7 @@ _BUILD_MAD = _get_build("BUILD_MAD", False)
 _BUILD_KALDI = False if platform.system() == "Windows" else _get_build("BUILD_KALDI", True)
 _BUILD_RNNT = _get_build("BUILD_RNNT", True)
 _BUILD_CTC_DECODER = False if platform.system() == "Windows" else _get_build("BUILD_CTC_DECODER", True)
-_BUILD_FFMPEG = _get_build("BUILD_FFMPEG", False)
+_USE_FFMPEG = _get_build("USE_FFMPEG", False)
 _USE_ROCM = _get_build("USE_ROCM", torch.cuda.is_available() and torch.version.hip is not None)
 _USE_CUDA = _get_build("USE_CUDA", torch.cuda.is_available() and torch.version.hip is None)
 _USE_OPENMP = _get_build("USE_OPENMP", True) and "ATen parallel backend: OpenMP" in torch.__config__.parallel_info()
@@ -56,7 +56,7 @@ def get_ext_modules():
                Extension(name="torchaudio._torchaudio_decoder", sources=[]),
            ]
        )
-    if _BUILD_FFMPEG:
+    if _USE_FFMPEG:
        modules.append(Extension(name="torchaudio.lib.libtorchaudio_ffmpeg", sources=[]))
    return modules

@@ -97,7 +97,6 @@ class CMakeBuild(build_ext):
            f"-DPython_INCLUDE_DIR={distutils.sysconfig.get_python_inc()}",
            f"-DBUILD_SOX:BOOL={'ON' if _BUILD_SOX else 'OFF'}",
            f"-DBUILD_MAD:BOOL={'ON' if _BUILD_MAD else 'OFF'}",
-            f"-DBUILD_FFMPEG:BOOL={'ON' if _BUILD_FFMPEG else 'OFF'}",
            f"-DBUILD_KALDI:BOOL={'ON' if _BUILD_KALDI else 'OFF'}",
            f"-DBUILD_RNNT:BOOL={'ON' if _BUILD_RNNT else 'OFF'}",
            f"-DBUILD_CTC_DECODER:BOOL={'ON' if _BUILD_CTC_DECODER else 'OFF'}",
@@ -105,6 +104,7 @@ class CMakeBuild(build_ext):
            f"-DUSE_ROCM:BOOL={'ON' if _USE_ROCM else 'OFF'}",
            f"-DUSE_CUDA:BOOL={'ON' if _USE_CUDA else 'OFF'}",
            f"-DUSE_OPENMP:BOOL={'ON' if _USE_OPENMP else 'OFF'}",
+            f"-DUSE_FFMPEG:BOOL={'ON' if _USE_FFMPEG else 'OFF'}",
        ]
        build_args = ["--target", "install"]
        # Pass CUDA architecture to cmake

--- a/torchaudio/__init__.py
+++ b/torchaudio/__init__.py
 from torchaudio import _extension  # noqa: F401
 from torchaudio import (
+    io,
    compliance,
    datasets,
    functional,
@@ -22,6 +23,7 @@ except ImportError:
    pass

 __all__ = [
+    "io",
    "compliance",
    "datasets",
    "functional",

--- a/torchaudio/csrc/CMakeLists.txt
+++ b/torchaudio/csrc/CMakeLists.txt
@@ -170,7 +170,7 @@ endif()
 ################################################################################
 # libtorchaudio_ffmpeg
 ################################################################################
-if(BUILD_FFMPEG)
+if(USE_FFMPEG)
  set(
    LIBTORCHAUDIO_FFMPEG_SOURCES
    ffmpeg/prototype.cpp

--- a/torchaudio/io/__init__.py
+++ b/torchaudio/io/__init__.py
+_INITIALIZED = False
+_LAZILY_IMPORTED = [
+    "StreamReader",
+    "StreamReaderSourceStream",
+    "StreamReaderSourceAudioStream",
+    "StreamReaderSourceVideoStream",
+    "StreamReaderOutputStream",
+]
+
+
+def _init_extension():
+    import torch
+    import torchaudio
+
+    try:
+        torchaudio._extension._load_lib("libtorchaudio_ffmpeg")
+    except OSError as err:
+        raise ImportError(
+            "Stream API requires FFmpeg libraries (libavformat and such). Please install FFmpeg 4."
+        ) from err
+    try:
+        torch.ops.torchaudio.ffmpeg_init()
+    except RuntimeError as err:
+        raise RuntimeError(
+            "Stream API requires FFmpeg binding. Please set USE_FFMPEG=1 when building from source."
+        ) from err
+
+    global _INITIALIZED
+    _INITIALIZED = True
+
+
+def __getattr__(name: str):
+    if name in _LAZILY_IMPORTED:
+        if not _INITIALIZED:
+            _init_extension()
+
+        from . import _stream_reader
+
+        item = getattr(_stream_reader, name)
+        globals()[name] = item
+        return item
+    raise AttributeError(f"module {__name__} has no attribute {name}")
+
+
+def __dir__():
+    return sorted(__all__ + _LAZILY_IMPORTED)
+
+
+__all__ = []
--- a/torchaudio/prototype/io/streamer.py
+++ b/torchaudio/prototype/io/streamer.py
@@ -8,8 +8,8 @@ import torchaudio


 @dataclass
-class SourceStream:
-    """SourceStream()
+class StreamReaderSourceStream:
+    """StreamReaderSourceStream()

    The metadata of a source stream. This class is used when representing streams of
    media type other than `audio` or `video`.
@@ -58,8 +58,8 @@ class SourceStream:


 @dataclass
-class SourceAudioStream(SourceStream):
-    """SourceAudioStream()
+class StreamReaderSourceAudioStream(StreamReaderSourceStream):
+    """StreamReaderSourceAudioStream()

    The metadata of an audio source stream.

@@ -75,8 +75,8 @@ class SourceAudioStream(SourceStream):


 @dataclass
-class SourceVideoStream(SourceStream):
-    """SourceVideoStream()
+class StreamReaderSourceVideoStream(StreamReaderSourceStream):
+    """StreamReaderSourceVideoStream()

    The metadata of a video source stream.

@@ -114,7 +114,7 @@ def _parse_si(i):
    codec_name = i[_CODEC]
    codec_long_name = i[_CODEC_LONG]
    if media_type == "audio":
-        return SourceAudioStream(
+        return StreamReaderSourceAudioStream(
            media_type,
            codec_name,
            codec_long_name,
@@ -124,7 +124,7 @@ def _parse_si(i):
            i[_NUM_CHANNELS],
        )
    if media_type == "video":
-        return SourceVideoStream(
+        return StreamReaderSourceVideoStream(
            media_type,
            codec_name,
            codec_long_name,
@@ -134,14 +134,14 @@ def _parse_si(i):
            i[_HEIGHT],
            i[_FRAME_RATE],
        )
-    return SourceStream(media_type, codec_name, codec_long_name, None, None)
+    return StreamReaderSourceStream(media_type, codec_name, codec_long_name, None, None)


 @dataclass
-class OutputStream:
+class StreamReaderOutputStream:
    """OutputStream()

-    Output stream configured on :py:class:`Streamer`.
+    Output stream configured on :py:class:`StreamReader`.
    """

    source_index: int
@@ -151,10 +151,10 @@ class OutputStream:


 def _parse_oi(i):
-    return OutputStream(i[0], i[1])
+    return StreamReaderOutputStream(i[0], i[1])


-class Streamer:
+class StreamReader:
    """Fetch and decode audio/video streams chunk by chunk.

    For the detailed usage of this class, please refer to the tutorial.
@@ -239,7 +239,7 @@ class Streamer:
        """
        return self._i_video

-    def get_src_stream_info(self, i: int) -> torchaudio.prototype.io.SourceStream:
+    def get_src_stream_info(self, i: int) -> torchaudio.io.StreamReaderSourceStream:
        """Get the metadata of source stream

        Args:
@@ -249,7 +249,7 @@ class Streamer:
        """
        return _parse_si(torch.ops.torchaudio.ffmpeg_streamer_get_src_stream_info(self._s, i))

-    def get_out_stream_info(self, i: int) -> torchaudio.prototype.io.OutputStream:
+    def get_out_stream_info(self, i: int) -> torchaudio.io.StreamReaderOutputStream:
        """Get the metadata of output stream

        Args:
@@ -278,7 +278,7 @@ class Streamer:
        """Add output audio stream

        Args:
-            frames_per_chunk (int): Number of frames returned by Streamer as a chunk.
+            frames_per_chunk (int): Number of frames returned by StreamReader as a chunk.
                If the source stream is exhausted before enough frames are buffered,
                then the chunk is returned as-is.

@@ -314,7 +314,7 @@ class Streamer:
        """Add output video stream

        Args:
-            frames_per_chunk (int): Number of frames returned by Streamer as a chunk.
+            frames_per_chunk (int): Number of frames returned by StreamReader as a chunk.
                If the source stream is exhausted before enough frames are buffered,
                then the chunk is returned as-is.

@@ -361,7 +361,7 @@ class Streamer:
        """Add output audio stream

        Args:
-            frames_per_chunk (int): Number of frames returned by Streamer as a chunk.
+            frames_per_chunk (int): Number of frames returned by StreamReader as a chunk.
                If the source stream is exhausted before enough frames are buffered,
                then the chunk is returned as-is.

@@ -408,7 +408,7 @@ class Streamer:
        """Add output video stream

        Args:
-            frames_per_chunk (int): Number of frames returned by Streamer as a chunk.
+            frames_per_chunk (int): Number of frames returned by StreamReader as a chunk.
                If the source stream is exhausted before enough frames are buffered,
                then the chunk is returned as-is.

@@ -446,7 +446,7 @@ class Streamer:
        Example - HW decoding::

            >>> # Decode video with NVDEC, create Tensor on CPU.
-            >>> streamer = Streamer(src="input.mp4")
+            >>> streamer = StreamReader(src="input.mp4")
            >>> streamer.add_video_stream(10, decoder="h264_cuvid", hw_accel=None)
            >>>
            >>> chunk, = next(streamer.stream())
@@ -454,7 +454,7 @@ class Streamer:
            ... cpu

            >>> # Decode video with NVDEC, create Tensor directly on CUDA
-            >>> streamer = Streamer(src="input.mp4")
+            >>> streamer = StreamReader(src="input.mp4")
            >>> streamer.add_video_stream(10, decoder="h264_cuvid", hw_accel="cuda:1")
            >>>
            >>> chunk, = next(streamer.stream())
@@ -462,7 +462,7 @@ class Streamer:
            ... cuda:1

            >>> # Decode and resize video with NVDEC, create Tensor directly on CUDA
-            >>> streamer = Streamer(src="input.mp4")
+            >>> streamer = StreamReader(src="input.mp4")
            >>> streamer.add_video_stream(
            >>>     10, decoder="h264_cuvid",
            >>>     decoder_options={"resize": "240x360"}, hw_accel="cuda:1")
@@ -595,10 +595,10 @@ class Streamer:

        Arguments:
            timeout (float or None, optional): See
-                :py:func:`~Streamer.process_packet`. (Default: ``None``)
+                :py:func:`~StreamReader.process_packet`. (Default: ``None``)

            backoff (float, optional): See
-                :py:func:`~Streamer.process_packet`. (Default: ``10.0``)
+                :py:func:`~StreamReader.process_packet`. (Default: ``10.0``)

        Returns:
            Iterator[Tuple[Optional[torch.Tensor], ...]]:

--- a/torchaudio/prototype/io/__init__.py
+++ b/torchaudio/prototype/io/__init__.py
-_INITIALIZED = False
-_LAZILY_IMPORTED = [
-    "Streamer",
-    "SourceStream",
-    "SourceAudioStream",
-    "SourceVideoStream",
-    "OutputStream",
-]
-
-
-def _init_extension():
-    import torch
-    import torchaudio
-
-    try:
-        torchaudio._extension._load_lib("libtorchaudio_ffmpeg")
-    except OSError as err:
-        raise ImportError(
-            "Stream API requires FFmpeg libraries (libavformat and such). Please install FFmpeg 4."
-        ) from err
-    try:
-        torch.ops.torchaudio.ffmpeg_init()
-    except RuntimeError as err:
-        raise RuntimeError(
-            "Stream API requires FFmpeg binding. Please set BUILD_FFMPEG=1 when building from source."
-        ) from err
-
-    global _INITIALIZED
-    _INITIALIZED = True
-
-
 def __getattr__(name: str):
-    if name in _LAZILY_IMPORTED:
-        if not _INITIALIZED:
-            _init_extension()
+    if name == "Streamer":
+        import warnings
+
+        from torchaudio.io import StreamReader

-        from . import streamer
+        warnings.warn(
+            f"{__name__}.{name} has been moved to torchaudio.io.StreamReader. Please use torchaudio.io.StreamReader",
+            DeprecationWarning,
+        )

-        item = getattr(streamer, name)
-        globals()[name] = item
-        return item
+        global Streamer
+        Streamer = StreamReader
+        return Streamer
    raise AttributeError(f"module {__name__} has no attribute {name}")


 def __dir__():
-    return sorted(__all__ + _LAZILY_IMPORTED)
-
-
-__all__ = []
+    return ["Streamer"]