Support multiple FFmpeg versions (#3464)

Summary: This commit introduces support for multiple FFmpeg versions for OSS binary distributions. Currently torchaudio only works with FFmpeg 4. This is inconvenient from installing to runtime linking. This commit allows to pick FFmpeg 4, 5 or 6 at runtime, instead of just looking for v4. The way it works is that we compile the FFmpeg extension three times with different FFmpeg and ship them. At runtime, we look for libavutil of specific version and when one is found, load the corresponding FFmpeg extension. The order of preference is 6, 5, then 4. To make the build process simple and reproducible, we use pre-built binaries of FFmpeg during the build. They are LGPL and downloaded from S3 at build time, instead of building every time. The use of pre-built binaries as scaffolding limits the system that can build torchaudio, so it also introduces single FFmpeg version support mode. setting FFMPEG_ROOT during the build will change the way binaries are built so that it will only support one specific version of FFmpeg. Pull Request resolved: https://github.com/pytorch/audio/pull/3464 Differential Revision: D47300223 Pulled By: mthrok fbshipit-source-id: 560c7968315e4c8922afa11a4693f648c0356d04

Support multiple FFmpeg versions (#3464)
Summary: This commit introduces support for multiple FFmpeg versions for OSS binary distributions. Currently torchaudio only works with FFmpeg 4. This is inconvenient from installing to runtime linking. This commit allows to pick FFmpeg 4, 5 or 6 at runtime, instead of just looking for v4. The way it works is that we compile the FFmpeg extension three times with different FFmpeg and ship them. At runtime, we look for libavutil of specific version and when one is found, load the corresponding FFmpeg extension. The order of preference is 6, 5, then 4. To make the build process simple and reproducible, we use pre-built binaries of FFmpeg during the build. They are LGPL and downloaded from S3 at build time, instead of building every time. The use of pre-built binaries as scaffolding limits the system that can build torchaudio, so it also introduces single FFmpeg version support mode. setting FFMPEG_ROOT during the build will change the way binaries are built so that it will only support one specific version of FFmpeg. Pull Request resolved: https://github.com/pytorch/audio/pull/3464 Differential Revision: D47300223 Pulled By: mthrok fbshipit-source-id: 560c7968315e4c8922afa11a4693f648c0356d04
786066b4 · moto · Facebook GitHub Bot · cc41178b · 786066b4 · 786066b4
Commit 786066b4 authored Jul 11, 2023 by moto Committed by Facebook GitHub Bot Jul 11, 2023
20 changed files
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -171,7 +171,12 @@ if (BUILD_SOX)
  add_subdirectory(torchaudio/csrc/sox)
 endif()
 if (USE_FFMPEG)
-  add_subdirectory(third_party/ffmpeg)
+  if (DEFINED ENV{FFMPEG_ROOT})
+    add_subdirectory(third_party/ffmpeg/single)
+  else()
+    message(STATUS "Building FFmpeg integration with multi version support")
+    add_subdirectory(third_party/ffmpeg/multi)
+  endif()
  add_subdirectory(torchaudio/csrc/ffmpeg)
 endif()
 if (BUILD_CUDA_CTC_DECODER)

--- a/docs/source/build.jetson.rst
+++ b/docs/source/build.jetson.rst
@@ -137,7 +137,7 @@ Verify the installation by checking the version and CUDA device accessibility.

   git clone https://github.com/pytorch/audio
   cd audio
-   USE_CUDA=1 USE_FFMPEG=1 pip install -v -e . --no-use-pep517
+   USE_CUDA=1 pip install -v -e . --no-use-pep517

 4. Check the installation
 ~~~~~~~~~~~~~~~~~~~~~~~~~

--- a/docs/source/build.linux.rst
+++ b/docs/source/build.linux.rst
@@ -24,14 +24,7 @@ Here, we install nightly build.

   conda install cmake ninja pkg-config

-4. Install external dependencies
--------------------------------
-
-.. code-block::
-
-   conda install -c conda-forge ffmpeg
-
-5. Clone the torchaudio repository
+4. Clone the torchaudio repository
 ----------------------------------

 .. code-block::
@@ -39,15 +32,29 @@ Here, we install nightly build.
   git clone https://github.com/pytorch/audio
   cd audio

-6. Build
+5. Build
 --------

 .. code-block::

-   USE_FFMPEG=1 python setup.py develop
+   python setup.py develop

 .. note::
   Due to the complexity of build process, TorchAudio only supports in-place build.
   To use ``pip``, please use ``--no-use-pep517`` option.

-   ``USE_FFMPEG=1 pip install -v -e . --no-use-pep517``
+   ``pip install -v -e . --no-use-pep517``
+
+[Optional] Build TorchAudio with a custom built FFmpeg
+------------------------------------------------------
+
+By default, torchaudio tries to build FFmpeg extension with support for multiple FFmpeg versions. This process uses pre-built FFmpeg libraries compiled for specific CPU architectures like ``x86_64`` and ``aarch64`` (``arm64``).
+
+If your CPU is not one of those, then the build process can fail. To workaround, one can disable FFmpeg integration (by setting the environment variable ``USE_FFMPEG=0``) or switch to the single version FFmpeg extension.
+
+To build single version FFmpeg extension, FFmpeg binaries must be provided by user and available in the build environment. To do so, install FFmpeg and set ``FFMPEG_ROOT`` environment variable to specify the location of FFmpeg.
+
+.. code-block::
+
+   conda install -c conda-forge ffmpeg
+   FFMPEG_ROOT=${CONDA_PREFIX} python setup.py develop
--- a/docs/source/build.windows.rst
+++ b/docs/source/build.windows.rst
@@ -118,10 +118,6 @@ When using conda, the directories are ``${CONDA_PREFIX}/bin``, ``${CONDA_PREFIX}

   conda install cmake ninja

-.. code-block::
-
-   conda install -c conda-forge ffmpeg
-
 6. Build TorchAudio
 -------------------

@@ -136,19 +132,33 @@ Now that we have everything ready, we can build TorchAudio.
 .. code-block::

   # In Command Prompt
-   set USE_FFMPEG=1
   python setup.py develop

 .. code-block::

   # In Bash
-   USE_FFMPEG=1 python setup.py develop
+   python setup.py develop

 .. note::
   Due to the complexity of build process, TorchAudio only supports in-place build.
   To use ``pip``, please use ``--no-use-pep517`` option.

-   ``USE_FFMPEG=1 pip install -v -e . --no-use-pep517``
+   ``pip install -v -e . --no-use-pep517``
+
+[Optional] Build TorchAudio with a custom FFmpeg
+------------------------------------------------
+
+By default, torchaudio tries to build FFmpeg extension with support for multiple FFmpeg versions. This process uses pre-built FFmpeg libraries compiled for specific CPU architectures like ``x86_64``.
+
+If your CPU is different, then the build process can fail. To workaround, one can disable FFmpeg integration (by setting the environment variable ``USE_FFMPEG=0``) or switch to the single version FFmpeg extension.
+
+To build single version FFmpeg extension, FFmpeg binaries must be provided by user and available in the build environment. To do so, install FFmpeg and set ``FFMPEG_ROOT`` environment variable to specify the location of FFmpeg.
+
+.. code-block::
+
+   conda install -c conda-forge ffmpeg
+   FFMPEG_ROOT=${CONDA_PREFIX}/Library python setup.py develop
+
   
 [Optional] Building FFmpeg from source
 --------------------------------------
@@ -170,6 +180,10 @@ FFmpeg's official documentation touches this https://trac.ffmpeg.org/wiki/Compil

 Please follow the instruction at https://www.msys2.org/ to install MSYS2.

+.. note::
+
+   In CI environment, often `Chocolatery <https://chocolatey.org/>`_ can be used to install MSYS2.
+
 2. Launch MSYS2
 ~~~~~~~~~~~~~~~

@@ -229,22 +243,3 @@ If the build succeeds, ``ffmpeg.exe`` should be found in the same directory. Mak
 Check that the resulting FFmpeg binary is accessible from Conda env

 Now launch a new command prompt and enable the TorchAudio development environment. Make sure that you can run the ``ffmpeg.exe`` command generated in the previous step.
-
-6. Build TorchAudio with the custom FFmpeg
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-To use this FFmpeg libraries for building torchaudio, do the following;
-1. Uninstall ``ffmpeg`` package installed by conda. ``conda uninstall ffmpeg``.
-2. When building set ``FFMPEG_ROOT`` environment variable to the directory where the libraries like ``libavcodec`` are found.
-
-.. code-block::
-
-   # In Command Prompt
-   set USE_FFMPEG=1
-   set FFMPEG_ROOT=<FFMPEG_BUILD_DIR>
-   python setup.py clean develop
-
-.. code-block::
-
-   # In Bash
-   USE_FFMPEG=1 FFMPEG_ROOT=<FFMPEG_BUILD_DIR> python setup.py clean develop
--- a/docs/source/installation.rst
+++ b/docs/source/installation.rst
@@ -16,7 +16,7 @@ Please refer to https://pytorch.org/get-started/locally/ for the details.
   each of which requires a corresponding PyTorch distribution.

 .. note::
-   This software was compiled against an unmodified copy of FFmpeg (licensed under `the LGPLv2.1 <https://github.com/FFmpeg/FFmpeg/blob/a5d2008e2a2360d351798e9abe883d603e231442/COPYING.LGPLv2.1>`_), with the specific rpath removed so as to enable the use of system libraries. The LGPL source can be downloaded `here <https://github.com/FFmpeg/FFmpeg/releases/tag/n4.1.8>`_.
+   This software was compiled against an unmodified copies of FFmpeg, with the specific rpath removed so as to enable the use of system libraries. The LGPL source can be downloaded from the following locations: `n4.1.8 <https://github.com/FFmpeg/FFmpeg/releases/tag/n4.1.8>`__ (`license <https://github.com/FFmpeg/FFmpeg/blob/n4.1.8/COPYING.LGPLv2.1>`__), `n5.0.3 <https://github.com/FFmpeg/FFmpeg/releases/tag/n5.0.3>`__ (`license <https://github.com/FFmpeg/FFmpeg/blob/n5.0.3/COPYING.LGPLv2.1>`__) and `n6.0 <https://github.com/FFmpeg/FFmpeg/releases/tag/n6.0>`__ (`license <https://github.com/FFmpeg/FFmpeg/blob/n6.0/COPYING.LGPLv2.1>`__).

 Dependencies
 ------------

--- a/test/torchaudio_unittest/common_utils/case_utils.py
+++ b/test/torchaudio_unittest/common_utils/case_utils.py
@@ -112,7 +112,7 @@ class TorchaudioTestCase(TestBaseMixin, PytorchTestCase):


 def is_ffmpeg_available():
-    return torchaudio._extension._FFMPEG_INITIALIZED
+    return torchaudio._extension._FFMPEG_EXT is not None


 _IS_CTC_DECODER_AVAILABLE = None

--- a/third_party/ffmpeg/CMakeLists.txt
+++ b/third_party/ffmpeg/CMakeLists.txt
@@ -2,7 +2,8 @@
 # This file defines the following FFmpeg libraries using pre-built binaries.

 add_library(ffmpeg4 INTERFACE)
-add_library(ffmpeg ALIAS ffmpeg4)
+add_library(ffmpeg5 INTERFACE)
+add_library(ffmpeg6 INTERFACE)

 ################################################################################

@@ -17,16 +18,41 @@ if (APPLE)
      URL ${base_url}/2023-07-06/macos_arm64/4.1.8.tar.gz
      URL_HASH SHA256=a44b8152b7f204ce5050fc7f6fd2bbbafe7ae4e45f03e135f3b45dd9a08f404e
      )
+    FetchContent_Declare(
+      f5
+      URL ${base_url}/2023-07-06/macos_arm64/5.0.3.tar.gz
+      URL_HASH SHA256=316fe8378afadcf63089acf3ad53a626fd3c26cc558b96ce1dc94d2a78f4deb4
+      )
+    FetchContent_Declare(
+      f6
+      URL ${base_url}/2023-07-06/macos_arm64/6.0.tar.gz
+      URL_HASH SHA256=5d1da9626f8cb817d6c558a2c61085a3d39a8d9f725a6f69f4658bea8efa9389
+      )
  elseif ("${CMAKE_SYSTEM_PROCESSOR}" STREQUAL "x86_64")
    FetchContent_Declare(
      f4
      URL ${base_url}/2023-07-06/macos_x86_64/4.1.8.tar.gz
      URL_HASH SHA256=392d5af0b24535bfc69d6244e7595e5f07117b93d94505d0a4b34c82ae479f48
      )
+    FetchContent_Declare(
+      f5
+      URL ${base_url}/2023-07-06/macos_x86_64/5.0.3.tar.gz
+      URL_HASH SHA256=d0b49575d3b174cfcca53b3049641855e48028cf22dd32f3334bbec4ca94f43e
+      )
+    FetchContent_Declare(
+      f6
+      URL ${base_url}/2023-07-06/macos_x86_64/6.0.tar.gz
+      URL_HASH SHA256=eabc01eb7d9e714e484d5e1b27bf7d921e87c1f3c00334abd1729e158d6db862
+      )
  else ()
    message(
      FATAL_ERROR
-      "CPU architecture ${CMAKE_SYSTEM_PROCESSOR} is not currently supported. If you do not need FFmpeg integration, then setting USE_FFMPEG=0 will bypass the issue.")
+      "${CMAKE_SYSTEM_PROCESSOR} is not supported for FFmpeg multi-version integration. "
+      "If you have FFmpeg libraries installed in the system,"
+      " setting FFMPEG_ROOT environment variable to the root directory of FFmpeg installation"
+      " (the directory where `include` and `lib` sub directories with corresponding headers"
+      " and library files are present) will invoke the FFmpeg single-version integration. "
+      "If you do not need the FFmpeg integration, setting USE_FFMPEG=0 will bypass the issue.")
  endif()
 elseif (UNIX)
  if ("${CMAKE_SYSTEM_PROCESSOR}" STREQUAL "aarch64")
@@ -35,17 +61,42 @@ elseif (UNIX)
      URL ${base_url}/2023-07-06/linux_aarch64/4.1.8.tar.gz
      URL_HASH SHA256=aae0b713040e30ceebe0d0bc82353d3d9054055c7af8a4f4abc1766015ab7681
      )
+    FetchContent_Declare(
+      f5
+      URL ${base_url}/2023-07-06/linux_aarch64/5.0.3.tar.gz
+      URL_HASH SHA256=65c663206982ee3f0ff88436d8869d191b46061e01e753518c77ecc13ea0236d
+      )
+    FetchContent_Declare(
+      f6
+      URL ${base_url}/2023-07-06/linux_aarch64/6.0.tar.gz
+      URL_HASH SHA256=ec762fd41ea7b8d9ad4f810f53fd78a565f2bc6f680afe56d555c80f3d35adef
+      )
  elseif ("${CMAKE_SYSTEM_PROCESSOR}" STREQUAL "x86_64")
    FetchContent_Declare(
      f4
      URL ${base_url}/2023-07-06/linux_x86_64/4.1.8.tar.gz
      URL_HASH SHA256=52e53b8857739bdd54f9d8541e22569b57f6c6f16504ee83963c2ed3e7061a23
      )
+    FetchContent_Declare(
+      f5
+      URL ${base_url}/2023-07-06/linux_x86_64/5.0.3.tar.gz
+      URL_HASH SHA256=de3c75c99b9ce33de7efdc144566804ae5880457ce71e185b3f592dc452edce7
+      )
+    FetchContent_Declare(
+      f6
+      URL ${base_url}/2023-07-06/linux_x86_64/6.0.tar.gz
+      URL_HASH SHA256=04d3916404bab5efadd29f68361b7d13ea71e6242c6473edcb747a41a9fb97a6
+      )
  else ()
    # Possible case ppc64le (though it's not officially supported.)
    message(
      FATAL_ERROR
-      "CPU architecture ${CMAKE_SYSTEM_PROCESSOR} is not currently supported. If you do not need FFmpeg integration, then setting USE_FFMPEG=0 will bypass the issue.")
+      "${CMAKE_SYSTEM_PROCESSOR} is not supported for FFmpeg multi-version integration. "
+      "If you have FFmpeg libraries installed in the system,"
+      " setting FFMPEG_ROOT environment variable to the root directory of FFmpeg installation"
+      " (the directory where `include` and `lib` sub directories with corresponding headers"
+      " and library files are present) will invoke the FFmpeg single-version integration. "
+      "If you do not need the FFmpeg integration, setting USE_FFMPEG=0 will bypass the issue.")
  endif()
 elseif(MSVC)
  FetchContent_Declare(
@@ -53,10 +104,22 @@ elseif(MSVC)
    URL ${base_url}/2023-07-06/windows/4.1.8.tar.gz
    URL_HASH SHA256=c45cd36e0575490f97ace07365bb67c5e1cbe9f3e6a4272d035c19348df96790
    )
+  FetchContent_Declare(
+    f5
+    URL ${base_url}/2023-07-06/windows/5.0.3.tar.gz
+    URL_HASH SHA256=e2daa10799909e366cb1b4b91a217d35f6749290dcfeea40ecae3d5b05a46cb3
+    )
+  FetchContent_Declare(
+    f6
+    URL ${base_url}/2023-07-06/windows/6.0.tar.gz
+    URL_HASH SHA256=098347eca8cddb5aaa61e9ecc1a00548c645fc59b4f7346b3d91414aa00a9cf6
+    )
 endif()

-FetchContent_MakeAvailable(f4)
+FetchContent_MakeAvailable(f4 f5 f6)
 target_include_directories(ffmpeg4 INTERFACE ${f4_SOURCE_DIR}/include)
+target_include_directories(ffmpeg5 INTERFACE ${f5_SOURCE_DIR}/include)
+target_include_directories(ffmpeg6 INTERFACE ${f6_SOURCE_DIR}/include)

 if(APPLE)
  target_link_libraries(
@@ -68,6 +131,24 @@ if(APPLE)
    ${f4_SOURCE_DIR}/lib/libavdevice.58.dylib
    ${f4_SOURCE_DIR}/lib/libavfilter.7.dylib
    )
+  target_link_libraries(
+    ffmpeg5
+    INTERFACE
+    ${f5_SOURCE_DIR}/lib/libavutil.57.dylib
+    ${f5_SOURCE_DIR}/lib/libavcodec.59.dylib
+    ${f5_SOURCE_DIR}/lib/libavformat.59.dylib
+    ${f5_SOURCE_DIR}/lib/libavdevice.59.dylib
+    ${f5_SOURCE_DIR}/lib/libavfilter.8.dylib
+    )
+  target_link_libraries(
+    ffmpeg6
+    INTERFACE
+    ${f6_SOURCE_DIR}/lib/libavutil.58.dylib
+    ${f6_SOURCE_DIR}/lib/libavcodec.60.dylib
+    ${f6_SOURCE_DIR}/lib/libavformat.60.dylib
+    ${f6_SOURCE_DIR}/lib/libavdevice.60.dylib
+    ${f6_SOURCE_DIR}/lib/libavfilter.9.dylib
+    )
 elseif (UNIX)
  target_link_libraries(
    ffmpeg4
@@ -78,6 +159,24 @@ elseif (UNIX)
    ${f4_SOURCE_DIR}/lib/libavdevice.so.58
    ${f4_SOURCE_DIR}/lib/libavfilter.so.7
    )
+  target_link_libraries(
+    ffmpeg5
+    INTERFACE
+    ${f5_SOURCE_DIR}/lib/libavutil.so.57
+    ${f5_SOURCE_DIR}/lib/libavcodec.so.59
+    ${f5_SOURCE_DIR}/lib/libavformat.so.59
+    ${f5_SOURCE_DIR}/lib/libavdevice.so.59
+    ${f5_SOURCE_DIR}/lib/libavfilter.so.8
+    )
+  target_link_libraries(
+    ffmpeg6
+    INTERFACE
+    ${f6_SOURCE_DIR}/lib/libavutil.so.58
+    ${f6_SOURCE_DIR}/lib/libavcodec.so.60
+    ${f6_SOURCE_DIR}/lib/libavformat.so.60
+    ${f6_SOURCE_DIR}/lib/libavdevice.so.60
+    ${f6_SOURCE_DIR}/lib/libavfilter.so.9
+    )
 elseif(MSVC)
  target_link_libraries(
    ffmpeg4
@@ -88,4 +187,22 @@ elseif(MSVC)
    ${f4_SOURCE_DIR}/bin/avdevice.lib
    ${f4_SOURCE_DIR}/bin/avfilter.lib
    )
+  target_link_libraries(
+    ffmpeg5
+    INTERFACE
+    ${f5_SOURCE_DIR}/bin/avutil.lib
+    ${f5_SOURCE_DIR}/bin/avcodec.lib
+    ${f5_SOURCE_DIR}/bin/avformat.lib
+    ${f5_SOURCE_DIR}/bin/avdevice.lib
+    ${f5_SOURCE_DIR}/bin/avfilter.lib
+    )
+  target_link_libraries(
+    ffmpeg6
+    INTERFACE
+    ${f6_SOURCE_DIR}/bin/avutil.lib
+    ${f6_SOURCE_DIR}/bin/avcodec.lib
+    ${f6_SOURCE_DIR}/bin/avformat.lib
+    ${f6_SOURCE_DIR}/bin/avdevice.lib
+    ${f6_SOURCE_DIR}/bin/avfilter.lib
+    )
 endif()
--- a/third_party/ffmpeg/single/CMakeLists.txt
+++ b/third_party/ffmpeg/single/CMakeLists.txt
+# CMake file for searching existing FFmpeg installation and defining ffmpeg TARGET
+
+message(STATUS "Searching existing FFmpeg installation")
+message(STATUS FFMPEG_ROOT=$ENV{FFMPEG_ROOT})
+if (NOT DEFINED ENV{FFMPEG_ROOT})
+  message(FATAL_ERROR "Environment variable FFMPEG_ROOT is not set.")
+endif()
+
+set(_root $ENV{FFMPEG_ROOT})
+set(lib_dirs "${_root}/lib" "${_root}/bin")
+set(include_dir "${_root}/include")
+
+add_library(ffmpeg INTERFACE)
+target_include_directories(ffmpeg INTERFACE "${include_dir}")
+
+function (_find_ffmpeg_lib component)
+  find_path("${component}_header"
+    NAMES "lib${component}/${component}.h"
+    PATHS "${include_dir}"
+    DOC "The include directory for ${component}"
+    REQUIRED
+    NO_DEFAULT_PATH)
+  find_library("lib${component}"
+    NAMES "${component}"
+    PATHS ${lib_dirs}
+    DOC "${component} library"
+    REQUIRED
+    NO_DEFAULT_PATH)
+  message(STATUS "Found ${component}: ${lib${component}}")
+  target_link_libraries(
+    ffmpeg
+    INTERFACE
+    ${lib${component}})
+endfunction ()
+
+_find_ffmpeg_lib(avutil)
+_find_ffmpeg_lib(avcodec)
+_find_ffmpeg_lib(avformat)
+_find_ffmpeg_lib(avdevice)
+_find_ffmpeg_lib(avfilter)
--- a/tools/setup_helpers/extension.py
+++ b/tools/setup_helpers/extension.py
@@ -65,12 +65,25 @@ def get_ext_modules():
            ]
        )
    if _USE_FFMPEG:
+        if "FFMPEG_ROOT" in os.environ:
+            # single version ffmpeg mode
            modules.extend(
                [
                    Extension(name="torchaudio.lib.libtorchaudio_ffmpeg", sources=[]),
                    Extension(name="torchaudio.lib._torchaudio_ffmpeg", sources=[]),
                ]
            )
+        else:
+            modules.extend(
+                [
+                    Extension(name="torchaudio.lib.libtorchaudio_ffmpeg4", sources=[]),
+                    Extension(name="torchaudio.lib._torchaudio_ffmpeg4", sources=[]),
+                    Extension(name="torchaudio.lib.libtorchaudio_ffmpeg5", sources=[]),
+                    Extension(name="torchaudio.lib._torchaudio_ffmpeg5", sources=[]),
+                    Extension(name="torchaudio.lib.libtorchaudio_ffmpeg6", sources=[]),
+                    Extension(name="torchaudio.lib._torchaudio_ffmpeg6", sources=[]),
+                ]
+            )
    return modules



--- a/torchaudio/_backend/utils.py
+++ b/torchaudio/_backend/utils.py
@@ -6,10 +6,10 @@ from typing import BinaryIO, Dict, Optional, Tuple, Union

 import torch
 import torchaudio.backend.soundfile_backend as soundfile_backend
-from torchaudio._extension import _FFMPEG_INITIALIZED, _SOX_INITIALIZED
+from torchaudio._extension import _FFMPEG_EXT, _SOX_INITIALIZED
 from torchaudio.backend.common import AudioMetaData

-if _FFMPEG_INITIALIZED:
+if _FFMPEG_EXT is not None:
    from torchaudio.io._compat import info_audio, info_audio_fileobj, load_audio, load_audio_fileobj, save_audio


@@ -262,7 +262,7 @@ class SoundfileBackend(Backend):
 @lru_cache(None)
 def get_available_backends() -> Dict[str, Backend]:
    backend_specs = {}
-    if _FFMPEG_INITIALIZED:
+    if _FFMPEG_EXT is not None:
        backend_specs["ffmpeg"] = FFmpegBackend
    if _SOX_INITIALIZED:
        backend_specs["sox"] = SoXBackend

--- a/torchaudio/_extension/__init__.py
+++ b/torchaudio/_extension/__init__.py
@@ -20,7 +20,7 @@ __all__ = [
    "_IS_TORCHAUDIO_EXT_AVAILABLE",
    "_IS_RIR_AVAILABLE",
    "_SOX_INITIALIZED",
-    "_FFMPEG_INITIALIZED",
+    "_FFMPEG_EXT",
 ]


@@ -59,11 +59,10 @@ if is_module_available("torchaudio.lib._torchaudio_sox"):


 # Initialize FFmpeg-related features
-_FFMPEG_INITIALIZED = False
-if is_module_available("torchaudio.lib._torchaudio_ffmpeg"):
+_FFMPEG_EXT = None
+if _IS_TORCHAUDIO_EXT_AVAILABLE:
    try:
-        _init_ffmpeg()
-        _FFMPEG_INITIALIZED = True
+        _FFMPEG_EXT = _init_ffmpeg()
    except Exception:
        # The initialization of FFmpeg extension will fail if supported FFmpeg
        # libraries are not found in the system.
@@ -81,7 +80,7 @@ fail_if_no_sox = (
    )
 )

-fail_if_no_ffmpeg = no_op if _FFMPEG_INITIALIZED else _fail_since_no_ffmpeg
+fail_if_no_ffmpeg = _fail_since_no_ffmpeg if _FFMPEG_EXT is None else no_op

 fail_if_no_rir = (
    no_op

--- a/torchaudio/_extension/utils.py
+++ b/torchaudio/_extension/utils.py
@@ -6,15 +6,18 @@ Anything that depends on external state should happen in __init__.py
 """


+import importlib
+import logging
 import os
+import platform
+import warnings
 from functools import wraps
 from pathlib import Path

 import torch
-
 import torchaudio
-from torchaudio._internal.module_utils import is_module_available

+_LG = logging.getLogger(__name__)
 _LIB_DIR = Path(__file__).parent.parent / "lib"


@@ -75,22 +78,92 @@ def _init_sox():
    atexit.register(torch.ops.torchaudio.sox_effects_shutdown_sox_effects)


-def _init_ffmpeg():
-    if not is_module_available("torchaudio.lib._torchaudio_ffmpeg"):
-        raise RuntimeError(
-            "torchaudio is not compiled with FFmpeg integration. Please set USE_FFMPEG=1 when compiling torchaudio."
-        )
+def _try_access_avutil(ffmpeg_ver):
+    libname_template = {
+        "Linux": "libavutil.so.{ver}",
+        "Darwin": "libavutil.{ver}.dylib",
+        "Windows": "avutil-{ver}.dll",
+    }[platform.system()]
+    avutil_ver = {"6": 58, "5": 57, "4": 56}[ffmpeg_ver]
+    libavutil = libname_template.format(ver=avutil_ver)
+    torchaudio.lib._torchaudio.find_avutil(libavutil)
+
+
+def _find_versionsed_ffmpeg_extension(ffmpeg_ver: str):
+    _LG.debug("Attempting to load FFmpeg version %s.", ffmpeg_ver)
+
+    library = f"libtorchaudio_ffmpeg{ffmpeg_ver}"
+    extension = f"_torchaudio_ffmpeg{ffmpeg_ver}"
+
+    if not _get_lib_path(extension).exists():
+        raise RuntimeError(f"FFmpeg {ffmpeg_ver} extension is not available.")
+
+    if ffmpeg_ver:
+        # A simple check for FFmpeg availability.
+        # This is not technically sufficient as other libraries could be missing,
+        # but usually this is sufficient.
+        #
+        # Note: the reason why this check is performed is because I don't know
+        # if the next `_load_lib` (which calls `ctypes.CDLL` under the hood),
+        # could leak handle to shared libraries of dependencies, in case it fails.
+        #
+        # i.e. If the `ctypes.CDLL("foo")` fails because one of `foo`'s dependency
+        # does not exist while `foo` and some other dependencies exist, is it guaranteed
+        # that none-of them are kept in memory after the failure??
+        _try_access_avutil(ffmpeg_ver)
+
+    _load_lib(library)
+
+    _LG.debug("Found FFmpeg version %s.", ffmpeg_ver)
+    return importlib.import_module(f"torchaudio.lib.{extension}")
+

+_FFMPEG_VERS = ["6", "5", "4", ""]
+
+
+def _find_ffmpeg_extension(ffmpeg_vers, show_error):
+    logger = _LG.error if show_error else _LG.debug
+    for ffmpeg_ver in ffmpeg_vers:
        try:
-        _load_lib("libtorchaudio_ffmpeg")
-    except OSError as err:
-        raise ImportError("FFmpeg libraries are not found. Please install FFmpeg.") from err
+            return _find_versionsed_ffmpeg_extension(ffmpeg_ver)
+        except Exception:
+            logger("Failed to load FFmpeg %s extension.", ffmpeg_ver, exc_info=True)
+            continue
+    raise ImportError(f"Failed to intialize FFmpeg extension. Tried versions: {ffmpeg_vers}")
+
+
+def _find_available_ffmpeg_ext():
+    ffmpeg_vers = ["6", "5", "4", ""]
+    return [v for v in ffmpeg_vers if _get_lib_path(f"_torchaudio_ffmpeg{v}").exists()]

-    import torchaudio.lib._torchaudio_ffmpeg  # noqa

-    torchaudio.lib._torchaudio_ffmpeg.init()
-    if torchaudio.lib._torchaudio_ffmpeg.get_log_level() > 8:
-        torchaudio.lib._torchaudio_ffmpeg.set_log_level(8)
+def _init_ffmpeg(show_error=False):
+    ffmpeg_vers = _find_available_ffmpeg_ext()
+    if not ffmpeg_vers:
+        raise RuntimeError(
+            # fmt: off
+            "TorchAudio is not built with FFmpeg integration. "
+            "Please build torchaudio with USE_FFMPEG=1."
+            # fmt: on
+        )
+
+    # User override
+    if ffmpeg_ver := os.environ.get("TORCHAUDIO_USE_FFMPEG_VERSION"):
+        if ffmpeg_vers == [""]:
+            warnings.warn("TorchAudio is built in single FFmpeg mode. TORCHAUDIO_USE_FFMPEG_VERSION is ignored.")
+        else:
+            if ffmpeg_ver not in ffmpeg_vers:
+                raise ValueError(
+                    f"The FFmpeg version {ffmpeg_ver} (read from TORCHAUDIO_USE_FFMPEG_VERSION) "
+                    f"is not available. Available versions are {[v for v in ffmpeg_vers if v]}"
+                )
+            ffmpeg_vers = [ffmpeg_ver]
+
+    ext = _find_ffmpeg_extension(ffmpeg_vers, show_error)
+    ext.init()
+    if ext.get_log_level() > 8:
+        ext.set_log_level(8)
+    return ext


 def _init_dll_path():
@@ -131,7 +204,7 @@ def _fail_since_no_ffmpeg(func):
            # Note:
            # We run _init_ffmpeg again just to show users the stacktrace.
            # _init_ffmpeg would not succeed here.
-            _init_ffmpeg()
+            _init_ffmpeg(show_error=True)
        except Exception as err:
            raise RuntimeError(
                f"{func.__name__} requires FFmpeg extension which is not available. "

--- a/torchaudio/backend/sox_io_backend.py
+++ b/torchaudio/backend/sox_io_backend.py
@@ -24,7 +24,7 @@ def _fail_load(
    raise RuntimeError("Failed to load audio from {}".format(filepath))


-if torchaudio._extension._FFMPEG_INITIALIZED:
+if torchaudio._extension._FFMPEG_EXT is not None:
    import torchaudio.io._compat as _compat

    _fallback_info = _compat.info_audio

--- a/torchaudio/csrc/ffmpeg/CMakeLists.txt
+++ b/torchaudio/csrc/ffmpeg/CMakeLists.txt
@@ -18,30 +18,77 @@ set(
  compat.cpp
  )

+set(
+  ext_sources
+  pybind/pybind.cpp
+  )
+
 if (USE_CUDA)
  set(
    additional_lib
    cuda_deps)
 endif()

-torchaudio_library(
+if (TARGET ffmpeg)
+  torchaudio_library(
    libtorchaudio_ffmpeg
    "${sources}"
    ""
    "torch;ffmpeg;${additional_lib}"
    ""
    )
-
-if (BUILD_TORCHAUDIO_PYTHON_EXTENSION)
-  set(
-    ext_sources
-    pybind/pybind.cpp
-    )
+  if (BUILD_TORCHAUDIO_PYTHON_EXTENSION)
    torchaudio_extension(
      _torchaudio_ffmpeg
      "${ext_sources}"
      ""
      "libtorchaudio_ffmpeg"
+      "TORCHAUDIO_FFMPEG_EXT_NAME=_torchaudio_ffmpeg"
+      )
+  endif()
+else()
+  torchaudio_library(
+    libtorchaudio_ffmpeg4
+    "${sources}"
+    ""
+    "torch;ffmpeg4;${additional_lib}"
+    ""
+    )
+  torchaudio_library(
+    libtorchaudio_ffmpeg5
+    "${sources}"
+    ""
+    "torch;ffmpeg5;${additional_lib}"
    ""
    )
-endif ()
+  torchaudio_library(
+    libtorchaudio_ffmpeg6
+    "${sources}"
+    ""
+    "torch;ffmpeg6;${additional_lib}"
+    ""
+    )
+  if (BUILD_TORCHAUDIO_PYTHON_EXTENSION)
+    torchaudio_extension(
+      _torchaudio_ffmpeg4
+      "${ext_sources}"
+      ""
+      "libtorchaudio_ffmpeg4"
+      "TORCHAUDIO_FFMPEG_EXT_NAME=_torchaudio_ffmpeg4"
+      )
+    torchaudio_extension(
+      _torchaudio_ffmpeg5
+      "${ext_sources}"
+      ""
+      "libtorchaudio_ffmpeg5"
+      "TORCHAUDIO_FFMPEG_EXT_NAME=_torchaudio_ffmpeg5"
+      )
+    torchaudio_extension(
+      _torchaudio_ffmpeg6
+      "${ext_sources}"
+      ""
+      "libtorchaudio_ffmpeg6"
+      "TORCHAUDIO_FFMPEG_EXT_NAME=_torchaudio_ffmpeg6"
+      )
+  endif ()
+endif()
--- a/torchaudio/csrc/ffmpeg/pybind/pybind.cpp
+++ b/torchaudio/csrc/ffmpeg/pybind/pybind.cpp
@@ -186,7 +186,11 @@ struct StreamWriterFileObj : private FileObj, public StreamWriterCustomIO {
            py::hasattr(fileobj, "seek") ? &seek_func : nullptr) {}
 };

-PYBIND11_MODULE(_torchaudio_ffmpeg, m) {
+#ifndef TORCHAUDIO_FFMPEG_EXT_NAME
+#error TORCHAUDIO_FFMPEG_EXT_NAME must be defined.
+#endif
+
+PYBIND11_MODULE(TORCHAUDIO_FFMPEG_EXT_NAME, m) {
  m.def("init", []() { avdevice_register_all(); });
  m.def("get_log_level", []() { return av_log_get_level(); });
  m.def("set_log_level", [](int level) { av_log_set_level(level); });

--- a/torchaudio/csrc/pybind/pybind.cpp
+++ b/torchaudio/csrc/pybind/pybind.cpp
@@ -8,6 +8,7 @@ PYBIND11_MODULE(_torchaudio, m) {
  m.def("is_rir_available", &is_rir_available, "");
  m.def("is_align_available", &is_align_available, "");
  m.def("cuda_version", &cuda_version, "");
+  m.def("find_avutil", &find_avutil, "");
 }

 } // namespace

--- a/torchaudio/csrc/utils.cpp
+++ b/torchaudio/csrc/utils.cpp
-#include <torch/script.h>
+#include <ATen/DynamicLibrary.h>
 #include <torchaudio/csrc/utils.h>

 #ifdef USE_CUDA
@@ -31,4 +31,10 @@ c10::optional<int64_t> cuda_version() {
 #endif
 }

+int find_avutil(const char* name) {
+  auto lib = at::DynamicLibrary{name};
+  auto avutil_version = (unsigned (*)())(lib.sym("avutil_version"));
+  return static_cast<int>(avutil_version() >> 16);
+}
+
 } // namespace torchaudio
--- a/torchaudio/csrc/utils.h
+++ b/torchaudio/csrc/utils.h
@@ -5,4 +5,5 @@ namespace torchaudio {
 bool is_rir_available();
 bool is_align_available();
 c10::optional<int64_t> cuda_version();
+int find_avutil(const char* name);
 } // namespace torchaudio
--- a/torchaudio/io/_compat.py
+++ b/torchaudio/io/_compat.py
@@ -7,6 +7,9 @@ import torchaudio
 from torchaudio.backend.common import AudioMetaData
 from torchaudio.io import StreamWriter

+if torchaudio._extension._FFMPEG_EXT is not None:
+    StreamReaderFileObj = torchaudio._extension._FFMPEG_EXT.StreamReaderFileObj
+

 # Note: need to comply TorchScript syntax -- need annotation and no f-string nor global
 def info_audio(
@@ -22,7 +25,7 @@ def info_audio_fileobj(
    format: Optional[str],
    buffer_size: int = 4096,
 ) -> AudioMetaData:
-    s = torchaudio.lib._torchaudio_ffmpeg.StreamReaderFileObj(src, format, None, buffer_size)
+    s = StreamReaderFileObj(src, format, None, buffer_size)
    i = s.find_best_audio_stream()
    sinfo = s.get_src_stream_info(i)
    if sinfo.num_frames == 0:
@@ -67,7 +70,7 @@ def _get_load_filter(


 def _load_audio_fileobj(
-    s: torchaudio.lib._torchaudio_ffmpeg.StreamReaderFileObj,
+    s: StreamReaderFileObj,
    filter: Optional[str] = None,
    channels_first: bool = True,
 ) -> torch.Tensor:
@@ -103,7 +106,7 @@ def load_audio_fileobj(
    buffer_size: int = 4096,
 ) -> Tuple[torch.Tensor, int]:
    demuxer = "ogg" if format == "vorbis" else format
-    s = torchaudio.lib._torchaudio_ffmpeg.StreamReaderFileObj(src, demuxer, None, buffer_size)
+    s = StreamReaderFileObj(src, demuxer, None, buffer_size)
    sample_rate = int(s.get_src_stream_info(s.find_best_audio_stream()).sample_rate)
    filter = _get_load_filter(frame_offset, num_frames, convert)
    waveform = _load_audio_fileobj(s, filter, channels_first)

--- a/torchaudio/io/_stream_reader.py
+++ b/torchaudio/io/_stream_reader.py
@@ -7,6 +7,11 @@ import torch
 import torchaudio
 from torch.utils._pytree import tree_map

+if torchaudio._extension._FFMPEG_EXT is not None:
+    _StreamReader = torchaudio._extension._FFMPEG_EXT.StreamReader
+    _StreamReaderFileObj = torchaudio._extension._FFMPEG_EXT.StreamReaderFileObj
+
+
 __all__ = [
    "StreamReader",
 ]
@@ -513,9 +518,9 @@ class StreamReader:
        buffer_size: int = 4096,
    ):
        if isinstance(src, str):
-            self._be = torchaudio.lib._torchaudio_ffmpeg.StreamReader(src, format, option)
+            self._be = _StreamReader(src, format, option)
        elif hasattr(src, "read"):
-            self._be = torchaudio.lib._torchaudio_ffmpeg.StreamReaderFileObj(src, format, option, buffer_size)
+            self._be = _StreamReaderFileObj(src, format, option, buffer_size)
        else:
            raise ValueError("`src` must be either a string or file-like object.")