Fix the input pixel format when using GPU video encoder (#3426)

Summary: StreamWriter's encoding pipeline looks like the following 1. convert tensor to AVFrame 2. pass AVFrame to AVFilter 3. pass the resulting AVFrame to AVCodecContext (encoder) and AVFormatContext (muxer) When dealing with CUDA tensor, the AVFilter becomes no-op, as we have not added support for CUDA-compatible filters. When CUDA frame is passed, the existing solution passes the software pixel format to AVFilter, which issues warning later as what AVFilter sees is AV_PIX_FMT_CUDA. Since the filter itself is no-op, it functions as expected. But this commit fixes it. See https://github.com/pytorch/audio/issues/3317 Pull Request resolved: https://github.com/pytorch/audio/pull/3426 Differential Revision: D46562370 Pulled By: mthrok fbshipit-source-id: ce0131f1e50bcc826ee036fc0f35db2a5162b660

Fix the input pixel format when using GPU video encoder (#3426)
Summary: StreamWriter's encoding pipeline looks like the following 1. convert tensor to AVFrame 2. pass AVFrame to AVFilter 3. pass the resulting AVFrame to AVCodecContext (encoder) and AVFormatContext (muxer) When dealing with CUDA tensor, the AVFilter becomes no-op, as we have not added support for CUDA-compatible filters. When CUDA frame is passed, the existing solution passes the software pixel format to AVFilter, which issues warning later as what AVFilter sees is AV_PIX_FMT_CUDA. Since the filter itself is no-op, it functions as expected. But this commit fixes it. See https://github.com/pytorch/audio/issues/3317 Pull Request resolved: https://github.com/pytorch/audio/pull/3426 Differential Revision: D46562370 Pulled By: mthrok fbshipit-source-id: ce0131f1e50bcc826ee036fc0f35db2a5162b660
30afaa9b · moto · Facebook GitHub Bot · dfd0c5fd · 30afaa9b
Commit 30afaa9b authored Jun 09, 2023 by moto Committed by Facebook GitHub Bot Jun 09, 2023
Hide whitespace changes
Inline Side-by-side

Showing with 6 additions and 1 deletion

torchaudio/csrc/ffmpeg/stream_writer/encode_process.cpp torchaudio/csrc/ffmpeg/stream_writer/encode_process.cpp +6 -1

No files found.
--- a/torchaudio/csrc/ffmpeg/stream_writer/encode_process.cpp
+++ b/torchaudio/csrc/ffmpeg/stream_writer/encode_process.cpp
@@ -682,7 +682,12 @@ FilterGraph get_video_filter_graph(
  FilterGraph f;
  f.add_video_src(
-      src_fmt, av_inv_q(src_rate), src_rate, src_width, src_height, {1, 1});
+      is_cuda ? AV_PIX_FMT_CUDA : src_fmt,
+      av_inv_q(src_rate),
+      src_rate,
+      src_width,
+      src_height,
+      {1, 1});
  f.add_video_sink();
  f.add_process(desc);
  f.create_filter();