0

当我使用 vstack 堆叠两个视频时,结果底部视频的音频同步问题。

我的出发点:从 2 人视频聊天中捕获的四个单独的 RTP 轨道:

 Actor1Video.webm,
 Actor1Audio.webm,
 Actor2Video.webm,
 Actor2Audio.webm

我使用 vstack 将 Actor1 放在顶部,将 Actor2 放在底部:

ffmpeg -i Actor1Video.webm -i Actor2Video.webm -i Actor1Audio.webm -i Actor2Audio.webm  -filter_complex "[1][0]scale2ref=oh*mdar:ih[2nd][ref];[ref][2nd]vstack=inputs=2[v];[2:a][3:a]join=inputs=2:channel_layout=stereo:map=0.0-FL|1.0-FR[a]" -c:a libfdk_aac -map "[v]" -map "[a]"  -vsync 2 ActorsCombined.mp4

这是日志:

ffmpeg version git-2021-02-08-89f78dd Copyright (c) 2000-2021 the FFmpeg developers
  built with Apple clang version 11.0.3 (clang-1103.0.32.62)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/HEAD-89f78dd_6 --enable-shared --cc=clang --host-cflags= --host-ldflags= --enable-gpl --enable-libaom --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-libsnappy --enable-libtheora --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-demuxer=dash --disable-libjack --disable-indev=jack --enable-opencl --enable-videotoolbox --disable-htmlpages --enable-libfdk-aac --enable-nonfree
  libavutil      56. 64.100 / 56. 64.100
  libavcodec     58.121.100 / 58.121.100
  libavformat    58. 67.100 / 58. 67.100
  libavdevice    58. 11.103 / 58. 11.103
  libavfilter     7.103.100 /  7.103.100
  libswscale      5.  8.100 /  5.  8.100
  libswresample   3.  8.100 /  3.  8.100
  libpostproc    55.  8.100 / 55.  8.100
Input #0, matroska,webm, from 'Actor1Video.webm':
  Metadata:
    title           : FFmpeg
    ENCODER         : Lavf58.29.100
  Duration: 447576:28:17.41, start: 1611273978.135000, bitrate: N/A
  Stream #0:0: Video: vp8, yuv420p(tv, bt470bg/unknown/unknown, progressive), 1280x720, SAR 1:1 DAR 16:9, 29.97 fps, 29.97 tbr, 1k tbn, 1k tbc (default)
    Metadata:
      DURATION        : 447576:28:17.408999
Input #1, matroska,webm, from 'Actor2Video.webm':
  Metadata:
    title           : FFmpeg
    ENCODER         : Lavf58.29.100
  Duration: 447576:28:17.45, start: 1611273978.257000, bitrate: N/A
  Stream #1:0: Video: vp8, yuv420p(tv, bt470bg/unknown/unknown, progressive), 320x180, SAR 1:1 DAR 16:9, 29.97 fps, 29.97 tbr, 1k tbn, 1k tbc (default)
    Metadata:
      DURATION        : 447576:28:17.453999
Input #2, matroska,webm, from 'Actor1Audio.webm':
  Metadata:
    title           : FFmpeg
    ENCODER         : Lavf58.29.100
  Duration: 447576:28:17.49, start: 1611273978.112000, bitrate: N/A
  Stream #2:0: Audio: opus, 48000 Hz, stereo, fltp (default)
    Metadata:
      DURATION        : 447576:28:17.492000
Input #3, matroska,webm, from 'Actor2Audio.webm':
  Metadata:
    title           : FFmpeg
    ENCODER         : Lavf58.29.100
  Duration: 447576:28:17.45, start: 1611273978.208000, bitrate: N/A
  Stream #3:0: Audio: opus, 48000 Hz, stereo, fltp (default)
    Metadata:
      DURATION        : 447576:28:17.447999
File 'ActorsCombined.mp4' already exists. Overwrite? [y/N] y
Stream mapping:
  Stream #0:0 (vp8) -> scale2ref:ref
  Stream #1:0 (vp8) -> scale2ref:default
  Stream #2:0 (opus) -> join:input0
  Stream #3:0 (opus) -> join:input1
  vstack -> Stream #0:0 (libx264)
  join -> Stream #0:1 (libfdk_aac)
Press [q] to stop, [?] for help
[libx264 @ 0x7ff0c1831a00] using SAR=1/1
[libx264 @ 0x7ff0c1831a00] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0x7ff0c1831a00] profile High, level 4.0, 4:2:0, 8-bit
[libx264 @ 0x7ff0c1831a00] 264 - core 161 r3043 59c0609 - H.264/MPEG-4 AVC codec - Copyleft 2003-2021 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=12 lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'ActorsCombined.mp4':
  Metadata:
    title           : FFmpeg
    encoder         : Lavf58.67.100
  Stream #0:0: Video: h264 (avc1 / 0x31637661), yuv420p(progressive), 1280x1440 [SAR 1:1 DAR 8:9], q=2-31, 29.97 fps, 30k tbn (default)
    Metadata:
      encoder         : Lavc58.121.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
  Stream #0:1: Audio: aac (mp4a / 0x6134706D), 48000 Hz, stereo, s16, 139 kb/s (default)
    Metadata:
      encoder         : Lavc58.121.100 libfdk_aac
frame=36626 fps= 15 q=-1.0 Lsize=  389420kB time=00:21:59.38 bitrate=2417.9kbits/s dup=0 drop=34791 speed=0.535x    
video:365641kB audio:22446kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.343645%
[libx264 @ 0x7ff0c1831a00] frame I:158   Avg QP:15.51  size:107833
[libx264 @ 0x7ff0c1831a00] frame P:9670  Avg QP:18.71  size: 25824
[libx264 @ 0x7ff0c1831a00] frame B:26798 Avg QP:24.90  size:  4018
[libx264 @ 0x7ff0c1831a00] consecutive B-frames:  0.6%  5.2%  0.6% 93.5%
[libx264 @ 0x7ff0c1831a00] mb I  I16..4: 13.2% 75.5% 11.3%
[libx264 @ 0x7ff0c1831a00] mb P  I16..4:  1.2%  3.6%  0.2%  P16..4: 43.1% 10.4%  5.9%  0.0%  0.0%    skip:35.6%
[libx264 @ 0x7ff0c1831a00] mb B  I16..4:  0.1%  0.1%  0.0%  B16..8: 28.3%  0.7%  0.1%  direct: 2.3%  skip:68.5%  L0:45.1% L1:53.6% BI: 1.3%
[libx264 @ 0x7ff0c1831a00] 8x8 transform intra:71.6% inter:85.4%
[libx264 @ 0x7ff0c1831a00] coded y,uvDC,uvAC intra: 50.4% 77.2% 47.8% inter: 6.9% 17.0% 3.8%
[libx264 @ 0x7ff0c1831a00] i16 v,h,dc,p: 37% 28% 14% 22%
[libx264 @ 0x7ff0c1831a00] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 25% 17% 25%  4%  6%  7%  5%  6%  5%
[libx264 @ 0x7ff0c1831a00] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 35% 24% 16%  4%  6%  5%  4%  4%  2%
[libx264 @ 0x7ff0c1831a00] i8c dc,h,v,p: 60% 16% 17%  6%
[libx264 @ 0x7ff0c1831a00] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 0x7ff0c1831a00] ref P L0: 63.1%  9.9% 20.5%  6.6%
[libx264 @ 0x7ff0c1831a00] ref B L0: 90.0%  8.9%  1.1%
[libx264 @ 0x7ff0c1831a00] ref B L1: 94.7%  5.3%
[libx264 @ 0x7ff0c1831a00] kb/s:2270.36

生成的文件开始同步,但几分钟后,底部视频突然与其音频不同步。

奇怪的是,如果我将这些视频与它们的音频分别合并,而不使用 vstack,则不会出现同步问题:

ffmpeg -i Actor1Video.webm -i Actor1Audio.webm -vsync 2 Actor1.mp4 &&
ffmpeg -i Actor2Video.webm -i Actor2Audio.webm -vsync 2 Actor2.mp4

当我执行上述操作时,两个视频完全同步。但是,如果我将这两个 mp4 堆叠起来,我会遇到同样的问题,即底部视频不同步。

有什么建议么?


更新

这个问题似乎与该站点上的任何内容都没有重复(尽管,正如@llogan 所指出的,其他用户对 WebRTC 时间戳有疑问)。不过,WebRTC 录音似乎不太可能同步?

4

0 回答 0