Tuning Latency¶

Latency tuning allows for trade off among end to end delay, video bit rate, GOP composition, etc. Interactive applications that require low latency, can do so by setting relevant parameters in the video pipeline. It is understood that lowering the latency comes at the cost of increased bit rate, for a given visual quality.

Decoder Latency ¶

Decoding latency can be reduced by enabling the -low_latency option, in FFmpeg.

Scaler Latency ¶

Scaling latency can be reduced by disabling the enable_pipeline option, in FFmpeg. However, note that this will have a negative impact on scaler's performance.

Encoder Latency ¶

AMD AMA Video SDK encoder performs multi-objective optimization with set constraints on bit rate, GOP topology, visual quality measures, etc. As such, it may be tuned to achieve a compromise between latency and quality or be specialized to ultra low latency.

Guidelines on Encoder Latency Configuration¶

Encoding latency can also be reduced by trading off compression rate or visual quality. The following table lists the encoder options which can used to reduce to that effect.

Encode Options	Notes
Look Ahead Depth	For best visual quality, it is recommend to let the buffer depth be determined automatically. If it is found that the selected depth adds unacceptable delay, then this option can be set explicitly. Supported range is 0 - 46+number of B frames.
Number of B frames	It is understood that for every inserted B frame there will be a frame period delay.

See Encoding Compatibility Matrix combination of allowable parameters.

Automatic Look Ahead Depth Calculation¶

The default VQ optimized look ahead buffer depths are:

800 ms for 8 bit

600 ms for 10 bit

Ultra Low Latency (ULL) Mode¶

Ultra Low Latency (ULL) encoding is enabled by setting -lookahead_depth flag to 0.

Notes

In ULL encoding mode, frames are always processed in display order. As such, this mode is not compatible with B frames. Furthermore, only Constant Quantization Parameter (CQP) and Constant Bit Rate (CBR) options are allowed. See -control_rate.

Latency Adjustment¶

The overall latency can be further tuned by adjusting the -bufsize parameter. This parameter allows for tuning strict and relaxed ULL modes. Both relaxed ULL and strict ULL modes have the lowest achievable encoding latency, in AMD AMA SDK. Strict ULL has lower transmission latency than relaxed ULL, by restricting frame size variations at the expense of lower VQ. Relaxed ULL has better VQ than strict ULL, by allowing larger frame size variations. Such variations may results in higher transmission latency, depending on network bandwidth.

FFmpeg Latency Measurements ¶

Latency measurements can be obtained by configuring Unified Logging. The following example demonstrates how to measure decoder, encoder and end to end latencies:

export LOG_AMA_CONFIG="destination=file, location=log.txt, max_size=1000MB | log_level=WARN | debug_file_line=1 | debug_time_stamp=1 | debug_pid=1 | debug_thread=1 | perf_log=1"
export LOG_AMA_FILTER_PERF="*.*=INFO"
ffmpeg -y -nostdin -hwaccel ama -hwaccel_device /dev/ama_transcoder0   -re -c:v h264_ama  -low_latency 1 -i h264_1080p30.mp4 -c:v av1_ama -lookahead_depth 0 -f null /dev/null

The log file tracks all timing info for all components of the pipeline. To generate a human readable output, use the parse_logs.py utility. This script will generate an output the looks like the following:

============================ INSTANCE INFO START ===============================

DEC ::
     0. h264_ama@0x564cbc07cc40
             DECODER { PerfBeg, PerfEnd }
             DECSDK::DWLReserveCmdBuf { PerfBeg, PerfEnd }
             DECSDK::rsv_osalsubmit { PerfBeg, PerfEnd }
             DECSDK::rsv_osalwait { PerfBeg, PerfEnd }
             DECSDK::DWLDMA_RC2EP { PerfBeg, PerfEnd }
             DECSDK::DWLEnableCmdBuf { PerfBeg, PerfEnd }
             ...
ENC_0.ENCODER latency = 33.431 ms (APPLICATION LEVEL)
      (PerfBeg@av1_ama@0x560d0c196540-ENCODER --> PerfEnd@av1_ama@0x560d0c196540-ENCODER)
ENC_0.ENCODER::PutFrame latency = 0.057 ms
      (PerfBeg@av1_ama@0x560d0c196540-ENCODER::PutFrame --> PerfEnd@av1_ama@0x560d0c196540-ENCODER::PutFrame)
ENC_0.Encoder::SDK latency = 4.442 ms
      (PerfBeg@av1_ama@0x560d0c196540-Encoder::SDK --> PerfEnd@av1_ama@0x560d0c196540-Encoder::SDK)
ENC_0.ENCODER::GetPkt latency = 0.009 ms
     (PerfBeg@av1_ama@0x560d0c196540-ENCODER::GetPkt --> PerfEnd@av1_ama@0x560d0c196540-ENCODER::GetPkt)
PostDecode_CH1 latency = 33.458 ms
     (PerfEnd@h264_ama@0x560d0c194980-DECODER --> PerfEnd@av1_ama@0x560d0c196540-ENCODER)

End2End_CH1 latency = 168.958 ms
     (PerfBeg@h264_ama@0x560d0c194980-DECODER --> PerfEnd@av1_ama@0x560d0c196540-ENCODER))

The above delineates component based and end to end timing information. In this example, the end to end delay is 168.985 [ms] and the encoder delay of 33.431 [ms].