Attention

This version of the SDK has been superseded by the latest release of the SDK.

Tuning Video Quality

The quality of encoded video depends on various factors. It is primarily a function of the target bit rate and the type of video content. However, there are some encoder parameters which can be used to adjust the video quality. This document describes the major parameters impacting video quality, as they specifically pertain to AMA SDK compatible devices. These parameters and the underlying concepts are applicable whether using FFmpeg or the C example programs.

Currently, there are 3 significant parameters that effect the visual quality of the video:

  1. Metric, which is specified by -tune_metrics

  2. Latency, which is specified by -lookahead_depth

  3. Preset, which is specified by -preset

Sections below will describe each of these parameters in detail.

Various examples illustrating the effect of these settings can be found here:

Metrics

This parameters defines means of scoring the observed video quality, for both subjective and objective measures. To control this parameter, set -tune_metrics to its desired target value. It is important to note that the encode engine has specific optimizations per mode, e.g., while setting the -tune_metrics parameter to 1, will optimize the visual quality, it may not necessarily provide the best PSNR, SSIM or VMAF scores.

Latency

Through this parameter one has the ability to control the end to end delay, within a transcode pipeline. See Tuning Latency of Transcode Pipeline for more details. It is understood that typically allowing for more delay in a pipeline, results in a better visual quality, e.g., increasing the depth of the look ahead buffer, results in a higher score, for a given bit rate. This parameter is controlled -lookahead_depth. See FFmpeg Video Quality for example usage. Note that look ahead depth can be reduced to 15 frames with (limited) impact to video quality.

Preset

Speed preset determines the amount of time that is spent in encoding the incoming bit stream. This time in turn is determined by the number of encode optimization tools deployed, during the encoding process. Specifically, for this release, motion vector search range, Rate-Distortion Optimized Quantization (RDOQ) and other such tools are used to enhance the visual quality, at the expense of processing time. For example, -preset slow provides about 2% improvement, in terms of BD Rate, for AVC encoding, while decreasing the throughput by 20%. It should be noted that slow preset also increases the ULL mode latency, by 25%. See FFmpeg quality analysis examples for example usage. Note that the specific values noted in these example are the only ones that are recommended for usage.

GOP Composition

By default, the GOP structure of all AMD AMA encoders is configured to ensure optimal VQ. This implies that GOP topologies among different encoders may not be the same or the GOP itself may not have a homogeneous structure. As an example, while both AVC and HEVC encoders use hierarchical B frames, to further enhance AVC's VQ measure, unidirectional B frames that only reference prior anchor frames are also used. See -no_bll for how to turn this option off.

ROI

AMA SDK allows for further enhancement, in subjective VQ, by utilizing ML driven encoding. This is achieved by encoding regions with high probability of face and text classes, with lower QP values. Such improvements are most visible, when encoding high resolution video streams with low bit rates. As such, video conferencing applications, where faces do not occupy the majority of the frame area and are larger than 5x5 pixels, are most amenable to ROI based encoding. ROI model sensitivity, i.e., its probability mapping to QP gain, can be tuned by strength parameter. It should be noted, as with all subjective improvements, the resulting objective measures, e.g., SSIM, VMAF, etc. may not reflect such improvements. See Video Machine Learing and examples therein for sample usage.

AV1 Type Selection

AMD AMA Video SDK offers 2 flavors of AV1 encoding: the default type 1 and type 2. For most use-cases, it is recommended to stay with the default value, as it not only provides better visual quality, but also provides per frame objective stats. However, given that these 2 encoders employ 2 independent pipelines if the primary concern is to increase the number of encoded streams, then addition of type 2 AV1 will increase the channel density.

Dynamic Encoder Parameters

Dynamic parameters are parameters which can be changed during runtime. This is useful to optimize video quality and compression rate for different segments of the video. This capability is supported for FFmpeg and GStreamer.

The following encoder parameters can be dynamically modified:

  • Number of B frames (0 to 4)

  • Min/Max bitrate (0 to INT_MAX)

  • Bitrate (0 to INT_MAX) (Should be in Min-Max bit rates range)

  • Temporal AQ mode (0 to 1)

  • Temporal AQ gain (0 to 255)

  • Spatial AQ mode (0 to 1)

  • Spatial AQ gain (0 to 255)

  • Min/Max QP (See QP Ranges)

  • QP (See QP Ranges) (Should be in Min and Max QP range)

  • QP I frame offset (See QP Ranges) (Should be in Min and Max QP range)

  • QP B frame offset (See QP Ranges) (Should be in Min and Max QP range)

Note

AVC and HEVC QP ranges are 0-51

AV1 QP range is 0-255

When using FFmpeg or GStreamer, the encoder parameters which should be changed are specified in a configuration file as key-value pairs. This means that these key-value pairs must be known ahead of time.

Dynamic Parameters Considerations

  • Recommended settings for dynamic B frames are:

    • 0 for gaming clips with fast motion, camera pan/rotation scenes

    • 2 for static or slow moving scenes, talking heads, or video conferencing type of content

    • 1 for all medium motion and all other content

  • B frames changes do not happen at the exact frame number specified. Instead, the change comes into effect one or two frames from the actual frame number specified in the config file.

  • The maximum value for the number of B frames is the value configured at initilization

  • The configuration files for dynamic parameters must comply with the format specified above. Ill-formed files may result in unexpected behavior.

FFmpeg

Encoder parameters which should be changed dynamically are specified as key-value pairs in a configuration file. This configuration file is provided to FFmpeg using the -dynamic_params_file encoder option.

The -dynamic_params_file option is specific to an encoded output. For use cases with multiple encoded outputs (such as ABR ladders), each output can have its own -dynamic_params_file option and associated configuration file.

The configuration file should contain one line for each frame where one or more parameters are changed. Each line should start with the frame number followed by a list of key-value pairs for all the modified parameters:

<frameNumberN1>:<key1>=<value1>
<frameNumberN2>:<key2>=<value2>,<key3>=<value3>

Below is a table listing the parameters which can be changed at runtime, the corresponding key and valid values.

Dynamic Parameter

Key

Valid Values

Number of B frames

NumB=<int>

0 to 4

Min/Max bit rate

MinBRkbps=<int>

0 to INT_MAX

Max bit rate

MaxBRkbps=<int>

0 to INT_MAX

Bitrate (in bits per second)

BRkbps=<int>

0 to INT_MAX

Temporal AQ mode

tAQ=<int>

0 to 1

Temporal AQ gain

tAQGain=<int>

0 to 255

Spatial AQ mode

sAQ=<int>

0 to 1

Spatial AQ gain

sAQGain=<int>

0 to 255

Min/Max QP

MinQP=<int>,MaxQP=<int>

See QP Ranges

QP

QP=<int>

See QP Ranges

QP I frame offset

QPOffsetI

See QP Ranges

QP B frame offset

QPOffsetB

See QP Ranges

Sample FFmpeg encode command:

ffmpeg -hwaccel ama -re -f lavfi -i testsrc=duration=60:size=1920x1080:rate=60,format=yuv420p -f rawvideo  -vf "hwupload" -c:v hevc_ama -b:v 5M  -dynamic_params_file ./param.txt -y -f rawvideo  output.h265

Sample configuration file for dynamic parameters:

300:NumB=1
600:BRkbps=6000
1200:sAQ=1,sAQGain=50
1800:tAQ=1,tAQGain=50
2400:NumB=0,BRkbps=10000,sAQ=0,sAQGain=50,tAQ=0
5000:MinQP=30,MaxQP=35

GStreamer

Gstreamer uses the same formated configuration file as FFmpeg:

gst-launch-1.0 fakesrc sizetype=fixed sizemax=4147200 num-buffers=4000 ! capsfilter caps='video/x-raw' ! rawvideoparse width=1920 height=1080 format=i420 framerate=60/1 ! ama_upload ! ama_h265enc encoding-params-file=./param1.txt ! fakesink