Attention
This version of the SDK has been superseded by the latest release of the SDK.
Tuning Video Quality¶
The quality of encoded video depends on various factors. It is primarily a function of the target bit rate and the type of video content. However, there are some encoder parameters which can be used to adjust the video quality. This document describes the major parameters impacting video quality, as they specifically pertain to AMA SDK compatible devices. These parameters and the underlying concepts are applicable whether using FFmpeg or the C example programs.
Currently, there are 3 significant parameters that effect the visual quality of the video:
Metric, which is specified by
-tune_metrics
Latency, which is specified by
-lookahead_depth
Preset, which is specified by
-preset
Sections below will describe each of these parameters in detail.
Various examples illustrating the effect of these settings can be found here:
Metrics¶
This parameters defines means of scoring the observed video quality, for both subjective and objective measures. To control this parameter, set -tune_metrics
to its desired target value. It is important to note that the encode engine has specific optimizations per mode, e.g., while setting the -tune_metrics
parameter to 1, will optimize the visual quality, it may not necessarily provide the best PSNR, SSIM or VMAF scores.
Latency¶
Through this parameter one has the ability to control the end to end delay, within a transcode pipeline. See Tuning Latency of Transcode Pipeline for more details. It is understood that typically allowing for more delay in a pipeline, results in a better visual quality, e.g., increasing the depth of the look ahead buffer, results in a higher score, for a given bit rate. This parameter is controlled -lookahead_depth
. See FFmpeg Video Quality for example usage. Note that look ahead depth can be reduced to 15 frames with (limited) impact to video quality.
Preset¶
Speed preset determines the amount of time that is spent in encoding the incoming bit stream. This time in turn is determined by the number of encode optimization tools deployed, during the encoding process. Specifically, for this release, motion vector search range, Rate-Distortion Optimized Quantization (RDOQ) and other such tools are used to enhance the visual quality, at the expense of processing time. For example, -preset
slow
provides about 2% improvement, in terms of BD Rate, for AVC encoding, while decreasing the throughput by 20%. It should be noted that slow
preset also increases the ULL mode latency, by 25%. See FFmpeg quality analysis examples for example usage. Note that the specific values noted in these example are the only ones that are recommended for usage.
Content Adaptive Bit Rate (CABR)¶
CABR, -cabr
, opportunistically lowers bitrate for easy contents, without causing any noticeable visual quality degradation. Given that the threshold for noticeable visual degradation is around VMAF 90, CABR is set such that it targets the VMAF-90 point. CABR operates by lowering bitrate in sections of the content that exceed the VMAF-90 point. vq_offset
allows for fine tuning CABR. Specifically, a positive value will decrease the encoded bit rate, but will also lower the maximum VQ. Similarly, a negative value will increase the encoded bit rate, but will increase the maximum VQ. CABR modifier is applicable to all rate control, metrics, and preset modes.
GOP Composition¶
By default, the GOP structure of all AMD AMA encoders is configured to ensure optimal VQ. This implies that GOP topologies among different encoders may not be the same or the GOP itself may not have a homogeneous structure. As an example, while both AVC and HEVC encoders use hierarchical B frames, to further enhance VQ measure, unidirectional B frames that only reference prior anchor frames are also used. See -no_bll
for how to turn this option off.
ROI¶
AMA SDK allows for further enhancement, in subjective VQ, by utilizing ML driven encoding. This is achieved by encoding regions with high probability of face and text classes, with lower QP values. Such improvements are most visible, when encoding high resolution video streams with low bit rates. As such, video conferencing applications, where faces do not occupy the majority of the frame area and are larger than 5x5 pixels, are most amenable to ROI based encoding. ROI model sensitivity, i.e., its probability mapping to QP gain, can be tuned by strength
parameter. It should be noted, as with all subjective improvements, the resulting objective measures, e.g., SSIM, VMAF, etc. may not reflect such improvements. See Video Machine Learing and examples therein for sample usage.
AV1 Type Selection¶
AMD AMA Video SDK offers 2 flavors of AV1 encoding: the default type 1 and type 2. For most use-cases, it is recommended to stay with the default value, as it not only provides better visual quality, but also provides per frame objective stats. However, given that these 2 encoders employ 2 independent pipelines if the primary concern is to increase the number of encoded streams, then addition of type 2 AV1 will increase the channel density.
Dynamic Encoder Parameters¶
Dynamic parameters are parameters which can be changed during runtime. This is useful to optimize video quality and compression rate for different segments of the video. This capability is supported for FFmpeg and GStreamer.
The following encoder parameters can be dynamically modified:
Number of B frames (0 to 4)
Min/Max bitrate (0 to INT_MAX)
Bitrate (0 to INT_MAX) (Should be in Min-Max bit rates range)
Temporal AQ mode (0 to 1)
Temporal AQ gain (0 to 255)
Spatial AQ mode (0 to 1)
Spatial AQ gain (0 to 255)
Min/Max QP (See QP Ranges)
QP (See QP Ranges) (Should be in Min and Max QP range)
QP I frame offset (See QP Ranges) (Should be in Min and Max QP range)
QP B frame offset (See QP Ranges) (Should be in Min and Max QP range)
Note
AVC and HEVC QP ranges are 0-51
AV1 QP range is 0-255
When using FFmpeg or GStreamer, the encoder parameters which should be changed are specified in a configuration file as key-value pairs. This means that these key-value pairs must be known ahead of time.
Dynamic Parameters Considerations
Recommended settings for dynamic B frames are:
0 for gaming clips with fast motion, camera pan/rotation scenes
2 for static or slow moving scenes, talking heads, or video conferencing type of content
1 for all medium motion and all other content
B frames changes do not happen at the exact frame number specified. Instead, the change comes into effect one or two frames from the actual frame number specified in the config file.
The maximum value for the number of B frames is the value configured at initilization
The configuration files for dynamic parameters must comply with the format specified above. Ill-formed files may result in unexpected behavior.
The following table shows supported combinations of rate control modes and dynamic parameters:
Parameter |
CQP |
CBR |
VBR |
CVBR |
CRF |
CABR |
---|---|---|---|---|---|---|
Temporal AQ Mode |
Y
|
Y
|
Y
|
Y
|
Y
|
Y
|
Temporal AQ Gain |
Y
|
Y
|
Y
|
Y
|
Y
|
Y
|
Spatial AQ Mode |
Y
|
Y
|
Y
|
Y
|
Y
|
Y
|
Spatial AQ Gain |
Y
|
Y
|
Y
|
Y
|
Y
|
Y
|
Bit Rate |
N
|
Y
|
Y
|
Y
|
N
|
Y
|
Min Bit Rate |
N
|
N
|
Y
|
Y
|
N
|
N
|
Max Bit Rate |
Y
|
N
|
Y
|
Y
|
Y
|
N
|
Number of B Frames |
Y
|
Y
|
Y
|
Y
|
Y
|
Y
|
Min QP |
Y
|
Y
|
Y
|
Y
|
Y
|
Y
|
Max QP |
Y
|
Y
|
Y
|
Y
|
Y
|
Y
|
QP |
Y
|
N
|
N
|
N
|
Y
|
N
|
QP I Offset |
Y
|
Y
|
Y
|
Y
|
Y
|
Y
|
QP B Offset |
Y
|
Y
|
Y
|
Y
|
Y
|
Y
|
FFmpeg
Encoder parameters which should be changed dynamically are specified as key-value pairs in a configuration file. This configuration file is provided to FFmpeg using the -dynamic_params_file
encoder option.
The -dynamic_params_file
option is specific to an encoded output. For use cases with multiple encoded outputs (such as ABR ladders), each output can have its own -dynamic_params_file
option and associated configuration file.
The configuration file should contain one line for each frame where one or more parameters are changed. Each line should start with the frame number followed by a list of key-value pairs for all the modified parameters:
<frameNumberN1>:<key1>=<value1>
<frameNumberN2>:<key2>=<value2>,<key3>=<value3>
Below is a table listing the parameters which can be changed at runtime, the corresponding key and valid values.
Dynamic Parameter |
Key |
Valid Values |
---|---|---|
Number of B frames |
|
0 to 4 |
Min/Max bit rate |
|
0 to INT_MAX |
Max bit rate |
|
0 to INT_MAX |
Bitrate (in bits per second) |
|
0 to INT_MAX |
Temporal AQ mode |
|
0 to 1 |
Temporal AQ gain |
|
0 to 255 |
Spatial AQ mode |
|
0 to 1 |
Spatial AQ gain |
|
0 to 255 |
Min/Max QP |
|
See QP Ranges |
QP |
|
See QP Ranges |
QP I frame offset |
|
See QP Ranges |
QP B frame offset |
|
See QP Ranges |
Sample FFmpeg encode command:
ffmpeg -hwaccel ama -re -f lavfi -i testsrc=duration=60:size=1920x1080:rate=60,format=yuv420p -f rawvideo -vf "hwupload" -c:v hevc_ama -b:v 5M -dynamic_params_file ./param.txt -y -f rawvideo output.h265
Sample configuration file for dynamic parameters:
300:NumB=1
600:BRkbps=6000
1200:sAQ=1,sAQGain=50
1800:tAQ=1,tAQGain=50
2400:NumB=0,BRkbps=10000,sAQ=0,sAQGain=50,tAQ=0
5000:MinQP=30,MaxQP=35
GStreamer
Gstreamer uses the same formated configuration file as FFmpeg:
gst-launch-1.0 fakesrc sizetype=fixed sizemax=4147200 num-buffers=4000 ! capsfilter caps='video/x-raw' ! rawvideoparse width=1920 height=1080 format=i420 framerate=60/1 ! ama_upload ! ama_h265enc encoding-params-file=./param1.txt ! fakesink