AMD Advanced Media Acceleration (AMA) Troubleshooting

Overview

This section describes various troubleshooting methods and workarounds for some known issues such as out of memory errors, low frame-rate, etc.

No Device Showing in lspci

If lspci -d 10ee: does not show any of MA35D devices and dmesg has no indication of such devices being detected, then this could indicate issues with:

  1. BIOS

  2. PCIe slot

Workaround:

Ensure that your BIOS is up to date and properly configured. See BIOS setting.

Switch to a known working PCIe slot, to isolate possible PCIe slot issues.

Single Device per Card

If SDK has access to only a single device on a card, i.e.,:

lspci -d 10ee:

returns half the expected devices, then bifurication has not been enabled in the BIOS.

Workaround: Check your BIOS and enable 4x4 PCIe bifurcation on each slot with a MA35 card.

Memory Usage

Messages such as:

  • [ERROR] ... Linear buffer allocate fail

  • [ERROR] ... from element xxxxx: Internal data stream error.

  • [ERROR] ... Cannot create channel: DeviceAllocate failed

...

, indicate memory pressure on the accelerator card.

Workaround: Other than the obvious over-subscription use-cases, these issues can be resolved by decrementing the lookahead buffer size.

AV1 Slow Playback

An AV1 HLS stream may playback at slower than real time speeds.

Workaround: Recommend explicitly setting of FPS playback frame rate when using ffplay or use ffplay from FFmpeg 5.1.2 or 6.0.

AV1 MP4 Playback

AV1 muxed into an MP4 container may not play back properly.

Workaround: Recommend playback of raw video using more recent versions of ffplay (n5.1.2 or later) or more recent versions of media players such as VLC v3.0.17.4. Alternatively, muxing the raw video into an IVF container format is also permitted.

30 FPS Density

30 FPS transcodes and below can not run at full density with default lookahead depth.

Workaround: Decrease lookahead depth or density. Required decrease is dependent upon resolution/frame rate. See Performance Tables Section for details.

Gstreamer Variable Frame Rate

Variable frame rate files are not supported by the Gstreamer AMA plugins.

Workaround: Turn XRM off and add a videorate filter and caps filter before the encoder to set a fixed frame rate.

400 Mbps Max Bit Rate

Encoders do not support a target or max bitrate above 400 Mbps.

Workaround: This issue can be avoided by constraining the bitrate values below to 400 Mbps.

Many Parallel Encodes

Encountering Cannot create channel: DeviceAllocate failed error, when running high number of encoding operations in parallel.

Workaround: Reduce the memory requirements by lowering the lookahead depth of the encoding operation. This can be controlled with the -lookahead_depth argument. Default values depend on the FPS and bitdepth of the input source. The following are some starting points:

30fps: -lookahead_depth 26 (reduce by steps of 2)

60fps: -lookahead_depth 46 (reduce by steps of 4)

HDR & spatialAqGain

Encoding HDR content with a spatialAqGain of 1 or 2 is not supported.

Workaround: Using the default values of spatialAqGain is recommended for best video quality when possible. If it is desirable to reduce the spatialAqGain from the recommended range of 60 - 100, using a value of 3 or greater will avoid this issue.

Nonresponsive Devices

It is possible that codec services and utilities may become unavailable, due to crashed SDK driver and as such mamgmt reset will not be able to reset any of the available devices.

Workaround: Execute the following commands:

sudo rmmod ama_transcoder
sudo modprobe ama_transcoder

If removal and reloading of the driver does not work, then system reboot is required.

Codec Issues in VM

If codec operations are hanging in a VM, while the following sudo dmesg -w output log is observed:

...
[  323.436842] ama_transcoder0 0000:05:00.0 hdma: warning:hdma_link_rc2ep_xfer status is done, c:2,status=0x1,condition=0
[  323.440802] ama_transcoder0 0000:05:00.0 hdma: dir:rc2ep element_cnt=1 channel:2 link_table_pa:0x160000
[  323.443387] ama_transcoder0 0000:05:00.0 hdma: ctl:0x01 size:0x4 sh:0x2 sl:0x97e00000 dh:0x0 dl:0x20831000
[  323.445538] ama_transcoder0 0000:05:00.0 hdma: end ctrl:0x06 rsv:0x0 llp_h:0x0 llp_l:0x160000
[  323.447257] ama_transcoder0 0000:05:00.0 hdma: rc2ep PF c=2 0x500 = 0x1
...

, with host reporting IO_PAGE_FAULT in its dmesg logs, then this is an indication of a mis-configured VM.

Workaround: Ensure that VM has been properly created as per Virtualization.

Cannot Assign a VF to a VM

If passing a VF to a VM fails, this could be due to VF not getting its own IOMMU group. To check this, run the following command:

for a in /sys/kernel/iommu_groups/*; do find $a -type l; done | sort --version-sort

, and ensure that VF is assign to a unique group. As an example, the following output shows that VF associated with device 0000:02:00.0 , i.e., 0000:02:00.1, has its own unique group 34:

...
/sys/kernel/iommu_groups/34/devices/0000:02:00.1
...

Workaround: Ensure that ACS has been enabled in the BIOS.

Cannot Flash the Card

While attempting to update the firmware with satellite controller version 9.7.35, the following message may appear:

...
Device: 0000:e2:00.0
**** ERROR Programming BMC-MSP432.bin of type [SC] SC is accessing flash, pls try FW update after sometime.
...

Workaround: Wait for 10 minutes and try to flash again.

FFmpeg Unsupported Audio Format Conversion

FFmpeg may attempt to automatically convert an audio track to an invalid format, based on the specified container type, e.g., from FLAC to AAC 5.1, for MP4 container. This may leave the transcoded stream with an invalid audio track:

...
[aac @ 0x564a3c67f580] Unsupported channel layout "6 channels"
...

Workaround: Explicitly specify the audio encoder type, e.g., -c:a copy, -c:a ac3, ... .

LED Red Lights On

If LED lights are red, this may be an indication of a hardware issue with the card or with the host chassis. Note that while chassis is in its off-state, the red LED lights are expected.

Workaround: Switch to a known working PCIe slot, to isolate possible PCIe slot issues. If issue is not due to chassis's PCIe slot, i.e., a known good slot continues to exhibits the same symptoms, contact AMD support for further instructions.

Ubuntu 20.04 Installation

If you are not able to install some of the SDK packages, on Ubuntu 20.04, and are seeing a message similar to the following:

Err:10 https://packages.xilinx.com/artifactory/debian-packages focal/main amd64 amd-ama-core amd64 1.2.0-2408071645
403  Forbidden [IP: xxx.xxx.xxx.xxx 443]

This is due to a known bug in apt version of Ubuntu 20.04 release.

Workaround: Update your apt version using ppa:gpxbv/apt-urlfix PPA, by following the instructions noted below:

  1. sudo apt install software-properties-common

  2. sudo add-apt-repository ppa:gpxbv/apt-urlfix

  3. sudo apt install apt apt-utils

  4. Proceed with the SDK installation

Fedora Kernel Upgrade

During SDK installation, Fedora may attempt to update to an unsupported kernel, which could lead to a failed or an unstable installation.

Workaround: Add --exclude=kernel* to /etc/dnf/dnf.conf file to prevent unwanted kernel upgrades.

Newly Added Card

If after adding a new card to an existing chassis, dmesg messages similar to the following are observed:

[ 1062.737171] ama_transcoderx 0000:XX:00.0 hwm: zsp boot mode: debug.
[ 1062.737179] ama_transcoderx 0000:XX:00.0 hwm: zsp boot mode abnormal [debug mode].

, the newly added card should be flashed and a cold reboot is required.

Workaround: Use mamgmt to flash the card, and perform a cold reboot.