AOCL-DLP API Documentation#
Welcome to the AMD Optimizing CPU Libraries - Deep Learning Primitives (AOCL-DLP) API documentation.
This site provides the API reference only. For user guides, tutorials, installation, and examples, please visit the project Wiki.
API Reference#
Complete API reference for AOCL-DLP (AMD Optimizing CPU Libraries - Deep Learning Primitives).
API Categories
Quick API Lookup#
Core GEMM Operations#
Function Pattern |
Description |
|---|---|
|
Float32 precision GEMM |
|
BFloat16 inputs, float32 output |
|
Unsigned/signed 8-bit quantized GEMM |
|
Signed 8-bit quantized GEMM |
|
IEEE float16 precision GEMM |
|
BFloat16 x int4 mixed precision |
|
BFloat16 x int8 mixed precision |
|
Float32 x int8 mixed precision |
|
Symmetric quantization GEMM |
Batch Operations#
Function Pattern |
Description |
|---|---|
|
Batch processing for multiple matrices |
Post-Operations#
Type / Structure |
Description |
|---|---|
|
Main metadata structure for configuring post-ops |
|
Post-op types: BIAS, ELTWISE, SCALE, MATRIX_ADD, MATRIX_MUL |
|
Activation functions: RELU, GELU, SWISH, TANH, SIGMOID, etc. |
Matrix Utilities#
Function Pattern |
Description |
|---|---|
|
Get buffer size for matrix reordering |
|
Reorder matrix for optimal performance |
|
Convert reordered matrix back to normal format |
Element-wise Operations#
Function Pattern |
Description |
|---|---|
|
Apply element-wise operations to matrices |
Utility Functions#
Function Pattern |
Description |
|---|---|
|
GELU activation functions |
|
Softmax functions |
Library Management#
Function |
Description |
|---|---|
|
Configure thread count |
|
Configure parallelization strategy |
|
Query AOCL_DLP_ENABLE_INSTRUCTIONS environment setting |
|
Query library version (major, minor, patch) |
API Selection Guide#
Choose the Right GEMM Variant#
By Precision Requirements:
High Precision:
f32f32f32of32for maximum accuracyBalanced:
bf16bf16f32of32for good accuracy with reduced memoryQuantized:
u8s8s32os32ors8s8s32os8for inference
By Performance Needs:
Single Operation: Standard GEMM functions
Multiple Operations: Batch GEMM functions
Repeated Operations: Use matrix reordering
Data Type Naming Convention#
Function names follow the pattern: [input_A][input_B][accumulation]o[output]
f32= float32bf16= bfloat16u8= uint8s8= int8s32= int32
Example: bf16bf16f32of32 = bfloat16 inputs, float32 accumulation and output
See Also#
API Overview - API design principles and usage patterns
GEMM Operations - GEMM operations documentation
Post-Operations - Post-operations framework
Quick Start Guide - Get started in 5 minutes
Integration Guide - CMake integration, linking, and troubleshooting
Examples and Tutorials - Code examples and usage patterns