Data Types and Structures#

Data types, structures, and constants used throughout AOCL-DLP.

Overview#

AOCL-DLP defines various data types and structures to support different precision levels and operation configurations. Understanding these types is essential for effective use of the library.

Basic Data Types#

Dimension Types#

typedef int64_t md_t#

BFloat16 Support#

typedef int16_t bfloat16#

Post-Operations Structures#

struct dlp_metadata_t

Main metadata structure containing all post-operation configurations.

This structure serves as the main container for all post-operation metadata, defining the sequence and parameters of operations to be applied after GEMM. It supports multiple post-operations that can be chained together in a specific order.

Main metadata structure for post-operation configurations.

Contains all post-operation parameters, sequence, and group information for GEMM.

Public Members

dlp_scale_t *scale

Scale post-operations (multiple allowed)

dlp_post_op_eltwise *eltwise

Element-wise post-operations (multiple allowed)

dlp_post_op_bias *bias

Bias addition post-operation

dlp_post_op_matrix_add *matrix_add

Matrix addition post-operation

dlp_post_op_matrix_mul *matrix_mul

Matrix multiplication post-operation

md_t seq_length

Number of operations in the sequence (e.g., 2)

DLP_POST_OP_TYPE *seq_vector

Sequence of post-operations to apply in order (e.g., seq_vector[0]=BIAS, seq_vector[1]=ELTWISE means bias followed by element-wise operation)

dlp_pre_op *pre_ops

Pre-operations to be applied before GEMM

dlp_group_post_op *post_op_grp

Grouped post-operations for different quantization groups

md_t num_eltwise

Number of element-wise operations to track

dlp_error_hndl_t error_hndl

Error handle for the routine, currently wrapped as part of the metadata.

Post-Operation Types#

enum DLP_POST_OP_TYPE

Enumeration of post-operation types that can be applied to GEMM results.

This enum defines the different types of operations that can be performed on the output matrix after GEMM computation.

Post-operation types for GEMM results.

Enumerates supported post-operations that can be applied to GEMM output.

Values:

enumerator ELTWISE

Element-wise operations (activations)

enumerator BIAS

Bias addition operation

enumerator SCALE

Scaling operation

enumerator MATRIX_ADD

Matrix addition operation

enumerator MATRIX_MUL

Matrix multiplication operation

enum DLP_ELT_ALGO_TYPE

Enumeration of element-wise algorithm types supported in post-operations.

This enum defines the various activation functions and element-wise operations that can be applied as post-operations in GEMM computations.

Element-wise algorithm types for post-operations.

Enumerates supported activation and element-wise functions for GEMM post-ops.

Values:

enumerator RELU

Rectified Linear Unit activation: max(0, x)

enumerator PRELU

Parametric ReLU activation: max(alpha*x, x)

enumerator GELU_TANH

GELU activation using tanh approximation

enumerator GELU_ERF

GELU activation using error function

enumerator CLIP

Clipping operation: min(max(x, min_val), max_val)

enumerator SWISH

Swish activation: x * sigmoid(x)

enumerator TANH

Hyperbolic tangent activation

enumerator SIGMOID

Sigmoid activation: 1 / (1 + exp(-x))

Constants and Enumerations#

Memory Format Constants#

Memory Format Values#

Constant

Description

'R'

Row-major (C-style) memory layout

'C'

Column-major (Fortran-style) memory layout

'N'

Normal (unpacked) matrix format

'R'

Reordered (packed) matrix format

Transpose Options#

Transpose Values#

Constant

Description

'N'

No transpose

'T'

Transpose matrix

Data Type Naming Convention#

Function names in AOCL-DLP follow a systematic naming convention that encodes the data types:

Pattern: [input_A][input_B][accumulation]o[output]

Type Abbreviations#

Abbreviation

Data Type

Description

f32

float (32-bit)

IEEE 754 single precision

bf16

bfloat16 (16-bit)

Brain floating point format

u8

uint8_t (8-bit)

Unsigned 8-bit integer

s8

int8_t (8-bit)

Signed 8-bit integer

s4

int4 (4-bit)

Signed 4-bit integer (packed)

s32

int32_t (32-bit)

Signed 32-bit integer

Examples#

Function Name Examples#

Function Name

Meaning

aocl_gemm_f32f32f32of32

float32 × float32 → float32 (with float32 accumulation)

aocl_gemm_bf16bf16f32of32

bfloat16 × bfloat16 → float32 (with float32 accumulation)

aocl_gemm_u8s8s32os8

uint8 × int8 → int8 (with int32 accumulation)

aocl_gemm_bf16s4f32of32

bfloat16 × int4 → float32 (with float32 accumulation)

Memory Layout Considerations#

Row-Major vs Column-Major#

// Row-major (C-style): A[i][j] = A[i*lda + j]
float A_row_major[M][N];

// Column-major (Fortran-style): A[i][j] = A[j*lda + i]
float A_col_major[N][M];

Leading Dimensions#

The leading dimension must be at least as large as the corresponding matrix dimension:

// For row-major matrices
assert(lda >= k);  // For matrix A (m × k)
assert(ldb >= n);  // For matrix B (k × n)
assert(ldc >= n);  // For matrix C (m × n)

// For column-major matrices
assert(lda >= m);  // For matrix A (m × k)
assert(ldb >= k);  // For matrix B (k × n)
assert(ldc >= m);  // For matrix C (m × n)

BFloat16 Usage#

BFloat16 Format#

BFloat16 (Brain Floating Point) is a 16-bit floating point format:

  • 1 bit: Sign

  • 8 bits: Exponent (same as float32)

  • 7 bits: Mantissa (reduced from float32’s 23 bits)

Usage Example#

#include "aocl_dlp.h"

// Convert float32 to bfloat16
float f32_value = 3.14159f;
aocl_bf16 bf16_value = (aocl_bf16)f32_value;

// Use in GEMM operations
aocl_bf16 *a_bf16, *b_bf16;
float *c_f32;

aocl_gemm_bf16bf16f32of32(
    'R', 'N', 'N', m, n, k,
    1.0f, a_bf16, lda, 'N',
    b_bf16, ldb, 'N',
    0.0f, c_f32, ldc, NULL
);

Type Safety and Conversions#

Implicit Conversions#

AOCL-DLP handles some implicit type conversions:

  • Quantized to float: Automatic dequantization in mixed-precision operations

  • BFloat16 to float32: Automatic promotion for accumulation

  • Integer widening: Automatic promotion to prevent overflow

Explicit Conversions#

For explicit type conversions, use appropriate utility functions or element-wise operations:

// Convert float32 array to bfloat16
aocl_gemm_eltwise_ops_f32obf16(
    'R', 'N', 'N', m, n,
    f32_array, lda,
    bf16_array, ldb,
    NULL  // No additional operations
);

Best Practices#

  1. Choose appropriate precision: Balance accuracy and performance requirements

  2. Understand memory layouts: Use consistent layouts throughout your application

  3. Validate dimensions: Ensure leading dimensions are correctly set

  4. Handle type conversions: Be explicit about precision requirements

  5. Consider hardware support: Some types may have better hardware acceleration

See Also#

Full Header Reference#