Cross-Platform Accelerated Machine Learning

Speed up machine learning process

Built-in optimizations that deliver up to 17X faster inferencing and up to 1.4X faster training

Plug into your existing technology stack

Support for a variety of frameworks, operating systems and hardware platforms

Build using proven technology

Used in Office 365, Visual Studio and Bing, delivering more than a Trillion inferences every day

Use ONNX Runtime with your favorite language

import onnxruntime as ort

# Load the model and create InferenceSession
model_path = "path/to/your/onnx/model"
session = ort.InferenceSession(model_path)

# Load and preprocess the input image inputTensor
...

# Run inference
outputs = session.run(None, {"input": inputTensor})
print(outputs)
 Learn more
											
import ai.onnxruntime.*;

// Load the model and create InferenceSession
String modelPath = "path/to/your/onnx/model";
OrtEnvironment env = OrtEnvironment.getEnvironment();
OrtSession session = env.createSession(modelPath);

// Load and preprocess the input image inputTensor
...

// Run inference
OrtSession.Result outputs = session.run(inputTensor);
System.out.println(outputs.get(0).getTensor().getFloatBuffer().get(0));
 Learn more
											
import * as ort from "onnxruntime-web";

// Load the model and create InferenceSession
const modelPath = "path/to/your/onnx/model";
const session = await ort.InferenceSession.create(modelPath);

// Load and preprocess the input image to inputTensor
...

// Run inference
const outputs = await session.run({ input: inputTensor });
console.log(outputs);
 Learn more
											
#include "onnxruntime_cxx_api.h"

// Load the model and create InferenceSession
Ort::Env env;
std::string model_path = "path/to/your/onnx/model";
Ort::Session session(env, model_path, Ort::SessionOptions{ nullptr });

// Load and preprocess the input image to 
// inputTensor, inputNames, and outputNames
...

// Run inference
std::vector outputTensors =
 session.Run(Ort::RunOptions{nullptr}, 
 			inputNames.data(), 
			&inputTensor, 
			inputNames.size(), 
			outputNames.data(), 
			outputNames.size());

const float* outputDataPtr = outputTensors[0].GetTensorMutableData();
std::cout << outputDataPtr[0] << std::endl;
 Learn more
											
using Microsoft.ML.OnnxRuntime;

// Load the model and create InferenceSession
string model_path = "path/to/your/onnx/model";
var session = new InferenceSession(model_path);

// Load and preprocess the input image to inputTensor
...

// Run inference
var outputs = session.Run(inputTensor).ToList();
Console.WriteLine(outputs[0].AsTensor()[0]);
 Learn more
											

Use ONNX Runtime with the platform of your choice

Select the configuration you want to use and run the corresponding installation script.
ONNX Runtime supports a variety of hardware and architectures to fit any need.

Platform

Platform list contains six items

Windows
Linux
Mac
Android
iOS
Web Browser

API

API list contains eight items

Python
C++
C#
C
Java
JS
Obj-C
WinRT

Architecture

Architecture list contains five items

X64
X86
ARM64
ARM32
IBM Power

Hardware Acceleration

Hardware Acceleration list contains seventeen items

Default  CPU
CoreML
CUDA
DirectML
MIGraphX
NNAPI
oneDNN
OpenVINO
ROCm
QNN
TensorRT
ACL (Preview)
ArmNN (Preview)
Azure (Preview)
CANN (Preview)
Rockchip NPU (Preview)
TVM (Preview)
Vitis AI (Preview)
XNNPACK (Preview)

Installation Instructions

Please select a combination of resources

ONNX RUNTIME VIDEOS

ORGANIZATIONS & PRODUCTS USING ONNX RUNTIME