ailia SDK

This is the ONNX inference API, which serves as the core of everything.

Getting Started

Choose your platform and follow three steps to run your first inference.

1

Install

Install the ailia Python package from PyPI. Python 3.6 or later is required.

pip3 install ailia
View on PyPI
2

Run a Sample

ailia-models ships 400+ ready-to-run inference scripts. Clone once, install the shared requirements, then cd into any model folder and run — pass -v 0 for webcam input, -i image.png for a still, or skip the flag to use the bundled demo input. python3 launcher.py at the repo root opens a GUI that browses every model.

git clone https://github.com/ailia-ai/ailia-models.git
cd ailia-models
pip3 install -r requirements.txt
cd object_detection/yolox
python3 yolox.py -v 0
Sample Repository (400+ models)

System Requirements

ailia SDK runs across desktop, mobile, and embedded platforms with a wide range of GPU back-ends.

Operating Systems

  • Windows 10 / 11
  • macOS 11 or later
  • Linux (Ubuntu 20.04+)
  • iOS 13+
  • Android 7+
  • Jetson, Raspberry Pi

Languages

  • Python 3.6 or later
  • C++17
  • C# / Unity 2021.3.10f1+
  • Kotlin / Java (JNI)
  • Dart / Flutter 3.19+

GPU Acceleration

  • NVIDIA CUDA
  • Apple Metal
  • Vulkan (cross-platform)
  • Intel MKL
  • NPU via NNAPI (Android)

Model Formats

  • ONNX (primary)
  • TFLite (via ailia TFLite Runtime)
  • GGUF (via ailia LLM)
  • Convert from PyTorch / TensorFlow / Keras

Use the API in Your Project

Minimal examples for loading an ONNX model and running inference.

import ailia
import numpy as np

# stream=None lets ailia derive the graph from the .onnx file
net = ailia.Net(stream=None, weight="model.onnx")

input_tensor = np.zeros((1, 3, 224, 224), dtype=np.float32)
output = net.run(input_tensor)
print(output[0].shape)

API Reference by Platform

FAQ

Common questions from first-time ailia SDK users.

What is included in ailia SDK?

Bundled with ailia SDK: the ONNX inference API (ailia) plus high-level helpers for common tasks — Classification for image classification, Detector for object detection, PoseEstimation for skeletal pose, and ailia Audio for audio pre/post-processing.

Supplemental libraries (separate packages): ailia Tokenizer, ailia Speech, ailia Voice, and ailia Tracker ship as separate libraries on top of the SDK.

What is the difference between the evaluation license and a production license?

The evaluation license is downloaded automatically at runtime for Python, Unity, Flutter, and JNI, and is valid for one month for C++. It is intended for development and trial use only.

For commercial deployment, redistribution, or longer-term use, request a production license. See the ailia license terms for details.

How do I switch between CPU and GPU?

Pass an env_id to ailia.Net(). List available environments (CPU, CUDA, Metal, Vulkan, MKL) with ailia.get_environment_list(), then select the one you want.

By default, ailia chooses the fastest available environment for your platform.

To use CUDA, install the CUDA Toolkit and cuDNN. See the CUDA Toolkit / cuDNN Installation Guide for details.

To use Vulkan, see the Vulkan Setup Guide.

Where are model files stored after I download them?

Sample scripts in ailia-models download .onnx and .onnx.prototxt files into the model's own directory the first time you run them. Subsequent runs reuse the cached files.

For ailia Speech and ailia Voice, models are downloaded into ./models/ by default, configurable via initialize_model(model_path=...).

Can I run ailia SDK offline?

Yes, after the first run. The evaluation license and any auto-downloaded model files require an internet connection on first use; once cached, subsequent inference works offline.

For C++, the license is fetched once via download_license.py from the binding repository.

How do I convert a PyTorch or TensorFlow model to ONNX for ailia?

Export your model to ONNX with the framework's standard tooling (torch.onnx.export, tf2onnx, etc.), then generate the matching .onnx.prototxt using the script bundled with ailia SDK.

The model conversion tutorial walks through the process.

Is a prototxt file required?

No, prototxt is optional when loading an ONNX model directly (e.g. ailia.Net(stream=None, weight="model.onnx")). The ailia-models repo ships prototxt files alongside ONNX so that Netron can visualize them quickly, but your own models work without one.

How do I handle multiple input / output tensors?

The Python API accepts a name-to-array dict as the argument to net.run(), so multi-input models work out of the box and multi-output results come back as a list.

The C API's ailiaPredict only supports one input and one output. For multi-IO models, write each input via ailiaSetInputBlobData, run inference with ailiaUpdate, then read each output via ailiaGetBlobData. Use ailiaFindBlobIndexByName to look up blob indices by name.

How are tensor data types handled?

Input and output tensor buffers are always passed as float (FP32) regardless of the underlying ONNX datatype. Internally ailia executes the model using whatever datatype the ONNX defines (FP16, INT8 quantization, etc.). You can query the actual datatype of any blob with ailiaGetBlobDataType.

What do AILIAShape's x / y / z / w correspond to?

AILIAShape represents up to 4 dimensions through x / y / z / w + dim. x is the innermost (memory-contiguous) axis and w is the outermost. For a numpy-style (batch, channel, height, width) 4-D tensor, that maps to w = batch, z = channel, y = height, x = width.

The dim field tells you how many of those axes are valid:

  • dim = 0: scalar (ONNX rank-0 tensor)
  • dim = 1: x only
  • dim = 2: x and y
  • dim = 3: x / y / z
  • dim = 4: x / y / z / w

Tensors with rank ≥ 5 cannot be expressed in AILIAShape. Use the ND variants instead — ailiaSetInputShapeND / ailiaGetOutputShapeND (with ailiaGetInputDim / ailiaGetOutputDim) — which take a flat unsigned int* shape array.

I'm hitting AILIA_STATUS_UNSETTLED_SHAPE

When the ONNX was exported with dynamic shape, the engine can't resolve the shape until you tell it the actual input dimensions. Calling inference or ailiaGetOutputShape before ailiaSetInputShape (or ailiaSetInputShapeND for rank ≥ 5) returns AILIA_STATUS_UNSETTLED_SHAPE (-18).

Python: net.run() inspects the input array's shape and calls ailiaSetInputShape for you, so no extra step is needed.

C / C# (Unity) / Kotlin (JNI) / Dart (Flutter): set the input shape explicitly via the equivalent ailiaSetInputShape call before running inference.

How do I profile inference performance?

A built-in profiler reports per-layer timing.

Python: call net.set_profile_mode(True), run inference, and read the per-layer summary with net.get_summary().

C: enable profiling with ailiaSetProfileMode, run inference, then call ailiaGetSummaryLength to get the buffer size and ailiaSummary to fill it with the summary string.

How do I reduce memory usage?

The default mode is speed-first and keeps every intermediate tensor in memory. Enabling intermediate reuse and constant reduction lowers peak memory.

Python: pass a bit-flag to set_memory_mode:

memory_mode = ailia.get_memory_mode(
    reduce_constant=True,
    ignore_input_with_initializer=True,
    reduce_interstage=False,
    reuse_interstage=True,
)
net.set_memory_mode(memory_mode)

C: the equivalent control is ailiaSetMemoryMode. OR together flags such as AILIA_MEMORY_REDUCE_CONSTANT and AILIA_MEMORY_REUSE_INTERSTAGE.

Where can I get help?

For bug reports and questions on the sample repositories, open an issue on the relevant GitHub repo. For SDK licensing and commercial inquiries, contact ailia Inc. directly.

Materials

Release History

Release commentary articles for each version on the ailia tech blog.

Related Articles

Installation guides and binding tutorials from the ailia tech blog.

About ailia SDK
tech.ailia.ai
How to install ailia SDK
tech.ailia.ai
Install via Unity Package Manager
tech.ailia.ai
Install via Flutter pubspec
tech.ailia.ai
Evaluation version installable via pip
tech.ailia.ai