ailia TFLite Runtime

AI inference engine for embedded devices such as NonOS and RTOS. TFLite compatible inference API.

Getting Started

Choose your platform and run your first TFLite inference.

1

Install

Install the ailia TFLite Python package from PyPI.

pip3 install ailia_tflite
View on PyPI
2

Run a Sample

ailia-models-tflite bundles per-model inference scripts. Clone once, install the shared requirements, then cd into any model folder and run — pass -v 0 for webcam input or -i image.png for a still. python3 launcher.py at the repo root opens a GUI that browses every quantized TFLite model.

git clone https://github.com/ailia-ai/ailia-models-tflite.git
cd ailia-models-tflite
pip3 install -r requirements.txt
cd object_detection/yolox
python3 yolox.py -v 0
Sample Repository

System Requirements

ailia TFLite Runtime is a TensorFlow Lite-compatible inference engine written in C99 for desktop, mobile, and embedded use.

Operating Systems

  • Windows 10 / 11
  • macOS 11 or later
  • Linux (Ubuntu 20.04+)
  • Android 7+ (NPU via NNAPI)
  • NonOS / RTOS (embedded)

Languages

  • Python 3.6 or later
  • C99 (embedded-friendly)
  • C# / Unity 2021.3.10f1+
  • Kotlin / Java (JNI)

Acceleration

  • Intel MKL (PC)
  • Android NNAPI / NPU
  • Optimized C99 fallback

Model Formats

  • TFLite (.tflite)
  • INT8 quantized models
  • Drop-in replacement for tflite_runtime

Use the API in Your Project

Minimal examples for loading a TFLite model and running inference. The Python API mirrors tflite_runtime.interpreter, so existing TFLite code works with one import change.

import ailia_tflite
import numpy as np

interpreter = ailia_tflite.Interpreter(model_path="model.tflite")
interpreter.allocate_tensors()

input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

input_data = np.zeros(input_details[0]["shape"], dtype=np.float32)
interpreter.set_tensor(input_details[0]["index"], input_data)
interpreter.invoke()

output = interpreter.get_tensor(output_details[0]["index"])
print(output.shape)

API Reference by Platform

C99

FAQ

Common questions about ailia TFLite Runtime.

Is it really a drop-in replacement for TensorFlow Lite?

Yes. The Python ailia_tflite.Interpreter class mirrors tflite_runtime.interpreter.Interpreter — same constructor, same allocate_tensors() / set_tensor() / invoke() / get_tensor() methods. Existing TFLite Python scripts typically need only an import change.

Where does ailia TFLite shine compared to upstream TFLite?

Two main areas: high-speed PC inference via Intel MKL, and embedded deployment on NonOS / RTOS thanks to a lightweight C99 implementation. On Android it can also drive the on-device NPU through NNAPI.

You can compare against upstream behaviour by passing --tflite to the sample scripts in ailia-models-tflite.

Does it support quantized models?

Yes. INT8 quantized TFLite models are first-class — quantization is the recommended path for embedded targets and NPUs. The model zoo at ailia-models-tflite is built around quantized variants.

Can I use it on microcontrollers?

The C99 core is designed for NonOS / RTOS and small footprint deployments. Specific MCU support depends on available memory and toolchain — contact ailia for embedded port details.

How do I switch back-ends (CPU / NPU / MKL)?

Pass env_id (and optional flags / num_threads) when constructing ailia_tflite.Interpreter. The default chooses the fastest available back-end for your platform.

How does licensing work?

An evaluation license is downloaded automatically at runtime, suitable for development and trial. For commercial deployment — including embedded redistribution — request a production license. See the ailia license terms.

Materials