AI inference engine for embedded devices such as NonOS and RTOS. TFLite compatible inference API.
Choose your platform and run your first TFLite inference.
Install the ailia TFLite Python package from PyPI.
pip3 install ailia_tflite
Haven't installed Python or git yet? Start with Setting up your Python environment (Windows / Mac / Linux).
View on PyPIailia-models-tflite bundles per-model inference scripts. Clone once, install the shared requirements, then cd into any model folder and run — pass -v 0 for webcam input or -i image.png for a still. python3 launcher.py at the repo root opens a GUI that browses every quantized TFLite model.
git clone https://github.com/ailia-ai/ailia-models-tflite.git
cd ailia-models-tflite
pip3 install -r requirements.txt
cd object_detection/yolox
python3 yolox.py -v 0
On Windows, use python instead of python3.
ailia TFLite Runtime is a TensorFlow Lite-compatible inference engine written in C99 for desktop, mobile, and embedded use.
.tflite)Minimal examples for loading a TFLite model and running inference. The Python API mirrors tflite_runtime.interpreter, so existing TFLite code works with one import change.
import ailia_tflite
import numpy as np
interpreter = ailia_tflite.Interpreter(model_path="model.tflite")
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
input_data = np.zeros(input_details[0]["shape"], dtype=np.float32)
interpreter.set_tensor(input_details[0]["index"], input_data)
interpreter.invoke()
output = interpreter.get_tensor(output_details[0]["index"])
print(output.shape)
Common questions about ailia TFLite Runtime.
Yes. The Python ailia_tflite.Interpreter class mirrors tflite_runtime.interpreter.Interpreter — same constructor, same allocate_tensors() / set_tensor() / invoke() / get_tensor() methods. Existing TFLite Python scripts typically need only an import change.
Two main areas: high-speed PC inference via Intel MKL, and embedded deployment on NonOS / RTOS thanks to a lightweight C99 implementation. On Android it can also drive the on-device NPU through NNAPI.
You can compare against upstream behaviour by passing --tflite to the sample scripts in ailia-models-tflite.
Yes. INT8 quantized TFLite models are first-class — quantization is the recommended path for embedded targets and NPUs. The model zoo at ailia-models-tflite is built around quantized variants.
Read the per-tensor quantization parameters via ailiaTFLiteGetTensorQuantizationScale and ailiaTFLiteGetTensorQuantizationZeroPoint.
Float → Int8: q = round(f / scale) + zero_point
Int8 → Float: f = (q − zero_point) × scale
Each TFLite tensor has a single scale and a single zero_point. Weights (e.g. Conv kernels) use per-channel scales (per-axis quantization) and the zero_point is fixed at 0. Inspect the per-channel layout with ailiaTFLiteGetTensorQuantizationCount and ailiaTFLiteGetTensorQuantizationQuantizedDimension.
For the full specification see the TensorFlow Lite 8-bit quantization spec.
The C99 core is designed for NonOS / RTOS and small footprint deployments. Specific MCU support depends on available memory and toolchain — contact ailia for embedded port details.
Pass env_id (and optional flags / num_threads) when constructing ailia_tflite.Interpreter. The default uses the CPU back-end with MKL; to run on the Android NPU you must specify AILIA_TFLITE_ENV_NNAPI (=1).
An evaluation license is downloaded automatically at runtime, suitable for development and trial. For commercial deployment — including embedded redistribution — request a production license. See the ailia license terms.
Model deep dives, release notes, and tutorials from the ailia tech blog.