HOME
Shop
  • English
  • 简体中文
HOME
Shop
  • English
  • 简体中文
  • Product Series

    • FPGA+ARM

      • GM-3568JHF

        • 1. Introduction

          • About GM-3568JHF
        • 2. Quick Start

          • 00 Introduction
          • 01 Environment Setup
          • 02 Compilation Instructions
          • 03 Flashing Guide
          • 04 Debug Tools
          • 05 Software Update
          • 06 View Information
          • 07 Test Commands
          • 08 App Compilation
          • 09 Source Code Acquisition
        • 3. Peripherals and Interfaces

          • 01 USB
          • 02 Display and Touch
          • 03 Ethernet
          • 04 WIFI
          • 05 Bluetooth
          • 06 TF-Card
          • 07 Audio
          • 08 Serial Port
          • 09 CAN
          • 10 RTC
        • 4. Application Development

          • 01 UART read and write case
          • 02 Key detection case
          • 03 LED light flashing case
          • 04 MIPI screen detection case
          • 05 Read USB device information example
          • 06 FAN Detection Case
          • 07 FPGA FSPI Communication Case
          • 08 FPGA DMA read and write case
          • 09 GPS debugging case
          • 10 Ethernet Test Cases
          • 11 RS485 reading and writing examples
          • 12 FPGA IIC read and write examples
          • 13 PN532 NFC card reader case
          • 14 TF card reading and writing case
        • 5. QT Development

          • 01 ARM64 cross compiler environment construction
          • 02 QT program added automatic startup service
        • 6. RKNN_NPU Development

          • 01 RK3568 NPU Overview
          • 02 Development Environment Setup
          • Run Official YOLOv5 Example
          • Model Conversion Detailed Explanation
          • Run Custom Model on Board
        • 7. FPGA Development

          • ARM and FPGA Communication
          • /fpga-arm/GM-3568JHF/FPGA/ch02-FPGA-Development-Manual.html
        • 8. Others

          • 01 Modification of the root directory file system
          • 02 System auto-start service
        • 9. Download

          • Download Resources
    • ShimetaPi

      • M4-R1

        • 1. Introduction

          • 1.1 About M4-R1
        • 2. Quick Start

          • 2.1 OpenHarmony Overview
          • 2.2 Image Burning
          • 2.3 Development Environment Preparation
          • 2.4 Hello World Application
        • 3. Application Development

          • 3.1 Getting Started

            • 3.1.1 ArkTS Language Overview
            • 3.1.2 UI Components (Part 1)
            • 3.1.3 UI Components (Part 2)
            • 3.1.4 UI Components (Part 3)
          • 3.2 Advanced

            • 3.2.1 Getting Started Guide
            • 3.2.2 Usage of Third Party Libraries
            • 3.2.3 Deployment of the Application
            • 3.2.4 Factory Reset
            • 3.2.5 System Debug
            • 3.2.6 APP Stability Testing
            • 3.2.7 Application Testing
          • 3.3 Getting Docs

            • 3.3.1 Official Website Information
          • 3.4 Development Instructions

            • 3.4.1 Full SDK
            • 3.4.2 Introduction of Third Party Libraries
            • 3.4.3 Introduction of HDC Tool
            • 3.4.4 Restore Factory Mode
            • 3.4.5 Update System API
          • 3.5 First Application

            • 3.5.1 First ArkTS App
          • 3.6 Application Demo

            • 3.6.1 UART Tool
            • 3.6.2 Graphics Tablet
            • 3.6.3 Digital Clock
            • 3.6.4 WIFI Tool
        • 4. Device Development

          • 4.1 Ubuntu Environment Development

            • 4.1.1 Environment Setup
            • 4.1.2 Download Source Code
            • 4.1.3 Compile Source Code
          • 4.2 Using DevEco Device Tool

            • 4.2.1 Tool Introduction
            • 4.2.2 Environment Construction
            • 4.2.3 Import SDK
            • 4.2.4 Function Introduction
        • 5. Peripherals and Interfaces

          • 5.1 Raspberry Pi Interfaces
          • 5.2 GPIO Interface
          • 5.3 I2C Interface
          • 5.4 SPI Communication
          • 5.5 PWM Control
          • 5.6 Serial Port Communication
          • 5.7 TF Card Slot
          • 5.8 Display Screen
          • 5.9 Touch Screen
          • 5.10 Audio
          • 5.11 RTC
          • 5.12 Ethernet
          • 5.13 M.2
          • 5.14 MINI PCIE
          • 5.15 Camera
          • 5.16 WIFI BT
          • 5.17 HAT
        • 6. FAQ

          • 6.1 Download Link
      • M5-R1

        • 1. Introduction

          • M5-R1 Development Documentation
        • 2. Quick Start

          • OpenHarmony Overview
          • Image Burning
          • Development Environment Preparation
          • Hello World Application and Deployment
        • 3. Peripherals and Interfaces

          • 3.1 Raspberry Pi Interfaces
          • 3.2 GPIO Interface
          • 3.3 I2C Interface
          • 3.4 SPI Communication
          • 3.5 PWM Control
          • 3.6 Serial Port Communication
          • 3.7 TF Card Slot
          • 3.8 Display Screen
          • 3.9 Touch Screen
          • 3.10 Audio
          • 3.11 RTC
          • 3.12 Ethernet
          • 3.13 M.2
          • 3.14 MINI PCIE
          • 3.15 Camera
          • 3.16 WIFI BT
          • 3.17 HAT
        • 4. Application Development

          • 4.1 Getting Started

            • 4.1.1 ArkTS Language Overview
            • 4.1.2 UI Components (Part 1)
            • 4.1.3 UI Components (Part 2)
            • 4.1.4 UI Components (Part 3)
          • 4.2 Advanced

            • 4.2.1 Getting Started Guide
            • 4.2.2 Usage of Third Party Libraries
            • 4.2.3 Deployment of the Application
            • 4.2.4 Factory Reset
            • 4.2.5 System Debug
            • 4.2.6 APP Stability Testing
            • 4.2.7 Application Testing
        • 5. Device Development

          • 5.1 Environment Setup
          • 5.2 Download Source Code
          • 5.3 Compile Source Code
        • 6. Download

          • Data Download
    • OpenHarmony

      • SC-3568HA

        • 1. Introduction

          • 1.1 About SC-3568HA
        • 2. Quick Start

          • 2.1 OpenHarmony Overview
          • 2.2 Image Burning
          • 2.3 Development Environment Preparation
          • 2.4 Hello World Application
        • 3. Application Development

          • 3.1 ArkUI

            • 3.1.1 ArkTS Language Overview
            • 3.1.2 UI Components (Part 1)
            • 3.1.3 UI Components (Part 2)
            • 3.1.4 UI Components (Part 3)
          • 3.2 Advanced

            • 3.2.1 Getting Started Guide
            • 3.2.2 Usage of Third Party Libraries
            • 3.2.3 Deployment of the Application
            • 3.2.4 Factory Reset
            • 3.2.5 System Debug
            • 3.2.6 APP Stability Testing
            • 3.2.7 Application Testing
        • 4. Device Development

          • 4.1 Environment Setup
          • 4.2 Download Source Code
          • 4.3 Compile Source Code
        • 5. Peripherals and Interfaces

          • 5.1 Raspberry Pi Interfaces
          • 5.2 GPIO Interface
          • 5.3 I2C Interface
          • 5.4 SPI Communication
          • 5.5 PWM Control
          • 5.6 Serial Port Communication
          • 5.7 TF Card Slot
          • 5.8 Display Screen
          • 5.9 Touch Screen
          • 5.10 Audio
          • 5.11 RTC
          • 5.12 Ethernet
          • 5.13 M.2
          • 5.14 MINI PCIE
          • 5.15 Camera
          • 5.16 WIFI BT
          • 5.17 HAT
        • 6. FAQ

          • 6.1 Download Link
      • M-K1HSE

        • 1. Introduction

          • 1.1 Product Introduction
        • 2. Quick Start

          • 2.1 Debug Tool Installation
          • 2.2 Development Environment Setup
          • 2.3 Source Code Download
          • 2.4 Build Instructions
          • 2.5 Flashing Guide
          • 2.6 APT Update Sources
          • 2.7 View Board Info
          • 2.8 CLI LED and Key Test
          • 2.9 GCC Build Programs
        • 3. Application Development

          • 3.1 Basic Application Development

            • 3.1.1 Development Environment Preparation
            • 3.1.2 First Application HelloWorld
            • 3.1.3 Develop HAR Package
          • 3.2 Peripheral Application Cases

            • 3.2.1 UART Read/Write
            • 3.2.2 Key Demo
            • 3.2.3 LED Flash
        • 4. Peripherals and Interfaces

          • 4.1 Standard Peripherals

            • 4.1.1 USB
            • 4.1.2 Display and Touch
            • 4.1.3 Ethernet
            • 4.1.4 WIFI
            • 4.1.5 Bluetooth
            • 4.1.6 TF Card
            • 4.1.7 Audio
            • 4.1.8 Serial Port
            • 4.1.9 CAN
            • 4.1.10 RTC
          • 4.2 Interfaces

            • 4.2.1 Audio
            • 4.2.2 RS485
            • 4.2.3 Display
            • 4.2.4 Touch
        • 5. System Customization Development

          • 5.1 System Porting
          • 5.2 System Customization
          • 5.3 Driver Development
          • 5.4 System Debugging
          • 5.5 OTA Upgrade
        • 6. Download

          • 6.1 Download
    • EVS-Camera

      • CF-NRS1

        • 1. Introduction

          • 1.1 About CF-NRS1
          • 1.2 Event-Based Concepts
          • 1.3 Quick Start
          • 1.4 Resources
        • 2. Development

          • 2.1 Development Overview

            • 2.1.1 Shimetapi Hybrid Camera SDK Introduction
          • 2.2 Environment & API

            • 2.2.1 Environment Overview
            • 2.2.2 Development API Overview
          • 2.3 Linux Development

            • 2.3.1 Linux SDK Introduction
            • 2.3.2 Linux SDK API
            • 2.3.3 Linux Algorithm
            • 2.3.4 Linux Algorithm API
          • 2.4 Service & Web

            • 2.4.1 EVS Server
            • 2.4.2 Time Server
            • 2.4.3 EVS Web
        • 3. Download

          • 3.1 Download
        • 4. Common Problems

          • 4.1 Common Problems
      • CF-CRA2

        • 1. Introduction

          • 1.1 About CF-CRA2
        • 2. Download

          • 2.1 Download
      • EVS Module

        • 1. Related Concepts
        • 2. Hardware Preparation and Environment Configuration
        • 3. Example Program User Guide
        • Resources Download
    • AI-model

      • 1684XB-32T

        • 1. Introduction

          • AIBOX-1684XB-32 Introduction
        • 2. Quick Start

          • First time use
          • Network Configuration
          • Disk usage
          • Memory allocation
          • Fan Strategy
          • Firmware Upgrade
          • Cross-Compilation
          • Model Quantization
        • 3. Application Development

          • 3.1 Development Introduction

            • Sophgo SDK Development
            • SOPHON-DEMO Introduction
          • 3.2 Large Language Models

            • Deploying Llama3 Example
            • /ai-model/AIBOX-1684XB-32/application-development/LLM/Sophon_LLM_api_server-Development-AIBOX-1684XB-32.html
            • /ai-model/AIBOX-1684XB-32/application-development/LLM/MiniCPM-V-2_6-AIBOX-1684XB-32.html
            • /ai-model/AIBOX-1684XB-32/application-development/LLM/Qwen-2-5-VL-demo-Development-AIBOX-1684XB-32.html
            • /ai-model/AIBOX-1684XB-32/application-development/LLM/Qwen-3-chat-demo-Development-AIBOX-1684XB-32.html
            • /ai-model/AIBOX-1684XB-32/application-development/LLM/Qwen3-Qwen Agent-MCP.html
            • /ai-model/AIBOX-1684XB-32/application-development/LLM/Qwen3-langchain-AI Agent.html
          • 3.3 Deep Learning

            • ResNet (Image Classification)
            • LPRNet (License Plate Recognition)
            • SAM (Universal Image Segmentation Foundation Model)
            • YOLOv5 (Object Detection)
            • OpenPose (Human Keypoint Detection)
            • PP-OCR (Optical Character Recognition)
        • 4. Download

          • Resource Download
      • 1684X-416T

        • 1. Introduction

          • AIBOX-1684X-416 Introduction
        • 2. Demo Simple Operation Guide

          • Simple instructions for using shimeta smart monitoring demo
      • RDK-X5

        • 1. Introduction

          • RDK-X5 Hardware Introduction
        • 2. Quick Start

          • RDK-X5 Quick Start
        • 3. Application Development

          • 3.1 AI Online Model Development

            • AI Online Development - Experiment01
            • AI Online Development - Experiment02
            • AI Online Development - Experiment03
            • AI Online Development - Experiment04
            • AI Online Development - Experiment05
            • AI Online Development - Experiment06
          • 3.2 Large Language Models (Voice)

            • Voice LLM Application - Experiment01
            • Voice LLM Application - Experiment02
            • Voice LLM Application - Experiment03
            • Voice LLM Application - Experiment04
            • Voice LLM Application - Experiment05
            • Voice LLM Application - Experiment06
          • 3.3 40pin-IO Development

            • 40pin IO Development - Experiment01
            • 40pin IO Development - Experiment02
            • 40pin IO Development - Experiment03
            • 40pin IO Development - Experiment04
            • 40pin IO Development - Experiment05
            • 40pin IO Development - Experiment06
            • 40pin IO Development - Experiment07
          • 3.4 USB Module Development

            • USB Module Usage - Experiment01
            • USB Module Usage - Experiment02
          • 3.5 Machine Vision

            • Machine Vision Technology Development - Experiment01
            • Machine Vision Technology Development - Experiment02
            • Machine Vision Technology Development - Experiment03
            • Machine Vision Technology Development - Experiment04
          • 3.6 ROS2 Base Development

            • ROS2 Basic Development - Experiment01
            • ROS2 Basic Development - Experiment02
            • ROS2 Basic Development - Experiment03
            • ROS2 Basic Development - Experiment04
      • RDK-S100

        • 1. Introduction

          • 1.1 About RDK-S100
        • 2. Quick Start

          • 2.1 First Use
        • 3. Application Development

          • 3.1 AI Online Model Development

            • 3.1.1 Volcano Engine Doubao AI
            • 3.1.2 Image Analysis
            • 3.1.3 Multimodal Visual Analysis
            • 3.1.4 Multimodal Image Comparison
            • 3.1.5 Multimodal Document Analysis
            • 3.1.6 Camera AI Vision Analysis
          • 3.2 Large Language Models

            • 3.2.1 Speech Recognition
            • 3.2.2 Voice Conversation
            • 3.2.3 Multimodal Image Analysis
            • 3.2.4 Multimodal Image Comparison
            • 3.2.5 Multimodal Document Analysis
            • 3.2.6 Multimodal Vision Application
          • 3.3 40pin-IO Development

            • 3.3.1 GPIO Output LED Blink
            • 3.3.2 GPIO Input
            • 3.3.3 Key Control LED
            • 3.3.4 PWM Output
            • 3.3.5 Serial Output
            • 3.3.6 I2C Experiment
          • 3.4 USB Module Development

            • 3.4.1 USB Voice Module
            • 3.4.2 Sound Source Localization
          • 3.5 Machine Vision

            • 3.5.1 USB Camera
            • 3.5.2 Image Processing Basics
            • 3.5.3 Object Detection
            • 3.5.4 Image Segmentation
          • 3.6 ROS2 Base Development

            • 3.6.1 Environment Setup
            • 3.6.2 Create and Build Workspace
            • 3.6.3 ROS2 Topic Communication
            • 3.6.4 ROS2 Camera Application
    • Core-Board

      • C-3568BQ

        • 1. Introduction

          • C-3568BQ Introduction
      • C-3588LQ

        • 1. Introduction

          • C-3588LQ Introduction
      • GC-3568JBAF

        • 1. Introduction

          • GC-3568JBAF Introduction
      • C-K1BA

        • 1. Introduction

          • C-K1BA Introduction

Model Conversion Detailed Explanation

4.1 Basic Concepts of Model Conversion

What is Model Conversion

Definition of Model Conversion

Model conversion is the process of converting a trained deep learning model from one format to another. In RK3568 NPU development, it mainly involves converting common deep learning framework models (such as PyTorch, TensorFlow, ONNX, etc.) into RKNN format to run efficiently on Rockchip NPU.

Necessity of Conversion

Original Model (PyTorch/TensorFlow/ONNX)
    ↓
Model Optimization (Graph Optimization, Operator Fusion)
    ↓
Quantization Processing (FP32 → INT8/INT16)
    ↓
Hardware Adaptation (NPU Instruction Set Mapping)
    ↓
RKNN Model (Runnable on RK3568 NPU)

Main Purposes of Conversion:

  • Hardware Adaptation: Adapt general models to specific NPU hardware.
  • Performance Optimization: Improve inference speed through graph optimization and operator fusion.
  • Memory Optimization: Reduce model size and runtime memory usage.
  • Quantization Acceleration: Quantize FP32 models to INT8 to improve inference speed.

Supported Model Formats

Input Format Support

FrameworkFormatSupported VersionRemarks
ONNX.onnx1.6-1.12Recommended format, best compatibility
TensorFlow.pb1.x, 2.xRequires frozen graph
TensorFlow Lite.tflite2.xLightweight model
Caffe.prototxt + .caffemodel1.0Classic framework
DarkNet.cfg + .weights-YOLO series models

Recommended Conversion Path

PyTorch → ONNX → RKNN (Recommended)
TensorFlow → ONNX → RKNN (Recommended)
TensorFlow → TensorFlow Lite → RKNN
Caffe → RKNN (Direct Conversion)

Quantization Technology Detailed

Quantization Type Comparison

Quantization TypePrecisionSpeedModel SizeApplicable Scenarios
FP32HighestSlowLargeExtremely high precision requirements
FP16HighMediumMediumBalance precision and performance
INT8MediumFastSmallMost application scenarios
Mixed PrecisionHighFastSmallKeep high precision for key layers

Quantization Strategy

# Symmetric Quantization vs Asymmetric Quantization
symmetric_quantization = {
    "range": "[-127, 127]",
    "zero_point": 0,
    "advantages": "Simple calculation, hardware friendly",
    "disadvantages": "May waste quantization range"
}

asymmetric_quantization = {
    "range": "[0, 255] or [-128, 127]",
    "zero_point": "Non-zero",
    "advantages": "Fully utilize quantization range",
    "disadvantages": "Slightly higher calculation complexity"
}

Conversion Process Overview

Complete Conversion Workflow

graph TD
    A[Original Model] --> B[Model Verification]
    B --> C[Preprocessing Config]
    C --> D[Quantization Data Prep]
    D --> E[Model Conversion]
    E --> F[Precision Verification]
    F --> G[Performance Test]
    G --> H[Model Optimization]
    H --> I[Final Deployment]

Key Steps Explanation

  1. Model Verification: Ensure the original model can infer normally.
  2. Preprocessing Config: Set preprocessing parameters for input data.
  3. Quantization Data Prep: Prepare representative dataset for quantization calibration.
  4. Model Conversion: Execute the actual conversion process.
  5. Precision Verification: Compare precision differences before and after conversion.
  6. Performance Test: Test inference performance of the converted model.
  7. Model Optimization: Perform further optimization based on test results.

4.2 Prepare Model for Conversion

Get Pretrained Model

Get ONNX Model from Official Source

#!/usr/bin/env python3
# download_models.py

import torch
import torchvision.models as models
import requests
import os

def download_classification_models():
    """Download classification models"""
    models_info = {
        'resnet18': 'https://download.pytorch.org/models/resnet18-5c106cde.pth',
        'resnet50': 'https://download.pytorch.org/models/resnet50-19c8e357.pth',
        'mobilenet_v2': 'https://download.pytorch.org/models/mobilenet_v2-b0353104.pth',
        'efficientnet_b0': 'https://download.pytorch.org/models/efficientnet_b0_rwightman-3dd342df.pth'
    }
    
    os.makedirs('models/classification', exist_ok=True)
    
    for model_name, url in models_info.items():
        print(f"Downloading {model_name}...")
        
        # Load pretrained model
        if model_name == 'resnet18':
            model = models.resnet18(pretrained=True)
        elif model_name == 'resnet50':
            model = models.resnet50(pretrained=True)
        elif model_name == 'mobilenet_v2':
            model = models.mobilenet_v2(pretrained=True)
        
        model.eval()
        
        # Export as ONNX
        dummy_input = torch.randn(1, 3, 224, 224)
        onnx_path = f'models/classification/{model_name}.onnx'
        
        torch.onnx.export(
            model,
            dummy_input,
            onnx_path,
            export_params=True,
            opset_version=11,
            do_constant_folding=True,
            input_names=['input'],
            output_names=['output'],
            dynamic_axes={
                'input': {0: 'batch_size'},
                'output': {0: 'batch_size'}
            }
        )
        
        print(f"ONNX model saved to: {onnx_path}")

def download_yolo_models():
    """Download YOLO models"""
    import ultralytics
    
    os.makedirs('models/detection', exist_ok=True)
    
    # YOLOv5 models
    yolo_models = ['yolov5s', 'yolov5m', 'yolov5l']
    
    for model_name in yolo_models:
        print(f"Downloading {model_name}...")
        
        # Load model
        model = torch.hub.load('ultralytics/yolov5', model_name, pretrained=True)
        model.eval()
        
        # Export ONNX
        dummy_input = torch.randn(1, 3, 640, 640)
        onnx_path = f'models/detection/{model_name}.onnx'
        
        torch.onnx.export(
            model,
            dummy_input,
            onnx_path,
            export_params=True,
            opset_version=11,
            do_constant_folding=True,
            input_names=['images'],
            output_names=['output'],
            dynamic_axes={
                'images': {0: 'batch_size'},
                'output': {0: 'batch_size'}
            }
        )
        
        print(f"ONNX model saved to: {onnx_path}")

if __name__ == "__main__":
    download_classification_models()
    download_yolo_models()

Get Model from Hugging Face

#!/usr/bin/env python3
# download_huggingface_models.py

from transformers import AutoModel, AutoTokenizer
import torch
import os

def download_transformer_models():
    """Download Transformer models"""
    models_info = {
        'bert-base-uncased': 'bert-base-uncased',
        'distilbert-base-uncased': 'distilbert-base-uncased',
        'roberta-base': 'roberta-base'
    }
    
    os.makedirs('models/nlp', exist_ok=True)
    
    for model_name, model_id in models_info.items():
        print(f"Downloading {model_name}...")
        
        # Download model and tokenizer
        model = AutoModel.from_pretrained(model_id)
        tokenizer = AutoTokenizer.from_pretrained(model_id)
        
        # Save model
        model_dir = f'models/nlp/{model_name}'
        model.save_pretrained(model_dir)
        tokenizer.save_pretrained(model_dir)
        
        # Export ONNX (Example)
        model.eval()
        dummy_input = torch.randint(0, 1000, (1, 128))  # Sequence length 128
        
        onnx_path = f'{model_dir}/{model_name}.onnx'
        torch.onnx.export(
            model,
            dummy_input,
            onnx_path,
            export_params=True,
            opset_version=11,
            input_names=['input_ids'],
            output_names=['last_hidden_state'],
            dynamic_axes={
                'input_ids': {0: 'batch_size', 1: 'sequence'},
                'last_hidden_state': {0: 'batch_size', 1: 'sequence'}
            }
        )
        
        print(f"Model saved to: {model_dir}")

if __name__ == "__main__":
    download_transformer_models()

Model Verification and Preprocessing

Model Integrity Check

#!/usr/bin/env python3
# model_validation.py

import onnx
import onnxruntime as ort
import numpy as np
import cv2

def validate_onnx_model(model_path):
    """Validate ONNX model integrity"""
    try:
        # Load model
        model = onnx.load(model_path)
        
        # Check model
        onnx.checker.check_model(model)
        print(f"✓ Model {model_path} validation passed")
        
        # Print model info
        print(f"Model version: {model.ir_version}")
        print(f"Producer: {model.producer_name}")
        print(f"Opset version: {[opset.version for opset in model.opset_import]}")
        
        # Print input/output info
        print("\nInput Info:")
        for input_tensor in model.graph.input:
            print(f"  Name: {input_tensor.name}")
            print(f"  Shape: {[dim.dim_value for dim in input_tensor.type.tensor_type.shape.dim]}")
            print(f"  Type: {input_tensor.type.tensor_type.elem_type}")
        
        print("\nOutput Info:")
        for output_tensor in model.graph.output:
            print(f"  Name: {output_tensor.name}")
            print(f"  Shape: {[dim.dim_value for dim in output_tensor.type.tensor_type.shape.dim]}")
            print(f"  Type: {output_tensor.type.tensor_type.elem_type}")
        
        return True
        
    except Exception as e:
        print(f"✗ Model validation failed: {e}")
        return False

def test_onnx_inference(model_path, input_shape):
    """Test ONNX model inference"""
    try:
        # Create inference session
        session = ort.InferenceSession(model_path)
        
        # Get input/output names
        input_name = session.get_inputs()[0].name
        output_name = session.get_outputs()[0].name
        
        # Create random input
        dummy_input = np.random.randn(*input_shape).astype(np.float32)
        
        # Run inference
        result = session.run([output_name], {input_name: dummy_input})
        
        print(f"✓ Inference test successful")
        print(f"Input shape: {dummy_input.shape}")
        print(f"Output shape: {result[0].shape}")
        
        return True
        
    except Exception as e:
        print(f"✗ Inference test failed: {e}")
        return False

def analyze_model_complexity(model_path):
    """Analyze model complexity"""
    model = onnx.load(model_path)
    
    # Count node types
    node_types = {}
    for node in model.graph.node:
        op_type = node.op_type
        node_types[op_type] = node_types.get(op_type, 0) + 1
    
    print(f"\nModel Complexity Analysis:")
    print(f"Total nodes: {len(model.graph.node)}")
    print(f"Node type distribution:")
    for op_type, count in sorted(node_types.items()):
        print(f"  {op_type}: {count}")
    
    # Estimate parameters
    total_params = 0
    for initializer in model.graph.initializer:
        param_size = 1
        for dim in initializer.dims:
            param_size *= dim
        total_params += param_size
    
    print(f"Estimated parameters: {total_params:,}")
    print(f"Estimated model size: {total_params * 4 / 1024 / 1024:.2f} MB (FP32)")

if __name__ == "__main__":
    # Test example
    model_path = "models/classification/resnet18.onnx"
    
    if validate_onnx_model(model_path):
        test_onnx_inference(model_path, (1, 3, 224, 224))
        analyze_model_complexity(model_path)

Prepare Quantization Dataset

Create Quantization Calibration Dataset

#!/usr/bin/env python3
# prepare_calibration_dataset.py

import os
import cv2
import numpy as np
import random
from pathlib import Path

class CalibrationDataset:
    """Quantization Calibration Dataset"""
    
    def __init__(self, data_dir, input_size=(224, 224), num_samples=100):
        self.data_dir = Path(data_dir)
        self.input_size = input_size
        self.num_samples = num_samples
        self.image_paths = self._collect_images()
        
    def _collect_images(self):
        """Collect image paths"""
        extensions = ['.jpg', '.jpeg', '.png', '.bmp']
        image_paths = []
        
        for ext in extensions:
            image_paths.extend(self.data_dir.glob(f"**/*{ext}"))
            image_paths.extend(self.data_dir.glob(f"**/*{ext.upper()}"))
        
        # Random sampling
        if len(image_paths) > self.num_samples:
            image_paths = random.sample(image_paths, self.num_samples)
        
        print(f"Collected {len(image_paths)} calibration images")
        return image_paths
    
    def preprocess_image(self, image_path):
        """Image preprocessing"""
        # Read image
        image = cv2.imread(str(image_path))
        if image is None:
            return None
        
        # Convert color space
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
        
        # Resize
        image = cv2.resize(image, self.input_size)
        
        # Normalize
        image = image.astype(np.float32) / 255.0
        
        # ImageNet standardization
        mean = np.array([0.485, 0.456, 0.406])
        std = np.array([0.229, 0.224, 0.225])
        image = (image - mean) / std
        
        # Convert to NCHW format
        image = np.transpose(image, (2, 0, 1))
        
        return image
    
    def generate_calibration_data(self, output_file):
        """Generate calibration data"""
        calibration_data = []
        
        print("Generating calibration data...")
        for i, image_path in enumerate(self.image_paths):
            processed_image = self.preprocess_image(image_path)
            if processed_image is not None:
                calibration_data.append(processed_image)
            
            if (i + 1) % 20 == 0:
                print(f"Processing progress: {i + 1}/{len(self.image_paths)}")
        
        # Convert to numpy array
        calibration_data = np.array(calibration_data)
        
        # Save
        np.save(output_file, calibration_data)
        print(f"Calibration data saved to: {output_file}")
        print(f"Data shape: {calibration_data.shape}")
        
        return calibration_data

def download_imagenet_samples():
    """Download ImageNet sample data"""
    import urllib.request
    
    # ImageNet validation set sample URL (Example)
    sample_urls = [
        "https://github.com/pytorch/hub/raw/master/images/dog.jpg",
        "https://github.com/pytorch/hub/raw/master/images/deeplab1.png",
        # Add more sample URLs
    ]
    
    os.makedirs("calibration_data/imagenet_samples", exist_ok=True)
    
    for i, url in enumerate(sample_urls):
        try:
            filename = f"calibration_data/imagenet_samples/sample_{i:03d}.jpg"
            urllib.request.urlretrieve(url, filename)
            print(f"Downloaded: {filename}")
        except Exception as e:
            print(f"Download failed {url}: {e}")

def create_synthetic_dataset(output_dir, num_samples=100, input_size=(224, 224)):
    """Create synthetic dataset (for testing)"""
    os.makedirs(output_dir, exist_ok=True)
    
    print(f"Creating synthetic dataset: {num_samples} images")
    
    for i in range(num_samples):
        # Generate random image
        image = np.random.randint(0, 256, (input_size[1], input_size[0], 3), dtype=np.uint8)
        
        # Add some structure
        cv2.rectangle(image, (50, 50), (150, 150), (255, 0, 0), -1)
        cv2.circle(image, (100, 100), 30, (0, 255, 0), -1)
        
        # Save image
        filename = f"{output_dir}/synthetic_{i:03d}.jpg"
        cv2.imwrite(filename, image)
    
    print(f"Synthetic dataset creation completed: {output_dir}")

if __name__ == "__main__":
    # Create calibration dataset
    
    # Method 1: Use existing image directory
    if os.path.exists("path/to/your/images"):
        dataset = CalibrationDataset("path/to/your/images")
        dataset.generate_calibration_data("calibration_data.npy")
    
    # Method 2: Download sample data
    download_imagenet_samples()
    
    # Method 3: Create synthetic dataset
    create_synthetic_dataset("calibration_data/synthetic", num_samples=50)
    
    # Use synthetic dataset
    dataset = CalibrationDataset("calibration_data/synthetic")
    dataset.generate_calibration_data("calibration_data_synthetic.npy")

4.3 RKNN-Toolkit2 Conversion API Explanation

Core API Introduction

RKNN Class Basic Usage

#!/usr/bin/env python3
# rknn_api_tutorial.py

from rknn.api import RKNN
import numpy as np

class RKNNConverter:
    """RKNN Converter Wrapper Class"""
    
    def __init__(self, verbose=True):
        self.rknn = RKNN(verbose=verbose)
        self.model_loaded = False
        self.model_built = False
    
    def configure_model(self, target_platform='rk3568', **kwargs):
        """Configure model conversion parameters"""
        config_params = {
            'target_platform': target_platform,
            'quantized_dtype': kwargs.get('quantized_dtype', 'asymmetric_quantized-u8'),
            'optimization_level': kwargs.get('optimization_level', 3),
            'output_optimize': kwargs.get('output_optimize', 1),
            'compress_weight': kwargs.get('compress_weight', False),
            'single_core_mode': kwargs.get('single_core_mode', False),
            'model_pruning': kwargs.get('model_pruning', False)
        }
        
        print("Configuring conversion parameters:")
        for key, value in config_params.items():
            print(f"  {key}: {value}")
        
        ret = self.rknn.config(**config_params)
        if ret != 0:
            raise Exception("Model configuration failed!")
        
        return ret
    
    def load_model(self, model_path, model_type='onnx'):
        """Load model"""
        print(f"Loading {model_type.upper()} model: {model_path}")
        
        if model_type.lower() == 'onnx':
            ret = self.rknn.load_onnx(model=model_path)
        elif model_type.lower() == 'tensorflow':
            ret = self.rknn.load_tensorflow(
                tf_pb=model_path,
                inputs=['input'],
                outputs=['output'],
                input_size_list=[[1, 224, 224, 3]]
            )
        elif model_type.lower() == 'tflite':
            ret = self.rknn.load_tflite(model=model_path)
        elif model_type.lower() == 'caffe':
            ret = self.rknn.load_caffe(
                model=model_path + '.prototxt',
                blobs=model_path + '.caffemodel'
            )
        else:
            raise ValueError(f"Unsupported model type: {model_type}")
        
        if ret != 0:
            raise Exception(f"{model_type.upper()} model loading failed!")
        
        self.model_loaded = True
        print("Model loaded successfully!")
        return ret
    
    def build_model(self, do_quantization=True, dataset=None):
        """Build model"""
        if not self.model_loaded:
            raise Exception("Please load model first!")
        
        print("Starting model build...")
        
        build_params = {'do_quantization': do_quantization}
        
        if do_quantization and dataset is not None:
            print("Using custom dataset for quantization...")
            build_params['dataset'] = dataset
        
        ret = self.rknn.build(**build_params)
        if ret != 0:
            raise Exception("Model build failed!")
        
        self.model_built = True
        print("Model built successfully!")
        return ret
    
    def export_model(self, export_path):
        """Export RKNN model"""
        if not self.model_built:
            raise Exception("Please build model first!")
        
        print(f"Exporting model to: {export_path}")
        ret = self.rknn.export_rknn(export_path)
        if ret != 0:
            raise Exception("Model export failed!")
        
        print("Model export successfully!")
        return ret
    
    def init_runtime(self, target='rk3568'):
        """Initialize runtime"""
        print(f"Initializing runtime (target: {target})...")
        ret = self.rknn.init_runtime(target=target)
        if ret != 0:
            raise Exception("Runtime initialization failed!")
        
        print("Runtime initialized successfully!")
        return ret
    
    def inference(self, inputs):
        """Execute inference"""
        return self.rknn.inference(inputs=inputs)
    
    def release(self):
        """Release resources"""
        if self.rknn:
            self.rknn.release()
            print("Resource release completed")

# Usage example
def basic_conversion_example():
    """Basic conversion example"""
    converter = RKNNConverter(verbose=True)
    
    try:
        # 1. Configure parameters
        converter.configure_model(
            target_platform='rk3568',
            quantized_dtype='asymmetric_quantized-u8',
            optimization_level=3
        )
        
        # 2. Load model
        converter.load_model('models/classification/resnet18.onnx', 'onnx')
        
        # 3. Build model
        converter.build_model(do_quantization=True)
        
        # 4. Export model
        converter.export_model('resnet18_rk3568.rknn')
        
        # 5. Test inference (Optional)
        converter.init_runtime()
        dummy_input = np.random.randn(1, 3, 224, 224).astype(np.float32)
        outputs = converter.inference([dummy_input])
        print(f"Inference output shape: {outputs[0].shape}")
        
    finally:
        converter.release()

if __name__ == "__main__":
    basic_conversion_example()

Advanced Configuration Options

Quantization Configuration Detailed

#!/usr/bin/env python3
# advanced_quantization.py

from rknn.api import RKNN
import numpy as np

def configure_quantization_options():
    """Configure quantization options"""
    
    # Quantization data type options
    quantization_types = {
        'asymmetric_quantized-u8': {
            'description': 'Asymmetric 8-bit unsigned integer quantization',
            'range': '[0, 255]',
            'precision': 'Medium',
            'speed': 'Fast',
            'recommended': True
        },
        'asymmetric_quantized-i8': {
            'description': 'Asymmetric 8-bit signed integer quantization',
            'range': '[-128, 127]',
            'precision': 'Medium',
            'speed': 'Fast',
            'recommended': False
        },
        'symmetric_quantized-u8': {
            'description': 'Symmetric 8-bit unsigned integer quantization',
            'range': '[0, 255]',
            'precision': 'Medium',
            'speed': 'Fast',
            'recommended': False
        },
        'dynamic_fixed_point-i8': {
            'description': 'Dynamic fixed-point 8-bit quantization',
            'range': '[-128, 127]',
            'precision': 'High',
            'speed': 'Medium',
            'recommended': False
        },
        'dynamic_fixed_point-i16': {
            'description': 'Dynamic fixed-point 16-bit quantization',
            'range': '[-32768, 32767]',
            'precision': 'Very High',
            'speed': 'Slow',
            'recommended': False
        }
    }
    
    print("Supported quantization types:")
    for qtype, info in quantization_types.items():
        print(f"\n{qtype}:")
        for key, value in info.items():
            print(f"  {key}: {value}")
    
    return quantization_types

def advanced_quantization_config():
    """Advanced quantization configuration example"""
    rknn = RKNN(verbose=True)
    
    # Advanced configuration options
    advanced_config = {
        'target_platform': 'rk3568',
        'quantized_dtype': 'asymmetric_quantized-u8',
        'optimization_level': 3,  # 0-3, 3 is highest optimization level
        'output_optimize': 1,     # Output optimization
        'compress_weight': True,  # Weight compression
        'single_core_mode': False, # Single core mode
        'model_pruning': False,   # Model pruning
        'quantized_algorithm': 'normal',  # Quantization algorithm
        'quantized_method': 'channel',    # Quantization method
        'float_dtype': 'float16'  # Float data type
    }
    
    print("Advanced configuration parameters:")
    for key, value in advanced_config.items():
        print(f"  {key}: {value}")
    
    ret = rknn.config(**advanced_config)
    rknn.release()
    
    return ret

def mixed_precision_quantization():
    """Mixed precision quantization example"""
    rknn = RKNN(verbose=True)
    
    # Mixed precision configuration
    # Keep certain layers high precision, others use low precision
    mixed_precision_config = {
        'target_platform': 'rk3568',
        'quantized_dtype': 'asymmetric_quantized-u8',
        'optimization_level': 3,
        # Specify quantization type for specific layers
        'quantize_input_node': False,  # Input node not quantized
        'quantize_output_node': False, # Output node not quantized
    }
    
    ret = rknn.config(**mixed_precision_config)
    rknn.release()
    
    return ret

if __name__ == "__main__":
    configure_quantization_options()
    advanced_quantization_config()
    mixed_precision_quantization()

Optimization Levels Detailed

#!/usr/bin/env python3
# optimization_levels.py

def explain_optimization_levels():
    """Explain optimization levels"""
    
    optimization_levels = {
        0: {
            'name': 'No Optimization',
            'description': 'Keep original model structure, no optimization',
            'speed': 'Slow',
            'accuracy': 'Highest',
            'model_size': 'Large',
            'use_case': 'Debugging and accuracy comparison'
        },
        1: {
            'name': 'Basic Optimization',
            'description': 'Basic graph optimization, such as constant folding',
            'speed': 'Medium',
            'accuracy': 'High',
            'model_size': 'Medium',
            'use_case': 'Balance accuracy and performance'
        },
        2: {
            'name': 'Standard Optimization',
            'description': 'Includes operator fusion and memory optimization',
            'speed': 'Fast',
            'accuracy': 'Medium',
            'model_size': 'Small',
            'use_case': 'Most application scenarios'
        },
        3: {
            'name': 'Aggressive Optimization',
            'description': 'Maximum optimization, may affect accuracy',
            'speed': 'Fastest',
            'accuracy': 'Medium to Low',
            'model_size': 'Smallest',
            'use_case': 'Scenarios requiring extreme performance'
        }
    }
    
    print("RKNN Optimization Levels Detailed:")
    for level, info in optimization_levels.items():
        print(f"\nLevel {level} - {info['name']}:")
        for key, value in info.items():
            if key != 'name':
                print(f"  {key}: {value}")
    
    return optimization_levels

def benchmark_optimization_levels(model_path):
    """Test effects of different optimization levels"""
    from rknn.api import RKNN
    import time
    import os
    
    results = {}
    
    for opt_level in range(4):
        print(f"\nTesting Optimization Level {opt_level}...")
        
        rknn = RKNN(verbose=False)
        
        try:
            # Configure
            rknn.config(
                target_platform='rk3568',
                quantized_dtype='asymmetric_quantized-u8',
                optimization_level=opt_level
            )
            
            # Load and build
            start_time = time.time()
            rknn.load_onnx(model=model_path)
            rknn.build(do_quantization=True)
            build_time = time.time() - start_time
            
            # Export
            output_path = f'model_opt_{opt_level}.rknn'
            rknn.export_rknn(output_path)
            
            # Get file size
            model_size = os.path.getsize(output_path) / 1024 / 1024  # MB
            
            # Test inference speed
            rknn.init_runtime()
            dummy_input = np.random.randn(1, 3, 224, 224).astype(np.float32)
            
            # Warm up
            for _ in range(10):
                rknn.inference(inputs=[dummy_input])
            
            # Test
            inference_times = []
            for _ in range(100):
                start_time = time.time()
                rknn.inference(inputs=[dummy_input])
                inference_times.append(time.time() - start_time)
            
            avg_inference_time = np.mean(inference_times)
            
            results[opt_level] = {
                'build_time': build_time,
                'model_size_mb': model_size,
                'avg_inference_time_ms': avg_inference_time * 1000,
                'fps': 1.0 / avg_inference_time
            }
            
            print(f"  Build time: {build_time:.2f}s")
            print(f"  Model size: {model_size:.2f}MB")
            print(f"  Inference time: {avg_inference_time*1000:.2f}ms")
            print(f"  FPS: {1.0/avg_inference_time:.2f}")
            
            # Clean up
            os.remove(output_path)
            
        except Exception as e:
            print(f"  Optimization Level {opt_level} test failed: {e}")
            results[opt_level] = None
        
        finally:
            rknn.release()
    
    return results

if __name__ == "__main__":
    explain_optimization_levels()
    
    # If model file exists, can test different optimization levels
    model_path = "models/classification/resnet18.onnx"
    if os.path.exists(model_path):
        results = benchmark_optimization_levels(model_path)
        print("\nOptimization Level Comparison Results:")
        for level, result in results.items():
            if result:
                print(f"Level {level}: {result}")

4.4 Conversion Practice: ONNX Model

ResNet Classification Model Conversion

Complete ResNet Conversion Workflow

#!/usr/bin/env python3
# resnet_conversion.py

import os
import cv2
import numpy as np
import time
from rknn.api import RKNN

class ResNetConverter:
    """ResNet Model Converter"""
    
    def __init__(self, model_path, output_path):
        self.model_path = model_path
        self.output_path = output_path
        self.rknn = RKNN(verbose=True)
        
    def prepare_calibration_data(self, num_samples=50):
        """Prepare calibration data"""
        print("Preparing calibration data...")
        
        # Create synthetic data (should use real data in practice)
        calibration_data = []
        
        for i in range(num_samples):
            # Generate random image
            image = np.random.randint(0, 256, (224, 224, 3), dtype=np.uint8)
            
            # Preprocess
            image = image.astype(np.float32) / 255.0
            
            # ImageNet standardization
            mean = np.array([0.485, 0.456, 0.406])
            std = np.array([0.229, 0.224, 0.225])
            image = (image - mean) / std
            
            # Convert to NCHW format
            image = np.transpose(image, (2, 0, 1))
            calibration_data.append(image)
        
        calibration_data = np.array(calibration_data)
        print(f"Calibration data shape: {calibration_data.shape}")
        
        return calibration_data
    
    def convert_model(self, use_custom_dataset=True):
        """Convert model"""
        try:
            # 1. Configure conversion parameters
            print("Configuring conversion parameters...")
            ret = self.rknn.config(
                target_platform='rk3568',
                quantized_dtype='asymmetric_quantized-u8',
                optimization_level=3,
                output_optimize=1,
                compress_weight=True
            )
            if ret != 0:
                raise Exception("Configuration failed!")
            
            # 2. Load ONNX model
            print(f"Loading ONNX model: {self.model_path}")
            ret = self.rknn.load_onnx(model=self.model_path)
            if ret != 0:
                raise Exception("Model loading failed!")
            
            # 3. Build model
            print("Building model...")
            if use_custom_dataset:
                # Use custom dataset
                dataset = self.prepare_calibration_data()
                ret = self.rknn.build(do_quantization=True, dataset=dataset)
            else:
                # Default behavior
                ret = self.rknn.build(do_quantization=True)
            
            if ret != 0:
                raise Exception("Model build failed!")
            
            # 4. Export RKNN model
            print(f"Exporting model to: {self.output_path}")
            ret = self.rknn.export_rknn(self.output_path)
            if ret != 0:
                raise Exception("Model export failed!")
            
            print("Model conversion completed successfully!")
            
        except Exception as e:
            print(f"Conversion failed: {e}")
        finally:
            self.rknn.release()

if __name__ == "__main__":
    # Ensure directory exists
    os.makedirs("models", exist_ok=True)
    
    # Model path
    onnx_path = "models/classification/resnet18.onnx"
    rknn_path = "models/resnet18_rk3568.rknn"
    
    # Check if ONNX model exists
    if not os.path.exists(onnx_path):
        print(f"Model file not found: {onnx_path}")
        print("Please run download_models.py to download the model first")
    else:
        converter = ResNetConverter(onnx_path, rknn_path)
        converter.convert_model()
Edit this page on GitHub
Last Updated:
Contributors: ZSL
Prev
Run Official YOLOv5 Example
Next
Run Custom Model on Board