Model Conversion Detailed Explanation

4.1 Basic Concepts of Model Conversion

What is Model Conversion

Definition of Model Conversion

Model conversion is the process of converting a trained deep learning model from one format to another. In RK3568 NPU development, it mainly involves converting common deep learning framework models (such as PyTorch, TensorFlow, ONNX, etc.) into RKNN format to run efficiently on Rockchip NPU.

Necessity of Conversion

Original Model (PyTorch/TensorFlow/ONNX)
    ↓
Model Optimization (Graph Optimization, Operator Fusion)
    ↓
Quantization Processing (FP32 → INT8/INT16)
    ↓
Hardware Adaptation (NPU Instruction Set Mapping)
    ↓
RKNN Model (Runnable on RK3568 NPU)

Main Purposes of Conversion:

Hardware Adaptation: Adapt general models to specific NPU hardware.
Performance Optimization: Improve inference speed through graph optimization and operator fusion.
Memory Optimization: Reduce model size and runtime memory usage.
Quantization Acceleration: Quantize FP32 models to INT8 to improve inference speed.

Supported Model Formats

Input Format Support

Framework	Format	Supported Version	Remarks
ONNX	.onnx	1.6-1.12	Recommended format, best compatibility
TensorFlow	.pb	1.x, 2.x	Requires frozen graph
TensorFlow Lite	.tflite	2.x	Lightweight model
Caffe	.prototxt + .caffemodel	1.0	Classic framework
DarkNet	.cfg + .weights	-	YOLO series models

Recommended Conversion Path

PyTorch → ONNX → RKNN (Recommended)
TensorFlow → ONNX → RKNN (Recommended)
TensorFlow → TensorFlow Lite → RKNN
Caffe → RKNN (Direct Conversion)

Quantization Technology Detailed

Quantization Type Comparison

Quantization Type	Precision	Speed	Model Size	Applicable Scenarios
FP32	Highest	Slow	Large	Extremely high precision requirements
FP16	High	Medium	Medium	Balance precision and performance
INT8	Medium	Fast	Small	Most application scenarios
Mixed Precision	High	Fast	Small	Keep high precision for key layers

Quantization Strategy

# Symmetric Quantization vs Asymmetric Quantization
symmetric_quantization = {
    "range": "[-127, 127]",
    "zero_point": 0,
    "advantages": "Simple calculation, hardware friendly",
    "disadvantages": "May waste quantization range"
}

asymmetric_quantization = {
    "range": "[0, 255] or [-128, 127]",
    "zero_point": "Non-zero",
    "advantages": "Fully utilize quantization range",
    "disadvantages": "Slightly higher calculation complexity"
}

Conversion Process Overview

Complete Conversion Workflow

graph TD
    A[Original Model] --> B[Model Verification]
    B --> C[Preprocessing Config]
    C --> D[Quantization Data Prep]
    D --> E[Model Conversion]
    E --> F[Precision Verification]
    F --> G[Performance Test]
    G --> H[Model Optimization]
    H --> I[Final Deployment]

Key Steps Explanation

Model Verification: Ensure the original model can infer normally.
Preprocessing Config: Set preprocessing parameters for input data.
Quantization Data Prep: Prepare representative dataset for quantization calibration.
Model Conversion: Execute the actual conversion process.
Precision Verification: Compare precision differences before and after conversion.
Performance Test: Test inference performance of the converted model.
Model Optimization: Perform further optimization based on test results.

4.2 Prepare Model for Conversion

Get Pretrained Model

Get ONNX Model from Official Source

#!/usr/bin/env python3
# download_models.py

import torch
import torchvision.models as models
import requests
import os

def download_classification_models():
    """Download classification models"""
    models_info = {
        'resnet18': 'https://download.pytorch.org/models/resnet18-5c106cde.pth',
        'resnet50': 'https://download.pytorch.org/models/resnet50-19c8e357.pth',
        'mobilenet_v2': 'https://download.pytorch.org/models/mobilenet_v2-b0353104.pth',
        'efficientnet_b0': 'https://download.pytorch.org/models/efficientnet_b0_rwightman-3dd342df.pth'
    }
    
    os.makedirs('models/classification', exist_ok=True)
    
    for model_name, url in models_info.items():
        print(f"Downloading {model_name}...")
        
        # Load pretrained model
        if model_name == 'resnet18':
            model = models.resnet18(pretrained=True)
        elif model_name == 'resnet50':
            model = models.resnet50(pretrained=True)
        elif model_name == 'mobilenet_v2':
            model = models.mobilenet_v2(pretrained=True)
        
        model.eval()
        
        # Export as ONNX
        dummy_input = torch.randn(1, 3, 224, 224)
        onnx_path = f'models/classification/{model_name}.onnx'
        
        torch.onnx.export(
            model,
            dummy_input,
            onnx_path,
            export_params=True,
            opset_version=11,
            do_constant_folding=True,
            input_names=['input'],
            output_names=['output'],
            dynamic_axes={
                'input': {0: 'batch_size'},
                'output': {0: 'batch_size'}
            }
        )
        
        print(f"ONNX model saved to: {onnx_path}")

def download_yolo_models():
    """Download YOLO models"""
    import ultralytics
    
    os.makedirs('models/detection', exist_ok=True)
    
    # YOLOv5 models
    yolo_models = ['yolov5s', 'yolov5m', 'yolov5l']
    
    for model_name in yolo_models:
        print(f"Downloading {model_name}...")
        
        # Load model
        model = torch.hub.load('ultralytics/yolov5', model_name, pretrained=True)
        model.eval()
        
        # Export ONNX
        dummy_input = torch.randn(1, 3, 640, 640)
        onnx_path = f'models/detection/{model_name}.onnx'
        
        torch.onnx.export(
            model,
            dummy_input,
            onnx_path,
            export_params=True,
            opset_version=11,
            do_constant_folding=True,
            input_names=['images'],
            output_names=['output'],
            dynamic_axes={
                'images': {0: 'batch_size'},
                'output': {0: 'batch_size'}
            }
        )
        
        print(f"ONNX model saved to: {onnx_path}")

if __name__ == "__main__":
    download_classification_models()
    download_yolo_models()

Get Model from Hugging Face

#!/usr/bin/env python3
# download_huggingface_models.py

from transformers import AutoModel, AutoTokenizer
import torch
import os

def download_transformer_models():
    """Download Transformer models"""
    models_info = {
        'bert-base-uncased': 'bert-base-uncased',
        'distilbert-base-uncased': 'distilbert-base-uncased',
        'roberta-base': 'roberta-base'
    }
    
    os.makedirs('models/nlp', exist_ok=True)
    
    for model_name, model_id in models_info.items():
        print(f"Downloading {model_name}...")
        
        # Download model and tokenizer
        model = AutoModel.from_pretrained(model_id)
        tokenizer = AutoTokenizer.from_pretrained(model_id)
        
        # Save model
        model_dir = f'models/nlp/{model_name}'
        model.save_pretrained(model_dir)
        tokenizer.save_pretrained(model_dir)
        
        # Export ONNX (Example)
        model.eval()
        dummy_input = torch.randint(0, 1000, (1, 128))  # Sequence length 128
        
        onnx_path = f'{model_dir}/{model_name}.onnx'
        torch.onnx.export(
            model,
            dummy_input,
            onnx_path,
            export_params=True,
            opset_version=11,
            input_names=['input_ids'],
            output_names=['last_hidden_state'],
            dynamic_axes={
                'input_ids': {0: 'batch_size', 1: 'sequence'},
                'last_hidden_state': {0: 'batch_size', 1: 'sequence'}
            }
        )
        
        print(f"Model saved to: {model_dir}")

if __name__ == "__main__":
    download_transformer_models()

Model Verification and Preprocessing

Model Integrity Check

#!/usr/bin/env python3
# model_validation.py

import onnx
import onnxruntime as ort
import numpy as np
import cv2

def validate_onnx_model(model_path):
    """Validate ONNX model integrity"""
    try:
        # Load model
        model = onnx.load(model_path)
        
        # Check model
        onnx.checker.check_model(model)
        print(f"✓ Model {model_path} validation passed")
        
        # Print model info
        print(f"Model version: {model.ir_version}")
        print(f"Producer: {model.producer_name}")
        print(f"Opset version: {[opset.version for opset in model.opset_import]}")
        
        # Print input/output info
        print("\nInput Info:")
        for input_tensor in model.graph.input:
            print(f"  Name: {input_tensor.name}")
            print(f"  Shape: {[dim.dim_value for dim in input_tensor.type.tensor_type.shape.dim]}")
            print(f"  Type: {input_tensor.type.tensor_type.elem_type}")
        
        print("\nOutput Info:")
        for output_tensor in model.graph.output:
            print(f"  Name: {output_tensor.name}")
            print(f"  Shape: {[dim.dim_value for dim in output_tensor.type.tensor_type.shape.dim]}")
            print(f"  Type: {output_tensor.type.tensor_type.elem_type}")
        
        return True
        
    except Exception as e:
        print(f"✗ Model validation failed: {e}")
        return False

def test_onnx_inference(model_path, input_shape):
    """Test ONNX model inference"""
    try:
        # Create inference session
        session = ort.InferenceSession(model_path)
        
        # Get input/output names
        input_name = session.get_inputs()[0].name
        output_name = session.get_outputs()[0].name
        
        # Create random input
        dummy_input = np.random.randn(*input_shape).astype(np.float32)
        
        # Run inference
        result = session.run([output_name], {input_name: dummy_input})
        
        print(f"✓ Inference test successful")
        print(f"Input shape: {dummy_input.shape}")
        print(f"Output shape: {result[0].shape}")
        
        return True
        
    except Exception as e:
        print(f"✗ Inference test failed: {e}")
        return False

def analyze_model_complexity(model_path):
    """Analyze model complexity"""
    model = onnx.load(model_path)
    
    # Count node types
    node_types = {}
    for node in model.graph.node:
        op_type = node.op_type
        node_types[op_type] = node_types.get(op_type, 0) + 1
    
    print(f"\nModel Complexity Analysis:")
    print(f"Total nodes: {len(model.graph.node)}")
    print(f"Node type distribution:")
    for op_type, count in sorted(node_types.items()):
        print(f"  {op_type}: {count}")
    
    # Estimate parameters
    total_params = 0
    for initializer in model.graph.initializer:
        param_size = 1
        for dim in initializer.dims:
            param_size *= dim
        total_params += param_size
    
    print(f"Estimated parameters: {total_params:,}")
    print(f"Estimated model size: {total_params * 4 / 1024 / 1024:.2f} MB (FP32)")

if __name__ == "__main__":
    # Test example
    model_path = "models/classification/resnet18.onnx"
    
    if validate_onnx_model(model_path):
        test_onnx_inference(model_path, (1, 3, 224, 224))
        analyze_model_complexity(model_path)

Prepare Quantization Dataset

Create Quantization Calibration Dataset

#!/usr/bin/env python3
# prepare_calibration_dataset.py

import os
import cv2
import numpy as np
import random
from pathlib import Path

class CalibrationDataset:
    """Quantization Calibration Dataset"""
    
    def __init__(self, data_dir, input_size=(224, 224), num_samples=100):
        self.data_dir = Path(data_dir)
        self.input_size = input_size
        self.num_samples = num_samples
        self.image_paths = self._collect_images()
        
    def _collect_images(self):
        """Collect image paths"""
        extensions = ['.jpg', '.jpeg', '.png', '.bmp']
        image_paths = []
        
        for ext in extensions:
            image_paths.extend(self.data_dir.glob(f"**/*{ext}"))
            image_paths.extend(self.data_dir.glob(f"**/*{ext.upper()}"))
        
        # Random sampling
        if len(image_paths) > self.num_samples:
            image_paths = random.sample(image_paths, self.num_samples)
        
        print(f"Collected {len(image_paths)} calibration images")
        return image_paths
    
    def preprocess_image(self, image_path):
        """Image preprocessing"""
        # Read image
        image = cv2.imread(str(image_path))
        if image is None:
            return None
        
        # Convert color space
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
        
        # Resize
        image = cv2.resize(image, self.input_size)
        
        # Normalize
        image = image.astype(np.float32) / 255.0
        
        # ImageNet standardization
        mean = np.array([0.485, 0.456, 0.406])
        std = np.array([0.229, 0.224, 0.225])
        image = (image - mean) / std
        
        # Convert to NCHW format
        image = np.transpose(image, (2, 0, 1))
        
        return image
    
    def generate_calibration_data(self, output_file):
        """Generate calibration data"""
        calibration_data = []
        
        print("Generating calibration data...")
        for i, image_path in enumerate(self.image_paths):
            processed_image = self.preprocess_image(image_path)
            if processed_image is not None:
                calibration_data.append(processed_image)
            
            if (i + 1) % 20 == 0:
                print(f"Processing progress: {i + 1}/{len(self.image_paths)}")
        
        # Convert to numpy array
        calibration_data = np.array(calibration_data)
        
        # Save
        np.save(output_file, calibration_data)
        print(f"Calibration data saved to: {output_file}")
        print(f"Data shape: {calibration_data.shape}")
        
        return calibration_data

def download_imagenet_samples():
    """Download ImageNet sample data"""
    import urllib.request
    
    # ImageNet validation set sample URL (Example)
    sample_urls = [
        "https://github.com/pytorch/hub/raw/master/images/dog.jpg",
        "https://github.com/pytorch/hub/raw/master/images/deeplab1.png",
        # Add more sample URLs
    ]
    
    os.makedirs("calibration_data/imagenet_samples", exist_ok=True)
    
    for i, url in enumerate(sample_urls):
        try:
            filename = f"calibration_data/imagenet_samples/sample_{i:03d}.jpg"
            urllib.request.urlretrieve(url, filename)
            print(f"Downloaded: {filename}")
        except Exception as e:
            print(f"Download failed {url}: {e}")

def create_synthetic_dataset(output_dir, num_samples=100, input_size=(224, 224)):
    """Create synthetic dataset (for testing)"""
    os.makedirs(output_dir, exist_ok=True)
    
    print(f"Creating synthetic dataset: {num_samples} images")
    
    for i in range(num_samples):
        # Generate random image
        image = np.random.randint(0, 256, (input_size[1], input_size[0], 3), dtype=np.uint8)
        
        # Add some structure
        cv2.rectangle(image, (50, 50), (150, 150), (255, 0, 0), -1)
        cv2.circle(image, (100, 100), 30, (0, 255, 0), -1)
        
        # Save image
        filename = f"{output_dir}/synthetic_{i:03d}.jpg"
        cv2.imwrite(filename, image)
    
    print(f"Synthetic dataset creation completed: {output_dir}")

if __name__ == "__main__":
    # Create calibration dataset
    
    # Method 1: Use existing image directory
    if os.path.exists("path/to/your/images"):
        dataset = CalibrationDataset("path/to/your/images")
        dataset.generate_calibration_data("calibration_data.npy")
    
    # Method 2: Download sample data
    download_imagenet_samples()
    
    # Method 3: Create synthetic dataset
    create_synthetic_dataset("calibration_data/synthetic", num_samples=50)
    
    # Use synthetic dataset
    dataset = CalibrationDataset("calibration_data/synthetic")
    dataset.generate_calibration_data("calibration_data_synthetic.npy")

4.3 RKNN-Toolkit2 Conversion API Explanation

Core API Introduction

RKNN Class Basic Usage

#!/usr/bin/env python3
# rknn_api_tutorial.py

from rknn.api import RKNN
import numpy as np

class RKNNConverter:
    """RKNN Converter Wrapper Class"""
    
    def __init__(self, verbose=True):
        self.rknn = RKNN(verbose=verbose)
        self.model_loaded = False
        self.model_built = False
    
    def configure_model(self, target_platform='rk3568', **kwargs):
        """Configure model conversion parameters"""
        config_params = {
            'target_platform': target_platform,
            'quantized_dtype': kwargs.get('quantized_dtype', 'asymmetric_quantized-u8'),
            'optimization_level': kwargs.get('optimization_level', 3),
            'output_optimize': kwargs.get('output_optimize', 1),
            'compress_weight': kwargs.get('compress_weight', False),
            'single_core_mode': kwargs.get('single_core_mode', False),
            'model_pruning': kwargs.get('model_pruning', False)
        }
        
        print("Configuring conversion parameters:")
        for key, value in config_params.items():
            print(f"  {key}: {value}")
        
        ret = self.rknn.config(**config_params)
        if ret != 0:
            raise Exception("Model configuration failed!")
        
        return ret
    
    def load_model(self, model_path, model_type='onnx'):
        """Load model"""
        print(f"Loading {model_type.upper()} model: {model_path}")
        
        if model_type.lower() == 'onnx':
            ret = self.rknn.load_onnx(model=model_path)
        elif model_type.lower() == 'tensorflow':
            ret = self.rknn.load_tensorflow(
                tf_pb=model_path,
                inputs=['input'],
                outputs=['output'],
                input_size_list=[[1, 224, 224, 3]]
            )
        elif model_type.lower() == 'tflite':
            ret = self.rknn.load_tflite(model=model_path)
        elif model_type.lower() == 'caffe':
            ret = self.rknn.load_caffe(
                model=model_path + '.prototxt',
                blobs=model_path + '.caffemodel'
            )
        else:
            raise ValueError(f"Unsupported model type: {model_type}")
        
        if ret != 0:
            raise Exception(f"{model_type.upper()} model loading failed!")
        
        self.model_loaded = True
        print("Model loaded successfully!")
        return ret
    
    def build_model(self, do_quantization=True, dataset=None):
        """Build model"""
        if not self.model_loaded:
            raise Exception("Please load model first!")
        
        print("Starting model build...")
        
        build_params = {'do_quantization': do_quantization}
        
        if do_quantization and dataset is not None:
            print("Using custom dataset for quantization...")
            build_params['dataset'] = dataset
        
        ret = self.rknn.build(**build_params)
        if ret != 0:
            raise Exception("Model build failed!")
        
        self.model_built = True
        print("Model built successfully!")
        return ret
    
    def export_model(self, export_path):
        """Export RKNN model"""
        if not self.model_built:
            raise Exception("Please build model first!")
        
        print(f"Exporting model to: {export_path}")
        ret = self.rknn.export_rknn(export_path)
        if ret != 0:
            raise Exception("Model export failed!")
        
        print("Model export successfully!")
        return ret
    
    def init_runtime(self, target='rk3568'):
        """Initialize runtime"""
        print(f"Initializing runtime (target: {target})...")
        ret = self.rknn.init_runtime(target=target)
        if ret != 0:
            raise Exception("Runtime initialization failed!")
        
        print("Runtime initialized successfully!")
        return ret
    
    def inference(self, inputs):
        """Execute inference"""
        return self.rknn.inference(inputs=inputs)
    
    def release(self):
        """Release resources"""
        if self.rknn:
            self.rknn.release()
            print("Resource release completed")

# Usage example
def basic_conversion_example():
    """Basic conversion example"""
    converter = RKNNConverter(verbose=True)
    
    try:
        # 1. Configure parameters
        converter.configure_model(
            target_platform='rk3568',
            quantized_dtype='asymmetric_quantized-u8',
            optimization_level=3
        )
        
        # 2. Load model
        converter.load_model('models/classification/resnet18.onnx', 'onnx')
        
        # 3. Build model
        converter.build_model(do_quantization=True)
        
        # 4. Export model
        converter.export_model('resnet18_rk3568.rknn')
        
        # 5. Test inference (Optional)
        converter.init_runtime()
        dummy_input = np.random.randn(1, 3, 224, 224).astype(np.float32)
        outputs = converter.inference([dummy_input])
        print(f"Inference output shape: {outputs[0].shape}")
        
    finally:
        converter.release()

if __name__ == "__main__":
    basic_conversion_example()

Advanced Configuration Options

Quantization Configuration Detailed

#!/usr/bin/env python3
# advanced_quantization.py

from rknn.api import RKNN
import numpy as np

def configure_quantization_options():
    """Configure quantization options"""
    
    # Quantization data type options
    quantization_types = {
        'asymmetric_quantized-u8': {
            'description': 'Asymmetric 8-bit unsigned integer quantization',
            'range': '[0, 255]',
            'precision': 'Medium',
            'speed': 'Fast',
            'recommended': True
        },
        'asymmetric_quantized-i8': {
            'description': 'Asymmetric 8-bit signed integer quantization',
            'range': '[-128, 127]',
            'precision': 'Medium',
            'speed': 'Fast',
            'recommended': False
        },
        'symmetric_quantized-u8': {
            'description': 'Symmetric 8-bit unsigned integer quantization',
            'range': '[0, 255]',
            'precision': 'Medium',
            'speed': 'Fast',
            'recommended': False
        },
        'dynamic_fixed_point-i8': {
            'description': 'Dynamic fixed-point 8-bit quantization',
            'range': '[-128, 127]',
            'precision': 'High',
            'speed': 'Medium',
            'recommended': False
        },
        'dynamic_fixed_point-i16': {
            'description': 'Dynamic fixed-point 16-bit quantization',
            'range': '[-32768, 32767]',
            'precision': 'Very High',
            'speed': 'Slow',
            'recommended': False
        }
    }
    
    print("Supported quantization types:")
    for qtype, info in quantization_types.items():
        print(f"\n{qtype}:")
        for key, value in info.items():
            print(f"  {key}: {value}")
    
    return quantization_types

def advanced_quantization_config():
    """Advanced quantization configuration example"""
    rknn = RKNN(verbose=True)
    
    # Advanced configuration options
    advanced_config = {
        'target_platform': 'rk3568',
        'quantized_dtype': 'asymmetric_quantized-u8',
        'optimization_level': 3,  # 0-3, 3 is highest optimization level
        'output_optimize': 1,     # Output optimization
        'compress_weight': True,  # Weight compression
        'single_core_mode': False, # Single core mode
        'model_pruning': False,   # Model pruning
        'quantized_algorithm': 'normal',  # Quantization algorithm
        'quantized_method': 'channel',    # Quantization method
        'float_dtype': 'float16'  # Float data type
    }
    
    print("Advanced configuration parameters:")
    for key, value in advanced_config.items():
        print(f"  {key}: {value}")
    
    ret = rknn.config(**advanced_config)
    rknn.release()
    
    return ret

def mixed_precision_quantization():
    """Mixed precision quantization example"""
    rknn = RKNN(verbose=True)
    
    # Mixed precision configuration
    # Keep certain layers high precision, others use low precision
    mixed_precision_config = {
        'target_platform': 'rk3568',
        'quantized_dtype': 'asymmetric_quantized-u8',
        'optimization_level': 3,
        # Specify quantization type for specific layers
        'quantize_input_node': False,  # Input node not quantized
        'quantize_output_node': False, # Output node not quantized
    }
    
    ret = rknn.config(**mixed_precision_config)
    rknn.release()
    
    return ret

if __name__ == "__main__":
    configure_quantization_options()
    advanced_quantization_config()
    mixed_precision_quantization()

Optimization Levels Detailed

#!/usr/bin/env python3
# optimization_levels.py

def explain_optimization_levels():
    """Explain optimization levels"""
    
    optimization_levels = {
        0: {
            'name': 'No Optimization',
            'description': 'Keep original model structure, no optimization',
            'speed': 'Slow',
            'accuracy': 'Highest',
            'model_size': 'Large',
            'use_case': 'Debugging and accuracy comparison'
        },
        1: {
            'name': 'Basic Optimization',
            'description': 'Basic graph optimization, such as constant folding',
            'speed': 'Medium',
            'accuracy': 'High',
            'model_size': 'Medium',
            'use_case': 'Balance accuracy and performance'
        },
        2: {
            'name': 'Standard Optimization',
            'description': 'Includes operator fusion and memory optimization',
            'speed': 'Fast',
            'accuracy': 'Medium',
            'model_size': 'Small',
            'use_case': 'Most application scenarios'
        },
        3: {
            'name': 'Aggressive Optimization',
            'description': 'Maximum optimization, may affect accuracy',
            'speed': 'Fastest',
            'accuracy': 'Medium to Low',
            'model_size': 'Smallest',
            'use_case': 'Scenarios requiring extreme performance'
        }
    }
    
    print("RKNN Optimization Levels Detailed:")
    for level, info in optimization_levels.items():
        print(f"\nLevel {level} - {info['name']}:")
        for key, value in info.items():
            if key != 'name':
                print(f"  {key}: {value}")
    
    return optimization_levels

def benchmark_optimization_levels(model_path):
    """Test effects of different optimization levels"""
    from rknn.api import RKNN
    import time
    import os
    
    results = {}
    
    for opt_level in range(4):
        print(f"\nTesting Optimization Level {opt_level}...")
        
        rknn = RKNN(verbose=False)
        
        try:
            # Configure
            rknn.config(
                target_platform='rk3568',
                quantized_dtype='asymmetric_quantized-u8',
                optimization_level=opt_level
            )
            
            # Load and build
            start_time = time.time()
            rknn.load_onnx(model=model_path)
            rknn.build(do_quantization=True)
            build_time = time.time() - start_time
            
            # Export
            output_path = f'model_opt_{opt_level}.rknn'
            rknn.export_rknn(output_path)
            
            # Get file size
            model_size = os.path.getsize(output_path) / 1024 / 1024  # MB
            
            # Test inference speed
            rknn.init_runtime()
            dummy_input = np.random.randn(1, 3, 224, 224).astype(np.float32)
            
            # Warm up
            for _ in range(10):
                rknn.inference(inputs=[dummy_input])
            
            # Test
            inference_times = []
            for _ in range(100):
                start_time = time.time()
                rknn.inference(inputs=[dummy_input])
                inference_times.append(time.time() - start_time)
            
            avg_inference_time = np.mean(inference_times)
            
            results[opt_level] = {
                'build_time': build_time,
                'model_size_mb': model_size,
                'avg_inference_time_ms': avg_inference_time * 1000,
                'fps': 1.0 / avg_inference_time
            }
            
            print(f"  Build time: {build_time:.2f}s")
            print(f"  Model size: {model_size:.2f}MB")
            print(f"  Inference time: {avg_inference_time*1000:.2f}ms")
            print(f"  FPS: {1.0/avg_inference_time:.2f}")
            
            # Clean up
            os.remove(output_path)
            
        except Exception as e:
            print(f"  Optimization Level {opt_level} test failed: {e}")
            results[opt_level] = None
        
        finally:
            rknn.release()
    
    return results

if __name__ == "__main__":
    explain_optimization_levels()
    
    # If model file exists, can test different optimization levels
    model_path = "models/classification/resnet18.onnx"
    if os.path.exists(model_path):
        results = benchmark_optimization_levels(model_path)
        print("\nOptimization Level Comparison Results:")
        for level, result in results.items():
            if result:
                print(f"Level {level}: {result}")

4.4 Conversion Practice: ONNX Model

ResNet Classification Model Conversion

Complete ResNet Conversion Workflow

#!/usr/bin/env python3
# resnet_conversion.py

import os
import cv2
import numpy as np
import time
from rknn.api import RKNN

class ResNetConverter:
    """ResNet Model Converter"""
    
    def __init__(self, model_path, output_path):
        self.model_path = model_path
        self.output_path = output_path
        self.rknn = RKNN(verbose=True)
        
    def prepare_calibration_data(self, num_samples=50):
        """Prepare calibration data"""
        print("Preparing calibration data...")
        
        # Create synthetic data (should use real data in practice)
        calibration_data = []
        
        for i in range(num_samples):
            # Generate random image
            image = np.random.randint(0, 256, (224, 224, 3), dtype=np.uint8)
            
            # Preprocess
            image = image.astype(np.float32) / 255.0
            
            # ImageNet standardization
            mean = np.array([0.485, 0.456, 0.406])
            std = np.array([0.229, 0.224, 0.225])
            image = (image - mean) / std
            
            # Convert to NCHW format
            image = np.transpose(image, (2, 0, 1))
            calibration_data.append(image)
        
        calibration_data = np.array(calibration_data)
        print(f"Calibration data shape: {calibration_data.shape}")
        
        return calibration_data
    
    def convert_model(self, use_custom_dataset=True):
        """Convert model"""
        try:
            # 1. Configure conversion parameters
            print("Configuring conversion parameters...")
            ret = self.rknn.config(
                target_platform='rk3568',
                quantized_dtype='asymmetric_quantized-u8',
                optimization_level=3,
                output_optimize=1,
                compress_weight=True
            )
            if ret != 0:
                raise Exception("Configuration failed!")
            
            # 2. Load ONNX model
            print(f"Loading ONNX model: {self.model_path}")
            ret = self.rknn.load_onnx(model=self.model_path)
            if ret != 0:
                raise Exception("Model loading failed!")
            
            # 3. Build model
            print("Building model...")
            if use_custom_dataset:
                # Use custom dataset
                dataset = self.prepare_calibration_data()
                ret = self.rknn.build(do_quantization=True, dataset=dataset)
            else:
                # Default behavior
                ret = self.rknn.build(do_quantization=True)
            
            if ret != 0:
                raise Exception("Model build failed!")
            
            # 4. Export RKNN model
            print(f"Exporting model to: {self.output_path}")
            ret = self.rknn.export_rknn(self.output_path)
            if ret != 0:
                raise Exception("Model export failed!")
            
            print("Model conversion completed successfully!")
            
        except Exception as e:
            print(f"Conversion failed: {e}")
        finally:
            self.rknn.release()

if __name__ == "__main__":
    # Ensure directory exists
    os.makedirs("models", exist_ok=True)
    
    # Model path
    onnx_path = "models/classification/resnet18.onnx"
    rknn_path = "models/resnet18_rk3568.rknn"
    
    # Check if ONNX model exists
    if not os.path.exists(onnx_path):
        print(f"Model file not found: {onnx_path}")
        print("Please run download_models.py to download the model first")
    else:
        converter = ResNetConverter(onnx_path, rknn_path)
        converter.convert_model()