SOPHON-DEMO Introduction

1. SOPHON-DEMO Introduction

SOPHON-DEMO is developed based on the SOPHONSDK interface and provides a series of mainstream algorithm porting examples. Including model compilation and quantization based on TPU-NNTC and TPU-MLIR, inference engine porting based on BMRuntime, and pre-processing and post-processing algorithm porting based on BMCV/OpenCV.

SOPHONSDK is a deep learning SDK customized by Sophgo Technologies based on its self-developed deep learning processors. It covers capabilities such as model optimization and efficient runtime support required for neural network inference stages, providing an easy-to-use and efficient full-stack solution for deep learning application development and deployment. Currently compatible with BM1684/BM1684X/BM1688 (CV186X). Below are some related term explanations:

Term	Description
BM1688/CV186AHBM1684X	Sophgo's fifth-generation tensor processors for the deep learning field
BM1684	Sophgo's third-generation tensor processor for the deep learning field
Intelligent Vision Deep Learning Processor	Neural network computing unit in BM1688/CV186AH, BM1684/BM1684X
VPU	Encoding and decoding unit in BM1688/CV186AH, BM1684/BM1684X
VPP	Graphics operation acceleration unit in BM1684/BM1684X
VPSS	Video processing subsystem in BM1688/CV186AH, including graphics operation acceleration unit and decoding unit, also called VPP
JPU	Image JPEG encoding and decoding unit in BM1688/CV186AH, BM1684/BM1684X
SOPHONSDK	Sophgo's original deep learning development toolkit based on BM1688/CV186AH, BM1684/BM1684X
PCIe Mode	A working form of BM1688/CV186AH, BM1684/BM1684X, used as an acceleration device
SoC Mode	A working form of BM1688/CV186AH, BM1684/BM1684X, runs independently as a host, customer algorithms can run directly on it
arm_pcie Mode	A working form of BM1684/BM1684X, the board with BM1684/BM4X serves as a PCIe slave device inserted into an ARM processor server, customer algorithms run on the ARM processor host
BMCompiler	Deep neural network optimization compiler for intelligent vision deep learning processors, can convert various deep neural networks from deep learning frameworks into instruction streams that run on processors
BMRuntime	Intelligent vision deep learning processor inference interface library
BMCV	Graphics operation hardware acceleration interface library
BMLib	A underlying software library encapsulated on top of kernel driver, device management, memory management, data transfer, API sending, A53 enable, power control
mlir	Intermediate model format generated by TPU-MLIR, used for migration or quantization of models
BModel	Deep neural network model file format for intelligent vision deep learning processors, containing target network weights, instruction streams, etc.
BMLang	Advanced programming model for intelligent vision deep learning processors, users don't need to understand underlying hardware information during development
TPUKernel	Development library based on atomic operations of intelligent vision deep learning processors (a set of interfaces encapsulated according to BM1688/CV186AH, BM1684/BM1684X instruction sets).
SAIL	SOPHON Inference library supporting Python/C++ interfaces, further encapsulation of BMCV, sophon-media, BMLib, BMRuntime, etc.
TPU-MLIR	Intelligent vision deep learning processor compiler project, can convert pre-trained neural networks from different frameworks into bmodels that can run efficiently on Sophgo's intelligent vision deep learning processors

1.1 BModel

BModel: A deep neural network model file format for Sophgo's intelligent vision deep learning processors, containing target network weights, instruction streams, etc.

Stage: Supports combining models with different batch sizes of the same network into one BModel; different batch size inputs of the same network correspond to different stages, and BMRuntime will automatically select the model of the corresponding stage based on the input shape size during inference. Also supports combining different networks into one BModel, obtaining different networks through network names.

Dynamic and Static Compilation: Supports dynamic and static compilation of models, which can be set through parameters during model conversion. Dynamic compiled BModel supports any input shape smaller than the shape set during compilation at runtime; static compiled BModel only supports the shape set during compilation at runtime.

Remarks

Prefer using statically compiled models: Dynamic compiled models require the participation of the ARM9 microcontroller inside BM168X to dynamically generate instruction streams for the intelligent vision deep learning processor based on actual input shapes in real-time. Therefore, dynamically compiled models have lower execution efficiency than statically compiled models. When possible, priority should be given to using statically compiled models or statically compiled models supporting multiple input shapes.

1.2 bm_image

BMCV: BMCV provides a set of machine vision libraries optimized for SOPHON Deep learning processors. By utilizing the processor's Tensor Computing Processor and VPP modules, it can complete operations such as color space conversion, scale transformation, affine transformation, projection transformation, linear transformation, drawing boxes, JPEG encoding/decoding, BASE64 encoding/decoding, NMS, sorting, and feature matching.

bm_image: BMCV APIs are all centered around bm_image, where one bm_image object corresponds to one image. Users construct bm_image objects through bm_image_create, then use them with various bmcv functional functions, and need to call bm_image_destroy to destroy them after use.

BMImage: In the SAIL library, bm_image is encapsulated as BMImage. For related information, refer to the SOPHON-SAIL User Manual.

The following are the bm_image struct and related data format definitions:

typedef enum bm_image_format_ext_{
    FORMAT_YUV420P,
    FORMAT_YUV422P,
    FORMAT_YUV444P,
    FORMAT_NV12,
    FORMAT_NV21,
    FORMAT_NV16,
    FORMAT_NV61,
    FORMAT_RGB_PLANAR,
    FORMAT_BGR_PLANAR,
    FORMAT_RGB_PACKED,
    FORMAT_BGR_PACKED,
    PORMAT_RGBP_SEPARATE,
    PORMAT_BGRP_SEPARATE,
    FORMAT_GRAY,
    FORMAT_COMPRESSED
} bm_image_format_ext;

typedef enum bm_image_data_format_ext_{
    DATA_TYPE_EXT_FLOAT32,
    DATA_TYPE_EXT_1N_BYTE,
    DATA_TYPE_EXT_4N_BYTE,
    DATA_TYPE_EXT_1N_BYTE_SIGNED,
    DATA_TYPE_EXT_4N_BYTE_SIGNED,
}bm_image_data_format_ext;

// bm_image struct definition is as follows
struct bm_image {
    int width;
    int height;
    bm_image_format_ext image_format;
    bm_data_format_ext data_type;
    bm_image_private* image_private;
};

2. Directory Structure and Description

The examples provided by SOPHON-DEMO are divided into three modules from easy to difficult: tutorial, sample, and application:

Warning

The tutorial module contains examples of basic interface usage;
The tutorial module contains examples of basic interface usage;
The sample module contains some classic algorithm serial examples on SOPHONSDK;
The application module contains some typical applications for typical scenarios.

Module	Link
tutorial	LINK1
sample	LINK2
application	LINK3

3. Version Description

Version	Description
0.2.1	Improved and fixed documentation and code issues, supplemented CV186X support for some examples, YOLOv5 adapted to SG2042, added GroundingDINO and Qwen1_5 to sample module, StableDiffusionV1_5 new support for multiple resolutions, Qwen, Llama2, ChatGLM3 added web and multi-session modes. Added blend and stitch examples to tutorial module
0.2.0	Improved and fixed documentation and code issues, added application and tutorial modules, added ChatGLM3 and Qwen examples, SAM added web ui, BERT, ByteTrack, C3D adapted to BM1688, original YOLOv8 renamed to YOLOv8_det and added cpp post-processing acceleration method, optimized auto_test for common examples, updated TPU-MLIR installation method to pip
0.1.10	Fixed documentation and code issues, added ppYoloe, YOLOv8_seg, StableDiffusionV1.5, SAM examples, refactored yolact, CenterNet, YOLOX, YOLOv8 adapted to BM1688, YOLOv5, ResNet, PP-OCR, DeepSORT supplemented BM1688 performance data, WeNet provides C++ cross-compilation method
0.1.9	Fixed documentation and code issues, added segformer, YOLOv7, Llama2 examples, refactored YOLOv34, YOLOv5, ResNet, PP-OCR, DeepSORT, LPRNet, RetinaFace, YOLOv34, WeNet adapted to BM1688, OpenPose post-processing acceleration, chatglm2 added compilation method and int8/int4 quantization
0.1.8	Improved and fixed documentation and code issues, added BERT, ppYOLOv3, ChatGLM2, refactored YOLOX, PP-OCR added beam search, OpenPose added tpu-kernel post-processing acceleration, updated SFTP download method
0.1.7	Fixed documentation issues, some examples support BM1684 mlir, refactored PP-OCR, CenterNet examples, YOLOv5 added sail support
0.1.6	Fixed documentation issues, added ByteTrack, YOLOv5_opt, WeNet examples
0.1.5	Fixed documentation issues, added DeepSORT example, refactored ResNet, LPRNet examples
0.1.4	Fixed documentation issues, added C3D, YOLOv8 examples
0.1.3	Added OpenPose example, refactored YOLOv5 examples (including adapting arm PCIe, supporting TPU-MLIR compiled BM1684X models, using ffmpeg component to replace opencv decoding, etc.)
0.1.2	Fixed documentation issues, refactored SSD related examples, LPRNet/cpp/lprnet_bmcv uses ffmpeg component to replace opencv decoding
0.1.1	Fixed documentation issues, refactored LPRNet/cpp/lprnet_bmcv using BMNN related classes
0.1.0	Provided 10 examples including LPRNet, adapted to BM1684X (x86 PCIe, SoC), BM1684 (x86 PCIe, SoC)

4. Environment Dependencies

SOPHON-DEMO mainly depends on TPU-MLIR, TPU-NNTC, LIBSOPHON, SOPHON-FFMPEG, SOPHON-OPENCV, SOPHON-SAIL, with the following version requirements:

SOPHON-DEMO	TPU-MLIR	TPU-NNTC	LIBSOPHON	SOPHON-FFMPEG	SOPHON-OPENCV	SOPHON-SAIL	Release Date
0.2.0	>=1.6	>=3.1.7	>=0.5.0	>=0.7.3	>=0.7.3	>=3.7.0	>=23.10.01
0.1.10	>=1.2.2	>=3.1.7	>=0.4.6	>=0.6.0	>=0.6.0	>=3.7.0	>=23.07.01
0.1.9	>=1.2.2	>=3.1.7	>=0.4.6	>=0.6.0	>=0.6.0	>=3.7.0	>=23.07.01
0.1.8	>=1.2.2	>=3.1.7	>=0.4.6	>=0.6.0	>=0.6.0	>=3.6.0	>=23.07.01
0.1.7	>=1.2.2	>=3.1.7	>=0.4.6	>=0.6.0	>=0.6.0	>=3.6.0	>=23.07.01
0.1.6	>=0.9.9	>=3.1.7	>=0.4.6	>=0.6.0	>=0.6.0	>=3.4.0	>=23.05.01
0.1.5	>=0.9.9	>=3.1.7	>=0.4.6	>=0.6.0	>=0.6.0	>=3.4.0	>=23.03.01
0.1.4	>=0.7.1	>=3.1.5	>=0.4.4	>=0.5.1	>=0.5.1	>=3.3.0	>=22.12.01
0.1.3	>=0.7.1	>=3.1.5	>=0.4.4	>=0.5.1	>=0.5.1	>=3.3.0	-
0.1.2	Not support	>=3.1.4	>=0.4.3	>=0.5.0	>=0.5.0	>=3.2.0	-
0.1.1	Not support	>=3.1.3	>=0.4.2	>=0.4.0	>=0.4.0	>=3.1.0	-
0.1.0	Not support	>=3.1.3	>=0.3.0	>=0.2.4	>=0.2.4	>=3.1.0	-

Warning

Different examples may have different version requirements. Refer to the example's README for specifics. Other third-party libraries may need to be installed.
BM1688/CV186X and BM1684X/BM1684 correspond to different SDKs, which have not yet been published on the official website. Please contact technical staff to obtain them.

5. Technical Information

Tips

Please obtain related documents, materials, and video tutorials through the Sophgo official website Technical Information.

6. Community

Tips

Sophgo Community encourages developers to communicate and learn together. Developers can communicate and learn through the following channels.

Sophgo Community website: https://www.sophgo.com/

Sophgo Developer Forum: https://developer.sophgo.com/forum/index.html