SOPHON-DEMO Introduction
1. SOPHON-DEMO Introduction
SOPHON-DEMO is developed based on the SOPHONSDK interface and provides a series of mainstream algorithm porting examples. Including model compilation and quantization based on TPU-NNTC and TPU-MLIR, inference engine porting based on BMRuntime, and pre-processing and post-processing algorithm porting based on BMCV/OpenCV.
SOPHONSDK is a deep learning SDK customized by Sophgo Technologies based on its self-developed deep learning processors. It covers capabilities such as model optimization and efficient runtime support required for neural network inference stages, providing an easy-to-use and efficient full-stack solution for deep learning application development and deployment. Currently compatible with BM1684/BM1684X/BM1688 (CV186X). Below are some related term explanations:
| Term | Description |
|---|---|
| BM1688/CV186AHBM1684X | Sophgo's fifth-generation tensor processors for the deep learning field |
| BM1684 | Sophgo's third-generation tensor processor for the deep learning field |
| Intelligent Vision Deep Learning Processor | Neural network computing unit in BM1688/CV186AH, BM1684/BM1684X |
| VPU | Encoding and decoding unit in BM1688/CV186AH, BM1684/BM1684X |
| VPP | Graphics operation acceleration unit in BM1684/BM1684X |
| VPSS | Video processing subsystem in BM1688/CV186AH, including graphics operation acceleration unit and decoding unit, also called VPP |
| JPU | Image JPEG encoding and decoding unit in BM1688/CV186AH, BM1684/BM1684X |
| SOPHONSDK | Sophgo's original deep learning development toolkit based on BM1688/CV186AH, BM1684/BM1684X |
| PCIe Mode | A working form of BM1688/CV186AH, BM1684/BM1684X, used as an acceleration device |
| SoC Mode | A working form of BM1688/CV186AH, BM1684/BM1684X, runs independently as a host, customer algorithms can run directly on it |
| arm_pcie Mode | A working form of BM1684/BM1684X, the board with BM1684/BM4X serves as a PCIe slave device inserted into an ARM processor server, customer algorithms run on the ARM processor host |
| BMCompiler | Deep neural network optimization compiler for intelligent vision deep learning processors, can convert various deep neural networks from deep learning frameworks into instruction streams that run on processors |
| BMRuntime | Intelligent vision deep learning processor inference interface library |
| BMCV | Graphics operation hardware acceleration interface library |
| BMLib | A underlying software library encapsulated on top of kernel driver, device management, memory management, data transfer, API sending, A53 enable, power control |
| mlir | Intermediate model format generated by TPU-MLIR, used for migration or quantization of models |
| BModel | Deep neural network model file format for intelligent vision deep learning processors, containing target network weights, instruction streams, etc. |
| BMLang | Advanced programming model for intelligent vision deep learning processors, users don't need to understand underlying hardware information during development |
| TPUKernel | Development library based on atomic operations of intelligent vision deep learning processors (a set of interfaces encapsulated according to BM1688/CV186AH, BM1684/BM1684X instruction sets). |
| SAIL | SOPHON Inference library supporting Python/C++ interfaces, further encapsulation of BMCV, sophon-media, BMLib, BMRuntime, etc. |
| TPU-MLIR | Intelligent vision deep learning processor compiler project, can convert pre-trained neural networks from different frameworks into bmodels that can run efficiently on Sophgo's intelligent vision deep learning processors |
1.1 BModel
BModel: A deep neural network model file format for Sophgo's intelligent vision deep learning processors, containing target network weights, instruction streams, etc.
Stage: Supports combining models with different batch sizes of the same network into one BModel; different batch size inputs of the same network correspond to different stages, and BMRuntime will automatically select the model of the corresponding stage based on the input shape size during inference. Also supports combining different networks into one BModel, obtaining different networks through network names.
Dynamic and Static Compilation: Supports dynamic and static compilation of models, which can be set through parameters during model conversion. Dynamic compiled BModel supports any input shape smaller than the shape set during compilation at runtime; static compiled BModel only supports the shape set during compilation at runtime.
Remarks
Prefer using statically compiled models: Dynamic compiled models require the participation of the ARM9 microcontroller inside BM168X to dynamically generate instruction streams for the intelligent vision deep learning processor based on actual input shapes in real-time. Therefore, dynamically compiled models have lower execution efficiency than statically compiled models. When possible, priority should be given to using statically compiled models or statically compiled models supporting multiple input shapes.
1.2 bm_image
BMCV: BMCV provides a set of machine vision libraries optimized for SOPHON Deep learning processors. By utilizing the processor's Tensor Computing Processor and VPP modules, it can complete operations such as color space conversion, scale transformation, affine transformation, projection transformation, linear transformation, drawing boxes, JPEG encoding/decoding, BASE64 encoding/decoding, NMS, sorting, and feature matching.
bm_image: BMCV APIs are all centered around bm_image, where one bm_image object corresponds to one image. Users construct bm_image objects through bm_image_create, then use them with various bmcv functional functions, and need to call bm_image_destroy to destroy them after use.
BMImage: In the SAIL library, bm_image is encapsulated as BMImage. For related information, refer to the SOPHON-SAIL User Manual.
The following are the bm_image struct and related data format definitions:
typedef enum bm_image_format_ext_{
FORMAT_YUV420P,
FORMAT_YUV422P,
FORMAT_YUV444P,
FORMAT_NV12,
FORMAT_NV21,
FORMAT_NV16,
FORMAT_NV61,
FORMAT_RGB_PLANAR,
FORMAT_BGR_PLANAR,
FORMAT_RGB_PACKED,
FORMAT_BGR_PACKED,
PORMAT_RGBP_SEPARATE,
PORMAT_BGRP_SEPARATE,
FORMAT_GRAY,
FORMAT_COMPRESSED
} bm_image_format_ext;
typedef enum bm_image_data_format_ext_{
DATA_TYPE_EXT_FLOAT32,
DATA_TYPE_EXT_1N_BYTE,
DATA_TYPE_EXT_4N_BYTE,
DATA_TYPE_EXT_1N_BYTE_SIGNED,
DATA_TYPE_EXT_4N_BYTE_SIGNED,
}bm_image_data_format_ext;
// bm_image struct definition is as follows
struct bm_image {
int width;
int height;
bm_image_format_ext image_format;
bm_data_format_ext data_type;
bm_image_private* image_private;
};2. Directory Structure and Description
The examples provided by SOPHON-DEMO are divided into three modules from easy to difficult: tutorial, sample, and application:
Warning
- The
tutorialmodule contains examples of basic interface usage; - The
tutorialmodule contains examples of basic interface usage; - The
samplemodule contains some classic algorithm serial examples on SOPHONSDK; - The
applicationmodule contains some typical applications for typical scenarios.
| Module | Link |
|---|---|
| tutorial | LINK1 |
| sample | LINK2 |
| application | LINK3 |
3. Version Description
| Version | Description |
|---|---|
| 0.2.1 | Improved and fixed documentation and code issues, supplemented CV186X support for some examples, YOLOv5 adapted to SG2042, added GroundingDINO and Qwen1_5 to sample module, StableDiffusionV1_5 new support for multiple resolutions, Qwen, Llama2, ChatGLM3 added web and multi-session modes. Added blend and stitch examples to tutorial module |
| 0.2.0 | Improved and fixed documentation and code issues, added application and tutorial modules, added ChatGLM3 and Qwen examples, SAM added web ui, BERT, ByteTrack, C3D adapted to BM1688, original YOLOv8 renamed to YOLOv8_det and added cpp post-processing acceleration method, optimized auto_test for common examples, updated TPU-MLIR installation method to pip |
| 0.1.10 | Fixed documentation and code issues, added ppYoloe, YOLOv8_seg, StableDiffusionV1.5, SAM examples, refactored yolact, CenterNet, YOLOX, YOLOv8 adapted to BM1688, YOLOv5, ResNet, PP-OCR, DeepSORT supplemented BM1688 performance data, WeNet provides C++ cross-compilation method |
| 0.1.9 | Fixed documentation and code issues, added segformer, YOLOv7, Llama2 examples, refactored YOLOv34, YOLOv5, ResNet, PP-OCR, DeepSORT, LPRNet, RetinaFace, YOLOv34, WeNet adapted to BM1688, OpenPose post-processing acceleration, chatglm2 added compilation method and int8/int4 quantization |
| 0.1.8 | Improved and fixed documentation and code issues, added BERT, ppYOLOv3, ChatGLM2, refactored YOLOX, PP-OCR added beam search, OpenPose added tpu-kernel post-processing acceleration, updated SFTP download method |
| 0.1.7 | Fixed documentation issues, some examples support BM1684 mlir, refactored PP-OCR, CenterNet examples, YOLOv5 added sail support |
| 0.1.6 | Fixed documentation issues, added ByteTrack, YOLOv5_opt, WeNet examples |
| 0.1.5 | Fixed documentation issues, added DeepSORT example, refactored ResNet, LPRNet examples |
| 0.1.4 | Fixed documentation issues, added C3D, YOLOv8 examples |
| 0.1.3 | Added OpenPose example, refactored YOLOv5 examples (including adapting arm PCIe, supporting TPU-MLIR compiled BM1684X models, using ffmpeg component to replace opencv decoding, etc.) |
| 0.1.2 | Fixed documentation issues, refactored SSD related examples, LPRNet/cpp/lprnet_bmcv uses ffmpeg component to replace opencv decoding |
| 0.1.1 | Fixed documentation issues, refactored LPRNet/cpp/lprnet_bmcv using BMNN related classes |
| 0.1.0 | Provided 10 examples including LPRNet, adapted to BM1684X (x86 PCIe, SoC), BM1684 (x86 PCIe, SoC) |
4. Environment Dependencies
SOPHON-DEMO mainly depends on TPU-MLIR, TPU-NNTC, LIBSOPHON, SOPHON-FFMPEG, SOPHON-OPENCV, SOPHON-SAIL, with the following version requirements:
| SOPHON-DEMO | TPU-MLIR | TPU-NNTC | LIBSOPHON | SOPHON-FFMPEG | SOPHON-OPENCV | SOPHON-SAIL | Release Date |
|---|---|---|---|---|---|---|---|
| 0.2.0 | >=1.6 | >=3.1.7 | >=0.5.0 | >=0.7.3 | >=0.7.3 | >=3.7.0 | >=23.10.01 |
| 0.1.10 | >=1.2.2 | >=3.1.7 | >=0.4.6 | >=0.6.0 | >=0.6.0 | >=3.7.0 | >=23.07.01 |
| 0.1.9 | >=1.2.2 | >=3.1.7 | >=0.4.6 | >=0.6.0 | >=0.6.0 | >=3.7.0 | >=23.07.01 |
| 0.1.8 | >=1.2.2 | >=3.1.7 | >=0.4.6 | >=0.6.0 | >=0.6.0 | >=3.6.0 | >=23.07.01 |
| 0.1.7 | >=1.2.2 | >=3.1.7 | >=0.4.6 | >=0.6.0 | >=0.6.0 | >=3.6.0 | >=23.07.01 |
| 0.1.6 | >=0.9.9 | >=3.1.7 | >=0.4.6 | >=0.6.0 | >=0.6.0 | >=3.4.0 | >=23.05.01 |
| 0.1.5 | >=0.9.9 | >=3.1.7 | >=0.4.6 | >=0.6.0 | >=0.6.0 | >=3.4.0 | >=23.03.01 |
| 0.1.4 | >=0.7.1 | >=3.1.5 | >=0.4.4 | >=0.5.1 | >=0.5.1 | >=3.3.0 | >=22.12.01 |
| 0.1.3 | >=0.7.1 | >=3.1.5 | >=0.4.4 | >=0.5.1 | >=0.5.1 | >=3.3.0 | - |
| 0.1.2 | Not support | >=3.1.4 | >=0.4.3 | >=0.5.0 | >=0.5.0 | >=3.2.0 | - |
| 0.1.1 | Not support | >=3.1.3 | >=0.4.2 | >=0.4.0 | >=0.4.0 | >=3.1.0 | - |
| 0.1.0 | Not support | >=3.1.3 | >=0.3.0 | >=0.2.4 | >=0.2.4 | >=3.1.0 | - |
Warning
Different examples may have different version requirements. Refer to the example's README for specifics. Other third-party libraries may need to be installed.
BM1688/CV186X and BM1684X/BM1684 correspond to different SDKs, which have not yet been published on the official website. Please contact technical staff to obtain them.
5. Technical Information
Tips
Please obtain related documents, materials, and video tutorials through the Sophgo official website Technical Information.
6. Community
Tips
Sophgo Community encourages developers to communicate and learn together. Developers can communicate and learn through the following channels.
Sophgo Community website: https://www.sophgo.com/
Sophgo Developer Forum: https://developer.sophgo.com/forum/index.html
