HOME
  • GM-3568JHF
  • M4-R1
  • M5-R1
  • SC-3568HA
  • M-K1HSE
  • CF-NRS1
  • CF-CRA2
  • 1684XB-32T
  • 1684X-416T
  • C-3568BQ
  • C-3588LQ
  • GC-3568JBAF
  • C-K1BA
Shop
  • English
  • 简体中文
HOME
  • GM-3568JHF
  • M4-R1
  • M5-R1
  • SC-3568HA
  • M-K1HSE
  • CF-NRS1
  • CF-CRA2
  • 1684XB-32T
  • 1684X-416T
  • C-3568BQ
  • C-3588LQ
  • GC-3568JBAF
  • C-K1BA
Shop
  • English
  • 简体中文
  • 1684XB-32T

    • Introduction

      • AIBOX-1684XB-32 Introduction
    • Get started quickly

      • First time use
      • Network Configuration
      • Disk usage
      • Memory allocation
      • Fan Strategy
      • Firmware Upgrade
    • Deployment Tutorial

      • Algorithm deployment
      • Deploy Llama3 Example
    • Application Development

      • Sophgo SDK Development
      • Sophon LLM_api_server development
      • Deploy MiniCPM-V-2_6
      • Qwen-2-5-VL Image and Video Recognition DEMO
      • Qwen3-chat-DEMO
  • 1684X-416T

    • Introduction

      • AIBOX-1684X-416 Introduction
    • Demo simple operation guide

      • Simple instructions for using shimeta smart monitoring demo

Deploy MiniCPM-V-2_6

1. Introduction

MiniCPM-V-2_6 is a multimodal pre-trained model based on the MiniCPM architecture. It is designed for vision and language tasks and has the ability to efficiently process images and text. The model is built on SigLip-400M and Qwen2-7B and has a total of 8B parameters.

1. Features

  • Support Chinese.
  • Supports uploading single or multiple pictures.

2. Operation steps

1. Clone the project installation environment and download the model

1.1 Clone the LLM-TPU project

    git clone https://github.com/sophgo/LLM-TPU.git
Or after downloading, transfer it to the root directory of the board at the path of /data. You can log in to SSH using MobaXterm and directly drag it in via the built-in SFTP, and then unzip it under the /data directory.

1.2 Installation environment, if already installed, please skip, non-Ubuntu system, use yum or other methods to install as appropriate

    sudo apt-get update
    sudo apt-get install pybind11-dev
    pip3 install sentencepiece transformers==4.40.0
    pip3 install gradio==3.39.0 mdtex2html==1.2.0 dfss

1.3 Download the model

    ##First, you need to enter the python_demo directory.
    /data/LLM-TPU-main/models/MiniCPM-V-2_6/python_demo/ 
    ##Directly download the compiled model.
    python3 -m dfss --url=open@sophgo.com:/ext_model_information/LLM/LLM-TPU/minicpmv26_bm1684x_int4_seq1024_imsize448.bmodel

2. Python routine

2.1 Compile library files

    mkdir build && cd build            ##Create a compilation directory and enter it
    cmake ..                           ##Generate the Makefile using CMake
    make                               ##Compile the project
    cp *chat* ..                       ##Copy the compiled libraries to the running directory

2.2 Run the demo

    cd /data/LLM-TPU-main/models/MiniCPM-V-2_6/python_demo/    ##Enter the python_demo directory
    python3 pipeline.py --model_path minicpmv26_bm1684x_int4_seq1024_imsize448.bmodel --processor_path ../support/processor_config/ --devid 0##Run the demo

3. Operation Effect

    linaro@bm1684:/data/LLM-TPU-main/models/MiniCPM-V-2_6/python_demo$ python3 pipeline.py --model_path minicpmv26_bm1684x_int4_seq1024_imsize448.bmodel --processor_path ../support/processor_config/ --devid 0
    Load ../support/processor_config/ ...
    Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
    Device [ 0 ] loading .....
    bmcpu init: skip cpu_user_defined
    open usercpu.so, init user_cpu_init
    Model[minicpmv26_bm1684x_int4_seq1024_imsize448.bmodel] loading ....
    Done!

    =================================================================
    1. If you want to quit, please enter one of [q, quit, exit]
    2. To create a new chat session, please enter one of [clear, new]
    =================================================================

    Question: Please analyze it step by step in detail. Are there two dogs in this picture?

    Image Num: 1

    Image Path 0: cat.png
    Please note that currently, variable image sizes are not supported, so the images will be resized. The target size is the image size when exporting the ONNX model.
    Please note that if you use a different image size when running export_onnx.py, please modify the following line of code: single_imsize = (448, 448).

    Answer:
    There is one dog in this picture, not two. The fur color and features of this dog, such as the pointed ears, black muzzle, and red collar, can be identified. The intimate posture between the cat and the dog, along with their similar fur colors, may cause confusion. However, by carefully observing their features, such as the shape of the dog's muzzle and its ears, it is clear that there are differences between them. Therefore, there is one dog in this picture.
    FTL: 1.784 s
    TPS: 9.662 token/s

4. Author's Environment

    Package                   Version
    ------------------------- -----------
    aiofiles                  23.2.1
    aiohappyeyeballs          2.4.4
    aiohttp                   3.10.11
    aiosignal                 1.3.1
    altair                    5.4.1
    annotated-types           0.7.0
    anyio                     4.5.2
    async-timeout             5.0.1
    attrs                     25.1.0
    certifi                   2019.11.28
    chardet                   3.0.4
    click                     8.1.3
    contourpy                 1.1.1
    cycler                    0.12.1
    dbus-python               1.2.16
    dfss                      1.9.2
    distlib                   0.3.9
    distro                    1.9.0
    distro-info               0.23ubuntu1
    exceptiongroup            1.2.2
    fastapi                   0.115.8
    ffmpy                     0.5.0
    filelock                  3.16.1
    Flask                     2.2.2
    fonttools                 4.54.0
    frozenlist                1.5.0
    fsspec                    2025.2.0
    gradio                    3.39.0
    gradio_client             1.3.0
    h11                       0.14.0
    httpcore                  1.0.7
    httpx                     0.28.1
    huggingface-hub           0.29.1
    idna                      2.8
    importlib-metadata        6.0.0
    importlib_resources       6.4.5
    itsdangerous              2.1.2
    Jinja2                    3.1.2
    jiter                     0.9.0
    jsonschema                4.23.0
    jsonschema-specifications 2023.12.1
    kiwisolver                1.4.7
    latex2mathml              3.77.0
    linkify-it-py             2.0.3
    Markdown                  3.7
    markdown-it-py            2.2.0
    MarkupSafe                2.1.2
    matplotlib                3.7.5
    mdit-py-plugins           0.3.3
    mdtex2html                1.2.0
    mdurl                     0.1.2
    mpmath                    1.3.0
    multidict                 6.1.0
    narwhals                  1.28.0
    netifaces                 0.10.4
    networkx                  3.1
    numpy                     1.24.1
    openai                    1.74.0
    orjson                    3.10.15
    packaging                 24.1
    pandas                    2.0.3
    pillow                    10.4.0
    pip                       25.0.1
    pkgutil_resolve_name      1.3.10
    platformdirs              4.3.6
    propcache                 0.2.0
    psutil                    5.9.4
    pydantic                  2.10.6
    pydantic_core             2.27.2
    pydub                     0.25.1
    PyGObject                 3.36.0
    pymacaroons               0.13.0
    PyNaCl                    1.3.0
    pyparsing                 3.1.4
    pyserial                  3.5
    python-apt                2.0.0
    python-dateutil           2.9.0.post0
    python-multipart          0.0.20
    pytz                      2025.1
    PyYAML                    5.3.1
    referencing               0.35.1
    regex                     2024.11.6
    requests                  2.22.0
    requests-unixsocket       0.2.0
    rpds-py                   0.20.1
    safetensors               0.5.2
    semantic-version          2.10.0
    sentencepiece             0.2.0
    setuptools                45.2.0
    six                       1.14.0
    sniffio                   1.3.1
    sophon-arm                3.10.0
    sse-starlette             2.1.3
    ssh-import-id             5.10
    starlette                 0.44.0
    sympy                     1.13.3
    tokenizers                0.19.1
    torch                     2.4.1
    torchaudio                2.4.1
    torchvision               0.19.1
    tqdm                      4.67.1
    transformers              4.40.0
    typing_extensions         4.12.2
    tzdata                    2025.1
    ubuntu-advantage-tools    20.3
    uc-micro-py               1.0.3
    unattended-upgrades       0.1
    urllib3                   1.25.8
    uvicorn                   0.33.0
    virtualenv                20.30.0
    websockets                11.0.3
    Werkzeug                  2.2.2
    wheel                     0.34.2
    yarl                      1.15.2
    zipp                      3.11.0
Edit this page on GitHub
Last Updated:
Contributors: zwhuang
Prev
Sophon LLM_api_server development
Next
Qwen-2-5-VL Image and Video Recognition DEMO