部署MiniCPM-V-2_6

一、简介

MiniCPM-V-2_6 是一个基于 MiniCPM 架构的多模态预训练模型，专为视觉和语言任务设计，具备高效处理图像和文本的能力，该模型基于 SigLip-400M 和 Qwen2-7B 构建，共有 8B 参数。

1、特性

支持中文。
支持单张图片、多张图片上传。

二、运行步骤

1、克隆项目安装环境和下载模型

1.1 克隆LLM-TPU项目

    git clone https://github.com/sophgo/LLM-TPU.git
    或者下载之后传到板子上根目录/data路径，可以使用MobaXterm 登录ssh后，直接通过自带的SFTP拖进去
    然后在/data/下解压

1.2 安装环境，若已安装请跳过，非ubuntu系统视情况使用yum或其他方式安装

    sudo apt-get update
    sudo apt-get install pybind11-dev
    pip3 install sentencepiece transformers==4.40.0
    pip3 install gradio==3.39.0 mdtex2html==1.2.0 dfss

1.3 下载模型

    ##首先需要先进入到python_demo目录中
    /data/LLM-TPU-main/models/MiniCPM-V-2_6/python_demo/ 
    ##直接下载编译好的模型
    python3 -m dfss --url=open@sophgo.com:/ext_model_information/LLM/LLM-TPU/minicpmv26_bm1684x_int4_seq1024_imsize448.bmodel

2、python例程

2.1 编译库文件

    mkdir build && cd build                             ##创建编译目录并进入其中
    cmake ..                                            ##cmake 生成Makefile
    make                                                ##编译
    cp *chat* ..                                        ##将编译出来的库拷贝到运行目录

2.2 运行demo

    cd /data/LLM-TPU-main/models/MiniCPM-V-2_6/python_demo/    ##进入python_demo目录
    python3 pipeline.py --model_path minicpmv26_bm1684x_int4_seq1024_imsize448.bmodel --processor_path ../support/processor_config/ --devid 0##运行demo

三、运行效果

    linaro@bm1684:/data/LLM-TPU-main/models/MiniCPM-V-2_6/python_demo$ python3 pipeline.py --model_path minicpmv26_bm1684x_int4_seq1024_imsize448.bmodel --processor_path ../support/processor_config/ --devid 0
    Load ../support/processor_config/ ...
    Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
    Device [ 0 ] loading .....
    bmcpu init: skip cpu_user_defined
    open usercpu.so, init user_cpu_init
    Model[minicpmv26_bm1684x_int4_seq1024_imsize448.bmodel] loading ....
    Done!

    =================================================================
    1. If you want to quit, please enter one of [q, quit, exit]
    2. To create a new chat session, please enter one of [clear, new]
    =================================================================

    Question: 请逐步详细分析，这张图片里有两只狗，对吗？

    Image Num: 1

    Image Path 0: cat.png
    请注意，目前不支持图片size可变，因此图片会进行resize。目标size为export_onnx时的图片size
    请注意，如果你export_onnx.py时使用的是其他图片size，请修改下面这行代码: single_imsize = (448, 448)

    Answer:
    这张图片里有一只狗，不是两只。这只狗是一它的毛色和特征，比如尖耳朵、黑色口吻和红色项圈，可以被识别出来。猫和狗之间亲密的姿势，以及它们相似的毛色，可能会导致混淆，但仔细观察它们的特征，比如狗的口吻形状和耳朵，可以清楚地看出它们之间存在差异。因此，这张图片里有一只狗。
    FTL: 1.784 s
    TPS: 9.662 token/s

四、笔者环境

    Package                   Version
    ------------------------- -----------
    aiofiles                  23.2.1
    aiohappyeyeballs          2.4.4
    aiohttp                   3.10.11
    aiosignal                 1.3.1
    altair                    5.4.1
    annotated-types           0.7.0
    anyio                     4.5.2
    async-timeout             5.0.1
    attrs                     25.1.0
    certifi                   2019.11.28
    chardet                   3.0.4
    click                     8.1.3
    contourpy                 1.1.1
    cycler                    0.12.1
    dbus-python               1.2.16
    dfss                      1.9.2
    distlib                   0.3.9
    distro                    1.9.0
    distro-info               0.23ubuntu1
    exceptiongroup            1.2.2
    fastapi                   0.115.8
    ffmpy                     0.5.0
    filelock                  3.16.1
    Flask                     2.2.2
    fonttools                 4.54.0
    frozenlist                1.5.0
    fsspec                    2025.2.0
    gradio                    3.39.0
    gradio_client             1.3.0
    h11                       0.14.0
    httpcore                  1.0.7
    httpx                     0.28.1
    huggingface-hub           0.29.1
    idna                      2.8
    importlib-metadata        6.0.0
    importlib_resources       6.4.5
    itsdangerous              2.1.2
    Jinja2                    3.1.2
    jiter                     0.9.0
    jsonschema                4.23.0
    jsonschema-specifications 2023.12.1
    kiwisolver                1.4.7
    latex2mathml              3.77.0
    linkify-it-py             2.0.3
    Markdown                  3.7
    markdown-it-py            2.2.0
    MarkupSafe                2.1.2
    matplotlib                3.7.5
    mdit-py-plugins           0.3.3
    mdtex2html                1.2.0
    mdurl                     0.1.2
    mpmath                    1.3.0
    multidict                 6.1.0
    narwhals                  1.28.0
    netifaces                 0.10.4
    networkx                  3.1
    numpy                     1.24.1
    openai                    1.74.0
    orjson                    3.10.15
    packaging                 24.1
    pandas                    2.0.3
    pillow                    10.4.0
    pip                       25.0.1
    pkgutil_resolve_name      1.3.10
    platformdirs              4.3.6
    propcache                 0.2.0
    psutil                    5.9.4
    pydantic                  2.10.6
    pydantic_core             2.27.2
    pydub                     0.25.1
    PyGObject                 3.36.0
    pymacaroons               0.13.0
    PyNaCl                    1.3.0
    pyparsing                 3.1.4
    pyserial                  3.5
    python-apt                2.0.0
    python-dateutil           2.9.0.post0
    python-multipart          0.0.20
    pytz                      2025.1
    PyYAML                    5.3.1
    referencing               0.35.1
    regex                     2024.11.6
    requests                  2.22.0
    requests-unixsocket       0.2.0
    rpds-py                   0.20.1
    safetensors               0.5.2
    semantic-version          2.10.0
    sentencepiece             0.2.0
    setuptools                45.2.0
    six                       1.14.0
    sniffio                   1.3.1
    sophon-arm                3.10.0
    sse-starlette             2.1.3
    ssh-import-id             5.10
    starlette                 0.44.0
    sympy                     1.13.3
    tokenizers                0.19.1
    torch                     2.4.1
    torchaudio                2.4.1
    torchvision               0.19.1
    tqdm                      4.67.1
    transformers              4.40.0
    typing_extensions         4.12.2
    tzdata                    2025.1
    ubuntu-advantage-tools    20.3
    uc-micro-py               1.0.3
    unattended-upgrades       0.1
    urllib3                   1.25.8
    uvicorn                   0.33.0
    virtualenv                20.30.0
    websockets                11.0.3
    Werkzeug                  2.2.2
    wheel                     0.34.2
    yarl                      1.15.2
    zipp                      3.11.0