HOME
Shop
  • English
  • 简体中文
HOME
Shop
  • English
  • 简体中文
  • Product Series

    • FPGA+ARM

      • GM-3568JHF

        • 1. Introduction

          • GM-3568JHF Introduction
        • 2. Quick Start

          • 01 Environment Construction
          • 02 Compilation Instructions
          • 03 Burning Guide
          • 04 Debugging Tools
          • 05 Software Update
          • 06 View information
          • 07 Test Command
          • 08 Application Compilation
          • 09 Source code acquisition
        • 3. Peripherals and Interfaces

          • USB
          • Display and touch
          • Ethernet
          • WIFI
          • Bluetooth
          • TF-Card
          • Audio
          • Serial Port
          • CAN
          • RTC
        • 4. Application Development

          • 01 UART read and write case
          • 02 Key detection case
          • 03 LED light flashing case
          • 04 MIPI screen detection case
          • 05 Read USB device information example
          • 06 FAN Detection Case
          • 07 FPGA FSPI Communication Case
          • 08 FPGA DMA read and write case
          • 09 GPS debugging case
          • 10 Ethernet Test Cases
          • 11 RS485 reading and writing examples
          • 12 FPGA IIC read and write examples
          • 13 PN532 NFC card reader case
          • 14 TF card reading and writing case
        • 5. QT Development

          • 01 ARM64 cross compiler environment construction
          • 02 QT program added automatic startup service
        • 6. Others

          • 01 Modification of the root directory file system
          • 02 System auto-start service
    • ShimetaPi

      • M4-R1

        • Introduction

          • M4-R1 Introduction
        • Get started quickly

          • OpenHarmony概述
          • 镜像烧录
          • 开发环境准备
          • Hello World应用以及部署
        • Application Development

          • getting Started

            • 第一章 ArkTS语言简介
            • 第二章 UI组件介绍和实际应用(上)
            • 第三章 UI组件介绍和实际应用(中)
            • 第四章 UI组件介绍和实际应用(下)
          • Advanced

            • 第一章 入门指引
            • 第二章 三方库的引用和使用
            • 第三章 应用编译以及部署
            • 第四章 命令行恢复出厂设置
            • 第五章 系统调试--HDC调试
            • 第六章 APP 稳定性测试
            • 第七章 应用测试
        • Equipment Development

          • 第一章 环境搭建
          • 第二章 下载源码
          • 第三章 编译源码
        • Peripherals and interfaces

          • 树莓派接口
          • GPIO 接口
          • I2C 接口
          • SPI通信
          • PWM控制
          • 串口通讯
          • TF Card
          • 屏幕
          • 触摸
          • 音频
          • RTC
          • Ethernet
          • M.2
          • MINI-PCIE
          • Camera
          • WIFI&BT
          • 树莓派拓展板
        • Frequently asked questions

          • 资源下载
      • M5-R1

        • Introduction

          • Introduction to ShimetaPi M5-R1
    • OpenHarmony

      • SC-3568HA

        • Introduction

          • SC-3568HA Overview
        • Quick Start Guide

          • OpenHarmony Overview
          • Image Flashing
          • Setting Up the Development Environment
          • Hello World Application and Deployment
        • Application Development

          • ArkUI

            • Chapter 1 Introduction to ArkTS Language
            • Chapter 2 Introduction to UI Components and Practical Applications (Part 1)
            • Chapter 3 Introduction to UI Components and Practical Applications (Part 2)
            • Chapter 4 Introduction to UI Components and Practical Applications (Part 3)
          • Expand

            • Chapter 1 Getting Started Guide
            • Chapter 2 Referencing and Using Third-Party Libraries
            • Chapter 3: Application Compilation and Deployment
            • Chapter 4: Command-Line Factory Reset
            • Chapter 5: System Debugging -- HDC (Huawei Device Connector) Debugging
            • Chapter 6 APP Stability Testing
            • Chapter 7 Application Testing
        • Device Development

          • Chapter 1 Environment Setup
          • Chapter 2 Download Source Code
          • Chapter 3 Compiling Source Code
        • Peripheral And Iinterface

          • Raspberry Pi interface
          • GPIO Interface
          • I2C Interface
          • SPI communication
          • PWM (Pulse Width Modulation) control
          • Serial port communication
          • TF Card
          • Display Screen
          • Touch
          • Audio
          • RTC
          • Ethernet
          • M.2
          • MINI-PCIE
          • Camera
          • WIFI&BT
          • Raspberry Pi expansion board
        • Frequently Asked Questions

          • Resource Downloads
      • M-K1HSE

        • Introduction

          • M-K1HSE Introduction
        • Quick Start

          • Development environment construction
          • Source code acquisition
          • Compilation Notes
          • Burning Guide
        • Peripherals and interfaces

          • 01 Audio
          • 02 RS485
          • 03 Display
        • System customization development

          • System transplant
          • System customization
          • Driver Development
          • System Debugging
          • OTA Update
    • EVS-Camera

      • CF-NRS1

        • 1. Introduction

          • Event Camera Technical Documentation
        • 2. Quick Start

          • Host driver and software installation
        • 3. SDK application development

          • API Usage Instructions
      • CF-CRA2

        • Introduction

          • About CF-NRS1
    • AI-model

      • 1684XB-32T

        • Introduction

          • AIBOX-1684XB-32 Introduction
        • Get started quickly

          • First time use
          • Network Configuration
          • Disk usage
          • Memory allocation
          • Fan Strategy
          • Firmware Upgrade
        • Deployment Tutorial

          • Algorithm deployment
          • Deploy Llama3 Example
        • Application Development

          • Sophgo SDK Development
          • Sophon LLM_api_server development
          • Deploy MiniCPM-V-2_6
          • Qwen-2-5-VL Image and Video Recognition DEMO
          • Qwen3-chat-DEMO
          • Qwen3-Qwen Agent-MCP-Demo
          • Qwen3-langchain-AI Agent
      • 1684X-416T

        • Introduction

          • AIBOX-1684X-416 Introduction
        • Demo simple operation guide

          • Simple instructions for using shimeta smart monitoring demo
    • Core-Board

      • C-3568BQ

        • Introduction

          • C-3568BQ Overview
      • C-3588LQ

        • Introduction

          • C-3588LQ Introduction
      • GC-3568JBAF

        • Introduction

          • GC-3568JBAF Introduction
      • C-K1BA

        • Introduction

          • C-K1BA Introduction

Deploy Llama3 Example

1. Compile the model

Refer to LLM-TPU-main stage 1, compile and convert the bmodel file in the X86 environment, and transfer it to the board.

You can also download it in the resource download.

At the same time, download the official TPU-demo of Suanneng.

Warning

Transfer the files to the /data path in the root directory of the board. After logging in to the SSH using MobaXterm, you can directly drag and drop the files via the built - in SFTP feature.

TOOL

2. Compile executable files

Tips

Make sure the board's network can connect to the Internet. The following steps are performed on the board.

  1. The system needs to install dependencies first. Use the following command to install:
    sudo apt-get update                ##Update the software sources
    apt-get install pybind11-dev -y    ##Install pybind11-dev
    pip3 install transformers          ##Install transformers in Python (Due to network issues, this step may take a relatively long time)
  1. The compilation steps are performed in the directory where demo and bmodel were just transferred:
    sudo -i                                             ##Switch to the root user
    cd /data                                            ##Enter the /data directory
    unzip LLM-TPU-main.zip                              ## Unzip the LLM-TPU-main.zip file
    mv llama3-8b_int4_1dev_1024.bmodel /data/LLM-TPU-main/models/Llama3/python_demo  ##Move the bmodel to the corresponding demo directory
    cd /data/LLM-TPU-main/models/Llama3/python_demo     ##Enter the Llama3 demo directory
    mkdir build && cd build                             ##Create a compilation directory and enter it
    cmake ..                                            ##Generate the Makefile using cmake
    make                                                ##Compile the project
    cp *chat* ..                                        ##Copy the compiled libraries to the running directory
  1. Run:
    cd /data/LLM-TPU-main/models/Llama3/python_demo     ##Enter the Llama3 demo directory
    python3 pipeline.py --model_path ./llama3-8b_int4_1dev_1024.bmodel --tokenizer_path ../token_config/ --devid 0 ##Run the demo

Operation effect:

    root@bm1684:/data/LLM-TPU-main/models/Llama3/python_demo# python3 pipeline.py --model_path ./llama3-8b_int4_1dev_1024.bmodel --tokenizer_path ../token_config/ --devid 0
    None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
    Load ../token_config/ ...
    Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
    Device [ 0 ] loading ....
    [BMRT][bmcpu_setup:498] INFO:cpu_lib 'libcpuop.so' is loaded.
    [BMRT][bmcpu_setup:521] INFO:Not able to open libcustomcpuop.so
    bmcpu init: skip cpu_user_defined
    open usercpu.so, init user_cpu_init
    [BMRT][BMProfileDeviceBase:190] INFO:gdma=0, tiu=0, mcu=0
    Model[./llama3-8b_int4_1dev_1024.bmodel] loading ....
    [BMRT][load_bmodel:1939] INFO:Loading bmodel from [./llama3-8b_int4_1dev_1024.bmodel]. Thanks for your patience...
    [BMRT][load_bmodel:1704] INFO:Bmodel loaded, version 2.2+v1.8.beta.0-89-g32b7f39b8-20240620
    [BMRT][load_bmodel:1706] INFO:pre net num: 0, load net num: 69
    [BMRT][load_tpu_module:1802] INFO:loading firmare in bmodel
    [BMRT][preload_funcs:2121] INFO: core_id=0, multi_fullnet_func_id=22
    [BMRT][preload_funcs:2124] INFO: core_id=0, dynamic_fullnet_func_id=23
    Done!

    =================================================================
    1. If you want to quit, please enter one of [q, quit, exit]
    2. To create a new chat session, please enter one of [clear, new]
    =================================================================

    Question: hello

    Answer: Hello! How can I help you?
    FTL: 1.690 s
    TPS: 7.194 token/s

    Question: who are you?

    Answer: I am Llama3, an AI assistant developed by IntellectNexus. How can I assist you?
    FTL: 1.607 s
    TPS: 7.213 token/s

Edit this page on GitHub
Last Updated:
Contributors: zsl, zwhuang
Prev
Algorithm deployment