swift

Get Started

  • Installation and Usage
  • Interface Training and Inference
  • Push Model
  • Basic Usage
  • Interface List
  • Res-Tuning Component
  • πŸ”₯SCEdit
  • Compatibility with Peft

Instruction

  • Instructions
  • LLM Fine-tuning Documentation
  • LLM Inference Documentation
  • LLM Evaluation Documentation
  • LLM Quantization and Export Documentation
  • LLM Experiment Documentation
  • Command Line Arguments
  • Supported models and datasets
  • Customization and Extension
  • Frequently Asked Questions in LLM & VLM Training, Inference, Deployment, and Evaluation

LLM Training and Inference

  • LLM Documentation
  • Human Preference Alignment Training Documentation
  • OLLaMA Export Documentation
  • VLLM Inference Acceleration and Deployment
  • LmDeploy Inference Acceleration and Deployment
  • Megatron Training Documentation
  • Best Practices for Self-Cognition Fine-Tuning
  • Agent Fine-tuning Best Practices
  • Agent Deployment Best Practice
  • Qwen1.5 Full Process Best Practices
  • NPU Best Practice
  • Hands-on Training and Inference with Grok 300B
  • LLM Human Alignment Training Documentation
  • Best Practices for ORPO Algorithm
  • Best Practices for SimPO Algorithm
  • HuggingFace Eco-compatibility
  • Benchmark

Multi-Modal LLM Training and Inference

  • Multi-Modal Documentation
    • πŸ“š Tutorial
    • Multi-Modal Best Practice
  • Human Preference Alignment Training Documentation
  • LmDeploy Inference Acceleration and Deployment
  • vLLM Inference Acceleration Documentation
  • Mutlimoda LLM Deployment
  • Qwen-VL Best Practice
  • Qwen2-VL Best Practice
  • Qwen-Audio Best Practice
  • Llava Best Practice
  • Llava Video Best Practice
  • InternVL Best Practice
  • Deepseek-VL Best Practice
  • Internlm-Xcomposer2 Best Practice
  • Phi3-Vision Best Practice
  • Yi-VL Best Practice
  • Florence Best Practice
  • CogVLM Best Practice
  • CogVLM2 Best Practice
  • GLM4V Best Practice
  • CogVLM2 Video Best Practice
  • MiniCPM-V Best Practice

API Doc

  • Hub
  • Trainer
  • Tuner
swift
  • Multi-Modal Documentation
  • View page source

Multi-Modal Documentation

πŸ“š Tutorial

  1. Human Preference Alignment Training Documentation

  2. LmDeploy-inference-acceleration

  3. vLLM Inference Acceleration

  4. MLLM Deployment Documentation

Multi-Modal Best Practice

A single round of dialogue can contain multiple images (or no images):

  1. Qwen-VL Best Practice, Qwen2-VL Best Practice

  2. Qwen-Audio Best Practice, Qwen2-Audio Best Practice

  3. Llava Best Practice, LLava Video Best Practice

  4. InternVL Series Best Practice

  5. MiniCPM-V Best Practice, MiniCPM-V-2.6 Best Practice

  6. Deepseek-VL Best Practice

  7. Internlm2-Xcomposers Best Practice

  8. Phi3-Vision Best Practice, Phi3.5-Vision Best Practice.

  9. mPLUG-Owl3 Best Practice

  10. GOT-OCR2 Best Practice

A single round of dialogue can only contain one image:

  1. Yi-VL Best Practice.md

  2. Florence Best Practice.md

The entire conversation revolves around one image.

  1. CogVLM Best Practice, CogVLM2 Best Practice, GLM4V Best Practice, CogVLM2-Video Best Practice

Previous Next

© Copyright 2022-2024, Alibaba ModelScope.

Built with Sphinx using a theme provided by Read the Docs.