swift

Get Started

Installation and Usage
Interface Training and Inference
Push Model
Basic Usage
Interface List
Res-Tuning Component
🔥SCEdit
Compatibility with Peft

Instruction

Instructions
LLM Fine-tuning Documentation
LLM Inference Documentation
LLM Evaluation Documentation
LLM Quantization and Export Documentation
LLM Experiment Documentation
Command Line Arguments
Supported models and datasets
Customization and Extension
Frequently Asked Questions in LLM & VLM Training, Inference, Deployment, and Evaluation

LLM Training and Inference

LLM Documentation
Human Preference Alignment Training Documentation
OLLaMA Export Documentation
VLLM Inference Acceleration and Deployment
LmDeploy Inference Acceleration and Deployment
Megatron Training Documentation
Best Practices for Self-Cognition Fine-Tuning
Agent Fine-tuning Best Practices
Agent Deployment Best Practice
Qwen1.5 Full Process Best Practices
NPU Best Practice
Hands-on Training and Inference with Grok 300B
LLM Human Alignment Training Documentation
Best Practices for ORPO Algorithm
Best Practices for SimPO Algorithm
HuggingFace Eco-compatibility
Benchmark

Multi-Modal LLM Training and Inference

Multi-Modal Documentation
- 📚 Tutorial
- Multi-Modal Best Practice
Human Preference Alignment Training Documentation
LmDeploy Inference Acceleration and Deployment
vLLM Inference Acceleration Documentation
Mutlimoda LLM Deployment
Qwen-VL Best Practice
Qwen2-VL Best Practice
Qwen-Audio Best Practice
Llava Best Practice
Llava Video Best Practice
InternVL Best Practice
Deepseek-VL Best Practice
Internlm-Xcomposer2 Best Practice
Phi3-Vision Best Practice
Yi-VL Best Practice
Florence Best Practice
CogVLM Best Practice
CogVLM2 Best Practice
GLM4V Best Practice
CogVLM2 Video Best Practice
MiniCPM-V Best Practice

API Doc

Hub
Trainer
Tuner

swift

Multi-Modal Documentation
View page source

Multi-Modal Documentation

📚 Tutorial

Human Preference Alignment Training Documentation
LmDeploy-inference-acceleration
vLLM Inference Acceleration
MLLM Deployment Documentation

Multi-Modal Best Practice

A single round of dialogue can contain multiple images (or no images):

Qwen-VL Best Practice, Qwen2-VL Best Practice
Qwen-Audio Best Practice, Qwen2-Audio Best Practice
Llava Best Practice, LLava Video Best Practice
InternVL Series Best Practice
MiniCPM-V Best Practice, MiniCPM-V-2.6 Best Practice
Deepseek-VL Best Practice
Internlm2-Xcomposers Best Practice
Phi3-Vision Best Practice, Phi3.5-Vision Best Practice.
mPLUG-Owl3 Best Practice
GOT-OCR2 Best Practice

A single round of dialogue can only contain one image:

Yi-VL Best Practice.md
Florence Best Practice.md

The entire conversation revolves around one image.

CogVLM Best Practice, CogVLM2 Best Practice, GLM4V Best Practice, CogVLM2-Video Best Practice

Previous Next

© Copyright 2022-2024, Alibaba ModelScope.

Built with Sphinx using a theme provided by Read the Docs.