Instructions to use OpenGVLab/InternVL2_5-4B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use OpenGVLab/InternVL2_5-4B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="OpenGVLab/InternVL2_5-4B", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("OpenGVLab/InternVL2_5-4B", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use OpenGVLab/InternVL2_5-4B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "OpenGVLab/InternVL2_5-4B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "OpenGVLab/InternVL2_5-4B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/OpenGVLab/InternVL2_5-4B
- SGLang
How to use OpenGVLab/InternVL2_5-4B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "OpenGVLab/InternVL2_5-4B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "OpenGVLab/InternVL2_5-4B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "OpenGVLab/InternVL2_5-4B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "OpenGVLab/InternVL2_5-4B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use OpenGVLab/InternVL2_5-4B with Docker Model Runner:
docker model run hf.co/OpenGVLab/InternVL2_5-4B
Difference between `<image>`, `<img>`, and `<im_start>`
#2
by mbrhd - opened
What is exactly the difference between all these tokens? <image>, <img>, and <im_start>. It seems they are all related to the image according to the model card. However, in the code snippet the <image> is used for the prompt, and for the training the <img> token was used.
The differences are as follows:
<image>: This is used as a placeholder in the prompt. In the code, it will eventually be replaced by a sequence that starts with<img>, followed by several<IMG_CONTEXT>tokens (which act as placeholders for the actual visual tokens produced by a Vision Transformer), and ends with</img>.<img>and</img>: These tokens mark the start and end of an image, respectively. They encapsulate the visual context (i.e., the<IMG_CONTEXT>tokens) that represents the processed image.<|im_start|>: This token is part of the ChatML template and is not directly related to image processing. It is used for formatting or structuring the dialogue rather than representing any image data.
In summary, <image> is a higher-level placeholder that gets expanded into a specific image token structure (<img>... </img> with visual tokens), while <|im_start|> is a formatting symbol for the chat interface unrelated to image tokens.
czczup changed discussion status to closed