Unverified Commit 1faeff85 authored by Younes Belkada's avatar Younes Belkada Committed by GitHub
Browse files

Fix Vip-llava docs (#28085)

* Update vipllava.md

* Update modeling_vipllava.py
parent ffa04def
...@@ -37,13 +37,13 @@ Tips: ...@@ -37,13 +37,13 @@ Tips:
- For better results, we recommend users to prompt the model with the correct prompt format: - For better results, we recommend users to prompt the model with the correct prompt format:
```bash ```bash
"USER: <image>\n<prompt>ASSISTANT:" A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.###Human: <image>\n<prompt>###Assistant:
``` ```
For multiple turns conversation: For multiple turns conversation:
```bash ```bash
"USER: <image>\n<prompt1>ASSISTANT: <answer1>USER: <prompt2>ASSISTANT: <answer2>USER: <prompt3>ASSISTANT:" A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.###Human: <image>\n<prompt1>###Assistant: <answer1>###Human: <prompt2>###Assistant:
``` ```
The original code can be found [here](https://github.com/mu-cai/ViP-LLaVA). The original code can be found [here](https://github.com/mu-cai/ViP-LLaVA).
......
...@@ -367,23 +367,26 @@ class VipLlavaForConditionalGeneration(VipLlavaPreTrainedModel): ...@@ -367,23 +367,26 @@ class VipLlavaForConditionalGeneration(VipLlavaPreTrainedModel):
Example: Example:
```python ```python
>>> import torch
>>> from PIL import Image >>> from PIL import Image
>>> import requests >>> import requests
>>> from transformers import AutoProcessor, VipLlavaForConditionalGeneration >>> from transformers import AutoProcessor, VipLlavaForConditionalGeneration
>>> model = VipLlavaForConditionalGeneration.from_pretrained("llava-hf/vipllava-7b-hf") >>> model = VipLlavaForConditionalGeneration.from_pretrained("llava-hf/vip-llava-7b-hf", device_map="auto", torch_dtype=torch.float16)
>>> processor = AutoProcessor.from_pretrained("llava-hf/vipllava-7b-hf") >>> processor = AutoProcessor.from_pretrained("llava-hf/vip-llava-7b-hf")
>>> prompt = "USER: <image>\nCan you please describe this image?\nASSISTANT:" >>> prompt = "A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.###Human: <image>\n{}###Assistant:"
>>> question = "Can you please describe this image?"
>>> prompt = prompt.format(question)
>>> url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/compel-neg.png" >>> url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/compel-neg.png"
>>> image = Image.open(requests.get(url, stream=True).raw) >>> image = Image.open(requests.get(url, stream=True).raw)
>>> inputs = processor(text=text, images=image, return_tensors="pt") >>> inputs = processor(text=text, images=image, return_tensors="pt").to(0, torch.float16)
>>> # Generate >>> # Generate
>>> generate_ids = model.generate(**inputs, max_new_tokens=20) >>> generate_ids = model.generate(**inputs, max_new_tokens=20)
>>> tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0] >>> processor.decode(generate_ids[0][len(inputs["input_ids"][0]):], skip_special_tokens=True)
"USER: <image> \nCan you please describe this image?\nASSISTANT: The image features a brown and white cat sitting on a green surface, with a red ball in its paw." The image features a brown and white cat sitting on a green surface, with a red ball in its
```""" ```"""
output_attentions = output_attentions if output_attentions is not None else self.config.output_attentions output_attentions = output_attentions if output_attentions is not None else self.config.output_attentions
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment