Update README.md

33268d8c · echo840 · GitHub · 4747dc52 · 33268d8c
Unverified Commit 33268d8c authored May 06, 2024 by echo840 Committed by GitHub May 06, 2024
Hide whitespace changes
Inline Side-by-side

Showing with 0 additions and 6 deletions

README.md README.md +0 -6

No files found.
--- a/README.md
+++ b/README.md
@@ -78,12 +78,6 @@ We also offer Monkey's model definition and training code, which you can explore
 The json file used for Monkey training can be downloaded at [Link](https://drive.google.com/file/d/18z_uQTe8Jq61V5rgHtxOt85uKBodbvw1/view?usp=sharing).
-**ATTENTION:** Specify the path to your training data, which should be a json file consisting of a list of conversations.
-Inspired by Qwen-VL, we freeze the Large Language Model (LLM) and introduce LoRA into four linear layers ```"c_attn", "attn.c_proj", "w1", "w2"``` for training. This step makes it possible to train Monkey using 8 NVIDIA 3090 GPUs. The specific implementation code is in ```modeling_qwen_nvdia3090.py```.
- - Add LoRA: You need to replace the contents of ```modeling_qwen.py``` with the contents of ```modeling_qwen_nvdia3090.py```.
- - Freeze LLM: You need to freeze other modules except LoRA and Resampler modules in ```finetune_multitask.py```.
 ## Inference
 Run the inference code for Monkey and Monkey-Chat: