Img2Vec/readme.md
2025-06-13 15:00:33 +08:00

17 lines
724 B
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Img2Vec
A rough implementation of generating image embeddings through methodologies introduced in LLaVA
### Structure
We derived the image embeddings by using a CLIP encoder and mapping it with the pretrained LLaVAs projection weight
### Prerequisites
1. install requirements.txt
2. Make sure you have downloaded `pytorch_model-00003-of-00003.bin`
### Usage
Replace **image-dir** and **llava-ckpt** to your **test image folder addr** and **pytorch_model-00003-of-00003.bin addr**
`python convert_images_to_vectors.py --image-dir ./datasets/coco/val2017 --output-dir imgVecs --vision-model openai/clip-vit-large-patch14-336 --proj-dim 5120 --llava-ckpt ./datasets/pytorch_model-00003-of-00003.bin --batch-size 64`