Img2Vec
A rough implementation of generating image embeddings through methodologies introduced in LLaVA
Structure
We derived the image embeddings by using a CLIP encoder and mapping it with the pretrained LLaVA’s projection weight
Prerequisites
- install requirements.txt
- Make sure you have downloaded
pytorch_model-00003-of-00003.bin
Usage
Replace image-dir and llava-ckpt to your test image folder addr and pytorch_model-00003-of-00003.bin addr
python convert_images_to_vectors.py --image-dir ./datasets/coco/val2017 --output-dir imgVecs --vision-model openai/clip-vit-large-patch14-336 --proj-dim 5120 --llava-ckpt ./datasets/pytorch_model-00003-of-00003.bin --batch-size 64
Description
Languages
Python
100%