17 lines
724 B
Markdown
17 lines
724 B
Markdown
# Img2Vec
|
||
|
||
A rough implementation of generating image embeddings through methodologies introduced in LLaVA
|
||
|
||
### Structure
|
||
We derived the image embeddings by using a CLIP encoder and mapping it with the pretrained LLaVA’s projection weight
|
||
|
||
### Prerequisites
|
||
1. install requirements.txt
|
||
2. Make sure you have downloaded `pytorch_model-00003-of-00003.bin`
|
||
|
||
### Usage
|
||
|
||
Replace **image-dir** and **llava-ckpt** to your **test image folder addr** and **pytorch_model-00003-of-00003.bin addr**
|
||
|
||
`python convert_images_to_vectors.py --image-dir ./datasets/coco/val2017 --output-dir imgVecs --vision-model openai/clip-vit-large-patch14-336 --proj-dim 5120 --llava-ckpt ./datasets/pytorch_model-00003-of-00003.bin --batch-size 64`
|