# Text2Img A rough implementation of generating image embeddings through methodologies introduced in LLaVA ### Structure We derived the image embeddings by using a CLIP encoder and mapping it with the pretrained LLaVA’s projection weights layer ### Prerequisites 1. install requirements.txt 2. Make sure you have [llava-v1.5-mlp2x-336px-pretrain-vicuna-7b-v1.5](https://huggingface.co/liuhaotian/llava-v1.5-mlp2x-336px-pretrain-vicuna-7b-v1.5/tree/main) under your **models** folder. 3. For example image data, I used [2017 Val images 5K/1GB](http://images.cocodataset.org/zips/val2017.zip) and [2017 Train/Val annotations 241MB](http://images.cocodataset.org/annotations/annotations_trainval2017.zip) ### Usage For image_embedder.py: 1. Embed a single image (Print Only): `python -m embed.image_embedder --image "C:\path\img.jpg" --no-save ` 2. Embed a single image (Save to File): `python -m embed.image_embedder --image "C:\path\to\image.jpg" --out "C:\project\embeddings\image_embeddings.pkl" ` 3. Embed a single folder of images: `python -m embed.image_embedder --folder "C:\path\to\images" --out "C:\project\embeddings\image_embeddings.pkl" --batch-size 32 ` For text_embedder.py: 1. Embed a Single Article (Print Only): `python -m embed.text_embedder --text "This is my single-article input string." ` 2. Embed a Single Article (Save to File): `python -m embed.text_embedder --text "This is my single-article input string." --out "C:\project\embeddings\text_embeddings.pkl" ` 3. Embed multiple articles from a file (one per line): `python -m embed.text_embedder --file "C:\path\to\articles.txt" --out "C:\project\embeddings\text_embeddings.pkl" --batch-size 8 `