Currently each image will be sent as a batch with the prompts thus asking each prompt on each image
Running on free CPU space tier currently so results may take a bit to process compared to duplicating space and using GPU space hardware
A tiny vision language model. moondream2
Structured Dataframe