Image to Text
TOX name base_img_to_txt
Summary
A TouchDesigner component for generating synthetic text from an image and a prompt with the Google Gemini API.
Controls
Refer to the Common Controls for a list of all available parameters.
| Parameter Name | Parameter | Type | Description |
|---|---|---|---|
| Input Resolution | Inputresolution | menu | Downscale options for reducing the image resolution - reducing your input resolution by a half or quarter will help maintain high performance |
Outputs
| Output Index | Name | Type | Description |
|---|---|---|---|
| 0 | out_response | DAT | Contains the response back from the Google Gemini API |
| 1 | out_metadata | DAT | Contains the metadata back from the Google Gemini API, this includes data like total token count, and prompt token count |