Skip to main content

Image to Text

TOX name base_img_to_txt

Summary

A TouchDesigner component for generating synthetic text from an image and a prompt with the Google Gemini API.

Controls

Refer to the Common Controls for a list of all available parameters.

Parameter NameParameterTypeDescription
Input ResolutionInputresolutionmenuDownscale options for reducing the image resolution - reducing your input resolution by a half or quarter will help maintain high performance

Outputs

Output IndexNameTypeDescription
0out_responseDATContains the response back from the Google Gemini API
1out_metadataDATContains the metadata back from the Google Gemini API, this includes data like total token count, and prompt token count