Skip to main content

Text to Audio

TOX name base_txt_to_audio

Summary

A TouchDesigner component for generating synthetic audio with the Google Gemini API.

Controls

Refer to the Common Controls for a list of all available parameters.

Parameter NameParameterTypeDescription
Input ResolutionInputresolutionmenuDownscale options for reducing the image resolution - reducing your input resolution by a half or quarter will help maintain high performance
Include ImageIncludeimagetoggleSpecifies if an input image will be used when submitting the prompt to the Gemini API
Export Audio FileExportaudiofilepulseAllows for exporting audio from component - using this parameter will open a dialog asking you where to save the file
FileFilefile(Read Only) Path of source.
ReloadReloadpulsepulseInstantly reload the file from disk.
PlayPlaytoggleAudio will playback when this is set to 1 and stop when set to 0.
SpeedSpeedfloatThis is a speed multiplier which only works when Play Mode is Sequential. A value of 1 is the default playback speed. A value of 2 is double speed, 0.5 is half speed and so on. This node can not play audio backwards so negative values will not work well.
CueCuetoggleJumps to Cue Point when set to 1. Only available when Play Mode is Sequential.
Pulse CueCuepulsepulseInstantly jumps to the Cue Point.
RepeatRepeatmenuRepeats the audio stream when the end is reached.
VolumeVolumefloatSet the level the file is read in at. A setting of 1 is full signal while 0 is muted.
Fade In/OutFadetoggleabout

Outputs

Output IndexNameTypeDescription
0out_responseTOPThe video output from the Google Gemini API
1out_response_audioCHOPThe video audio output from the Google Gemini API