Skip to content

whisper

pixeltable.functions.whisper

Pixeltable UDF that wraps the OpenAI Whisper library.

This UDF will cause Pixeltable to invoke the relevant model locally. In order to use it, you must first pip install openai-whisper.

transcribe

transcribe(
    audio: Audio,
    *,
    model: String,
    temperature: Optional[Json] = [0.0, 0.2, 0.4, 0.6, 0.8, 1.0],
    compression_ratio_threshold: Optional[Float] = 2.4,
    logprob_threshold: Optional[Float] = -1.0,
    no_speech_threshold: Optional[Float] = 0.6,
    condition_on_previous_text: Bool = True,
    initial_prompt: Optional[String] = None,
    word_timestamps: Bool = False,
    prepend_punctuations: String = "\"'“¿([{-",
    append_punctuations: String = "\"'.。,,!!??::”)]}、",
    decode_options: Optional[Json] = None
) -> Json

Transcribe an audio file using Whisper.

This UDF runs a transcription model locally using the Whisper library, equivalent to the Whisper transcribe function, as described in the Whisper library documentation.

Requirements:

  • pip install openai-whisper

Parameters:

  • audio (Audio) –

    The audio file to transcribe.

  • model (String) –

    The name of the model to use for transcription.

Returns:

  • Json

    A dictionary containing the transcription and various other metadata.

Examples:

Add a computed column that applies the model base.en to an existing Pixeltable column tbl.audio of the table tbl:

>>> tbl['result'] = transcribe(tbl.audio, model='base.en')