tts

stable

Text-to-speech preparation utilities: build SSML markup for TTS engines, split text into sentences, normalize abbreviations, estimate audio duration, and count phonemes.

use plugin tts::{ssml_wrap, ssml_emphasis, ssml_break, …}

16 functions Audio & Media

/ filter jk navigate Esc clear

Functions (16)

ssml_wrap Wrap text in a `<speak>` root element
ssml_emphasis Wrap text in an `<emphasis>` tag
ssml_break Insert a timed `<break>` pause
ssml_prosody Wrap text with rate/pitch/volume controls
ssml_say_as Wrap text with an `interpret-as` hint
ssml_sub Substitute a spoken alias for text
ssml_voice Wrap text in a named `<voice>` tag
ssml_lang Wrap text in a language tag
ssml_phoneme Provide an explicit phoneme pronunciation
ssml_audio Insert an `<audio>` element with a URL
strip_ssml Strip all SSML tags from a string
split_sentences Split text into a list of sentences
estimate_duration_seconds Estimate spoken duration from word count
normalize_text Expand common abbreviations for TTS
phoneme_count_estimate Estimate phoneme count of text
word_count Count whitespace-separated words

ssml_wrap(text) → string

Wrap text in a `<speak>` root element

Wraps text inside <speak>...</speak>, which is the required root element for SSML documents submitted to TTS engines like Google Cloud TTS or Amazon Polly.

use plugin tts::{ssml_wrap, ssml_emphasis}

let inner = ssml_emphasis("Hello!", "strong")
let doc = ssml_wrap(inner)
print(doc)

ssml_emphasis(text, level) → string

Wrap text in an `<emphasis>` tag

Wraps text in <emphasis level="...">. Valid levels are "strong", "moderate" (default), and "reduced". Use to stress particular words in synthesized speech.

use plugin tts::{ssml_emphasis}

let s = ssml_emphasis("important", "strong")
print(s)

ssml_break(time_ms) → string

Insert a timed `<break>` pause

Inserts a <break time="Nms"/> element representing a pause of time_ms milliseconds. Returns a self-closing tag with no surrounding text.

use plugin tts::{ssml_wrap, ssml_break}

let pause = ssml_break(500)
let doc = ssml_wrap("Hello.{pause} How are you?")
print(doc)

ssml_prosody(text, rate, pitch, volume) → string

Wrap text with rate/pitch/volume controls

Wraps text in a <prosody> element controlling speech rate, pitch, and volume. Each parameter accepts keyword values ("slow", "medium", "fast" for rate; "low", "medium", "high" for pitch/volume) or percentage strings like "+20%".

use plugin tts::{ssml_prosody, ssml_wrap}

let slow = ssml_prosody("Take your time.", "slow", "medium", "medium")
let doc = ssml_wrap(slow)
print(doc)

ssml_say_as(text, interpret_as) → string

Wrap text with an `interpret-as` hint

Wraps text in <say-as interpret-as="..."> to tell the TTS engine how to interpret the content. Common values: "cardinal", "ordinal", "characters", "spell-out", "date", "time", "telephone".

use plugin tts::{ssml_say_as, ssml_wrap}

let phone = ssml_say_as("555-1234", "telephone")
let doc = ssml_wrap("Call us at {phone}.")
print(doc)

ssml_sub(text, alias) → string

Substitute a spoken alias for text

Produces <sub alias="...">text</sub>, instructing the TTS engine to speak alias aloud while displaying text visually. Useful for acronyms and abbreviations.

use plugin tts::{ssml_sub, ssml_wrap}

let abbr = ssml_sub("TTS", "text to speech")
let doc = ssml_wrap("Welcome to {abbr}.")
print(doc)

ssml_voice(text, name) → string

Wrap text in a named `<voice>` tag

Wraps text in <voice name="...">, switching to the named TTS voice for that span. Voice names are engine-specific (e.g. "en-US-Wavenet-A" for Google).

use plugin tts::{ssml_voice, ssml_wrap}

let narrated = ssml_voice("Once upon a time...", "en-US-Wavenet-D")
let doc = ssml_wrap(narrated)
print(doc)

ssml_lang(text, lang) → string

Wrap text in a language tag

Wraps text in <lang xml:lang="..."> to switch the language for that span. Use standard BCP-47 language codes such as "en-US", "fr-FR", or "de-DE".

use plugin tts::{ssml_lang, ssml_wrap}

let greeting = ssml_lang("Bonjour!", "fr-FR")
let doc = ssml_wrap("She said: {greeting}")
print(doc)

ssml_phoneme(text, alphabet, ph) → string

Provide an explicit phoneme pronunciation

Provides an explicit phoneme pronunciation for text using the given phonetic alphabet ("ipa" or "x-sampa") and phoneme string ph. Overrides the engine's default pronunciation.

use plugin tts::{ssml_phoneme, ssml_wrap}

let word = ssml_phoneme("tomato", "ipa", "təˈmeɪtoʊ")
let doc = ssml_wrap("I say {word}.")
print(doc)

ssml_audio(src, alt) → string

Insert an `<audio>` element with a URL

Inserts an <audio src="..."> element pointing to an audio file URL. If alt text is provided it is placed inside the tag as a fallback for engines that do not support audio elements.

use plugin tts::{ssml_audio, ssml_wrap}

let sound = ssml_audio("https://example.com/chime.mp3", "chime")
let doc = ssml_wrap("{sound} Welcome back.")
print(doc)

strip_ssml(text) → string

Strip all SSML tags from a string

Removes all SSML XML tags from a string, returning only the plain text content. Useful for extracting readable text from an SSML document for logging or display.

use plugin tts::{ssml_wrap, ssml_emphasis, strip_ssml}

let doc = ssml_wrap(ssml_emphasis("Hello world", "strong"))
let plain = strip_ssml(doc)
print(plain)

split_sentences(text) → table

Split text into a list of sentences

Splits plain text into a list of sentences by breaking on ., !, and ?. Ellipses are not treated as sentence boundaries. Returns an indexed list of sentence strings.

use plugin tts::{split_sentences}

let sentences = split_sentences("Hello world. How are you? Fine!")
print(sentences[1])
print(sentences[2])
print(sentences[3])

estimate_duration_seconds(text, wpm) → float

Estimate spoken duration from word count

Estimates how many seconds a TTS engine would take to speak text, based on a words-per-minute rate. wpm defaults to 150 if omitted. Useful for timing audio segments before rendering.

use plugin tts::{estimate_duration_seconds}

let text = "Welcome to our application. This is a short introduction."
let secs = estimate_duration_seconds(text, 150.0)
print("Estimated duration: {secs} seconds")

normalize_text(text) → string

Expand common abbreviations for TTS

Expands common abbreviations (e.g. "Dr." → "Doctor", "etc." → "et cetera") and collapses multiple spaces. Improves pronunciation quality for TTS engines that handle abbreviations inconsistently.

use plugin tts::{normalize_text}

let raw = "Dr. Smith works at the Dept. of Health, etc."
let clean = normalize_text(raw)
print(clean)

phoneme_count_estimate(text) → int

Estimate phoneme count of text

Estimates the total number of phonemes in text using a simple vowel-group and consonant heuristic (3 phonemes per vowel group, 1 per consonant). Useful for rough timing estimates or batch processing.

use plugin tts::{phoneme_count_estimate}

let n = phoneme_count_estimate("Hello world")
print("Estimated phonemes: {n}")

word_count(text) → int

Count whitespace-separated words

Counts whitespace-separated words in the text. A fast utility for estimating content length before passing text to a TTS engine.

use plugin tts::{word_count}

let n = word_count("The quick brown fox jumps over the lazy dog")
print("Words: {n}")

View source code