Skip to content

tts

stable

Text-to-speech preparation utilities: build SSML markup for TTS engines, split text into sentences, normalize abbreviations, estimate audio duration, and count phonemes.

use plugin tts::{ssml_wrap, ssml_emphasis, ssml_break, …}
16 functions Audio & Media
/ filter jk navigate Esc clear
Functions (16)
  1. ssml_wrap Wrap text in a `<speak>` root element
  2. ssml_emphasis Wrap text in an `<emphasis>` tag
  3. ssml_break Insert a timed `<break>` pause
  4. ssml_prosody Wrap text with rate/pitch/volume controls
  5. ssml_say_as Wrap text with an `interpret-as` hint
  6. ssml_sub Substitute a spoken alias for text
  7. ssml_voice Wrap text in a named `<voice>` tag
  8. ssml_lang Wrap text in a language tag
  9. ssml_phoneme Provide an explicit phoneme pronunciation
  10. ssml_audio Insert an `<audio>` element with a URL
  11. strip_ssml Strip all SSML tags from a string
  12. split_sentences Split text into a list of sentences
  13. estimate_duration_seconds Estimate spoken duration from word count
  14. normalize_text Expand common abbreviations for TTS
  15. phoneme_count_estimate Estimate phoneme count of text
  16. word_count Count whitespace-separated words

Wrap text in a `<speak>` root element

Wraps text inside <speak>...</speak>, which is the required root element for SSML documents submitted to TTS engines like Google Cloud TTS or Amazon Polly.

use plugin tts::{ssml_wrap, ssml_emphasis}

let inner = ssml_emphasis("Hello!", "strong")
let doc = ssml_wrap(inner)
print(doc)

Wrap text in an `<emphasis>` tag

Wraps text in <emphasis level="...">. Valid levels are "strong", "moderate" (default), and "reduced". Use to stress particular words in synthesized speech.

use plugin tts::{ssml_emphasis}

let s = ssml_emphasis("important", "strong")
print(s)

Insert a timed `<break>` pause

Inserts a <break time="Nms"/> element representing a pause of time_ms milliseconds. Returns a self-closing tag with no surrounding text.

use plugin tts::{ssml_wrap, ssml_break}

let pause = ssml_break(500)
let doc = ssml_wrap("Hello.{pause} How are you?")
print(doc)

Wrap text with rate/pitch/volume controls

Wraps text in a <prosody> element controlling speech rate, pitch, and volume. Each parameter accepts keyword values ("slow", "medium", "fast" for rate; "low", "medium", "high" for pitch/volume) or percentage strings like "+20%".

use plugin tts::{ssml_prosody, ssml_wrap}

let slow = ssml_prosody("Take your time.", "slow", "medium", "medium")
let doc = ssml_wrap(slow)
print(doc)

Wrap text with an `interpret-as` hint

Wraps text in <say-as interpret-as="..."> to tell the TTS engine how to interpret the content. Common values: "cardinal", "ordinal", "characters", "spell-out", "date", "time", "telephone".

use plugin tts::{ssml_say_as, ssml_wrap}

let phone = ssml_say_as("555-1234", "telephone")
let doc = ssml_wrap("Call us at {phone}.")
print(doc)

Substitute a spoken alias for text

Produces <sub alias="...">text</sub>, instructing the TTS engine to speak alias aloud while displaying text visually. Useful for acronyms and abbreviations.

use plugin tts::{ssml_sub, ssml_wrap}

let abbr = ssml_sub("TTS", "text to speech")
let doc = ssml_wrap("Welcome to {abbr}.")
print(doc)

Wrap text in a named `<voice>` tag

Wraps text in <voice name="...">, switching to the named TTS voice for that span. Voice names are engine-specific (e.g. "en-US-Wavenet-A" for Google).

use plugin tts::{ssml_voice, ssml_wrap}

let narrated = ssml_voice("Once upon a time...", "en-US-Wavenet-D")
let doc = ssml_wrap(narrated)
print(doc)

Wrap text in a language tag

Wraps text in <lang xml:lang="..."> to switch the language for that span. Use standard BCP-47 language codes such as "en-US", "fr-FR", or "de-DE".

use plugin tts::{ssml_lang, ssml_wrap}

let greeting = ssml_lang("Bonjour!", "fr-FR")
let doc = ssml_wrap("She said: {greeting}")
print(doc)

Provide an explicit phoneme pronunciation

Provides an explicit phoneme pronunciation for text using the given phonetic alphabet ("ipa" or "x-sampa") and phoneme string ph. Overrides the engine's default pronunciation.

use plugin tts::{ssml_phoneme, ssml_wrap}

let word = ssml_phoneme("tomato", "ipa", "təˈmeɪtoʊ")
let doc = ssml_wrap("I say {word}.")
print(doc)

Insert an `<audio>` element with a URL

Inserts an <audio src="..."> element pointing to an audio file URL. If alt text is provided it is placed inside the tag as a fallback for engines that do not support audio elements.

use plugin tts::{ssml_audio, ssml_wrap}

let sound = ssml_audio("https://example.com/chime.mp3", "chime")
let doc = ssml_wrap("{sound} Welcome back.")
print(doc)

Strip all SSML tags from a string

Removes all SSML XML tags from a string, returning only the plain text content. Useful for extracting readable text from an SSML document for logging or display.

use plugin tts::{ssml_wrap, ssml_emphasis, strip_ssml}

let doc = ssml_wrap(ssml_emphasis("Hello world", "strong"))
let plain = strip_ssml(doc)
print(plain)

Split text into a list of sentences

Splits plain text into a list of sentences by breaking on ., !, and ?. Ellipses are not treated as sentence boundaries. Returns an indexed list of sentence strings.

use plugin tts::{split_sentences}

let sentences = split_sentences("Hello world. How are you? Fine!")
print(sentences[1])
print(sentences[2])
print(sentences[3])

Estimate spoken duration from word count

Estimates how many seconds a TTS engine would take to speak text, based on a words-per-minute rate. wpm defaults to 150 if omitted. Useful for timing audio segments before rendering.

use plugin tts::{estimate_duration_seconds}

let text = "Welcome to our application. This is a short introduction."
let secs = estimate_duration_seconds(text, 150.0)
print("Estimated duration: {secs} seconds")

Expand common abbreviations for TTS

Expands common abbreviations (e.g. "Dr.""Doctor", "etc.""et cetera") and collapses multiple spaces. Improves pronunciation quality for TTS engines that handle abbreviations inconsistently.

use plugin tts::{normalize_text}

let raw = "Dr. Smith works at the Dept. of Health, etc."
let clean = normalize_text(raw)
print(clean)

Estimate phoneme count of text

Estimates the total number of phonemes in text using a simple vowel-group and consonant heuristic (3 phonemes per vowel group, 1 per consonant). Useful for rough timing estimates or batch processing.

use plugin tts::{phoneme_count_estimate}

let n = phoneme_count_estimate("Hello world")
print("Estimated phonemes: {n}")

Count whitespace-separated words

Counts whitespace-separated words in the text. A fast utility for estimating content length before passing text to a TTS engine.

use plugin tts::{word_count}

let n = word_count("The quick brown fox jumps over the lazy dog")
print("Words: {n}")
enespt-br