SSML - Speech Synthesis Markup Language



SSML, the speech synthesis markup language, is a W3C standard - read all about it here:
https://www.w3.org/TR/speech-synthesis/

Think of it as the HTML for speech. Just as you would put a <b> tag around a word, to show it in bold letters, when rendered in a web browser, there is an <emphasis> tag in SSML, to emphasize a word, when it gets synthesized.

Just like HTML, SSML has a variety of tags and the essential role of the markup language is to provide a standard way to control aspects of speech such as pronunciation, volume, pitch, rate, etc. across different synthesis-capable platforms.

Now here is the problem, Apple never implemented SSML in OS X nor in the newer macOS as a standard and instead introduced their own proprietary “Embedded Speech Commands," to provide some control over how to synthesize content.


Using Mac2Speech to synthesize SSML encoded content.


Since Apple does not support SSML, you need to install new voices and synthesizers that support this W3C standard. We have identified
Cepstral as a provider for such components.
Once you have identified the voice(s) you like and downloaded / installed them on your Mac, you will find a new icon in your Mac’s System-Preferences. Clicking on this icon, brings up a dialog that should look something like this:

cepstral

You need to acquire a license from Cepstral to use those voices. Once you have your license code, first select the voice and then click the ‘License Voice’ button at the bottom of the ‘Ceptral Voices’ dialog, pictured above.

Straight forward licensing allows you to only synthesize text into sound, without ever storing the content. However, this is not how Mac2Speech works. Remember that Mac2Speech not only needs to synthesize, but also to encode the highest possible quality of the synthesis into the MP3 standard.
To allow Mac2Speech to use Cepstral’s voices to synthesize SSML content, voices need to be licensed with the “Save to File for Mac OS X” license.

Once you have your ‘Safe to File’ license code, open Terminal and enter this command, (replace wth Name, Company and License Code with values form the registration/activation information, you received from Cepstral.)

sudo /Applications/Mac2Speech.app/Contents/Java/swift --reg-filewrite \--customer-name “YOUR NAME” \--company-name “YOUR COMPANY” \--license-key " YOUR SAFE TO FILE LICENSE CODE”

I understand that this is not an effortless process and will certainly require a lot of determination. However, the end result might be worth it. To give you just a taste, here for instance is an SSML document.

SSML


<?xml version="1.0"?>
<speak xmlns="http://www.w3.org/2001/10/synthesis" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="1.0" xsi:schemaLocation="http://www.w3.org/2001/10/synthesis http://www.w3.org/TR/speech-synthesis/synthesis.xsd" xml:lang="en-US">
That is a <emphasis> big</emphasis> car!
This is going to make a <emphasis level="strong"> huge</emphasis> impression.
</speak>

SSML - Synthesis


And here is what the synthesis sounds like:

ssml


Since version 3, Mac2Speech supports SSML, which I think is a huge deal, given that Apple stayed as far away from this standard as possible and we truly have to be thankful for voice providers like Cepstral to make this possible.