Here is a PowerShell one liner for Text-to-Speech (TTS) using Microsoft's desktop oriented
Speech API (SAPI).
(New-Object -ComObject Sapi.SpVoice).Speak("Hello There!")
It uses
New-Object to create a
Component Object Model (COM) instance of SAPI
spVoice.
Actually, if you plan to use speech in script it will make more sense to keep the object around for reuse.
$synth = New-Object -ComObject Sapi.SpVoice
$synth.Speak("Hello Again!")
In the Windows jungle there is no escape from King-COM!
Oh...wait..You can also access SAPI via .NET instead of directly using COM. You can make the SAPI
System.speech assembly accessible by using
Add-Type.
Add-Type -AssemblyName System.speech
$synth = New-Object System.Speech.Synthesis.SpeechSynthesizer
$synth.Speak("Hello from dot net")
SAPI is fun to play with but it comes with a limited set of voices and speech recognizers. If you want to experiment with other voices you'll need to purchase them or switch speech systems. One option is the
Microsoft Speech Platform which supports several additional voices.
Unfortunately, voices and speech recognizers are not compatible between the two Microsoft speech systems. They have slightly different designs reflecting their different use cases. SAPI is designed for desktop platforms and single users. The SAPI speech recognizers are tuneable to a specific user and they support recognition of arbitrary words with a diction engine. A single running instance of the SAPI speech system can be shared among many applications (i.e. the SAPI provider runs
out-of-process). The Speech Platform is server oriented. It is
in-process (AKA InProc) so each process that requires speech capabilities will have it's own instance of the Speech Platform speech system. You could run multiple speech capable processes on a single server (e.g. concurrent voice recognition processes on several users voice mailboxes).
I'm assuming that you have already downloaded and installed the
Microsoft Speech Platform SDK, runtime, language packs (speech recognizers and text-to-speech voices) you want to use. Once again, use
Add-Type to add the Speech Platform assembly and create
Microsoft.Speech objects in a PowerShell environment. The Speech Platform requires you to set the audio output destination so you can hear what is said.
Add-Type -Path "C:\Program Files\Microsoft SDKs\Speech\v11.0\Assembly\Microsoft.Speech.dll"
$ms_speak = New-Object Microsoft.Speech.Synthesis.SpeechSynthesizer
$ms_speak.setOutputToDefaultAudioDevice()
$ms_speak.Speak("Hello, again, and again!")
After creating the
SpeechSynthesizer object You can record the speech to a file with:
$ms_speak.setOutputToWaveFile("hello.wav")
$ms_speak.Speak("Greetings")
$ms_speak.Dispose()
You must Dispose of the object to commit the speech audio data to the named file.
I recommend reviewing the MSDN documentation for both speech systems. Also, check out the
Out-Voice function and this
blog post (both by Boe Prox) he describes how you can spelunk the two systems from PowerShell with
Get-Member. Finally,
Language Packs provide SAPI text-to-speech voices and speech recognizers for a few non-English languages.
--
P.S. Technically you can use SAPI InProc or shared (out-of-process).
P.P.S. There really is no getting away from COM. It's still one of the architectural pillars of Microsoft server and desktop products.