Siri Text to Speech with iPad Playground

Many mobile programmers are unaware that they have access to a rich speech synthesizer capability built into Apple's mobile environment and has been ever since iOS 7.   iOS software development has come a long ways.  We'll look at how easy it is to use the text to speech API available to iOS programmers on an iPad using the Swift programming language.


You can now prototype samples with the Swift Playgrounds App which began shipping with iOS 10.  A Swift Playground is a touch-based interactive development environment for the iPad. You can run sample code and see (and hear) the results.  The goal is to learn not just the Swift programming language but also to learn iOS programming, including the libraries and frameworks which give it such power.

Speech Synthesis

Unlike the 'bad old days' of mobile development, programmers no longer have to rely on Objective C.  Today Swift has become the clear winner for the future of iOS programming.  All the built in frameworks (which is Apple's term for libraries) can be accessed not just in the older Objective C but also in Swift.  Not only that, when called from Swift you get to take advantage of modern programming language features.

The AVSpeechSynthesizer class produces speech from a given utterance.  The AV stands for audio visual, and are the letters used to put this framework into its own namespace.  You create an utterance using AVSpeechUtterance() which merely takes a text string, which is the text you want spoken.  It really is that easy.  The code looks something like this:


import PlaygroundSupport
import AVFoundation  

let text = "I'll be back"

PlaygroundPage.current.needsIndefiniteExecution = true

let speechSynthesizer = AVSpeechSynthesizer()
let utterance = AVSpeechUtterance(string: text)


You'll see our initial text defined, which is the text we want spoken.  

The next line sets a property needsIndefiniteExecution for the Playground.  It forces the sample code to continue executing and not shut down immediately.  We need this as text to speech happens asynchronously.  Our code calls the function, the function returns right away, even though the text has not yet had a chance to playback.  Without this statement, we would never hear any text as the code shuts down too soon.

It really comes down to two lines of code.  We create an utterance using the text we want spoken, and then we call the speak() method on the instantiated AVSpeechSynthesizer object to play back our test.

You can try different phrases by changing the text definition.  

Try It Out!

You can easily try this out on a Playground of your own.  

You have two options.  You can just type the example straight into the Playground App.  Or, if you happen to have a computer running macOS Sierra or better, you can copy the full text of the code to the clipboard from your Mac and then paste it in your iPad.  Through the magic of Universal Clipboard, you can actually copy/paste between devices.  

Once you have the code, tap on the Run My Code button in the lower right hand corner.  There is nothing to see in this sample.  You should hear the text being spoken.  After you have finished listening to it, you can tap the Stop button to end execution.

Updating Your App

If you have an App that could use voice output, you can see how easy this framework is to use.  This functionality is not just on iOS but also it's older sibling, macOS.  Of course, not all Apps need to voice output.  It all depends on your App.

We shoiuld also be careful, as text to speech capability as we get here, is only part of what voice assistants handle.  They also can handle the natural language understanding as well as dictation which gives the computer the ability to understand what the user said.  There are API's for this as well on both iOS and mac.  

I hope you find this informaiton helpful.  Let us know how you were able to use this in your own Apps.


Add new comment

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
5 + 12 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.