Last updated - July 8, 2021
Anyone dabbling in any type of text-based content creation knows how difficult it is to get in front of relevant eyeballs these days. The competition for both the time and attention of not only new visitors but also the existing ones is getting tougher by the day, which makes any kind of competitive edge (in a fairly loose sense of it) all the more important.
I’m here to tell you how a text-to-speech (TTS) plugin can be your advantage in getting your content seen by new audiences. The rapid advances in voice technology have gotten to a point where the synthesized voices are nearly indistinguishable from the real thing, it takes only a few minutes to add audio to your website, and the cost of entry is extremely affordable – not to mention non-existent for WordPress users, as I’m about to demonstrate.
Here’s how one such player looks like in a blog post:
But first, a little bit about the importance of website-based audio content today.
The power of a listening experience
There is an amalgam of reasons why digital audio is the preferred medium when it comes to content consumption. That’s the online sphere where your audience is congregating, doing more and more of listening. While I can’t tell you if they are listening more than they are reading (I’d love to have that bit of information), I can tell you for a fact there are situations when they are solely listening, and in a large number of cases – doing something else completely while at it.
It’s awfully convenient to have text read out to you, in no small part because it allows multitasking. Using a TTS plugin to attract and retain new visitors means keeping up with the times. With a few clicks, readers become listeners and you get a new, yet very familiar way to engage with visitors across different devices, at their own time and place: while they are out, in office, driving or commuting, home, and so on.
Other major benefits of audio include it’s adaptability to different formats and types of content. There something for everyone and every type of occasion: quick news update, long(er) story-driven report, audio blog, and so on in virtually every length and topic. Plus, you can consume audio the way you want it: online, offline, in one or multiple takes, and without any involvement whatsoever.
Now for the good stuff.
Adding a text to speech solution to your website
Quick disclaimer: because I have ‘behind the scenes’ access, I’ll be using the Trinity Audio WordPress plugin to show you the ropes and highlight some important data (I am big on the whole ‘being contextual’ thing). You can use any other solution out there, free or paid, as the principle is pretty much the same. And just like my company’s solution, do note these are suitable for other websites and content management systems – they just come in a slightly different form.
That’s pretty much the entire process behind the WordPress plugin, only significantly simplified due to the familiar environment of the said CMS. There’s no higher science behind it: besides the usual downloading and activating the plugin, users can pop the hood of their admin interface and configure the settings to their liking. The plugin works as soon as you reload your website and it automatically makes an audio option of your existing content.
Quality of voice output
At the center of it is Amazon Polly’s high-quality voice output with realistic voices available in dozens of languages. It’s arguably the best text-to-speech service on the market right now thanks to deep learning technologies that are able to synthesize speech that sounds like a real human voice. You can also make a case for Google’s Cloud Text-to-Speech that uses WaveNet’s speech synthesis and the company’s own neural networks to deliver high-fidelity audio – I guess it’s a matter of personal taste.
In any case, the quality of the audio is super important if you want users to accept the fact it’s synthetic. We use Polly’s Newscaster voice, which really does feel like a real person is narrating your content, complete with verbalized emotions. On top of that, you can choose between a male or a female voice, different languages and even local accents (where applicable, mostly English language) to further refine the listening experience. You can hear how it sounds here – feel free to play around with different languages.
Taking care of the finer points
When implementing any new technology, there are bound to be questions, even some emerging challenges. For instance, one of the frequent questions we get asked is the effect of the website’s loading time. Our player doesn’t put a strain on other processes and generates zero latency due to it loading asynchronously.
The WordPress plugin can read the post’s body, headline, and excerpt, all of which you can modify. A lot of people want to know how a TTS plugin understands what text is and isn’t relevant. I can’t reveal any business secrets but let’s just say we put a lot of faith into artificial intelligence to sort it out, as I’m sure other solutions do too. In terms of placement, I always advise the top position, right below the title because that’s where it’s most visible – you want people to know there’s an audio option.
One particularly great thing about most text-to-speech plugins is the fact that the playback continues in the background. In turn, this allows visitors to freely explore other content while they listen, increasing their time on page. And because each post can have different gender and language set as default, you can personalize it according to your audience segments.
Data supports audio’s big push
We get a lot of data from various publishers and content creators using our plugin about user consumption and their preferences. For instance, we are close to 4 million daily player loads, and the CTR is 3% and higher, depending on the amount of time the player is embedded. Generally speaking, data shows that the longer a website is using any kind of TTS solution, the engagement goes up.
I like to say that we are in the middle of an audio revolution or renaissance, where I honestly believe audio content in some shape or form will be a necessity. Audio has made a comeback, in no small part thanks to text-to-speech technology that provides humanlike, relatable voices. The technology is developing at a breathtaking pace, it’s continuously being optimized so I expect to see an increase in the usage of TTS software as the world catches on. It just makes sense.