Voice recognition software is getting better all the time. Despite its reputation for not quite working, modern tools can do a decent job of translating your voice into text for dictation and inputting commands. We took a look at some of the options to bring you this overview of the best speech-to-text software around. Our favorite is Dragon NaturallySpeaking, but there are plenty of free (or at least cheaper) options around, too.
If you want to dictate while doing other things, speech-to-text is perfect. You can write a speech for work while cooking, for example. Most people speak faster than they type, so it can make them more productive, providing that the software is accurate. For those with a physical impairment, it may be their only way to use a computer, making accuracy all the more important.
Mileage may vary depending on your language or accent. Those of us from more remote parts of the world may find our accents less likely to be recognized than others. Americans are better catered to than people from Scotland, for instance. Different languages present different challenges to computer interpretation systems. We will focus on English in our testing, but may throw in snippets of other languages to see what happens.
For our testing, the main thing we’ll be looking at is accuracy. We’ll read a fixed set of text to each tool to compare how it is handled. We will also look at command recognition where appropriate.
The big question we wanted to ask was whether voice recognition software has moved from being an occasionally useful novelty to something that offers a reasonable, or even superior, alternative to typing.
Our microphone is a standard headset, nothing fancy. We’ll be doing most of our testing on the same PC and will use an iPad and Android phone to look at offerings from Apple and Google. Our Mac mini failed to recognize the mic input, so it’s out by default.
Windows Speech Recognition
Windows Speech Recognition is built in to Windows. All you need to do to set it up is search for “speech recognition” in the Windows search box, then click through the installation wizard. You’ll need to repeat a couple of sentences aloud. Once you’re finished, it immediately offers to improve itself by flicking through your emails and documents. No thanks, fellas.
You might have concerns about enabling speech recognition on Windows given its record on privacy. If so, our article on Windows 10 privacy settings is worth checking out.
Moving on, Windows strongly recommends that we take a voice commands tutorial. Weirdly, though we are using Windows 10, we can only select Windows 7 or 8.1. The tutorial is a video that goes through the command list and ours clearly says Windows 10 at the top, so we skipped to giving it a try.
We started with a success and opened the search window as intended. Trying to add a new line to a document gave us a “what was that?” message, though. It was the same with the second attempt The third try opened a new document and the fourth tried to select all the numbers in our document, so it wasn’t especially useful.
Saying “go to start of sentence” took us to the start of our typed sentence. “Go to end of sentence” worked, too. Both took longer than hitting the home or end keys, though.
Clippy Can Speak, But Can It Dance?
Our PC was struggling as much as we were at that point. Our i5-7600 test system experienced worrying performance dips when using speech recognition and we had trouble flipping between documents and browser tabs in a way that reminded us of the old Microsoft favorite, Clippy.
Windows Speech Recognition turns off quickly, instead of taking 45 seconds to load an animation of itself scuttling into the distance, so thanks for small mercies. It’s also for the best that Microsoft Speech Recognition doesn’t present us with a face to punch. Monitors aren’t as durable as they used to be.
Since it often did the wrong thing when misinterpreting our commands, rather than nothing, we count ourselves lucky that it didn’t do anything serious while we were using it. The potential for a work-related catastrophe is there, though. It continued to operate after we put our mic down, too.
It’s fair to say we had mixed success with Windows Speech Recognition. It is impressive when it works, but gets it wrong too often to use regularly. It could be useful for the physically impaired, but there are better options.
- Built in to Windows
- Easy to turn off
- Potential Microsoft privacy issues
- Accurate rendition of speech is the exception, not the rule
- Often does the wrong thing
Siri Speech To Text
Having been let down by Microsoft, we thought Apple wouldn’t disappoint us; it is, after all, powered by Nuance, the same company behind Dragon. It did, though, by refusing to recognize our mic input, so instead of looking at Apple’s desktop speech to text, we decided to test Siri on the iPad.
Siri is the highest-profile service around, with its iPhone incantation popularizing the concept of speech to text and breaking records for the technology most shown off in bars.
Apple’s devices always look good and are geared toward user-friendliness. We were eager to see if that would translate to functional speech recognition software. Let’s see how Siri fared during our testing.
After launching the Notes app, we engaged in dictation, which involves input being sent to the cloud for Apple to process. If you are interested in the privacy aspect of this, read our article on the best cloud privacy laws.
Using the cloud enables a lot of computing horsepower to be thrown at interpreting what you say. You might think that approach would be slow, but it’s surprisingly quick. There is a noticeable delay, but it isn’t long and still works faster than typing.
Using iPad speech recognition is simple. You just click the mic whenever the keyboard is visible, which is how most apps that use a keyboard do it.
What the Dickens?
Apple did a decent job, but it still had trouble with Dickens. Most words in our test piece were rendered accurately, but there was still the odd clanger, such as, “Oliver was reckless with Missouri.”
It fared better with simple phrases and most of what we said was heard correctly as long as we kept to a basic vocabulary. It dropped the occasional word, though.
For web searches, asking Siri is often quicker than typing, especially on smaller devices with fiddly keyboards.
We next tested it with a few foreign words and place names. It handled “konnichi wa,” but failed to recognize the names of K-pop band members. Still, it is to Apple’s credit that we felt confident enough to give it a shot.
Overall, Siri does well with simple phrases and it’s good enough to use when you want to search for something in a hurry. As its users will be aware, though, it makes plenty of mistakes and is quite limited. Still, it’s a good effort from Apple.
- Good accuracy rate
- Gets most words right
- Simple to use
- Still makes errors regularly
Google Docs Voice Typing
Google Docs Voice Typing is free and available wherever Chrome is. It doesn’t require setup and can be activated from the tools menu in any document.
Beginning with our Dickens test, we found “Oliver Twist” was sometimes “Oliver” and at other times “all over.” Many words were skipped and the results were full of errors. Google Docs Voice Typing turns itself off automatically and, at one point, stopped responding despite being on, so we needed to repeat a section.
After getting poor results in our dictation tests, we tried giving commands and fared better. We switched between italics and bold type, added punctuation and dictated words, all of which were recognized.
Still, Google Docs Voice Typing is simple to use, even if its accuracy leaves something to be desired. It does seem to do better if you speak loudly and clearly, though.
When things are kept slow and simple, it gets more right, but it’s not accurate enough to be much more than a gimmick. If you needed to dictate hands-free for a while, you could do so and correct the errors afterward, but there will be a lot of them.
Google’s Voice Recognition Works Better on Mobile than Desktop
Disappointed with its desktop performance, we decided to give Google another chance. This time, we used Gmail on Android and, surprisingly, fared much better. Accuracy was near 100 percent for dictation and text, but the Dickens tests saw it drop considerably. Overall, though, we found that the Android version worked much better than the desktop one.
Clearly, there is potential in Google’s technology. Android gave us better results and, if you’re willing to tolerate the many mistakes, can be a useful alternative to its keyboard.
- Sometimes recognizes simple sentences
- Decent performance on Android
- Only available in Chrome on desktop
Speechnotes is a browser-based speech-to-text service that allows you to dictate into your browser. It doesn’t require setup beyond granting it permission to use your mic, so you can get straight to dictating.
It couldn’t be simpler to use. There’s a big area for typing text and a big microphone to click when you want to start and stop dictating.
For our first test, we tried hitting it with rap and it made as good a go of it as could be expected given the quality of our rhymes. It got sketchier when we tested punctuation. Full stops, commas and question marks worked most of the time, but colons became “codons” or “Kyle Long,” who we’ve never heard of.
The emoji commands brought smiles to our faces, as well as our screens, but dash and hyphen rendered as “dodge Hartford.”
Our Dickens test returned, “Oliver Twist was desperate with hunger and breakfast with misery,” which was, at least, in the spirit of the story. Mr. Bumble would be further enraged to find himself described as an “alpha mom,” though, especially while we had British English selected.
We tried setting it to U.S. English and speaking in our best American accent only to discover “mom” turned into “bomb.” It was hopeless. Fearing that it might be our diction, we turned to James Earl Jones. A recording of an iconic scene from a certain movie failed to register correctly. We tried shouting into our mic. That didn’t help either.
Keep It Simple
We did better when we used simple phrases. It did a decent job of getting things right, though there were still errors.
You could use Speechnotes to make a rough draft, provided things are kept simple and you speak slowly. There would be quite a few mistakes to correct, though, giving us the impression of a dishwasher that won’t work unless you wash the plates before putting them in.
Speechnotes works in any browser, as long as the browser is Chrome. You can export to .doc or .txt format or upload it to Google Drive.
- Good for simple sentences
- Makes a few errors
Transcribe’s focus is on file-based audio, so if you want to record a .mp3 and transcribe it later, it’s the tool for you. We’re not testing that, though. We’re just looking at its dictation capability.
It claims that its dictation feature allows you to work two to three times faster than typing. For that to be true, it needs to translate your speech to text accurately. As its own website points out, though, doing so with complete accuracy is still a pipe dream.
It gives you a week of free service, after which it charges $20 per year. That won’t break the bank and having an ongoing charge, rather than a hefty one-off fee, suggests the company is confident it will keep you as a customer. The subscription also means you can always take advantage of the latest version of its software.
As a paid service, though, the onus is on Transcribe to deliver. With its competitors mostly failing to provide anything in the way of stiff competition, though, the bar has not been set high. Let’s find out if Transcribe can clear it.
After signing up, we got a brief tour with a pop-up explaining a few tools and controls. We then headed for the dictate button, eager to see what Transcribe would make of our rambling.
We began with Oliver. As usual, we got about 50 percent accuracy, with the odd sentence being interpreted perfectly and others coming back to us as “advancing to the master, bison and Spoon in hand,” which broke the spell somewhat. Oliver was renamed “all over” at one point, too.
Transcribe’s performance improved with simple sentences. It started by getting eight consecutive sentences 100 percent correct. The first mistake came when we got overconfident and started belting out words at speed, but it got things right when we went back and repeated ourselves more slowly.
A Reasonable Job of Being Useful
Compared to Windows, Speechnotes and Google, Transcribe is way ahead ahead and it edges past Siri in reliability. It still can’t manage “Oliver Twist,” but does a good job of rendering simple sentences. It didn’t understand our French, but can hardly be blamed for that, as few French people do, either.
If you can’t type or are so bad at it that you make a mistake or two every sentence, you might find Transcribe improves your productivity. It may also be useful for recording meetings or conversations in situations where you only need rough notes or are happy to go back and correct errors later.
Transcribe is browser-based, but dictation only works in Chrome. You can export to .doc, though, so you aren’t tied to the service.
- Works well
- Not free
We looked at Dragon NaturallySpeaking last, which is the most expensive of this list. We tested the cheapest version, Home, on our PC. It claims it “captures your thoughts as quickly as you can speak them.” After being disappointed by the other software, we hoped it did, but were skeptical.
Setup is an ordeal, with awkward download links and a serial number that needs to be entered in five different fields without allowing users to paste the whole thing in at once. Taking a look at the install options, we found several English modules available. You can pick from Australian, Canadian, U.K., U.S., Indian or Southeast Asian, which is impressive, but you might want to disable the ones you don’t want as they eat over 200MB of space each.
It got confusing when choosing our region and accent, though. If we selected the U.S. as our region we could choose from all available accents, but when we chose the U.K. we couldn’t select Spanish or Pakistani accents. With our region set to India, Australia or New Zealand we couldn’t choose our accent at all.
Travelers that set their region to their location without checking carefully might not realize they can tune Dragon to their accent, which seems like a blunder from a usability perspective.
Assuming that our U.K. accent was “standard,” we proceeded. There were advanced options to select our vocabulary type, but only large was available. You can choose the speech acoustic model, too, but it only offers a previous version of the default BestMatch V.
Enter the Dragon
On start-up, we were given the option of launching in trial mode, despite not finding a free trial link on the website, or activating the product, which we chose.
It asked us to read some text to confirm our microphone worked. Dragon was so confident, it cut us off halfway through, letting us move on to the tutorial. “Go through these progressive simulations and you’ll learn important skills efficiently!” it announced.
The tutorials looked clumsy, but were better when it came to content. Our first chance to test Dragon’s speech recognition came when it asked us to turn the microphone off with our voice. Doing so took two attempts.The first tutorial dictation test took two tries before hearing us, as well, but the issue vanished outside the tutorial, so isn’t that serious.
From that point forward, it got everything right, including some complex punctuation and numeric input. Since we were only saying what it told us to, though, we reserved our judgment.
The tutorial gives you advice on how to speak when using the application, which is welcome and will help improve users’ chances of being understood. It also teaches you to use the “correct” menu when it makes mistakes.
At one point, a pop-up appeared to tell us what we said was not recognizable. We wondered if it was really our fault. Another pop-up offered to install a browser extension for us. Some may find these pop-ups helpful, others may consider them an irritant.
Using Dragon NaturallySpeaking
After jumping through all these hoops, Dragon is a breeze to use. Its menu bar sits at the top of the screen and has a big red microphone to click when you want to switch it on. Wisely, Dragon doesn’t let you start by saying “microphone on.” You have to click to begin.
The menu bar is well-designed and gives you access to many useful features. Dragon allows you to select user profiles, which is helpful if you have people with different accents using the same machine. It can analyze your vocabulary by looking at user-selected documents, meaning you can train it with data that reflects your personal use of language.
There are several audio calibration options and a feature that allows you to train specific phrases. You can also view a recognition history to see if there is anything Dragon gets wrong frequently.
It has a raft of help features, too. There is a performance assistant and several help and support options. The website includes a wealth of documentation, but it seems scattershot. The user guide link we saw didn’t cover much beyond installation. There are useful command guides for the Professional and Legal versions, but we couldn’t find one for Home.
There is also a knowledgebase, so if you need support, there a lot of options. When browsing through this we learned that only one user per machine is permitted, so taking advantage of the multiple profile feature is going to be costly.
The “correct” menu is useful and gives you a list of alternative interpretations for what you said. They are listed so you can pick them by number if you see the one you want.
Having been impressed by its features, but disappointed by the minor usability issues during setup, we began our “Oliver Twist” dictation test wondering whether Dragon would justify its price.
Five minutes later, we had our answer. Dragon is jaw-dropping when it comes to its core feature of recognizing what you say. Take a look at our dictation test results.
That’s 200 words of 19th century prose rendered with three mistakes. “Rebel” became “rabel,” “beadle” became “beetle” and it had no chance with “Mr. Limbkins.” The sketchy punctuation is down to us, and just what is a beadle anyway?
We know humans that aren’t as good at interpreting speech. It was so good that we had to resort to “Mary Poppins” to get an amusing mistake out of it, with “supercalifragilisticexpialidocious” becoming “super California listing expelling closures.”
Dragon includes a handy “learning center” that shows you commands relevant to whatever you’re doing. It is a nice way to learn about the software, especially when starting out. Basic dictation is simple and can be used without assistance, though.
The Best Speech-to-Text Software?
- Easy to use
- Many features
- Plenty of help options
- Awkward setup
We had fun testing these tools and exposing their limitations. There were many entertaining mistakes. Comedy writers with writer’s block could do worse than dictating to some of these applications and seeing what funny lines materialize.
Our initial impression when looking at the free options was that this technology is impressive when it works, but needs to become more reliable to realize its potential.
Looking at Dragon changed that. It is on a different level than the others for accuracy.The difference was night and day and we can see ourselves using Dragon in scenarios where none of the other tools would be viable.
The mobile options are worth using for search, as long as you’re prepared to head to the virtual keyboard on the many occasions that they fail to work.
Transcribe makes a decent attempt at accuracy, but isn’t good enough and, for professional use, we consider Dragon worth the money for the extra performance.
While we’ve had fun, this article has partly been an exercise to see why these services aren’t more widely used. The free options, though not without merit, leave much to be desired. Still, there’s no harm in trying them and, who knows, you may find they recognize everything you say.
The Best Voice Recognition Software
If you are willing to pay $150 for Dragon, things change completely. Hopefully, its technology will filter down to the free offerings. That could be a game changer and alter the way we interact with our devices forever.
The science fiction dream of our computers responding to our words may be closer than we think, though in most cases you will need a high tolerance for mistakes.
If you have any recommendations for other services, let us know. We’d be interested to hear how you did with them. It may be that people with different voices have different experiences, so shop around if you don’t like our recommendations. Thanks for reading.