Yet another article has been written about fraudulently hacking voice authentication systems via impersonation, this time it’s from PC Magazine. The tagline mentions: “a Black Hat presentation demonstrated how to crack voice authentication cheaply and quickly". Click here to view the complete article.
The article correctly states that recorded playback attacks of voice biometric systems used to be time-consuming and difficult – a hacker would have to get recordings of the exact words being spoken as the voice biometric passphrase, stitch them together properly, and submit them to the voice biometric system. However, the article also correctly states that this time-consuming process is no longer needed. New speech synthesis software from companies like Lyrebird and Baidu can create almost any spoken content, in a believable manner, from only “seconds” of speech.
There is no doubt that speech synthesis technology and transfer learning are making a lot of voice biometric companies nervous. The good news however, is that these technologies are not yet commonplace, are not yet perfected, and companies such as VBG have solutions to address widespread fraud by synthetic means – and are creating new counter-measures as well. This is an example of the never-ending cat-and-mouse game between fraudsters and authentication technology providers.
So, what can be done to address these threats? The article points to the use of static passphrases. The first thing to consider is the use of random passphrases, or more phonetically complex passphrases, or combinations of both. Simply put: make it more difficult for fraudsters to synthesize speech content. And, if you can leverage conversational speech, this will be even better. Use of live, interactive conversations for voice biometrics are where speech synthesis attacks will break down more readily.
We have also written multiple times about using multi-factor authentication (MFA) on our website, in our how-to guides, and in past articles. The PC Magazine article shows only a single factor in place – with lots of time to experiment and retry. The article points out a somewhat collusive, orchestrated approach being performed by highly technically savvy individuals. However, the day will come when almost anyone can use speech synthesis. Thus, the need for MFA is more critical than ever.
Other technologies are also increasingly being introduced that work in conjunction with voice biometrics, such as “phone printing”, sonar, and similar techniques that capture the more complete technical and behavioral elements of talking on a phone. These are things that transfer learning will not be able to model.
And, there are other techniques which VBG can employ, and new technologies we are developing, as speech synthesis counter-measures. For obvious reasons, we won’t provide details here. Sooner or later, fraudsters will figure them out … and we’ll of course be working on a new set of counter-measures by then.
There are two important take-aways to leave you with at this point. First, almost any single factor can be defeated when considered in a vacuum, and when you have a technically talented team of researchers being given multiple attempts to breach it. This tends not to happen in the real world and is the whole reason why MFA is almost unilaterally advocated. And second, voice biometric companies are aware of the threat of speech synthesis and have some good solutions available, and new solutions coming. Stay tuned; we’ll be writing more on this topic in the coming months.
Peter is an avid reader, particularly of high-tech topics. These articles express his opinion only, but he hopes you enjoy them!
Other Related Articles You May Enjoy
Click on any article title to read ...