Protecting voice communications from fraud and deep fakes [Q&A]

Speech recognition

The UK’s National Computer Security centre (NCSC) has recently issued new guidance on secure communications for voice and video calls and SMS in order to help protect consumers from scams.

UK telecoms regulator Ofcom has also announced a crackdown on scam phone calls using fake numbers as their volume has soared during the pandemic.

We spoke to Dr Nikolay Gaubitch, director of research at fraud protection company Pindrop to learn more about the recommendations, why they’re important, how voice fraud is evolving and how to guard against it.

BN: What’s your view of the new NCSC guidelines?

NG: I really like seeing these NCSC recommendation coming up for various reasons. The first part that I find striking in the recommendations is the notion that we cannot reliably establish the identity of the person on the other end of a call that you’re speaking to. The reason I find that exciting is it was basically the foundation that started Pindrop 10 years ago.

There were some other interesting points too, the recommendation that suggests having fewer communication channels can be beneficial because it’s easier to protect them, which I agree with as a concept though it rather contradicts the mainstream view of offering more channels.

The third thing is about SMS. SMS has been used and abused for a long time in this space and the concept of ensuring conformance is much more advanced. Finally, one of the parts that I really find resonates well with me is the suggestion that organizations and companies should provide mechanisms for customers to call back. If you don’t trust or don’t feel safe when you speak to someone who’s called, hang up and call back. This is good advice but then the organization needs to make sure that the place where customers can call back is protected.

BN: The new guidance on voice and video seems to echo what the NCSC has previously said about email. Isn’t much of this just common sense?

NG: Once you see advice gathered in one place it just seems simple, but I think it’s good to have this reminder and to focus attention. I agree that if you look at many of the many of the banks, for example, they embrace this approach already. We’ve received a lot of information over the last couple of years around how to be careful but I don’t think there’s any harm in repeating things even if they seem to be common sense.

BN: It’s very easy for fraudsters to spoof phone numbers using VoIP, how can businesses guard against that?

NG: Part of the recommendation is for the organization to ensure that their number cannot be hijacked effectively. Of course, the spoofing of the number now becomes a customer’s responsibility. It’s fine to get customers to call back for confirmation, but now the business has to make sure it has the proper ability to confirm that it’s speaking to the customer calling back rather than a fraudster calling.

BN: At the moment these are still just just guidelines. Do you think governments will start to go further and maybe legislate for stronger controls at some point?

NG: I think legislation is always a tricky balance to strike. This is version 1.0, as it says on the document, but I do believe that they have to go into more depth surrounding what can be done, what technologies can be used, etc. I think also it’s a relatively a very high level set of recommendations, so getting into a bit more depth on these points and what organizations actually can do will be a good next step. And, as I said, legislation can be a tricky balance to strike but there’s no harm in in providing more support for the issue.

It’s also important to make the public more alert to potential phone scams. I believe that if there are initiatives like this that speak about the technology that could be used to prevent fraud, that can also make the public a little bit more comfortable dealing with call centres. It’s a bit of a two sided.

BN: Last time we reported on this we spoke about the possibility of having some new technology to help verify calls. Are we any closer to that becoming reality?

NG: Yes, I think so, and when I was talking about SMS the recommendations actually already point to this. You want these organizations that you deal with, the companies that provide you SMS services, to be part of established organizations that control the quality of messaging and that would be a good step for every channel as well.

Losing the value of the phone number as a form of identity has been a real game changer. 10 to 15 years ago there was a much more strict link between the phone number and your identity. That’s partly because you had to prove your identity to buy a mobile — much like a landline — where now you can easily walk into a store and a get a pay-as-you-go phone off the shelf with no questions asked.

BN: Some banks are now using customers’ voices as a form of identity, how secure is this?

NG: This is where we get into the area of ‘deep fakes’, you have two ways of modifying voices or generating voices. One is voice synthesis, which means that you create a complete version of somebody’s voice and create the content by typing in what you wanted to say. The other form is voice conversion. where actually I speak and then my voice is modified in a way to make it sound like somebody else.

These have been around for decades as technologies, however, they weren’t very good until the recent rise of deep learning technology, which is itself a sub-branch of artificial intelligence. That was a real game changer which made these technologies actually usable. We’ve had technology modifying voices in the music industry for ages, there’s automatic pitch correction — which people don’t even think about — but it makes singers artificially sound better than they really are.

There was a case reported by Forbes last year where some fraudsters made themselves sound like the CEO of a company and managed to authorize a $35 million bank transfer. This is something I think we’re going to see more and more in the coming years because criminals embrace technology that can help them increase their social engineering skills and this is a perfect example.

BN: How can you guard against deep fake voice fraud?

NG: This is the other side of the technology, which is to be able to detect whether voices are real or synthetic. Even if much of that deep fake voice sounds genuine to the ears of human listeners, there are always artefacts in the background that are detectable by technology. Also people talk in a particular way which is quite hard for a machine to replicate. You can change somebody’s voice but you cannot change the content, yet.

Image credit: Zapp2Photo / Shutterstock

Author: Martha Meyer