AI Vishing Attacks: Analysis of an Emerging Threat

Vishing (voice phishing) attacks are not a new phenomenon, but their frequency is increasing. Several factors contribute to this trend. These include stronger email security measures and a higher level of user awareness. At the same time, advances in voice AI technologies enable attackers to improve both the effectiveness and the scale of their operations.

This analysis focuses on the specific capabilities that AI-enabled vishing provides to attackers.

Key AI Components in Vishing Attacks

Text-to-Speech and Voice Synthesis Technologies

This area is not new. Voice-over systems and various voice synthesis technologies have existed for a long time. However, recent AI developments have significantly expanded the capabilities of text-to-speech engines.

These technologies represent a core element of AI-driven vishing attacks. Cloned voices, artificial accents, and simulated emotions can now be generated with ease. The range of possibilities is extensive.

Simulating Human Interaction

Another major development concerns the quality of interaction and conversation produced by large language models. These models are capable of conducting discussions that closely resemble human dialogue. This creates a new level of realism for conversational attacks and significantly enhances AI-based vishing scenarios.

Such capabilities can be implemented in several ways. Examples include augmented decision trees and fine-tuned models. In essence, generative AI is highly effective at imitating human communication.

A simple indication of this effect is the frequency with which people say “please” or “thank you” during interactions with ChatGPT.

Large-Scale Vishing

AI-driven vishing attacks can now be executed at scale.

This represents one of the most visible changes in the threat landscape.

In the past, large-scale attacks were mainly limited to asynchronous channels such as email, messaging, or malvertising.

Synchronous attacks typically required a human operator to communicate with each target individually. Interactions were conducted one-to-one.

With the development of AI technologies, large-scale vishing campaigns have become both realistic and easy to implement.

A single attacker can configure multiple automated agents or “bots.” These bots can call victims and interact with them simultaneously.

It is important to note that the same technology can also be used defensively. Organizations can conduct large-scale vishing simulations to train employees and improve awareness.

Deepfake and Voice Cloning Attacks

Scale is not the only concern. AI-driven vishing attacks are also becoming significantly more convincing.

Voice cloning technology has existed for some time. In the film Mission: Impossible, the character Ethan Hunt changed his voice using a piece of advanced equipment placed on his neck. This scene appeared more than a decade ago.

Today, only about fifteen seconds of a clean voice recording are sufficient to create a convincing voice clone. Because this technology is relatively new and people naturally associate voices with identities, it exploits an implicit form of authentication. Normally, hearing a familiar voice is enough to identify the speaker. Confidence in that identification is usually very high. However, that confidence can be misplaced.

Consider the potential impact on busy employees who are focused on daily tasks. A phone call arrives from someone who sounds like a direct manager and requests specific information or asks for a particular action.

In such situations, a secondary verification step is rarely considered. The voice sounds authentic. Human instinct signals that the caller is legitimate. As a result, there may be no perceived need to verify the request, return the call, or ask a question that only the real person could answer.

This technique is extremely powerful and has already been used in real incidents. In Italy, for example, scammers used voice cloning to impersonate the Italian Minister of Defence. More than one million dollars were stolen in the process.

Agentic AI Vishing Attacks: A Forecast

Up to this point, the discussion has focused on documented events and existing attack methods. To anticipate how this topic may evolve, it is useful to consider potential future developments.

First, the previously mentioned trends are not mutually exclusive.

Increasing the scale of attacks will not reduce their realism. The number of attacks is likely to grow, and their effectiveness will likely improve as well.

These attacks are also expected to follow a trajectory similar to that of phishing.

Early phishing campaigns relied on mass spam and broadly targeted messages. Later developments introduced spear-phishing, which involved personalized emails built using open-source intelligence (OSINT).

The OSINT phase can largely be automated. Data may be collected from contact-enrichment databases, publicly accessible or purchased data leaks, social media activity, and news monitoring. Such information can then be used to coordinate highly targeted attacks against the right person at the right time.

All of this can be performed at scale. Experimental prototypes already exist that automatically identify targets, build relevant scenarios, retrieve contact details, and initiate phone calls.

Agentic AI can orchestrate chains of automated processes. These systems can create a complete attack framework that covers everything from data collection to exploitation.

For example, automated agents could place a call and engage in conversation for approximately fifteen seconds. A voice sample could then be captured and used to clone the individual’s voice. Afterward, additional calls could be made to employees, instructing them to install remote-access software.

Such scenarios are no longer science fiction. Given the pace of technological progress, employee training should begin addressing these threats.

Conclusion

Current AI-driven vishing attacks illustrate both the existing threat and the direction in which it is evolving. Phone numbers do not benefit from the same level of security protection as email. Defending against these attacks is also more difficult using the technical solutions currently available on the market.