Deepfake phishing - identify dangers and take countermeasures
Social engineering is as old as mankind – and still works remarkably well: this type of attack is based on gaining trust and getting the victim to do things they shouldn’t actually do: For example, revealing passwords or other sensitive information. Attackers are increasingly turning to so-called deepfakes. Deepfakes are manipulated videos, images or audio recordings in which artificial intelligence and machine learning are used to integrate a person’s face or voice into another scene. This technology can be used to create convincingly real fakes that are almost indistinguishable from real content. Attackers no longer only have manipulative emails at their disposal for deceptive maneuvers, but can also, for example, arrange a phone call in which the caller sounds like the CEO of the company being attacked using AI technology that is freely available on the internet.
Enhanced deepfakes - increased risk of manipulation
The technology required for this, which is based on complex machine learning algorithms, has developed rapidly in recent years. As a result, more and more such sophisticated deepfakes are appearing. Some are used for online fraud, others as fake news. Politicians in particular are among the victims, such as Joao Doria, the governor of the Brazilian state of São Paulo. In 2018, a video showing him in an alleged sex orgy was published. Later, it turned out that it was a fake. Denunciation is one danger, targeted disinformation is another. Fake news is already commonplace before elections, and fake news videos are also appearing more and more frequently and are quickly distributed online. It is already possible to put words into politicians’ mouths that they never said. In the past, manipulated clips were easy to identify with a scrutinizing glance. Now, even experts have to look at least twice.
Attack rate on companies on the rise
Deepfake technology harbors a number of IT risks for companies. In deepfake phishing, for example, cyber criminals try to use deepfake content to trick victims into making unauthorized payments or disclosing sensitive information. A recent case from Hong Kong shows how this works: According to police reports, a financial employee of a multinational company transferred 25 million dollars to fraudsters who successfully impersonated the company’s CFO in a video conference. “In the video conference with several people, it turned out that all the people present were fake,” said Chief Superintendent Baron Chan Shun-ching. The employee initially suspected that the invitation to the video conference was a phishing email. However, during the video call, the clerk put aside his doubts as the other participants looked and sounded exactly like his colleagues. Believing that all the other participants in the call were genuine, the worker agreed to transfer a total of 200 million Hong Kong dollars – the equivalent of around 25.6 million US dollars. This is just one of many cases in which fraudsters have used deepfakes to manipulate publicly available video and other footage in order to defraud companies of their money.
This example shows just how convincing deepfake videos have become: https://www.youtube.com/watch?v=WFc6t-c892A
Two types of deepfake phishing attacks
Such deepfake phishing campaigns occur more effectively and more frequently as AI technology advances. CISOs are well advised to prepare their employees to defend against such attacks. One way to do this is to explain to them what deepfake phishing is and how it works. Essentially, there are two types of deepfake phishing attacks:
- Real time attacks: In a successful real-time attack, the spoofed audio or video is so sophisticated that the victim believes the person on the phone or in a video conference is who they say they are, such as a colleague or customer. In these interactions, the attackers often create a strong sense of urgency by making victims believe they have imaginary deadlines, penalties and other consequences for delays in order to put pressure on them and get them to react rashly.
- Non real-time attacks: In non-real-time attacks, cyber criminals impersonate another person through fake audio or video messages, in whose name they will then distribute fake instructions via asynchronous communication channels such as chat, email, voicemail or social media. This type of communication reduces the pressure on criminals to respond credibly in real time. At the same time, it allows them to perfect deepfake clips before distributing them. Therefore, an attack that is not in real time can be very sophisticated and raise less suspicion among victims.
Compared to text-based phishing campaigns, deepfake video or audio clips sent by email also have a higher chance of passing through security filters. Attacks that do not take place in real time also allow attackers to increase their reach. For example, an attacker can pretend to be the CFO and send the same audio or video message to all employees in the finance department. This increases the likelihood that someone will fall for it and disclose confidential information. In both types of attack, social media activity usually provides attackers with enough information to strategically strike when targets are most likely to be distracted or most predisposed.
Identify deep fake phishing
The identification of deepfake phishing attacks is based on four pillars:
- Phishing in general: Every manager and every employee must internalize this principle: phishing is based on tempting victims to make rash decisions. Therefore, a sense of urgency should immediately trigger an alarm in every interaction. For example, if a person – be it the CEO or important customers – asks for an immediate bank transfer or product delivery, everyone should pause and check whether it is a legitimate request.
- Deepfake characteristics in videos: The company’s security officers should raise employees’ awareness of known and new attack methods through continuous training. One advantage of this is that most people find deepfake phishing training particularly interesting, engaging and educational. After all, it is almost entertaining to watch deepfake videos and identify suspicious visual clues. Typical signs that it could be a deepfake video include unrealistic blinking, inconsistent lighting and unnatural facial movements. Other indications of a fake are flickering at the edges of the distorted faces, hair, eyes and other parts of the face.
- Deepfake characteristics in audio files: Pronunciation errors often occur in text-to-speech (TTS) systems, especially when the spoken word does not correspond to the trained language. Monotone speech output is caused by insufficient training data, while forgery methods currently still have difficulties in correctly imitating certain features such as accents. Different input data can lead to unnatural sounds, and the need to capture the semantic content before synthesis can greatly delay the process of generating high-quality fakes. Tip: For training on how to identify manipulated audio data, the Fraunhofer AISEC website is a good place to start.
- Confirmation of identity: In case of urgent requests, employees should politely point out that, due to the increased number of phishing attacks, the person must confirm their identity using two-factor authentication via separate channels. Alternatively, in the case of suspicious interactions by phone or email, the other person should disclose information that can only be known to both parties. A typical example would be a question about the length of service. Close coworkers may even ask more personal questions, such as how many children the other person has or when they last ate together. This may be uncomfortable, but it is an effective and efficient mechanism to expose fraudsters.