Based in silicon valley, california, security brief is a blog by chris louie. his posts discuss current information security affairs

Next Level Phishing - Attacking Digital Assistants Amazon's Alexa and Google Home

Next Level Phishing - Attacking Digital Assistants Amazon's Alexa and Google Home

It was only a matter of time before attackers and security researchers would find a way to compromise digital voice assistance devices such as the Amazon Echo and the Google Home line of products. Just on the heels of news reports that a Portland, OR couple's Amazon Echo device sent the recording of a private conversation to a random acquaintance, security researchers have released a paper describing a clever attack method to compromise voice digital assistant devices.

In the case of the Portland, OR couple, they were having a conversation where a word was used that sounded like “Alexa”, the Amazon Echo’s default activation phrase. The Amazon Echo device began recording the conversation, as it is supposed to after the activation phrase is heard. The couple’s conversation then contained a phrase similar to “send this conversation to Al”, then the Echo did as it was believed to be instructed. 


This incident highlights an inherent flaw with voice assisted devices that they necessarily have to use fuzzy logic in order to work properly. If I wish my Echo device to perform some action, it’s almost impossible to make my voice volume, tone, inflection, amplitude, and accent, the exact same every time. Further complicating things, my Echo devices has to work for anyone who lives or visits my household and in environments where there may be excessive background noise or overlapping conversations. In order to prevent false negatives, voice activated devices have to allow for a tolerable threshold and will make compromises on voice accuracy in order for the devices to work as intended. I have personally seen my Amazon Echo devices activate when saying words such as “Lexus” or “Alex”. This tolerable threshold was beautifully illustrated when South Park ran an episodewhere one of the characters added items to a shopping list and set an alarm and users’ Echo devices within microphone rage of the TV also performed the same actions. A local news anchor muttered the words “Alexa, order me a dollhouse” on TV and hundreds of viewers scrambled to their computer to cancel orders placed by the news anchor. 

Chinese and US-based researchers published a paper describing how an attacker was able to register an Amazon Alexa skill with similar sounding names to legitimate Alexa skills, so called “voice squatting". Alexa skills are similar to third-party applications that can be accessed from the digital assistant. For example, Capital One is a well established banking and credit card operator in the United States and offers an Alexa skill to check your bank balance or make money transfers. An attacker could register an Alexa skill called Capital Won, which is spelled differently, but phonetically identical to Capital One. This is the voice equivalent of using punycode in a URL (domain squatting) in order to trick users into thinking they are actually at the website they are intending to visit. 


Similar tests were run against the Google Home digital assistant and researchers were able to register applications with similar sounding names to legitimate applications and were able to trick users into launching their application instead to the legitimate version.



Exploiting people’s propensity to be polite, researchers also registered applications with the same name as well known applications, but adding the word “please” to the end. An unsuspecting user would say “Alexa, open Trivial Pursuit please” would get a compromised version of the Trivial Pursuit game instead of the legitimate version. 

While compromising someone’s Amazon Echo or Google Home device may have limited impact on an enterprise, it is never a good practice to have compromised devices anywhere on the network. These malicious skills could be used to phish credentials (password reuse attack) or social engineer users into visiting websites that could compromise their systems as a first stage of a milti-stage attack. Some Alexa skills instruct users to visit a website to log in and complete the setup. Compromising a user’s digital assistant could also open up a beach head into a user’s network, so when a user connects their work computer to their home network, the compromised digital assistant could attack the corporate laptop. 



A strong mitigation for IoT-based attacks is to just not use IoT devices (IDIoT: I don’t IoT). When this is not an option, network segmentation is important so IoT devices do not have access to the rest of the network and are only allowed to go to authorized sites to perform their function and nothing more. The use of a cloud firewall can simplify this process. Sending IoT traffic through a security service for inspection will also prevent users from connecting to known or probable malicious websites and command and control servers. Lastly, it is up to the device manufacturers to add additional security measures to their devices and their developer ecosystem to prevent abuse.

At the time of publication, both Amazon and Google have acknowledged the attack and are working on mitigations to prevent abuse. One proposed method is to disallow applications that are a phoneme match, not just a spelling match.  

Backswap: the Next Generation of Banking Trojans

Backswap: the Next Generation of Banking Trojans

WannaCry's Long Tail: Setting the Stage for the Next Major Wormable Attack

WannaCry's Long Tail: Setting the Stage for the Next Major Wormable Attack