Snooping by Design

August 2019 | Digital Transformation

Virtual Assistants such as Amazon Alexa are a hot technology trend, yet using them carries privacy risks. It is time for data protection to became a priority.

It's been a difficult summer for virtual assistants and the companies that develop them. At the beginning of August, it was revealed that there was more than just an AI listening to conversations between Amazon Customers and the ever popular Amazon Echo. According to press reports, Amazon sends a small number of Alexa recordings to human employees for verification and quality assurance. This was of particular interest to newspapers and privacy regulators because nowhere in Amazon's privacy policy was this practice actually disclosed.

The Options Option

To be fair to Amazon they're not the only ones caught doing this. Google, Apple, Microsoft and Facebook have all been caught doing the same thing, with a lot of negative coverage to match. Alexa has received the biggest share of the coverage though, because she is the most popular and most widely used virtual assistant. That's a little unfair on Amazon because they were simply following a general industry best practice. They weren't doing anything more nefarious than their competitors.

Naturally, the tech giants have reacted to the negative coverage by reducing the number of virtual assistant recordings they send to human engineers for transcription, and adding an opt out option to the affected apps and services. Only Microsoft have gone as far as defending the practice and updating their privacy policy to explicitly permit it. They are also the only ones to already offer an opt-out, with both Cortana and Skype Translator having an option to help improve voice recognition in their privacy settings. Unselecting that option prevents recordings from being shared with Microsoft engineers.

Clarity and Confusion

The problem with adding checkboxes to privacy policies is that people often don't understand the consequences of what they're signing up to. It's well known that vanishingly few people read the terms and conditions for anything they register for, primarily because those t&cs are a long list of legal jargon that they don't understand. However, even plain English summaries only help so much. Consumers will read the summary, but that rarely explains what they're agreeing to in practice.

Marketing opt-ins are one of the few exceptions to this rule, primarily because people already associate marketing with spam. This is a fair assessment when you consider that recipients aren't interested in the legality of the unsolicited emails they receive, only whether they agreed to receive them in the first place. Opt in laws were introduced for precisely this reason.

Other privacy opt-ins such as software improvement programs don't have the same clarity of purpose, because consumers don't directly see the benefits of their consent. It is rarely clear what users are required to offer up in exchange for checking the relevant box, something that Microsoft have got into trouble for in recent years. Nor is it clear how that data is being used, which is the source of the controversy on this occasion.

Dividing Lines

In general, consumers draw a line between usage data and actual data when deciding whether a particular company has gone too far in data collection. Some people object to both, but many more will allow the former to be shared with sufficient justification. This isn't a blank slate. Sending actual document information or even document metadata as part of product diagnostics and software improvement programs is a surefire shortcut to unwanted media and regulatory attention. That is what happened here, and has also dogged Windows 10 since its release 4 years ago.

People are more accepting of feature usage stats and click stream tracking being analysed for product improvement, but even then there is a widespread lack of understanding about what is actually included in such data and in how it is used. Transparency is key here. Microsoft would have avoided a lot of bad press in relation to Windows 10 if they'd been open and upfront prior to release about what was included in product telemetry and what data sharing was actually required to use the various cloud features embedded in the OS.

Privacy by Design

Now the same issue has raised its head again. Developers and product engineers need to remember that that data flowing through their applications is not controlled by them; it's owned by their customers. All too often that can get lost amid the desire to make a better product. Legal compliance and data protection are never going to be top priorities for any development team, nor should they be. However, the principles of privacy by design are important ones to consider at every stage of the development cycle.

In the GDPR world, you're only allowed to use data for the express purpose that it was collected for and nothing else. From next year, that principle will become part of US law through CCPA. That requires a change of approach akin to OWASP and the secure software wave of the early 2000's. Not everyone has woken up to that fact yet, in part because data protection is often seen as someone else's problem. Yet, if it is ignored then it will come back to bite you in the form of media coverage and regulatory fines. The technology giants have just experienced this, and not for the first or last time either.