TL;DR: Using metadata, it is relatively easy to find out very intimate information about a person. Therefore it is good to assume that any data in a system is covered by the GDPR. (And if you want to see me go deeper into this, join the HoT69 conference on May 26th!)
GDPR is the new data privacy law that will come into force in the end of May 2018. It’s a neat law that in the best of worlds will help ordinary users to regain control over their data.
A central question for the GDPR is that all Personally Identifiable Information (PII) about a person is owned by that person, and must be protected. But what is PII?
Last year I hosted a panel on GDPR, and in the beginning I asked all of the panelists whether data from an internet connected outlet should be regarded as PII. All but one said no. At the end, after 30 minutes of discussion, all agreed that Yes, data about electric consumption can definitely be regarded as Personally Identifiable Information.
Funny enough, the word doesn’t even exist in the legislation. The GDPR instead puts it like this:
To determine whether a natural person is identifiable, account should be taken of all the means reasonably likely to be used, such as singling out, either by the controller or by another person to identify the natural person directly or indirectly. To ascertain whether means are reasonably likely to be used to identify the natural person, account should be taken of all objective factors, such as the costs of and the amount of time required for identification, taking into consideration the available technology at the time of the processing and technological developments. (GDPR Recital 26)
Ok, quotes in legal language aren’t usually helping much. Let’s take an example.
What does my air conditioner know about me?
An indoors climate control system usually has three sensors in order to function. It needs to know the temperature, the humidity and the CO2 levels. Without the two first the indoors climate may be too cold, too warm, too dry or too humid. The latter, CO2, will cause headache and tiredness.
A human emits heat, humidity and carbon dioxide in different amounts depending on what they do so by measuring these three we can find out:
- How many people (and who) are in the house?
- When do they come home?
- In what room are they?
- When and how long do they shower?
- When, how long and how well do they sleep?
- What’s their level of activity?
Hang on, that last point is interesting. A body that is excercising is showing a very characteristic peak in temperature (body heat), humidity (sweat) and CO2 (rest product of combustion). If you have ever been at a gym at the end of box class you know what I mean.
That means that your air conditioner can reliably determine when you’re using your indoors treadmill.
Your air conditioner can also determine when you’re having sex, how many people are involved and in what room they are doing it.
More more information into this kind of research, see Johan Broddfelt’s excellent talk.
In many cases this information is collected and sent to a central server for further processing by the company. Sometimes this information is handed to another company.
Often the connection is not following good design principles and is thus trivially hacked. A quick search on Shodan, the search engine for the Internet of Things, reveals 301 internet facing HVAC systems (=Heating, Ventilation, Air Conditioner). For a malicious hacker or a penetration tester with permission, these are desirable targets.
Collecting lots and lots of data raises a number of ethical concerns.
- A company will know very intimate details about your life style and pattern.
- The company will be obliged to protect this data thoroughly.
- Information that is collected and stored can be subject of subpoenas from the state.
- The air conditioner will collect data on people who are under the age of 18.
- Presenting the data to the end user may enable distrust and domestic abuse.
As a product owner, it might be a good idea to weigh the risk and rewards of collecting and retaining this data.
Rounding up, what can we deduce from this?
With just a few parameters at our disposal (address or billing data, CO2, temperature and humidity) it is possible to plot a person’s lifestyle. Processing of this data can be automated.
The analysis is within reach for an interested hobbyist, a malicious hacker or an employed developer to perform. Therefore, data from an air conditioner, as well as almost all other data trails we leave behind, should be regarded as Personally Identifiable Information.