Covert data-scraping on watch as EU DPA lays down “radical” GDPR red-line


An interesting decision came out of Poland’s data protection agency this week after the watchdog issued its first fine under Europe’s General Data Protection Regulation (GDPR).

On the surface the enforcement doesn’t look so remarkable: A ‘small’ ~€220K fine was handed to a Sweden-headquartered European digital marketing company, Bisnode, which has an office in Poland, after the national Personal Data Protection Office (UODO) decided the company had failed to comply with data subject rights obligations set out in Article 14 of the GDPR.

But the decision also requires it contact the close to six million people it did not already reach out to in order to fulfil its Article 14 information notification obligation, with the DPA giving the company three months to comply.

Bisnode previously estimated it would cost around €8M (~$ 9M) in registered postal costs to send so many letters, never mind the burden of handling any related admin.

So, as ever, the strength of data protection enforcement under GDPR is a lot more than the deterrent of top-line fines. It’s accompanying orders that can really rearrange business practices.

Local press reports that Bisnode has said it will delete the sanctioned records, presumably rather than shell out to send millions of letters. It also intends to challenge the UODO’s decision, initially in Polish courts — relying on caveats contained in Article 14 which relate to how much effort a data controller has to expend to contact people to tell them it’s processing their data.

It’s reportedly willing to fight all the way up to Europe’s top court, if necessary. (We’ve reached out to Bisnode for confirmation of its next steps.)

Any legal challenge to the UODO’s enforcement decision could therefore end up clarifying (and/or setting) some harder limits around covert scraping of personal data, if it reaches the CJEU — potentially affecting operators in multiple industries and sectors such as business intelligence, advertising and even cyber threat intelligence. So Privacy watchers have pricked up their ears.

“The decision is seen as radical, as it interprets Article 14 literally,” Dr Lukasz Olejnik, independent cybersecurity and privacy advisor, and research associate at the Center for Technology and Global Affairs at Oxford University, tells TechCrunch.

“UODO has taken a very principled position, arguing that the company business model is fully based on processing scraped data, and that the company has taken a decision willingly. UODO also argues that the company was aware of the obligation, as it did contact part of the people via email.”

While there are big and potentially costly implications for data-scrapers across various industries down the legal line, depending on how Bisnode’s appeal/s pan out, Olejnik adds a judicious caveat — noting that “each case might be different and have its specifics”.

There’s certainly no guarantee that the DPA’s decision will lead to a de facto ban on covert commercial data-scraping.

But there is fresh legal uncertainty for those quietly helping themselves to public databases of Europeans’ personal data. While repurposing such stuff for a commercial use may also be far more expensive than you think.

Right to be informed

Article 14 of the GDPR creates an obligation on data controllers to inform people whose personal data they intend to process when the information in question has not been directly obtained from them. So, for instance, when personal data has been scraped off the public Internet.

The relevant chunk of the regulation is pretty long — but key points include that the person whose data has been scraped must be informed who has their data (which includes anyone the data has been shared with, and any proposed international transfers); the types of data obtained; what is going to be done with; and the legal basis for the processing.

Data subjects must also be informed of their right to complain so they can object if they don’t like what you  want to do with their data.

The information obligation is also purpose specific; so if the data controller later wants to do something else with the scraped data there’s an obligation to send a new Article 14 notice.

Data subjects must be informed, at the latest, within a month of obtaining their information (as well as per intended purpose). While if the data is to be used for direct marketing the subject must be informed the first time they get sent a communication, if not sooner.

In the case of Bisnode it obtained a variety of personal data from public registers and other public databases pertaining to millions of entrepreneurs and business owners — including their names, national ID numbers and any legal events related to their business activity.

Registered addresses and/or company addresses appear to have been standard in the public data it scraped but other contact data was not, and Bisnode only obtained email addresses for a small sub-set of the individuals. It subsequently sent emails to those people — fulfilling its Article 14 information obligation in their case.

But, at issue, is that instead of sending text messages or snail mail notifications to all the other people whose email addresses it did not have — aka the vast majority; some 5.7M people — Bisnode made a conscious decision not to reach out to them directly. Instead it posted a notice on its website in the stated belief that fulfilled its Article 14 obligations.

“We recognise the right for sole proprietors to be informed of the fact that their data is processed by us. In this case, Bisnode has complied to the General Data Protection Regulation Art. 14 by posting the information on our website,” it wrote in an initial statement following the UODO’s decision, also posted on its website.

“We question the DPA’s interpretation of what is considered a proportionate effort. In the instances we have had email addresses (679,000 addresses), there we have sent out Art. 14 information via email, but to demand in addition that 5.7 million records of sole proprietors and members of corporate bodies of companies et al, be informed via postal mail or telephone cannot be considered a proportionate effort,” it added.

“In our view, information via email, other digital channels or via advertisements in national daily newspapers is preferable for recipients as well as senders.”

The DPA drastically disagrees — hence the penalty and other enforcement action.

Explaining its decision the watchdog says Bisnode clearly knew about its obligations under Article 14 and thereby made a conscious decision not to directly inform the majority of people whose personal data it had obtained for business purposes on cost grounds alone — when it should rather have accounted for its legal obligations related to data acquisition as a core component of business costs.

“The President of UODO states that the mere inclusion of information required in art. 14 par. 1 and par. 2 of the Regulation 2016/679, on the Company’s website, in the situation where the Company has the address data (and sometimes also phone numbers) of natural persons running a sole proprietorship (currently or in the past), enabling traditional mailing of correspondence containing information required by this provision (or transferring them by telephone), cannot be considered as sufficient fulfilment by the Company of the obligation referred to in art. 14 par. 1-3 of Regulation 2016/679,” runs the relevant chunk of legalese in the UODO decision [translated from Polish via Google Translate].

“The Company, as a professional in this type of activity, should be required to shape the business side of its business, which would take into account all the costs necessary to ensure its compliance with legal provisions (in this case, the provisions on the protection of personal data),” it adds, going on to further press its view that Bisnode’s decision not to reach out to inform the vast majority of individuals because it decided it was too expensive is exactly the problem, especially as its core business relies on processing people’s data.

The DPA’s decision also notes that Bisnode decided against sending SMS messages to another sub-set of people whose telephone numbers it did hold — again claiming as an excuse “the high costs of such an action”.

On the €8M figure which the company estimated would be the cost of posting Article 14 notifications to the 5.7M, the watchdog says there was in fact no obligation to send registered letters specifically (which is how Bisnode seems to have arrived at that estimate); or indeed to use any specific communication medium.

So it could presumably have sent (cheaper) standard mail, or even used its own staff (or hired temps) to spend a couple of days manually posting notifications to the individuals concerned. (Sidenote: Maybe there’s a new type of data notification compliance-tech robot/drone delivery startup to be created here… Knock-knock! Article14 delivery bot at the door to read you your rights…)

The UODO points out that GDPR’s Article 14 provision does not specify any particular means of fulfilling the obligation to inform. It just requires the data controller actually reach out.

An active manner vs disproportionate effort

The “essence of fulfilling the obligation” is to act in “an active manner”, it writes — so that means providing information to a data subject without them having to participate in enabling their own notification.

So just posting a passive notification under a tab on a website, as Bisnode did, would seem to go against that essence — as it clearly requires the people whose data is involved expending effort to find out.

And if they don’t even know their data was scraped in the first place how would they know where — or even to — go looking? It’s very unlikely they’d just stumble upon the notification by chance on Bisnode’s website and join the dots. Not without some kind of wider broadcast announcing its presence.

“The need for active notification is emphasized by the Article 29 Working Party, in the Transparency Guidelines under Regulation 2016/679 adopted on 29 November 2017 (most recently amended and adopted on 11 April 2018),” the UODO’s decision further notes, citing guidance from an influential pan-EU data protection oversight body that’s now known as the European Data Protection Board and responsible for helping ensure consistency of application of GDPR across the bloc.

In a press release accompanying its decision, the UODO also makes a point of specifying the number and proportion of people who objected to Bisnode using their data after it did contact them directly (i.e. by email) — writing: “Out of about 90,000 people who were informed about the processing by the company, more than 12,000 objected to the processing of their data.”

Which highlights the fact that informing people about commercial and marketing-related uses of their data can, and usually does, result in a bunch of them saying ‘no don’t do that’ — an outcome that’s not exactly aligned with the interests of a marketing company like Bisnode which obviously wants to maximize the reach of its database.

But a shrinking marketing database may well be the price of respecting people’s privacy rights and doing business legally in Europe. And Bisnode’s interpretation of what is and isn’t “proportionate”, vis-a-vis Article 14, does look self-servingly aligned with its own business interests rather than with the rights of EU citizens.

If the legal rights of EU people to know what’s being done with their personal data can just be sidestepped by a data controller holding only selective types of contact data (for instance) that risks putting a pretty big loophole in the data protection framework. (Although in a similar case from a few years ago the UODO reached a different decision in regards another company that did not have addresses at its disposal.)

There are some caveats included in Article 14 — allowing for a data controller to dispense with the requirement to inform data subjects if doing so “proves impossible or would involve a disproportionate effort” — but they are conspicuously linked in the text of GDPR to non-commercial examples: “[I]n particular for processing for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes”.

Safe to say, a b2b marketing business doesn’t fit the bill there.

A further caveat — which removes the obligation to inform the data subject if it is “likely to render impossible or seriously impair the achievement of the objectives of that processing” — would also seem a tough one to argue for a marketing purpose such as Bisnode’s.

It’s true that, as the complaints following its emailed Article 14 notifications indicate, there will very likely be a proportion of objections from those informed about a marketing purpose for their data. But the complaint stats cited by the UODO reveal that only a minority (~13%) of those emailed actively objected to Bisnode’s use of their data — a figure that does not seem so catastrophically large as to “seriously impair” the company’s overall business objective.

Of course it will be for judges to decide on all these details. But the looming legal fight will be around what constitutes “proportionate effort” — and in which circumstances those Article 13 caveats are allowed to apply.

“The ‘disproportionate effort’ in Article 14(5) is the core issue,” agrees Olejnik. “While including information solely on a website might be sufficient in some cases, but it is not clear if this applies in this case in particular. It is rather clear that the majority of people affected have no idea that their data are processed.”

“What the courts decide is anyone’s guess. It will be a truly interesting case to observe,” he adds.

In terms of immediate practical implications flowing from the UODO’s decision Olejnik says those are also unclear for now — not least because of Bisnode’s plan to fight all the way up to the CJEU if it can. (Meaning its appeal process could take years.)

“The company is also saying in public that its different EU branches are following a similar practice, but did not draw the attention of DPA,” Olejnik continues, adding: “It is however clear that some form of information obligation needs to be made. I believe this is an interesting precedent.

“While it may be shocking to some, this is the GDPR enforcement in action. Prior to enforcement, many would doubt if some text of GDPR means what it means. Well, it appears that to DPAs, it might indeed mean what it mean, if you know what I mean.”

The growing cost and risk of personal data

There is arguably a rather similar story going on, in parallel, around ‘free and informed’ consent under GDPR in relation to online ad targeting — which has turned into a major legal battleground since the regulation came into force last year. Multiple complaints remain in play targeting various data-for-ads tech platforms, as well as attacking core adtech processes for using and sharing personal data without proper consent and/or adequately robust protection.

With the GDPR not yet a year old, major enforcements are still lacking. But there are signs regulators are preparing to draw equally firm lines in the sand on this front too.

Given all the effort going into obfuscating and/or trying to ‘compliance-wash’ how the adtech industry strip-mines personal data, those most systematic personal data-harvesters similarly appear to have calculated that the cost of fully informing individuals is simply too high.

Also because they surely stand to lose a big chunk of their marketing muscle if every user whose personal information is being exploited for ads was offered a genuine, fully informed and entirely free choice to say no way.

But that doesn’t mean they can just sidestep the requirement. Enforcement is coming for any lurking lack of compliance there too.

Zooming out, it’s not clear what proportion of personal data is scraped from the Internet vs being actively provided by the user (albeit, not necessarily freely and willingly provided — as is the nub of this GDPR ‘forced consent’ complaint, for instance).

“Obtaining such comparative data would difficult at a scale,” admits Olejnik.

There’s no doubt plenty of nefarious actors engage in ‘fully unlicensed’ online data-scraping to run illegal spam campaigns or sell it to hackers planning phishing expeditions. And clearly no regulation under the sun that will put a firm lid on that. Though increased legal risk may at least provide a disincentive to less hardened cyber criminals.

In the commercial sector, where regulation has a more powerful bite, the lines between scraping and ‘providing’ data are frequently self-servingly blurred by the entities involved — seeking to workaround the law.

So, again, robust enforcement decisions that get upheld by jurisprudence are sorely needed to define and set down firm red-lines about how people’s data can be respectfully handled.

Let’s also not forget the scandalous acts of the now defunct political data company, Cambridge Analytica, which covertly scraped personal data off of Facebook’s platform to build psychographic profiles of American voters to try to influence domestic political outcomes — something which would certainly constitute a breach of Article 14, i.e. were such actions applied to EU peoples under the bloc’s current data protection regime.

An egregious example like Cambridge Analytica shows the clear logic of GDPR creating a framework for protecting people from non-disclosed use of their personal information — by offering a check against unwelcome misuse. As indeed does Facebook’s long history of abject failure to properly protect user data.

It’s not clear whether GDPR could have stopped a rogue actor like Cambridge Analytica. Though the heftier fines baked into the regime do mean data-scraping is no longer the ‘help yourself, free for all’ it apparently was back in 2014.

At the same time, multiple Facebook businesses remain under investigation in Europe: The Irish DPA has ten open investigations against multiple Facebook-owned platforms over questions of GDPR compliance. So watch that space. (And watch, too, Facebook announcing a sudden ‘pivot’ to ‘privacy… )

Covertly harvesting personal at scale now finally involves serious legal risk — at least in Europe.

And in light of the UODO’s strong stance on Article 14 there’s a little more reason for data scrapers to worry more.

Full disclosure

One final note on UODO and Bisnode: In a slightly odd quirk, the watchdog decided not to publicly name the company — choosing to pseudonymize it by editing out certain details from the published decision text.

It’s not clear why the DPA did so. Nor was its attempt to hide the name effective. Olejnik says he was quickly able to reverse its pseudonymization. While Bisnode also subsequently chose to out itself by going public with its disagreement.

Other European DPAs do disclose the targets of their decisions as a general rule. So it’s definitely a leftfield choice by the Polish watchdog.

A spokesperson for the UODO told us it does not always avoid disclosing the name of entities subject to its decisions but in this case said its president took the view that “information about the administrative fine and its justification is sufficient” — adding that in its view the most important element is to inform the public about decisions issued and “their substance”, including providing details of the decisive arguments in its decision-making process.

But given the lack of a specific justification and especially the weakness of the pseudonymization Olejnik suggests not publicly naming Bisnode was a questionable decision.

“Based on the information from the decision it did not take me much time to ‘reverse’ the pseudonymization and reveal the company name. This puts the decision behind pseudonymization under question,” he suggests. “Though I believe the public has a right to expect transparency in the first place — the decision to pseudonymize was controversial in the first place. To say the least, it forbids users to learn about the case, the misuse, and potentially even learn if they may have been affected.”

There is perhaps no small irony in a privacy watchdog choosing to ineffectively withhold the name of a company that had failed to inform a large number of private individuals that it covertly held their data.

Europe – TechCrunch