Data in the rest of Government: Put data to good use?

{this is a background reference blog post, ahead of more on the Cabinet Office’s data copying consultation. The call to action will be in the next newsletter.}

Let’s make data easy to put to good use” says the Cabinet Office. But good for whom? Good for the civil service? Good for each citizen? Who makes sure the balance is right? was claimed as a “good use” of data. The details showed it to be something radically different. The Cabinet Office consultation launched last week is about bureaucracy as usual. The mantra is reform, but the reform is to bring all the benefits to Government, and the downsides for citizens.

Digital transformation, this is not.

Every data project has to say how it’s “not like”. But many of those projects have learnt the titanic lesson of not crashing into that particular iceberg. The iceberg was the trigger of that incident, the next large scale data debacle will be slightly different. Will Government work for citizens or will it be turned the other way round? 

Bulk data sharing

There are some things that only Government can do in bulk – the proposal to take the privacy protecting measures used to give fuel benefits to pensioners, and applying them to those on particular benefits, are welcome. That is something that only helps citizens, and a clear statutory basis for that data flow is necessary, and the first connection to the citizen is them receiving a lower fuel bill. It has (seemingly) been designed to be done safely.

The majority of the planned work is not like that.

The majority of the effort is for generic powers: sending bulk copies of data on everyone, from one place to another.  That sounds a lot like As with medical records, there are a wide range of good reasons for doing things, but that doesn’t mean that it should be done to everyone, in general. The claim is “good use”, but good for whom?

The mantra of the data copying maximalists is that “we get asked why we don’t share data already?”. However, that is only asked by people who are unhappy with the status quo; people who don’t want data on them copied in bulk and at the whim of the system, are entirely happy. That’s not to say that a citizen shouldn’t be able to have that happen, it’s to say that copying data on everyone is not an improvement.

A missed opportunity to ask the citizen

For many purposes, the reason for greater data sharing is to reduce the burden on the citizen. But for the majority of the proposals (with the Home Office’s General Register Office) the goal is to reduce the burden on the citizen interaction with Government, when the citizen is already interacting with Government.

It is entirely feasible for the part of Government you interact with, to ask you whether you are content to get confirmed information from the other bits of Government that have it. If you say yes, the data on you alone gets shared (via APIs); if you say no, then you have to do more work to prove whatever Government tells you it needs. But the power about how your data gets used remains in your hands, as it is now.

The status quo gives you as a citizen great power to withhold information you feel is irrelevant, the generic data sharing power as proposed will share any data that any civil servant in Whitehall thinks might be useful for someone.

The consultation proposes taking that power out of the hands of each citizen, and putting into  one standard data sharing process for all. The Government’s term for “big databases” is “Bulk Personal Datasets”, and many civil servants want them copied into each little fiefdom – it makes their life easier, and what’s good for the civil service is apparently good for every citizen.

These proposals are designed to lead to more use of data by Government, not necessarily better uses of data for citizens. The process that led up to this consultation was about better uses; the consultation process now is about more uses. And hiding them all.

A commitment to perform only appropriate analyses in the right place, would require change from the civil service; giving more data to more people satisfies the existing bureaucracy while only having to claim modernisation. And not telling the citizen how their data is used means that no civil servant will ever be held accountable for their choices.

One of the GDS Design Principles is “Do the hard work to make it simple”; it seems simpler Whitehall meetings are more important than citizen or democratic control of what happens to data on each citizen.

Bulk Personal Datasets

DWP pays for benefits, but your GP can sign you off sick (which eventually has a tiny impact on the benefit bill). As a result, the DWP demanded a copy of all the “FITnotes” that each GP had issued, and in classic case of Whitehall weakness, had DH use their powers to get it, and just hand the data straight over to DWP.

There are many reasons for DWP to want an analysis of that data; and there are many reasons to have that analysis be done; but who should do the analysis?

If there was a clearly articulated question, DH analysts, who understand the data in depth, could do the work, ensuring that there wasn’t inappropriate uses of the data. However, DWP don’t trust DH, (or, more likely, also didn’t want to write down what the question should be, as then they couldn’t change it based on the data).

If Government departments don’t trust each other, why should citizens trust them?

It’s an uncomfortable question for the reshaped GDS to answer, and the data team has chosen to ignore it in this consultation.

“Context Collapse”

As found by the Wellcome Trust, “context collapse” for data occurs when a citizen (or patient) provides data in one context, only for it to be interpreted radically differently. is one example, and the attempt to “open” the National Pupil Database in 2012 was another.

It seemingly made sense for a civil servant in the context of Whitehall to attempt to release data on teachers and schools. Unfortunately, from the context of parents, children, and impartial observers, that was releasing individual level data on 20 million school children (everyone at school since the 90s). They still don’t routinely publish what happened to that data. is another high profile example; or the attempt by the PM’s office to get complete copies of every GP’s appointment book; or Jeremy Hunt’s big database of women’s genitals (designed by the Home Office); we could go on – there’s quite a long list.

When Whitehall departments are weak, and many of them are when political imperatives are at stake, the decision to copy data is not taken in the context of the citizen, it is taken in the context of Ministerial infighting and obscure agendas.

This consultation, and the thinking that created it, does nothing to improve on that.

What should happen?

1. Improving Individual Interactions


  • Ask citizens for permission of what you wish to do.
    • Some actions are “required” for the service.
    • Many can be consented (debt consolidation can be an opt in service (at least initially))
  • Tell citizens what you require of them:
    • Digital should be a a way of meeting those requirements
    • Digital services should be the better choice, not mandated – “services so good people choose to use them”
    • Authorisation for tasks
      • this can be research
    • Does not require bulk data copies  – needs APIs between depts
      • Citizen gives consent, and Departments have a model by which their systems offer information to each other on the citizen’s behalf, when the citizen requests it.
    • Statutory overrides where appropriate
    • Citizens must be able to find out what information was shared, and why, to prevent abuse – this is the only safeguard that keeps the entire system honest.

2. Bulk Personal Datasets

  • There are some purposes where the citizen is not present in the beginning of the transaction, and so can not consent. These should be statutory:
    • e.g. official statistics and other statutory analyses (e.g. Audit)
    • e.g. DECC cold weather payments (in the consultation)
  • Otherwise, access to bulk personal datasets is by consent for research and modelling:
    • including the design of publishable aggregated statistics
  • All projects reported
  • All outputs published
  • Copies minimised:
    • generally in a “safe setting” (e.g. NAO for Audit)
    • all queries must be capable of being entirely audited

The above two proposals are not controversial. They are entirely doable if there is political will inside the civil service to address what is hard.

3. The dark corners of legacy processes

It is in the dark corners of legacy where the majority of Government data copying resides. It’s the thinking that the lost tax discs in 2008, it’s the legacy of the bulk data sharing powers that Cabinet Office tried to sneak through in 2009. Copy more data, and hope nothing goes wrong on your watch (and pray you never get promoted to a position where your predecessor did the same thing)

It was historically easier to do a bulk copy and forget about it. That is the way most of Whitehall works, and the way NHS England envisaged working in 2012, and it’s easier to ignore than to fix it – after all, it’s citizens who will pay the price, not the bureaucrat in charge of legislation (who wants a promotion).

The NHS started this process of transparency, but the NHS should not be the end of it. The long delayed Caldicott Review of Consent should address these issues, but the NHS England and the Department of Health is busy watering it down so they don’t have any incentive to stand up to DWP or the Home Office the next time someone wants some data that is contrary to patients’ interests. Staff from the Cabinet Office are helping NHS England make sure that all data is copied as widely as possible, as if the NHS shows how to resolve the problems properly, Government will have no excuse not to.

Government says it wishes to have data use be more like the mobile telephone companies. Your mobile provider may sell a large amount of data on you and other customers, but even they are utterly shocked at the idea of copies of individual data leaving their departmental control, purely for political purposes, as happens routinely in Government. The secret parts of Government have reasons to try to keep some things secret; but secrecy should not be a comfort blanket for the civil service to in perpetuity hide from difficult changes to processes.

Hospital data being sold was a legacy dark corner, until it wasn’t. What’s the next Government equivalent?

Some legacy systems can only function if they make copies of data; that’s just how they work: they should be replaced. Over time, they will be replaced, hopefully before anything goes wrong. But the political agenda of data copying, encourages the most leaky parts of the system to have the most data, ready to be exploited like talk talk. The DWP had over 15,000 copies of their 80m person database lying around their analysts computers. Are they sure they’ve not lost one? How can they possibly know? They didn’t keep track until someone went in to count (and the chances that such an audit is accurate are…?).

The consultation does nothing to improve data use, it promotes unrestrained distribution of a toxic asset: citizens’ data.