Data in the rest of Government: AI, and today’s laws for tomorrow’s benefits

AI has finally got Government to take data seriously.

Information is the blood of any bureaucracy – and copying is the circulatory system. “Digital” in its broadest form is just the latest wave of faster photocopiers – decisions keep getting made no matter how fast the machines work. Any good private secretary knows: if you control the paper flow, you steer the decisions.

Just as the Cabinet Office has “spend controls” for technology, there should be flow controls for data. Current data practice in Government is 5 different scandals away from adequacy. As with our work in the NHS, some of those will be public, some of those will be private – the scandal is optional, the improvements are inevitable.

Even where the is a fundamental disagreement about a policy in the non-secret parts of Government, there should be the ability to have a shared factual understanding of how data is used.  But even in the “non-secret” parts of Government, there are legitimate reasons for some projects to have limited information disclosed (fraud detection being an obvious one where some information should be withheld, or generalised). The recent Data Sharing Code of Practice Consultation from the Cabinet Office seems to get that balance right for fraud data.

It would be helpful to have political leadership stand up and say (again) that “Citizens should know how data about them is used, in the same way taxpayers should know how taxpayers’ money is spent.” (quoting Matt Hancock MP – then Minister for the Cabinet Office). But that is only helpful, not necessary, and there are sub-political choices which deliver benefits for the civil service and Departmental priorities absent political leadership.

The Spring 2017 Conservative Manifesto gave a strong and clear vision of how Verify could be at the heart of a Government that was accountable to its citizens (page 3). The question is whether new guidances lets that be implemented, or stymied. The Article 29 Working Party has yet to issue full guidance on the transparency requirements of GDPR – but waiting to do the minimum is not in the spirit of the UK’s desire for leadership in AI, nor goals regarding data.

Government has a range of data sharing powers, and they should all be subject to transparency – otherwise the failings of one will infect public confidence in all.

Fortunately, the range of discussions currently ongoing give the opportunity for the choices of the future to be better than the the past; if that is the desire. The National Statistician’s Data Ethics Committee is a good start, addressing the highest profile and precedent setting issues across Government. However, as with other parts of the Digital Economy Act (Part 5), there should be a Data Review Board for all data sharing decisions that don’t reach NSDEC: it gives a process for which data sharing decisions can be reviewed.

However, if there is an informed citizenry, with citizens able to see and understand how their data has  been used by government, the more complex questions of AI and algorithms become tractable. The status quo will not lead to a collapse in public services, and they will always be able to catch up, the question is only the nature of the political pain that Ministers will suffer because of their civil servants.

A number of Departments believe that “digital transformation” has either failed or is not for them, and they wish to go another way. But the target was always the outcome not the method, and the test is not the pathway, but delivery. How do Departments transform to reflect their current situation? Will they be accountable and to whom?

 

Bad ideas beyond the AI Review

The recent “AI Review” talks about how “Navigating a complex organisation like the NHS is an unfathomable task for small startups like Your.MD.”. Your.MD being a company which hosts data they collect in the US (ie subject to US law), and outsources coding to eastern Europe (it’s cheaper), and generally cuts every corner that a startup cuts (the corners being things required to protect NHS patients). It should not be too much to ask that anyone wishing to use NHS patient data is capable of hiring someone who can use google to find NHS data rules. Although, as that is a test that DeepMind catastrophically failed, maybe Monty Python was right to hope for intelligence somewhere out in space.

 

Loopholes (and the Data Protection Bill)

There are some areas where narrow special interests still see themselves as more important than the promises made to patients or citizens, and as more important the principle of no surprises for patients. No bureaucracy can rid itself of the temptation to do what is in the interests of only the bureaucracy. However, it can decide to hold itself to a higher standard of transparency to the people it serves, and let them make the decisions.

With clause 15 it is Government’s demonstrable intent to carve holes into data protection law for its own purposes. To balance such attempts, through the many gateways through which it is possible in the Bill, there must be transparency to a citizen of how their data is copied, even if it entirely lawfully. That allows a separation between whether data is copied, from the rules that cover data copying and access, and an informed democratic debate

AI has finally got institutions to take data seriously. In doing so, it has created a clear distinction between those who understand data from those who do not (the transition from the latter to the former is incentivised as the latter are easier to replace with an AI). As yet, the AI companies don’t yet understand (or wish to understand) the institutions they want data from – which suggests those companies too are easily replaceable (paras 35-49). The AI review also suggests “data trusts” mirror other dodgy kinds and replace the existing principle of safe havens. While some of the large charities can look at that approach as insurance should public confidence in a particular disease registry collapse, and they are entirely wise to do so, a lawful disease registry should command public confidence.

The dash to big data and AI does not mean everything we have learnt about confidentiality, institutions, and public confidence should be thrown away to satisfy startups with less history than a Whitehall cat.

Any external body which seeks to prevent misuse of data will likely fail over time. It is easy for mediocre managers to believe the sales pitch to buy a big system that will “do everything” – to flood a data lake – while earnestly convincing others that this approach will solve whatever problem they think you have. Care.data was supported by many sectors, long after the flaws were undeniable, it was only when the public became aware that their tune changed. How will the new bodies learn from that mistake? Do they even think they have to?

The actions of the Home Office have destroyed the integrity of Country of Birth / ethnicity data in the National Pupil Database. At no point was that a discussion – just a directive. It impossible to expect even the most privacy-interested civil servant to defend such a line – even if they remained implacably opposed, their successor eventually would not. There are 3.5 years before the next census. If the first thing the nation’s children know about a census is that it deports their classmates, the fundamental basis for all statistics about the UK will be fatally undermined for a decade. This isn’t counting cranes, it’s extra resources for the areas that think they have high levels of immigration….

Bad ideas never die until they are replaced by better ideas. The misstep in the life sciences strategy illuminates the way that the future may go wrong – there needs to be a way to course correct over time. Just as every use of data in the NHS should be consensual, safe, and transparent; every use of data by Government can be fair, safe, and transparent. That includes uses by any group who cares to assist and be accountable to the individuals whose data they desire.

Is there an interest in a strategic, practical, and available solution? If not, then how many more data scandals will it take, and how high will the associated price be?

There is a better approach, using today’s laws for tomorrow’s benefits.