Category Archives: News

What are the principles that should underlie a login infrastructure of a digital NHS?

DH / NHS Digital’s name for the work they are doing on patients identifying themselves digitally is the “citizen identity” programme – a name which demonstrates the fundamental misunderstanding of the problem that needs to be solved. They expect to launch in September (item A1, page 54).

Designed after the Home Office ID cards scheme was abolished, the Government’s generalised login solution is an implementation of the ID Assurance principles, usually called GOV.UK Verify. It would allow a range of different “identity assurance” providers to allow patients to log in to a wide range of different services, without creating an overarching “database” of anything. There are lots of constituent parts to Verify, all working together, underpinned by a set of principles that are accountable to an independent advisory group and Ministers. The principles are separate from the infrastructure design, which are separate to the deployment for Gov.UK.

The NHS should agree to follow the Principles, and make it’s own deployment of the infrastructure, basing an ‘identity’ assertion in a pre-existing legitimate clinical relationship.

As a model, this would be closely aligned to the current NHS model for patients logging into services, known as Patient Online, for which GPs distribute logins – as they know who patients are, and can manage the exception handling (lost passwords, verification, edge cases, etc.).

Meanwhile, over in the database state corner, there are still projects looking to build a centralised login infrastructure for all digital health services, derived from a legal document – such as a passport, driving license, or tax payments.

Identity Assurance Principles

The PCAG Identity Assurance Principles should apply to the NHS login infrastructure, and be overseen in a similar way. A patient has a choice over which GP they wish to use; which provides for the choice of identity provider. Due to the range of conditions handled by the NHS, it may be clinically necessary to in practice deny to a patient choices they may otherwise in principle have – but only for clear clinical reasons.

It is initially convenient that the Principles, and the current mechanism for handing out usernames and passwords to patients across the NHS (i.e. GPs) align extremely well. There will need to be work on the infrastructure middleware layers of the system, but the Patient Online programme – giving details to already identified patients – has already begun, and begun at scale.

Whatever system is used must accommodate and enable patients who wish to keep some aspects of their treatment entirely disconnected from other aspects. Whether this is via one login for all NHS services, or for particular areas carved off, should be entirely under the patient’s control, and not be restricted by NHS technical decisions.

Technically, this is not difficult. The infrastructure has already been designed and built by GOV.UK, and that code can be reused. Whether NHS Digital reuses the Cabinet Office servers and operations team is primarily an operational question.

As a political and Governance framework, the principles may be hard – and digital identity governance doesn’t currently exist in the NHS – but it does exist in PCAG. PCAG should therefore be asked by DH to assess whatever the NHS implementation is, against the PCAG principles. This will require some complex conversations, and learning on all sides.

The standards and code are copied, the principles are accepted, the identity providers and service acceptance standards are NHS specific.

Absent leadership from DH, this could be almost impossible. It is absolutely vital that this delivers, and delivers fast, in order to realise significant savings in the NHS Budget. Those who control the budget are not necessarily the people who are capable of delivering quickly, nor are their interests necessarily served by a solution with strong governance.

The NHS expects to find £1bn a year in savings from reducing missed appointments via a better digital Choose and Book. The service already exists – there is simply no way to log into it easily. Let us say that again: £1 billion in short-term savings, simply from the NHS having a proper digital infrastructure.

Patient Online works for assigning patients with usernames and passwords, based on a clinical relationship, and Verify’s infrastructure has been shown to scale. Ad hoc identity approaches have been shown to fail.

Should passwords get stolen, the Patient Online system can include an additional factor: needing to know the GP for which a stolen username/password is valid. It is likely that username/GP lists are rarely stored (other than by the GPs themselves) which gives the NHS regulatory assistance unavailable elsewhere.

Here is our demonstrator: if you have a login for your GP, feel free to try the blue buttons

There is no reason for any part of the NHS to have a big list of all of the services a patient has used.

If the current world view persists, initiatives like the excellent SH24 projects, and a digital Dean Street Clinic, are going to remain services that cannot function at scale – because there will be no national infrastructure for them to reuse. A Verify-based governance model can do that, and they would also be able to issue their own usernames/passwords since they deal directly with patients, as GPs do.

Our demonstrator is purely a proof-of-concept. NHS England could have published in machine-readable form the login page for each GP, but for some reason didn’t see the need. #NHSAlpha, who could have got them to do it, instead wanted to own the database – so started work on the impersonation problem. Badly. These are both things that other parts of the NHS handle every day, and DH can only do worse at greater expense from afar. It is concerning that the NHS ‘technical silos’ have not recognised that this is a system which can be encouraged, and instead sees it as a technical problem with a technical solution.

There must be better governance around logins and how digital health information accesses are run. The PCAG principles are the beginning of that discussion, not the end. GOV.UK Verify relies heavily on passport verification, and the issuance of passports relies heavily on NHS-derived data. It would be perverse to go round that entire loop in order to issue a GP login, when the GP is also someone relied on to prescribe mind and body altering substances. But along the corridors of DH and NHS England, there are a handful of people muttering “My Precioussss”, while trying to forge a database state for medical information – worse than that, none of the projects are actually talking to each other.

Sound familiar?

care.data lessons, unlearned

The HSCIC (the statutory body otherwise known as NHS Digital) has a form you can fill in to opt yourself out of the various HSCIC datasets; the form is 12 pages long. The equivalent form, ready to be handed to your GP, is one side of A4 and contains just 2 tick boxes – plus space for your name, address, etc.

The HSCIC 12-page form has those same tick boxes, but the other eleven and a half pages are all about verifying identity, so that a remote institution that very, very rarely deals directly with patients knows that the right person filled in the form.

A process, done at the wrong level, can generate that much extra paperwork

 

Summary:

Extend Patient Online using GOV.UK Verify’s infrastructure design, augmented with the following features:

  • Only those organisations that have a clinician-patient relationship should be responsible for issuing identity credentials to individuals;
  • NHS Patient Identity should follow the PCAG principles;

The identity requirement for the NHS is not a citizen identity, but it is a patient identity – even if the patient is entirely healthy.

Here are some available next steps.

medConfidential Bulletin, 24th March 2017

It has been a while since we last sent a newsletter. Our apologies for that, but we have been kept busy!

We are entering a period where a lot of things are happening – and are likely to happen – in quick succession, so we wanted to provide a perspective and some context that we hope will help explain at least some of what is going on.

For patients whose practices use TPP SystmOne

You may have seen the note on our website last week about TPP SystmOne. TPP has now updated its system with the capacity to allow your GP tell you how your GP-held data has been accessed. However, busy GPs won’t yet know how to turn that function on, as the documentation has not yet appeared (and we’ve not been told either).

If your practice uses TPP SystmOne, also branded SystmOnline, and you are able to log into your GP practice online (i.e. if you have a username/password for online access) then you may be able to see this option – to review the organisations which have accessed your GP data – right now. If not, check back in a week or two. It is coming.

This ability to see who has accessed your GP data matters, as the the hard part of informed consent is actually being informed about how your medical records are used. As the NHS evolves over time, and while you have a range of consent choices, you need to have accurate information to be able to make those choices for yourself and your family; in your situation, according to your concerns.

Problems tend to arise when people other than those directly affected take decisions that do not – indeed, cannot – account for many millions of people’s individual circumstances.

Google Artificial Intelligence (AI) subsidiary DeepMind

When in a hole, it seems some AIs will keep digging.

medConfidential’s complaint against Google DeepMind’s use of 1.2 million patients’ hospital data continues to be investigated. The National Data Guardian appears to have come to a view some time ago – which suggests the question currently under consideration is how badly Google broke the rules.

A long analysis from the University of Cambridge was published last week, which goes through the entire sorry story in a great deal of detail.

We do not know when the Information Commissioner and National Data Guardian will publish their findings, but fully expect Google DeepMind to leak some parts of those findings to sycophantic outlets the day before…

We shall respond, as we always do.

What’s next?  An NHS reorganisation that really matters

Has your area announced the reorganisation of your NHS yet? For several big cities of the North, and some other parts of the country, the picture is getting clearer. The ‘STP shuffle’ will put your local council in partial control of where your medical records get copied – including whether they end up being dumped into a “data lake”.

In hidden meetings, proposals for a “national data lake” continue to be discussed. While NHS England denies it is their current plan, they continue to write regular drafts of an updated document, which they’re sharing with no-one beyond those people who thought a ‘National Data Lake’ was a good idea in the first place…

In our next Bulletin,  we hope to have something for you to do to help your community, and may also give an update on the continuing failures around data at Public Health England.

As ever, we are grateful for your donations. Especially as, right now, we’re being legally threatened (we’re in ‘letters before action’ stage of an attempt to sue us for defamation) for expressing our concerns about a data breach reported as affecting 26 million patients – that’s a lot of new badges.

(We’re aware that, as badges, our button badges in two new designs are ridiculously overpriced. The price point is deliberately chosen so that a donation of £20 to us gets you one, automatically. Or set up a regular subscription for any amount – and we’ll post it to you.)

Thank you.

Phil Booth & Sam Smith
24th March 2017

 

medConfidential comment on Google DeepMind briefing on an academic paper

We read many academic papers about data projects. It is rare they result in anything at all, let alone anonymous briefings against academic inquiry.

We were therefore intrigued by two points in this Wired article, written with access to Google DeepMind executives:

  1. It reuses a quote from medConfidential that is 9 months old, as if nothing has changed in the last 9 months. If that was true, why did Wired write about it again?
  2. That the quote from the Google DeepMind executive suggests the academic paper to which the article refers has errors.

If, as DeepMind says, “It makes a series of significant factual and analytical errors”, we look forward to DeepMind publishing evidence of any errors as a scientifically rigorous organisation would, rather than hiding behind anonymous briefings from their press office and a hospital. Google claims “ “we’re completely at the mercy and direction” of the Royal Free”, but from the last 2 paragraphs of the same article, that’s obviously not completely true…

medConfidential has confidence in the scientific inquiry process – and we are aware DeepMind also do, given their own authorship of academic articles about their work.

While it is highly unusual, it is not a factual or analytical error to write an academic paper that is readable by all.

We expect that DeepMind was aware of the substance of the paper prior to publication, and didn’t say anything about any of those problems then. This behaviour is entirely consistent with DeepMind’s duplicity regarding our timeline of public facts about their original deal – they claim errors in public, but will say nothing about them when asked.

Colleagues at the Wellcome Trust are right – mistakes were made.

This is how AI will go wrong; good people with good intentions making a mistake and being institutionally incapable of admitting that most human of characteristics, imperfection.

For patients whose doctors use TPP SystmOne

Update 19/3/17: We understand TPP is due to provide more information on their transparency process. We will update this notice when we have read what TPP provide.


There is a problem with the security of GP records held on TPP SystmOne, where your records are protected only by a Code of Conduct:

If you do not receive care from an organisation that uses TPP SystmOne, this issue does not affect you. You can check whether your GP practice does use TPP SystmOne by putting your postcode into this online form; select your GP practice from the list provided, and you should end up on a page which asks you for a username and/or password. If this page has anything other than a SystmOne/SystmOnline logo at the top of the page in big blue letters, then this issue doesn’t affect you. If you see a TPP SystmOne or SystmOnline logo, then you are affected.

(The logo looks like this, but in much bigger letters:  )

Nothing below affects you, unless your doctor uses TPP SystmOne.

Due to a failure by TPP SystmOne, your record may be visible to authorised users in other parts of the NHS that also use TPP SystmOne, unless you (or your GP) have previously taken an active decision to prevent this.

You will know if you took this decision already, because such a decision will affect the care you can receive as it affects who can access your GP record, including for services such as out-of-hours care. While access should be able to be restricted by your GP practice to only those who provide them out-of-hours care, that restriction is not currently offered by TPP. Therefore any authorised user, at any organisation that uses TPP SystmOne, can potentially access at least some of your record.

If you have major urgent concerns about this, and if you only receive care from a single NHS organisation – e.g. your GP, or a single mental health organisation, or a single pharmacy, etc. – you can simply turn off what is called “sharing out” by that organisation using this form BUT please ensure you read the information on the form itself, and the next paragraph, before making that decision.

For many people, turning off “sharing out” is an option that may affect your care, even in the medium term, while TPP fixes the problem.

In the interim: if you are concerned, and turning the “sharing out” feature off would impact your care – which is likely for many people – you can write to your GP practice manager and ask them to (in TPP’s words) “use the Record Sharing node within the patient record to view which other organisations are sharing in the patient’s record and can therefore access the information you have shared out”. In other words, you can ask the practice manager to provide you with a copy of the full list of organisations and the dates on which each one accessed your details for the last 6 months – or around specific dates, if you have a specific concerns.

If there are accesses from institutions you do not recognise, medConfidential will publish more information on this post in the next few days about what happens in those rare cases. In most cases, if the dates of access are around a day in which you were at a different NHS provider nearby, it is highly likely that information will have been shared between your care providers. (We will expand on this shortly.)

TPP are actively working to fix this issue, implementing a change that will let you use your login to your GP’s website so that, in future, you will be able to see the ‘audit trail’ of uses that your GP practice manager can see now. If you don’t already have a login to your GP website, it would probably be helpful to get one in advance – as it will have other features beneficial to you.

We understand that TPP are also taking a number of other steps we’ve not covered here

Longer term outcomes

As medConfidential said when commenting on this issue, “Failures of this sort are exactly why patients must be able to see by which organisations their GP records have been accessed.” We have said this before, when organisations have similarly failed.

We strongly welcome that “TPP will be making amendments to the record audit within SystmOnline, this will show the patient every organisation that has accessed the information you[r doctor] record within their electronic record.” (See the bottom of page 2 of this TPP document.) This work will help reduce the harm of data breaches across the NHS, and not just for TPP.

Such failures have happened before, and will happen again, and again, until – as Dame Fiona Caldicott recommended last summer – Jeremy Hunt commits to ensuring that every patient in the NHS can see how their data has been used.

TPP has now committed to telling patients how their data is used… what about everyone else?

 


For NHS staff

For GP practices: Please read page 2 of this TPP document linked from your TPP noticeboard, and ask your GPSoC provider to accelerate their work on delivery of the audit trails to patients, and a resolution of the underlying problem. If you have queries, contact your local Caldicott Guardian.

For Caldicott Guardians: please see the (imminent) guidance from the Council of Caldicott Guardians and the National Data Guardian.

medConfidential response to “technology company DeepMind” Press Release

For immediate release – Tuesday 28 February 2017

One year after first telling the public that “technology company DeepMind” [1] was going to help the NHS, it is still unclear whether Google’s duplicitous offer still includes forcing the NHS to hand over the medical history of every patient who has visited the hospital. [2]

It is no surprise that digital tools help patients, but is Google still forcing the NHS to pay with its patients’ most private data?

As the NHS reorganises itself again with the Secret Transformation Plans, [3] NHS England plans a ‘National Data Lake’ for all patient data. [4] Of which this is one. In defending giving data on all its patients to Google, Royal Free’s Chief Executive, David Sloman, said “it is quite normal to have data lying in storage”. [5]

Tomorrow the Government announces the UK’s new digital strategy, [6] including new money for the Artificial Intelligence in which DeepMind specialises. Is copying of data on a whim what the future holds?

Clause 31 of the Digital Economy Bill suggests precisely that [7] – data can be ‘shared’ (copied) to anyone associated with a public or NHS body [8] who can justify it as “quite normal to have data lying in storage”.

As Downing Street takes the Trump approach to health data, [9] does Google now say the ends justify the means?

Phil Booth, coordinator of medConfidential said:

“So toxic is the project, the latest press release doesn’t even use the word “Google”.

“It is good that 11 patients a day get faster care due to this tool; but Google will still not say why they wanted data on thousands of patients who visit the hospital daily.

“Until patients can see where their medical records have gone, companies will continue to predate upon the NHS to extract its most important resources.”

Notes to Editors

1) This is how Google’s wholly-owned subsidiary, DeepMind – based in the Google offices in London – was misleadingly described in this press release published by the Royal Free: https://www.royalfree.nhs.uk/news-media/news/new-app-helping-to-improve-patient-care/

2) ‘Google handed patients’ files without permission: Up to 1.6 million records – including names and medical history – passed on in NHS deal with web giant’, Daily Mail, 3/5/16: http://www.dailymail.co.uk/news/article-3571433/Google-s-artificial-intelligence-access-private-medical-records-1-6million-NHS-patients-five-years-agreed-data-sharing-deal.html

3) Hospital cuts planned in most of England: http://www.bbc.co.uk/news/health-39031546

4) medConfidential comments on NHS England’s National Data Lake: https://medconfidential.org/2017/fishing-in-the-national-data-lake/

5) The Government confirms that the bulk data copied by DeepMind, i.e. SUS, “are maintained for secondary uses” and not direct care: http://www.parliament.uk/business/publications/written-questions-answers-statements/written-question/Lords/2016-12-07/HL3943

6) Due to launch on Wednesday, being now pre-briefed by the Minister: https://twitter.com/MattHancockMP/status/835835027611127809

7) Clause 31 of the Digital Economy Bill as currently drafted would allow any provider of a service to a public body (such as Google to the NHS) to share data with (i.e. provide a copy to) any other provider.

8) While the Draft Regulations for Clause 31 state that Department of Health bodies are excluded from the Clause, medConfidential has received confirmation that such bodies will be included in the final regulations after Parliament has considered the Clause without health included.

9) The NHS is being forced to release the names and addresses of vulnerable patients to the Home Office: http://buzzfeed.com/jamesball/trumping-donald-trump

Questions that remain unanswered from May 2016 include:

  • What was the basis for Google to get 5 years of secondary uses data on every patient who visits the hospital? Google is getting thousands of people’s data per day, yet the hospital admits it is helping only a small fraction of them.
  • Why did the app not simply access the data it could clinically justify, when it needed to display it? That would have provided all the benefits of the app to patients and clinicians, and not given Google the medical records of patients which it had no justification for receiving. Did Google even talk to the hospital’s IT provider about access to only the data it needed before demanding all the data the hospital held?

medConfidential made a complaint to the ICO and National Data Guardian about the project in June 2016. Google and the Royal Free Hospital have failed to yet provide satisfactory answers and we understand the investigation remains ongoing.

-ends-

Public Health England

“I feel I guess betrayed that 19 months into my partner’s cancer battle we didn’t know about this. I think honesty is the best policy and have no problem with the info being recorded but we should have been told and that the details can be removed at the patients request as not to be made aware at some point seems deceitful”

 patient quoted on p37 of the Macmillan/CRUK report on consent

“Betrayed” and “deceitful” are not words cancer charities quote lightly, but they are right to use them.

medConfidential believes – as we do for all flows of health data – that the cancer registry should be consensual, safe and transparent. Whereas the current data handling practices of Public Health England are coercive, dangerous and dishonest.

PHE’s National Cancer Registration and Analysis Service web page today says: (emphasis added)

“Patients can ask NCRAS to remove all of their details from the cancer registry at any time. Opting out of the cancer registry won’t affect the patient’s immediate treatment at their hospital or GP practice, but there may be occasions in the future when the data that is held by NCRAS can be used to assist in their care or that of a close relative.

“If patients opt-out of the cancer registry, it may not be possible to contact individuals identified as being at risk in future, such as when an increased risk of breast cancer is identified in women treated for Hodgkin’s disease using radiotherapy.”

NHS Digital is solving this problem through medical ethics and hard work; it seems PHE has taken a Board-level decision to ignore the problem and, in effect, blackmail patients instead.

There can be good reasons to override dissent – many of them related to public health. We have asked PHE for a list of the reasons it thinks it needs to routinely ignore the wishes of cancer patients. That list has never been provided, and PHE has published no detailed justification for its demand for data. Scrutiny of what the data is used for shows its existing arguments to be “thin” at best.

The NHS does direct care, Public Health England does not – and PHE is not set up to keep data for both direct care and secondary uses. As a result, to maintain its turf, it has resorted to threatening patients and their families with reduced treatment for cancer both online and in printed literature. We understand “the Director” has called people who opted out, in person, to “encourage” them to rescind their request.

While PHE admits that 150 people have opted out of the registry, it is unclear whether these patients took at face value PHE’s public statements about this not affecting their care, or whether they fully understood any contradictory statements made in private .

This is why direct care and secondary uses must be kept separate – there are sometimes good reasons to have additional copies of data. This is one of them.

The problems of PHE are, however, far wider than just those regarding the cancer registry. While the current review terms are correctly narrowly defined, the solutions may have more general applicability by NHS Digital.

Who is responsible for this mess?

While actual data release decisions remain unpublished, PHE assures us that there is a “reporting line”.

The data release process is apparently managed and supported by an Office of Data Release, decisions are made by the Information Asset Owner, overseen by a Data Release Assurance Board, which does no assurance and which is both chaired by the Chief Knowledge Officer, and supposedly “overseen” by PHE’s Board… via the Chief Knowledge Officer.

While this may – at a glance – seem roughly similar to an HSCIC process, let us add some names to these various posts. For the cancer registry, every single one of those roles is held by the same person: Professor John Newton.

It is clear that PHE has a serious data governance problem.

PHE remains in denial

PHE’s annual report claims (page 119) it has done a “Partridge Review”, as HSCIC did in 2014. However, while the HSCIC process was a model of transparency – it was public, conducted by independent analysts overseen by a HSCIC non-executive board member (Sir Nick Partridge, hence the name) and its outputs were clear and contained both an acceptance of problems and suggested steps to remedy them – by contrast, PHE has chosen to keep its review secret.

It has chosen to hide the process of reform from the public, and chosen to refuse to acknowledge any form of critique. The review was conducted by an in-house consultant, and was delivered to the Information Asset Owner (Professor John Newton), not the Board (on which he sits). PHE has refused FOI requests for that review, and won’t talk publicly about even the topics of the 4 areas of “significant concern”.

This is not a process in which the public can have any confidence at all.  Indeed, it gives every impression of a cover-up by those complicit in a culture of failed priorities. And, as such, through the considered decisions of PHE and decision makers, the vitally important cancer registry (and other datasets) remain one small misstep from a collapse of public confidence.

Implementing dissent

While patients have the legal right to opt out of the cancer registry, as part of its move to NHS Digital, it should come under the broader Caldicott Consent Choice.

As there are direct care purposes for which the registry is used, a separate system for those purposes should be maintained by NHS Digital. As a result, where there is a clear and pressing need to use 100% of the cancer registry, rather than the 98% who have not dissented from processing, then approval can be sought from the Confidentiality Advisory Group at the HRA, using the powers CAG acquired under the Care Act 2014. That may simply be the validation of marginal outputs from the 98% dataset, and would be a very specific question (since it would only be confirmation of the output of a research process).

However, the Cancer Registry is currently releasing details of cancer patients to private contractors for purposes that NHS Digital would not have approved itself, or which would have had to have opt outs honoured. These requests are excluded from the PHE Data Release Register. The cancer registry is therefore a ‘back door’ leak of identifiable data about the patients and their cancers.

Given the role the new chair of DAAG played in creating the above cancer registry consent fiasco, continued lobbying to “use my data”, and his other responsibilities and funding, it would seem the current DAAG/IGARD chair is demonstrably unfit to override dissent for the cancer registry.

Demonstrate to patients who has used the data, and why, and what we learnt

The demand for “more data” is endless, and providing more data will not solve that problem – all we see is more demands for more data. Will doing the same thing over and over again generate a different result?

Showing patients what was said to them, and what happened next, will hopefully focus minds away from hyperbole, improve the quality of layperson explanations of projects, and show what works for better outcomes, and what does not.

The cancer registry is a vital resource, but it should be accountable to the very patients whose data is within it, ensuring that data is used properly, and not used wrongly. Currently, those who release the data are not accountable even inside PHE, and keep their decisions secret from the public.

The details matter

As with the failure to implement type-2 opt outs properly for hospital data, and with PHE’s actions at any step of this process, misleading the public has consequences for public trust and public confidence.

It is entirely possible to have a consensual, safe and transparent cancer registry, delivering benefits to patients who wish their data used legitimately. We must and will move away from a coercive, dangerous and dishonest model – the question is solely the manner, governance, and price of that move.

Digital Economy Bill: Part 5, Chapter 1, clause 30 and Part 5, Chapter 2 from a health data perspective

medConfidential asks Peers to:

  • Express support for Baroness Findlay’s amendment on Part 5 (NC213A-D)
  • Express support for either amendment to Part 5 Chapter 2 (Clause 39)
  • Oppose current Clause 30 of Part 5 in Committee and on Report

We attach a briefing, with a more detailed consideration of these points, but in summary:

In 2009, the then Government removed clause 30’s direct predecessor – clause 152 of the Coroners and Justice Bill – because the single safeguard offered then was ineffective. Bringing that back, this Government has not only excluded important aspects of Parliamentary scrutiny, it is trying to introduce “almost untrammeled powers” (para 21), that would “very significantly broaden the scope for the sharing of information” (para 4) without transparency, and with barely any accountability. The policy intent is clear:

“the data-related work will be part of wider reforms set out in the Digital Economy Bill. [GDS Director General Kevin] Cunnington said as an example, that both DWP and the NHS have large databases of citizen records, and that “we really need to be able to match those”. (interview)

While there is a  broad prohibition on the use of data from health and social care for research further down on the face of this Bill, in Chapter 5, the approach taken in clause 30 is very different, and contains no such prohibition. Regulations (currently draft) published under clause 36 simply omit the Secretary of State for Health from the list of Ministers, thereby excluding NHS bodies but not copies of health data others require to be provided. This is another fatal flaw in clause 30.

medConfidential is deeply concerned that Chapter 2 of Part 5 contains no safeguards against bulk copying. We accept the case for a power to disclose civil registration information on an individual consented basis – a citizen should be able to request the registrar informs other bodies of the registration – but, just as clause 30 contains insufficient safeguards and is designed to enable bulk copying, so is Chapter 2. One of the amendments laid to Part 5 Chapter 2 should be accepted.

Governments have had since 2009 to solve the problems that clause 30 not only leaves unaddressed, but exacerbates. The Government should either heavily amend Clause 30 at Report stage, or ensure it is removed before Third Reading. This clause is a breeding ground for disaster and a further collapse in public trust, and it simply doesn’t have to happen.

While medConfidential is open to legislation that treats sensitive and confidential personal data in a consensual, safe and transparent manner, this legislation does not. Despite more than 2 years of conversations about accessing data through systems that respect citizens and departments (ie data subjects and data controllers) and the promises they make to each other; Cabinet Office instead took a clause from 2009 off the shelf, and has been actively misleading about the process.


Briefing for Committee stage

Your hospital data is still being sold – and here’s why it matters

Every flow of health data should be consensual, safe, and transparent. The Wellcome Trust found that up to 39% of people would have concerns about the use of their hospital data (page 92). Those concerns are well founded, and the safeguards currently insufficient.

NHS Digital says that the “pseudonymised Hospital Episode Statistics” of each man, woman and child in the country are not “personal confidential information” and so your opt outs don’t apply.  But the Hospital Episode Statistics are not “statistics” in any normal sense. They are raw data; the medical history of every hospital patient in England, linked by an individual identifier (the pseudonym), over the last 28 years. This article is an explanation of what that means, and why it is important.

To understand the risk that NHS Digital’s decision puts you in, it is necessary to see how your medical records are collected, and what can be done with them when they have been collated.

A proper analogy is not to your credit card number, which can easily be fixed by your bank if compromised; but the publication of your entire transaction history. Your entire medical history cannot be anonymised, is deeply private, and is identifiable.

 

How do your treatments get processed?

Each hospital event creates a record in a database. Some large treatments create a single record (e.g. hip replacement); some smaller routine events create multiple records (e.g. test results).

The individual event may be recorded using a code, but the description of what each code means is readable online. As Google DeepMind asserted, this data is sufficient to build a hospital records system (we argued that they shouldn’t have; we agreed it was possible).

As for how millions of those single events get put together, here’s a screenshot of the commercial product “HALO Patient Analyser”, sold by a company called OmegaSolver, which uses the linking identifier (the pseudonym) to do just that:

OmegaSolver HALO Patient Analyser screengrab

 

The identifier links your records, and that’s the problem.

While a stolen credit card number might sell online for $1, a stolen medical history goes for more like $100.

The loss of a medical record is very different to losing a credit card. If your credit card is stolen, your bank can make you financially whole again, and give you a new credit card. A month later, the implications are minimal, and your credit history is clear. But if someone gets hold of information about your medical history, that knowledge cannot be cancelled and replaced – you can’t change the dates of birth of your children, and denial of a medical event can have serious health implications.

The Department of Health is correct that the identifier used to link all of an individual patient’s data together – the pseudonym, which you could equate to a credit card number – is effectively “unbreakable”, in the sense that it won’t reveal the NHS number from which it is derived. No one credible has ever argued otherwise. You cannot readily identify someone from their credit card number.

But that misses the point that there are plenty of ways to identify an individual other than their NHS number.  This is not a new point, but it has never been addressed by NHS Digital or the Department of Health. In fact, they repeatedly ignore it. It was medConfidential that redacted the dates from the graphic above, not the company who published it on their website.

Whenever we talk to NHS Digital or the Department of Health, they repeatedly argue their use of pseudonyms as linking identifiers keeps medical information safe because they hide one of the most obvious ways to identify someone, i.e. their NHS number. We don’t disagree, and we agree that making the pseudonym as unbreakable as possible is a good idea. But what this utterly fails to address is that it is the very use of linking identifiers that makes it possible to retrieve a person’s entire hospital history from a single event that can be identified.

Focussing narrowly on the risk that the linking identifier could be “cracked” to reveal someone’s NHS number misses the far more serious risk that if any one of the events using that pseudonym is identified, the pseudonym itself is the key to reading all the other events – precisely as it is designed to be. That multiple events are linked by the same pseudonym introduces the risk that someone could be identified by patterns of events as well as details of one single event.

In the same way that you cannot guess someone’s identity from their phone number alone, you won’t be able to guess someone’s identity from their linking identifier. But just as in reading your partner’s phone bill, you could probably figure out who some of the numbers are from knowledge of the person, such as call patterns and timings. And once you’ve identified someone’s number, you can then look at other calls that were made…

Hospital Episodes Statistics (HES) provides all that sort of information – and allows the same inferences – for the medical history of any patient who has been treated in an NHS hospital, about whom you know some information. Information that may be readily accessible online, from public records or things people broadcast themselves on social media.

In the event of an accident that leads to HES being ‘published’, this is what NHS Digital says “could happen” – allowing people who know, or decide to find out something about you, to identify your medical history. This is how, in the event that one thing goes wrong, the dominoes destroy your medical privacy and (not coincidentally) the medical privacy of those directly connected with you.

Returning to the example of the phone bill – from a call history, you could infer your partner is having an affair, without knowing any details beyond what’s itemised on the phone bill.

Linking identifiers are necessary to make medical information useful for all sorts of purposes but, for reasons that should now be obvious, they cannot be made safe. That is why safe settings and opt outs are vital to delivering usable data with public confidence.

 

With 1.5 billion events to search through, what does this mean in practice?

Health events, or accidents, can happen to anyone, and the risk of most people being individually targeted by someone unknown is generally low – a risk the majority may be prepared to take for the benefit of science, given safeguards. But while it may be fair to ask people to make this tradeoff, it is neither fair nor safe to require them to make it.

As an exercise, look in your local newspaper (or the news section of the website that used to be your local newspaper) and see what stories involve a person being treated in hospital. What details are given for them? Why were they there?  Have you, or has anyone you know, been in a similar situation?

The annex to the Partridge Review gives one good example, but here are several others:

  • Every seven minutes, someone has a heart attack. This is 205 heart attacks per day, spread across 235 A&E departments. If you know the date of someone’s heart attack (not something normally kept secret), the hospital they went to, and maybe something else about them, using the Hospital Episode Statistics, their entire history would be identifiable just out of sheer averages.
  • If a woman has three children, that is 3 identifiable dates on which some medical events occurred (most likely) in a hospital. Running the numbers on births per day, 3 dates will give you a unique identifier for the person you know. Are your children’s birthdays secret?
  • If misfortune befalls someone, and information ends up in the public domain due to an incident that affects their health, (e.g. a serious traffic accident), or a person who is in the public eye, or with a public profile who publicly thanks the NHS for recent care (twitter), how many events of that kind happen to that kind of person per day? The numbers are low, and the risk is very high.

More information simply makes things more certain, and you can exclude coincidences based on other information – heart attacks aren’t evenly distributed round the country, for example, and each event contains other information. Even if you don’t know which of several heart attack patients was the person you know, it’s likely that you have some other information about their person, their location, their medical history, or other events that can be used to exclude the false matches.

It only takes one incident to unlock your entire hospital history. All that protects those incidents is a contract with the recipients of HES to say they will not screw up, and the existence of that contract is accepted by the Information Commissioner’s Office as being compliant with its “anonymisation code of practice”, because the data is defined as being “anonymous in context”.

This may or may not be true, but relying on hundreds of companies never to screw up is unwise – we know they do.

All this goes to explain why the Secretary of State promised that those who are not reassured could opt out:

 

NHS England will go fishing in a “Data Lake”, but says “let them eat APIs” to doctors

An “Emerging Target Architecture” from NHS England aims to direct all NHS patient data into a new “national data lake” (page 14). This involves taking genomic, GP, and other health data for direct care, and then going fishing in that dataset for commissioning purposes, while keeping such actions secret from the patients whose data they access.

The inclusion of the data lake and claims to be ‘direct care’ show NHS England has no faith that the tools they propose to doctors will work. The fig-leaf of “localisation” is undermined by the “national” “data lake”, and it seems unlikely that DH and NHS England will cease meddling if a local area decides not to to rifle through patient records.

NHS England’s approach does not fix any problems that exist: there is no analysis that should be done, that this model will allow, that cannot be done now if someone cared to do it. The approach does however do away with patient privacy, safeguards and oversight, and allow nefarious access that is currently prohibited. This model does nothing to solve the actual problem, which is the need for more analysis. There is already an excess of data that no one is looking at, this simply creates more data. And no matter how much data there is, “more data” will remain an analyst’s desire. Patients, and the clinicians who treat them, don’t have such luxuries.

Conflating direct care and secondary uses will cause pain throughout the NHS for as long as it persists the legacy of the thinking behind care.data.

Direct care?

For direct care, the idea of patient-visible, audited, “near real time” access to records held elsewhere is not novel nor necessarily problematic in principle (though the details often fall short).

The Lefroy Act from 2015 requires hospitals to use the NHS number to identify patients, which makes data easy to link. The use of near-real-time access where there is a clinical need is not necessarily a problem everywhere, but there are clearly some areas where very great care is needed, and the ‘Emerging Target Architecture’ document contains none at all.

There are benefits to using FHIR APIs (or equivalent) as the definition of a “paperless” NHS (currently conveniently undefined). But this “target architecture” is not about that, and notably doesn’t say that. The APIs proposed can help patients, but do not require new data pools; the “national data lake” assumes they do not, and is included to allow fishing expeditions by NHS England itself and its customers – an “NHS England datamart”.

NHS England’s desire for unlimited access to data for direct care is to get unlimited access for other purposes. The document claims that “privacy by design” is important, but doesn’t go beyond words and completely ignores privacy from its worldview.

Where is the transparency?

Access to records to provide direct care is valid – but at the scale of the entire NHS, how will a patient know whether their records have been accessed by someone on the other side of the country? The system says nothing about transparency to patients.

While such an architecture can do good, it can also be abused, and the worldview of NHS England offers no potential for dissent.

Open Data and dashboards on current status are necessary for transparency in the NHS. However, paragraph 3.29.3 of ‘Emerging Target Architecture’ suggests that open data can be recombined into a patient record, which suggests something has gone very wrong in the “understanding” behind the document.

NHS England will go fishing in the genetic data lake

Because all patients’ records will be included in the data lake, NHS England will then be able to extract anything for which it can provide a plausible justification. But, as the care.data Expert Reference Group showed, anything can be justified if you use the right words and no one asks questions, e.g. “national data lake” and “privacy by design”.

The existence of a data lake means people will go fishing. You can put up “no fishing” signs, but we all know how that plays out with people who have good intentions, but priorities that undermine the larger goals.

The paper does not talk about genomic data, but Genomics England (GEL) is envisaged as an inflow. Was this a deliberate choice?

This free-for-all stands in comparison to the transparency of the current NHS Digital processes. We may fundamentally disagree with some of those decisions, but there is at least transparency on what decision was made and why.

“Datamart”

The idea of a “datamart” is the clear reappearance of the care.data principle of taking all the data from patients and clinicians, and selling it to anyone who might offer a few beans to get the detailed medical histories of patients.

The conflation of direct care and (dissentable) secondary uses now looks less accidental, and more like an end state goal – for which ignoring patient opt outs was a necessary means to an end.

There must continue to be rigorous and transparent processes for accessing patient level data – and that should include transparency to patients of which organisations have accessed their data. APIs may help care, but they also help those with other intentions.

This proposal also does nothing to reduce the administrative overhead of the NHS billing bureaucracy, nor does it reduce the requirement for identifiable information to be shown to accountants at multiple NHS bodies, simply because they don’t trust each other. A “national data bus” architecture could address that problem, but NHS England has chosen not to care about reducing the burden on others.

There should be no third party access protocols – statistics should be published, or data to solve a specific problem should be available to appropriate analysts within a safe setting, when their questions have received appropriate review, who have the data appropriate to answer them, and who publish their results.

Drug companies should be prevented from changing the questions they ask after they know what the results of their trials are. And CQC shouldn’t be allowed to pretend they never asked a question, purely because they don’t like the answer they got. Analysis of the data may lead to new questions; but it should never lead the original question not being answered. And all questions asked of the data should be published.

The future of (Fax) Machines

There is still no clarity on what will replace the fax machine for one clinician sending information along a care pathway to a department in another organisation. The desire to abolish fax machines isn’t unwise, but they serve a clinical purpose that e-mail demonstrably doesn’t resolve.

Wither Summary Care Record?

The Summary Care Record could perform many of the direct care features, had NHS England not decided upon an “all or nothing” approach to having a SCR.  Had the enhancements to Summary Care Records been done on an iterative and consented basis, it would have been simpler to widen SCR to the new areas proposed. But NHS England, with the bureaucratic arrogance and technical mediocrity that pervades this proposal, simply insisted on the same “all or nothing” approach to the enhanced SCR. This being the case, it insists on all patient data being included in a data lake, as the access to data of last resort for clinicians.

Some of the proposals in this document clearly have merit, but when claims are made for “privacy by design” alongside such a fundamentally misconceived and diametrically opposed notion as a “national data lake”, the vision articulated is shown to be incoherent at best.

Prioritising a data copying exercise over actual care repeats exactly the same errors in thinking that set care.data on its path to failure. And, published just weeks after it emerged that patients’ objections to their data being used for purposes beyond their care are being ignored, this looks even more like a deliberate attempt to ignore that there are – and always will be – valid objections.

Ignoring the past in this way puts at risk access to the data of those who would be happy for their medical records to be used, given sufficient safeguards and transparency. Unfortunately, a data lake can never meet those requirements.

The “Emerging Target Architecture” document is here, and NHS England is taking comments until the end of the week…