Author Archives: medcon

Welcoming NHS Improvement

The status quo of NHS  data collection could be described as “Collect it all yourself; trust no-one else”. This is clearly unsustainable: care.data may have been the straw that almost broke the camel’s back; the Prime Minister’s Challenge Fund just tossed some steel girders on top. Poor camel.

With the merger of the NHS Trust Development Authority (TDA) and Monitor under a new name, “NHS Improvement”, there may be an opportunity to begin to address some serious data shortcomings – and some persistent category errors. Monitor was supposed to act as a Government “stick”; the TDA was supposed to be an NHS “carrot” – but, as with so many bureaucracies, the left hand seemed not to know what the right hand was doing, so the stick ruled and very little productive got done.

From documents medConfidential has seen, Monitor’s approach to data seems to have adopted care.data as a handbook, rather than recognising the scheme for the “fiasco” it has so clearly become. Problems that emerged with the “pioneer” in Southend could have been as much down to flawed advice propagated by Monitor as it was the result of NHS England’s inadequate and inaccurate guidance.

We had expected the Government to have responded to its “Accredited Safe Havens” consultation from last summer by now. That it hasn’t speaks volumes. That some of the “pioneers” and “vanguards” reflect a backward-looking data worldview still prevalent in parts of DH gives cause for concern. It’s clearly not just care.data that’s infecting the thinking, and in real danger of further damaging patient – and professional – trust.

In the forthcoming consultation on the powers and remit of the National Data Guardian, we hope the Department gathers views on NDG having to be consulted on every use of NHS England’s and other statutory bodies’ powers to require data. While NHS Improvement should certainly not be given powers to require data (HSCIC doesn’t have such powers either), it could be a place where conversations can be had between the various stakeholders – care providers, commissioners and the Department of Health – about the statistics required to firstly measure, and then “improve” a particular area.

This should not be about measuring only what it is you want to manage, but be about measuring the things that matter. Not least because, as has been repeatedly been shown, simple measures can lead to detrimental care when ‘gamed’ by those in the system.

Learning the lessons of care.data – though some are still lagging behind – such datasets must always and exclusively be aggregated datasets; published statistics where not only the figures but the methodology are published for all to read. (Some datasets where the detail contains small numbers may need to remain unpublished, available only in a tightly-controlled safe setting.) The public must be able to see, and debate, the specification of any dataset that will be used for strategic decision making.

While the research process involved in the design and testing of these datasets may need access to consented individual-level data, such as should be possible with data in the new Secure Data Facility, the use of aggregated counts as the basis for decisions, rather than individual-level detail would remove many of the problems NHS England still claims will befall GP practices where 12% or more of the patients have already opted out of its ill-conceived, zombie data grab.

NHS Improvement could be a good place for these conversations to take place, if it steps up several gears. NHS England could even have a seat at the table – so long as NHS Improvement convenes and manages the process of defining these new aggregated measurement datasets, of which, given the dearth of them, there will probably need to be a fair few.

The process could be designed to ensure that care providers can have measures they feel accurately reflect good care, NHS England gets the evidence base it needs to justify decisions, and HSCIC can focus on the vital implementation issues – such as feasibility, assurance and process.

Preventing a repeat of the Prime Minister’s Challenge Fund debacle would appear to require such a venue; NHS England has proved itself institutionally incapable of being a trusted broker, and HSCIC has other roles. A correctly constituted NHS Improvement, appropriately staffed and resourced, could provide a venue to help ensure the outcome: “High quality care for all, now and for future generations”.

It could also help with another problem

In much the same way as the DWP requires health assessments by its own staff, rather than trusting the assessments of NHS care providers, and the way HMRC trusts nothing it didn’t confirm itself, an underlying cause of many problems in the NHS is quite easy to define: NHS bodies simply don’t trust other NHS bodies.

This is why bean counters in a CCG want detailed medical records of all “high cost” patients. Or indeed believe, in spite of Caldicott2, they should have access to individual-level medical records.

Multiple interlocking but discrete datasets, properly designed and produced as above, can show up the various “tricks” that get used to move people out of one column into another – “massaging” the figures – a practice that certainly should be measured. And acted upon by someone independent.

If an NHS organisation believes statistics being provided are fraudulent, then that’s a question for NHS Protect, rather than CCGs thinking they can investigate themselves. Integrity on process can be provided by HSCIC working on collation and process (SUS and GPES already do this for hospitals and GPs).

For NHS Improvement, ask the patients?

Though it has positive potential, NHS Improvement also has the potential to become yet another arcane and somewhat obscure NHS body. Yet one of the groups who understand a great deal about what might provide disproportionate improvements within the NHS are that chronically underrepresented group who use it every day; patients.

While NHS England continues to have its own political priorities and funding considerations, when HSCIC is telling patients what did actually happen to their data, patients can (also) feed back to NHS Improvement what they believe should have happened – a genuine partnership in improvement.

Consensual, Safe and Transparent Sharing of Medical Records Along Care Pathways

medConfidential notes the various calls for medical records for patients’ direct care to flow with patients along care pathways as a priority, following consent for treatment – and the new (or pending) legal requirement that the NHS number be the mandatory identifier.

Both of these are generating some levels of patient concern. However both can be implemented in a manner which enhances trust, rather than risking it further.

Reporting to HSCIC that a particular NHS number has entered an organisation for care, and whether this was via a ‘handover’ of electronic records or through some other means (e.g. non-electronic referral, for example from A&E – or if there was some form of electronic handover failure) would begin to assuage a range of concerns. HSCIC could also then publish aggregated statistics for each pair of providers, to show how the different types of record handoffs (successful, failed, or other-manual) had worked, with the aim of increasing successful handling of electronic records for direct care along a pathway.

For providers receiving data on a care pathway, a figure could be provided of the number or percentage of patients who had refused consent for their medical records to be handed across electronically to/from that provider, but who consented to care. There will be a range of issues around this, e.g. Mental Health records being restricted – and where there are ‘outliers’ for a particular provider or flow (either due to technical issues, or because of consent choices) these will need to be addressed through a transparent process.

For patients, HSCIC should then be able to report to each person individually, via their Personalised Data Usage Report, everywhere their NHS number (and associated data) has been passed. As patients can learn exactly what does happen to their records, and why – and that it is the norm for this to happen without incident – this will contribute to a tendency towards increasing trust around the handling of records.

This process should be systematic, automatic, accurate and, over time, complete.

Additionally, as the expectation becomes that records do flow, patients will be able to see where this flow hasn’t happened (in addition to potentially experiencing the effects) and can raise questions – which is entirely appropriate if, as is asserted, sharing of medical records along a care pathway for direct care will improve outcomes. It is far more important to patient care and safety to know and correct flows for direct care where they aren’t happening as they should, as it is to know the data and flows for secondary use.

We emphasise the distinction between direct care – in effect, data sharing with implied consent between medical professionals who interact with and provide treatment to a patient – and secondary uses, which cannot presume consent, and for which patients have a right to opt out.


To illustrate this with a recent example; there are very few reasons to dispute or object to medical records being used for direct (“integrated”) care in, say, a meeting held between and run by medical professionals with a duty of care for a particular patient with complex needs, to devise a specific care plan for that patient. But a secondary use of that same information would be a meeting run by an accountant looking ways to manage the impact of a “high cost” individual.

It is entirely up to the system to transparently describe and discuss the difference, and it is the public knowledge that this will be examined which helps keep the system honest. And therefore trustworthy.

Health and Transport along Data’s Cockup Boulevard

One of the things about data releases is that there are cockups. Even if we accept your argument that you’d never screw it up, what about the people who follow you, and the people who follow them? Or your predecessor?

In medConfidential’s usual health arena, those cockups tend to be cognitively uncomfortable, or include difficult tradeoffs, as do many decisions to do with people’s health. However, down the road at the Department for Transport, they have examples that have similar potential effects, but that are easier to talk about at parties.

Everyone knows what a train is and, while trains do crash, we have some idea of just how rare that actually is, and get on them daily anyway. For that reason, the examples in this blog post will look at transport, rather than health.

Finding your way to cockup boulevard

Our friends at the UK Anonymisation Network recently published a presentation on the process of anonymisation – mostly looking at the process that organisations should go through. (While the presentation was published in the context of open data, the rules apply for any data.) Full details are in the presentation and its accompanying documents – for the purposes of this post, the description and process in Section 2 is pretty good, within some constraints:

  • Describe your data situation
  • Know your data
  • Understand the use case
  • Understand the legal issues
  • Understand the issue of consent and your ethical obligations
  • Identify the processes you will need to assess disclosure risk
  • Identify the disclosure control processes that are relevant to your situation
  • Identify who your stakeholders are and plan how you will communicate
  • Plan what happens next after you have shared of released data
  • Plan what you will do if things go wrong

The last point is the kicker; this is hard. What happens when you cock it up? Or, if not you, your successor’s successor, who has less of an understanding of what the words actually mean than you do?

The whole process relies on those following the process having an understanding of not only what they’re doing, but the wider data environment in which they are operating. For many organisations, there is a fundamental denial of anything that’s even just outside their narrow silo, let alone the wider “environment”, and that’s going to get messy.

It doesn’t matter how good your SDC process is if you don’t care about the world as it is, rather than just how it would be convenient for it to be. Data, once released, cannot be un-released. Future releases may be stopped (with resultant damage to confidence in the data environment), however, the existing releases will still have been released. Under an Open Data License – which is necessary for arbitrary reuse – it is particularly difficult to get them back.

cyclingSome of these will be pure accidents.

Take as an example Transport for London, who run the “Boris bike” hire scheme, and who publish details of cycle hires – from where to where, and when. Data that produces many of the pretty cycle hire maps you see.

The data published should be “a row identifier, the length of hire, the start time/date, a Bike ID, the Start Location, and the End Location”, thus:

Rental Id, Duration, Bike Id, End Date, EndStation Id, EndStation Name, Start Date, StartStation Id, StartStation Name
18884041,271,4313,02/01/2013 13:32,251,”Brushfield Street, Liverpool Street”,02/01/2013 13:28,509,”Fore Street, Guildhall”

A significant amount of public benefit can come from such data being available; many different analyses have been done.

Sometimes the choice to release is deliberate. (The release of New York taxi trip data was a deliberate, if ill-considered, act.) But at some point last year, someone at Transport for London just made a mistake.

For a couple of months, TfL accidentally included the “hire key” ID, which is the identifier of the person who hired the bike. As such, it was possible to derive sensitive details using other data known about the various trips of individuals.

Avoiding cockup boulevard altogether

Whether deliberate or accidental, such issues come from fundamental category errors. We see this a lot – such as people perceiving linked achievement data as a dataset about schools and teachers, without appreciating the crucial significance of it containing the life experiences of children. Some projects see doctors and nurses – people who, when they were aged about 13, decided to spend their life helping people – and consider that an exploitable resource for acquiring nice things.

It will become increasingly common to wrap such things in the banner of “data”, and claim the magic pixie dust will solve all. How likely is it that such category errors will be nowhere within your organisation, and never occur? Especially in a political bureaucracy where you have powerful individuals “masterminding” a programme without regard to the details?

It’s a good thing that the UKAN assessment process has cockup sections one and two.

What is Open Data?

Open data is data published for all to use, with no limit on purpose – which is why personal data cannot ever be open data, except for matters of public record (i.e. some legally-mandated details about people who have power or influence over others’ lives). When aggregated and properly treated, fully anonymised results about people – statistics – can and should be open data. However, any failure to follow a full and complete statistically valid process means you are actually publishing personal data.

In ethical practice, the only entity who can publish rich, detailed personal data on an individual is that individual themselves. It can only ever be something someone does themselves, and not something people do to them.

And broad, open-ended ‘consent’ just won’t cut it. Even if you get someone’s permission for a bunch of the good stuff you imagine doing with their data, it’ll be the bad stuff you haven’t thought of that someone else does that’ll screw you. And the people whose personal data you published. Depending on circumstances, this could be downright abusive or worse.

I may choose to post photos of my meals to instagram; someone I don’t know choosing to post all my meals to instagram is just creepy.

 

P.S. Good luck to Mike Bracken and Tom Steinberg in their future endeavours.

medConfidential update, 21 March 2015

This is just a brief update; we hope to have more substantive (good) news soon, but something else we think you should know about is happening and we wanted to give you the heads-up.

Urgent action – your health data and beyond

While the Government and NHS England still refuse to rule out the commercial re-use of your medical information, their commercial cronies have lobbied the Office of National Statistics to consult on commercial, speculative and secret access to the unprotected data that ONS holds.

This “microdata” is highly sensitive, much of it personal data – which is why the ONS has had to keep it so tightly under lock and key. This isn’t your medical record, but it’s everything else the Government has, including the census and Health Survey; it’s all but your name.

With a general election in the offing and the budget this week, no-one else seems to have noticed. But where does the bulk of the data that the budget depends on come from? That’s right, ONS – and confidential business data is included in these proposals too.

Please act now. With just one week to go before the consultation closes, you can:

  1. Sign the open letter opposing the proposals – it’ll just take a minute
  2. Tell your friends – more information at www.AllButNames.com
  3. Fill in a longer response via the ONS website

There may be just a few of them but, as statisticians can count, your voice really matters.

medConfidential’s attention was drawn to this issue by Methods Insight Analytics’ breach of conditions for using ONS linked data sold by HSCIC last summer. It appears some private companies would rather change fundamental ONS principles than their own business models.

Has nothing been learned from the care.data fiasco? Allowing commercial access to highly detailed, sensitive information for private profit undermines both trust and the public good. Selling access to ONS microdata may make peanuts for companies and their shareholders, compared to the very real damage to public confidence in our National Statistics that will come from these proposals.

 

What’s happening with care.data?

We’d love to be able to tell you what’s going on with the care.data pathfinders but, depending on who’s asked, they’re both going ahead and not before the election… and now NHS England won’t say either way.

It has been clear for some time that data extractions won’t take place “before the autumn”, but that’s not quite the point. The question is when patients will start being written to, what they’ll be told, and whether it’s actually true.

Though the headlines talk about a delay, when pressed, “Mr Kelsey told HSJ that while the extraction would not take place before the election, pathfinders would send out communications around the data extraction and linkage programme.”

As The Register reports, Tim Kelsey repeated this intention to Roger Godsiff MP, who was prompted to lay an Early Day Motion this Monday.

We sincerely hope that NHS England will do the right thing, and postpone sending anything out to patients in the pathfinders until after the election. Too many questions are still unanswered, and critical elements – such as the CAG regulations, new Directions and fixing the ‘Type 2’ opt-out error* – are still not in place.

Proceeding now, so close to the election, could be seen as an attempt by this Government to constrain the next. And, as Shadow Cabinet Office Minister, Chi Onwurah has said: “I think if we have another care.data, then the public sector is not going to want to touch data, whether it is open or shared and that is a real danger.”

* We understand HSCIC is working on a solution to the issue they have taken responsibility for, that will honour your choices and not affect your direct care. We will let you know as soon as anything public is announced, but this is unlikely to be until after the election.

 

 

medConfidential response to NHS England response to Sky News NHS security story and research by the Oxford Internet Institute

NHS England is still trying to justify in 2015 what it tried to sneak through in 2013. Has it learnt nothing?

Disclosure: Sam Smith of medConfidential sits on the Privacy Advisory Group for the Office of National Statistics’ (census replacement) Beyond 2011 & Big Data programmes, of which the expert academic at the Oxford Internet Institute interviewed by Sky News is also a member.

 

Does the database exist?

NHS England: “firstly, there is no database of information for the care.data programme yet”
NHS England: “confirmed that pilot schemes are starting again”
NHS England: “
To access the data collected as part of care.data, applicants will need to…”

NHS England itself acknowledges, on a page named “our plans”: “for example, the hospital episode statistics (HES) service has been collating administrative information since the 1980s about every hospital admission funded by the NHS.”

So there are existing databases which are vulnerable to these problems and a new database is being built, it’s just not been built yet. (The ‘new’ specification in 2015 appears to be the same care.data specification from 2013 – with various ‘mistakes’ covering HIV, HPV, and AIDS codes corrected.)

Aspects of the existing data services are as concerning, if not more so, than the care.data proposals.

 

A statement and briefing were provided to Sky by NHS England ahead of broadcast

On Thursday evening, NHS England contacted medConfidential, having seen our tweet, to say they had commented to Sky News. But, as of Monday, the Sky News piece still contained no attributed quote or statement from NHS England. It has a quote from the programme director at HSCIC, not NHS England.

We don’t know the ins and outs of exactly who said what to who when but, yet again, it seems that NHS England is hiding behind another government body – the Health and Social Care Information Centre – to provide justifications that do not speak to the full consequences of its own future proposals.

HSCIC is a “creature of statute”, a body which in law may only do things as Directed, including by NHS England. NHS England is the puppeteer cowering behind the curtain, insisting the puppet’s the one at fault.

 

“this would be a criminal offence

While ‘hacking’ into a database of medical information would indeed be a criminal offence, it is rather beside the point. It’s the the ‘Hollywood scenario’ of a remote attacker defeating NHS England’s defences with cunning from their back bedroom, or North Korean data terrorists launching an attack.

What is far more relevant is that copies of the data (HES, etc.) have been sold [1] to a whole range of organisations and companies, many of which continue to receive data. And there are no criminal sanctions for misuse of the data by the recipients or data breaches, which – despite previous denials [2] – we now know there have been [3].

NHS England is quite clear that confidential data is already being sent to places: “confidential data is always encrypted whilst in transmission and the secure networks used to transfer data are regularly tested and monitored for any vulnerabilities”. (Unless David Cameron succeeds in outlawing it, as he proposed last week.)

In the case of the Sky News piece, the researcher acted entirely ethically and correctly in using the information provided by the journalist – who had given full and informed consent, and was clearly aware of the risks. Those who would rather continue the status quo and placate, rather than inform, the public are less likely to explain all of the risks and mitigations to a journalist. And highly selective ‘explanations’ do not give the full picture.

Given the continuing distribution of 25 years of hospital records – over 1 billion dated events – this research identifies both the grave risk to the medical privacy of the country, and the continued wilful ignorance of NHS England.


1) On a “cost recovery” basis.
2) On BBC Radio 4’s Today programme, 4 February 2014, Tim Kelsey claimed “in 25 years there has never been a single episode in which the rules… have ever compromised a patient’s privacy.”
3) HSCIC’s FOI response on 7 April 2014 lists a data breach in every year from 2009 to 2012; HSCIC holds no records from before it was formed in 2005.

 

Where does the data go?

NHS England: “To access the data collected as part of care.data, applicants will need to go through an approvals process and then, during the pathfinder stage, can only see it in a secure data facility (SDF). During pathfinder stage, access applications will only be accepted from select organisations and there is a robust security procedure in place when the applicant visits the SDF.” [our emphasis]

The crucial point being, what about after the pathfinder stage? Where will applicants be able to “see” the data then?

Will NHS England revert to current practice, as for HES and other data, and permit copies of the data to be sent out? There’s little point constructing a “secure data facility” if it is not then used for all future access to the data.

If all NHS England will promise is to keep patients’ data in the SDF “during the pathfinder stage” then it is just a temporary safeguard, which can be removed for the full national roll-out.

So why won’t NHS England promise that patients’ data will always be kept in the secure data facility? It clearly wants to keep its options open – but if the intention is for data to be accessed in other ways in future, why aren’t patients and GPs being told? Given NHS England’s track record of miscommunication, trumpeting what actually amounts to a tightly time-limited conditional safeguard does very little to inspire confidence.

 

NHS to carry on selling patient records to insurers” – Telegraph, 27 November 2014

NHS England: “credit rating agencies or health insurers would not be granted access to the NHS’ secure data facility where the information will be held.”

This may sound pretty definite, but can NHS England cite the precise part of legislation which provides the same level of certainty as that statement? We doubt it, because it has never previously been able to do so. NHS England argues the claim on the Telegraph front page was false, but has never provided any evidence to support its assertions. And we’ve asked, repeatedly.

In fact, the law remains mute on the types of companies that may have access to the data – it concentrates on uses – and the undefined phrase “for the promotion of health” leaves open loopholes for data access that even McDonalds or Big Tobacco might use. (Regulations that might begin to address this, for the Care Act passed in May, are still unpublished.)

 

Misunderstanding the ‘birthday attack’

PharmaTimes: “NHS England said the suggestion by Sky is incorrect, saying the likelihood of being able to identify an individual “is negligible”

NHS England is again misleading the public.

As an analogy, if you consider a classroom and pick two children at random it is highly unlikely – 1 in 133,225 (i.e. 365 x 365) – that they will both have a specific birthday. But if you walk into that same classroom of 23 children or more and ask “Do two of you share a birthday?” then the chances are better than 50-50 that the answer is yes.

Example 1: Know someone who had a heart attack?

Presume someone you know has had a heart attack.

NHS England has 181 A&E departments [4] handling England’s 386 heart attacks per day [5], so each A&E receives, on average, 2 heart attack victims per day. Which, even without any other information, gives a 50% probability of spontaneous identification of a victim whose hospital and date of event is known (neither should be sensitive on their own). As the OII research into the Sky News journalist argued, that is information that gets tweeted, as it is ‘not sensitive’.

Because the data is linked over time – ‘longitudinal’, to use the proper statistical term – discovery of a single medical event would mean you can use that pseudonym to link back to all of that person’s other medical events, because “the pseudonym is allocated to the record instead” (NHS England).

It doesn’t matter what the pseudonym is or what form it takes, what matters is that it links the records. The information associated with the date of the event is what gives you the link to a victim, not the NHS number or pseudonym.

NHS England is therefore being disingenuous when it says “once a patient’s record has been matched, the information that could identify a patient is removed and the pseudonym is allocated to the record instead” and that pseudonyms can be converted back to the original identifier “only by using the specific encryption key that created the pseudonym” and this is “only ever disclosed in very exceptional circumstances”.

Of course NHS England does not disclose the original identifier (NHS number). The key point that the researcher made, and that NHS England missed or continues to wilfully ignore, is that this is completely irrelevant.

And it shows that NHS England has learnt nothing from the concerns of the last year.

In February 2014, David Davis MP argued that knowing the dates he had his nose broken (due to media attention) would mean his entire medical record could be identified. NHS England has never refuted this argument with substance.


4) DH count. See https://www.whatdotheyknow.com/request/131933/response/325271/attach/3/Annex%20A%20Final.pdf 
5) 141,000 per year in England: https://www.bhf.org.uk/publications/statistics/cardiovascular-disease-statistics-2014

Example 2: Women with children

NHS England seems to believe that your children’s birthdays are secret.

For example, by the HSCIC’s own rules, in HES the date and code for “Birth date – baby” is deemed identifiable, but the date and code for “maternity: where the baby was delivered” is not [6]. These are the same event, stored twice, but treated as if they are entirely different. Removing only one of them does not magically turn HES into non-personal data, and HES contains dozens – if not hundreds – of such fields.

Similarly, a family is identifiable by knowing the birthdays of the children. For a family of 2 children, there is a 90% likelihood that the birthdays of the two children are unique. For a family with 3 children, the children’s birth dates are almost certainly a unique identifier for that family in the country, tracked via the mother’s medical history.

On average, one set of twins are born in each maternity hospital in the UK per day. There are just 208 triplets born in the UK per year, i.e. fewer than one per day. If you know the birthdate of a triplet you could therefore read off the entire medical history of the mother via that single event.


6) For a single illustrative example, see HSCIC HES inpatient data dictionary, page 11, field: admimeth (and many, many others). This is only one method of delivery, others are equivalent.

Example 3: Who gets chemotherapy?

NHS England repeatedly argues that its care.data programme is necessary because “the NHS isn’t capable, currently, of telling you how many patients are undergoing chemotherapy, for example”.

In fact, the vast majority of chemotherapy is delivered in secondary, not primary care. Extracting data from GPs’ systems would provide no more information than is (or should already be) gathered from the actual providers. If you want to know who is receiving treatment, the most sensible choice is to go to the source of the treatment.

And to count the number of people, it is simply not necessary to know who they are – a count of unique identifiers is enough. NHS England is mandating the use of NHS numbers by care providers, and that mandate is in the process of being passed into law.

To count people, you need to know only that you’re counting non-duplicate entities. It does not matter whether you use names, physical people or their pseudonyms (e.g. telephone number, NHS number, or an arbitrary pseudonym).

Worked example 4:  Don’t get into an accident

Relatively minor medical events of those in the public domain are often reported – how many women of a particular age reported to a particular hospital with an elbow injury, for example, the day that Nick Clegg’s wife broke her elbow in 2010, just before the general election? [7] – and even the most private of individuals can find themselves in the newspaper due to an accident.

Standard journalistic practice means that accidents reported in the local press will include the date of the event, a person’s name and age, along with the area of town – in some cases even the road – where the victim lives. Such reports usually provide enough information for an informed guess at likely diagnoses, which can then be matched with a particular incident. (With regard to example 2, the same would be true of someone announcing the birth of their triplets on Twitter or Facebook.)

An experiment by Professor Latanya Sweeney of the Harvard Data Laboratory starkly demonstrates the risks of matching within ‘de-identified’ data, i.e. data where some identifiers have been removed, rather than being replaced by pseudonyms.

Taking the US equivalent of HES – de-identified public hospital records for a state – and using articles in local news reports giving an indication of types of injury, her team was able to confirm that merely by being involved in an incident where you were taken to hospital, it was routinely possible to match to the victim’s entire hospital history, and discover details that even the patient had not told the hospital directly, but which had been discovered from their medical profile.

When contacted by the project, patients were horrified to find they could be identified and have their medical history exposed from the data made available.


7) https://www.google.com/search?q=nick+clegg+wife+election+elbow+broken

 

Pseudonyms

Identification isn’t just about finding someone’s name; it’s about linking an individual’s data records together so that you can learn things about them. If I know your home address, gender, date of birth, hair colour, eye colour, weight and telephone number, it doesn’t matter how many characters are in your database’s pseudonym – what matters is that I, and my data, can be (re)identified.

NHS England’s argument is bureaucratic obfuscation. It’s like saying that having a phone number doesn’t tell you who someone is and then blaming the patient for answering the phone with their name.

Or in another analogy, it’s the sort of approach that insists you have to know the name of the bug that bit you in order for it to matter. We don’t have many small poisonous bugs in England, but other places do. Small creatures have many names; they have their Latin classification, they have names in English, and in local areas they have names in local languages, etc. In short, they have many pseudonyms – but it’s all the same bug.

If you’re bitten by a poisonous bug, the sensible medical approach doesn’t care about its actual name but rather, by asking questions about its attributes – what colour was it? was it spotty or stripy? how many legs? any wings? – the care provider can work out the appropriate treatment. The name really doesn’t matter; what you care about is the antidote, a name you will care about far, far more! At best, whatever the bug is called may be a link between looking it up and how you cure the bite – but you really don’t need the name.

Attempting to make this all about pseudonyms seriously misses the point. The real problem is the linked individual-level data that the NHS has treated so egregiously badly in the past, which with this argument NHS England appears to continue to want to do.

In 1989 this was all new, and difficult. In 2015, there are no excuses.

 

In summary

NHS England’s scenario: “In the extremely unlikely event an individual was able to ‘hack’ the system, they would need the encryption key to convert back the coding” is a diversion.

The point is not that one can infer an individual’s identity from the linking pseudonym – taking the “100 character” pseudonym to “convert back the coding” – it’s that there is so much other data in the file that you don’t have to.

As detailed above, in the ‘Hollywood Scenario’ the chances of someone arbitrarily picking a row in a dataset and knowing who it is are slim. But, as PharmaTimes suggests, that’s the imaginary plotline for a movie, not real world protection of patients.

Can NHS England tell the difference? We suggest they listen to the experts who can.

For the rich, dated linked data about which NHS England has given no assurances regarding dissemination beyond the ‘pathfinder’ stage of care.data and using widely-available other information, as the researcher at OII and our by no means exhaustive examples show, there are many ways to identify people’s medical records in individual-level data – regardless of whether it has been pseudonymised (or de-identified).

That NHS England continues to try to mislead the public on this fundamental point in 2015 suggests the “pause” it took to “listen and understand” public concerns throughout 2014 was not enough. Continuing to hold onto and propagate the fantasy that pseudonymisation makes the possibility of re-identification “negligible” is either naïve or incompetent.

We’re not quite sure what’s worse.

Early January Update

IIGOP Annual Report

Following its care.data report at the end of last year, the 2014 Annual Report of Dame Fiona Caldicott’s Independent Information Governance Oversight Panel (IIGOP) was published in early January. Amongst other things, it says:

In summary, the goal should be a state of information governance in which the following proposition prevails: “Organisations have no hiding places, the public have no surprises.”

But with good progress having been made on just six of the year-long Caldicott2 Review’s 26 recommendations, the IIGOP is forced to conclude:

Unfortunately the cultural change that we called for [in 2013] in relation to information governance has only emerged in parts of the system.

The annual report goes into some detail on care.data in Chapter 3, noting:

The unintended consequence of care.data was a positive cycle of change, with greater public interest causing organisations to respond with greater transparency and stronger information governance.

But, worryingly, on consent across the health and care system:

IIGOP welcomes the Secretary of State’s enhancement of the “right to object” in the care.data programme, but calls for a more consistent approach. It is not reasonable to expect the public to understand objections and “opt outs” if there are different rules for different programmes. This remains unfinished business.

Over the next few weeks, we will see whether the Government and NHS England are moving towards that goal – or whether they’ve been hiding more surprises for the public later in the year.

Meanwhile, Healthwatch England “found disturbing evidence of the harm caused by failure to share information appropriately. The inquiry focused on the experiences of older people, people with mental health conditions and people who are homeless.”

The findings, summarised on pages 17 and 18 of the annual report, are especially horrifying due to the impacts on the direct care of patients – a missed opportunity cost due to the care.data programme:

Public opinion research has shown that most patients want any healthcare professional who treats them to have secure electronic access to key data from their GP health record. Most were surprised that emergency care doctors do not have automatic access to records, and concerned that lack of access may lead to delays in treatment and fatal errors. The public’s main concerns about the use of information about them were suspicions around usage creep, lack of personal benefits and loss of data.

As medConfidential has always said, there need be no conflict between good ethics, good data handling and good medical care.
A Statutory Data Guardian?

We had hoped that, as the Secretary of State said would happen, the National Data Guardian – providing independent, overarching information oversight for the entire health and care system – would be put on a statutory footing “at the earliest opportunity”. That opportunity was last Friday, but the Secretary of State failed to meet his commitment.

As we now discover from the IIGOP’s Annual Report, this is just one example of what happened without a strong oversight body:

NHS England communicated the proposal in a leaflet that was supposed to be delivered to all homes across England in January 2014. A copy of the intended leaflet was sent to IIGOP shortly before the quarterly meeting of the panel on 9th December 2013. On the following day IIGOP advised NHS England that its leaflet was not fit for purpose, but was informed that it had already been sent to the printers and would not be recalled.

Last Friday, Jeremy Lefroy’s Private Member’s Bill reached its final stage in the House of Commons, and has now moved on to the Lords. When the NHS Number is used beyond the NHS, its wider use a lifelong identifier for every person in the UK will also never be recalled. We wrote a briefing on this issue when it first raised its head.

 

Anniversary

2015 marks 10 years since the dodgy deal between the (then) NHS Information Centre and Dr Foster Ltd – a period during which, as we now know, less-than-optimal decisions were made.

One quote in the Public Accounts Committee’s report that sounds entirely familiar from the care.data fiasco a decade on:

At the outset there was an urgency to complete the deal with Dr Foster Ltd, and in negotiating the joint venture the roles and responsibilities of the Department’s advisors were sometimes confused.

With echoes of the messy “IG Universe” picture that emerged last year, and with venture capitalists that now own bits of the private sector part of Dr Foster Ltd writing down their stake and seeking an exit, we see once again that – in the long term – routing round or failing to institute and apply proper Information Governance doesn’t help anyone.

Finally, as the 12 month mark approaches, we understand the Health Select Committee will continue its inquiry into care.data and the handling of NHS patients’ records shortly. Let’s hope that this time its members will be given full and frank evidence by all.

[PRESS RELEASE] 27 fundamental areas of concern: 52 unanswered questions for NHS England on their care.data scheme

For immediate release – Thursday 18th December

The Independent Information Governance Oversight Panel (IIGOP), chaired by Dame Fiona Caldicott, published its report [1] to the care.data Programme Board this afternoon.

Responding, NHS England has welcomed Dame Fiona’s “observations and the insight it offers”, and will “discuss the report further once we have had the opportunity to speak with our colleagues in the pathfinder areas”.

The report lists 27 areas of concern for the care.data Programme Board itself, containing some 52 unanswered questions, with 7 additional tests that pathfinder CCGs must meet.

The sheer number of unanswered questions indicates just how fundamentally misconceived care.data was from its inception, and at this stage – 10 months after the programme was stopped – suggests continued mishandling by those inside the care.data bunker at NHS England.

Questions raised in February remain unanswered at Christmas. No doubt someone at NHS England will find a lump of coal under the tree when they’re at their desk next week.

Phil Booth, coordinator of medConfidential, said:

“It’s up to NHS England whether care.data in 2015 will be handled as badly as in 2014. Discussing questions to which they should already have answers with people they’ve been discussing with for months risks repeating the same failures over again. This needs a second reset [2].

“It all boils down to what will patients be told? What will actually happen? And who will make sure that all of this is true? Quite clearly Dame Fiona, and the public at large, still don’t know.”

Notes for Editors:
1) The Independent Information Governance Oversight Panel’s report to the care.data Programme Board on the care.data Pathfinder stage: https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/389219/IIGOP_care.data.pdf

2) “The re-constitution of the programme board follows recommendations from the Major Project Authority’s Project Validation Review”. Chair’s notes from care.data Advisory Board meeting on the 25th June: http://www.england.nhs.uk/wp-content/uploads/2014/07/ad-grp-notes-250614.pdf

medConfidential campaigns for confidentiality and consent in health and social care, seeking to ensure that every flow of data into, across and out of the NHS and care system is consensual, safe and transparent. Founded in January 2013, medConfidential is an independent, non-partisan organisation working with patients and medics, service users and care professionals.

For further information or for immediate or future interview, please contact Phil Booth, coordinator of medConfidential – phil@medconfidential.org

– ends –

Early December update

Ahead of Tuesday afternoon’s Commons Health Select Committee session with Jeremy Hunt, we’ve published a briefing with some current questions for the Secretary of State for Health. Hopefully the Committee will get chance to ask one of them.

As a result of the care.data Advisory Group public meeting in Manchester and recent press coverage, we have also written directly to both the Secretary of State and NHS England Chief Executive, Simon Stevens, about matters of increasing concern in NHS England’s approach to care.data. We look forward to public statements on the substantive issues we have raised, certainly before any ‘pathfinder’ is to proceed.

Last Friday, 5 December, HSCIC held another event as part of their post-Partridge Review process. The Information Centre has made a number of positive changes since the Partridge Review, and we hope this approach continues into the future. Unfortunately, HSCIC is often hampered by the decisions – or lack thereof – of NHS England, which has clearly not gone through the same level of reflection and renewal on consent and data issues since the care.data debacle earlier this year.

It remains to be seen if the Department of Health itself wishes to be more like HSCIC than NHS England. With the Secretary of State’s commitment that the role of National Data Guardian will be made a statutory body “at the earliest opportunity” and an amendment to Jeremy Lefroy’s Private Members’ Bill to do just that, the opportunity is there (see our Bill Committee briefing). Given weasel words that have been used before, it is actions that are required from the Secretary of State to deliver on his promising words.

We have also proposed a second clarification amendment to Jeremy Lefroy’s Bill – for a transparent register of every body authorised to make use of the NHS number – which we hope to see adopted at Committee Stage in the Commons, followed by Third Reading and all of the stages in the Lords before the election. And we note even a draft of the Regulations to define “the promotion of health”, sanctions for misuse and the rules and operation of the Confidentiality Advisory Group have yet to be published. There is a long way to go.

Speaking of a long way to go, we have still heard very little about the Department of Health’s proposed changes around “Accredited Safe Havens”. From what we do hear, we are increasingly concerned that they may allow data to be reused in “misguided, but well-meaning” ways, by entities that would cause significant concern were they to access data they might be a little too eager to get.

This week is the first Leadership Meeting of the Department of Health’s National Information Board (NIB) since the lay members were appointed. The event will be broadcast live on Tuesday morning. While usually paid to be one half of medConfidential, Sam Smith has been appointed by the Department of Health as a lay member – “like a non-executive director” – solely in a personal capacity, and sits on the Board on that basis.

 

It’s Christmas…

We deeply appreciate every donation you give us and especially the messages you include with them, whatever the amount… £5, £50 or more. We know each donation is an expression of individual support for what we are doing and the good wishes that come along with that.

medConfidential is a tiny organisation, hitting well above its weight, but to keep going we have to find around £60k per year. If you are – or know – someone who could make a substantial contribution towards our operating costs, please get in touch – coordinator@medconfidential.org .

Seasons Greetings to all – there’ll be one more update before the end of the year.

medConfidential Bulletin, 7 November 2014

What just happened?

The MP for Stafford, Jeremy Lefroy, has introduced a Private Members’ Bill that would amongst other things mandate the use of NHS numbers as “consistent identifiers” across health and social care.

We have some concerns about potential unintended consequences of the proposed legislation but believe these can be addressed at the upcoming Committee stage, to which the Bill was sent this morning. We’ll be starting to engage with specific MPs on the Committee from next week.

What is in care.data?

As NHS England begins to ramp up again towards the ‘pathfinder’ stage (see our last newsletter) the new narrative seems to be that the data to be extracted from your GP record is only “codes”. Quite aside from the fact that each item will be associated with your NHS number, date of birth, full postcode, gender and ethnicity, these codes are not secret – they are published, and even used in adverts on the sides of trains.

To help you understand the breadth of the information to be extracted under the current version of care.data, we have put together an online tool to let you search and read the diagnoses, treatments and other ‘events’ described by the codes. All the events within the care.data GP dataset will have dates attached and be linked to every other medical diagnosis you have on the dataset, or that can be inferred from your prescriptions.

Click on the link below to search or browse the information that will be extracted from your GP record under care.data:

N.B. The page may initially take a minute or so to load as it contains a significant amount of information.

Where does your data go, and why?

You should know where your medical records have gone, and why (longer version).

Whether you have opted in or out of care.data, there are a whole host of other data flows that relate both to direct care and to all the other things that happen around the NHS. You may have a Summary Care Record (SCR), and your hospital (HES) records may – or may not – be sent to various places depending on your consent where it is applied, and irrespective of your consent where it isn’t.

If you don’t know where your data has gone, there’s no way to know whether your wishes are being respected. And when there is a problem, there’s no way to know whether you personally were affected. In September, we produced an example of such a personalised data usage report [PDF] that we believe should be available to every patient.

Without a full commitment to individuals knowing where their data goes – and this must be for everyone, not just those who don’t choose to opt out – there will continue to be mistakes caused by secrecy that would be catastrophic to public trust in the handling of NHS patients’ data.

More details on data usage reports.

What next?

Though the care.data ‘pathfinder’ areas have been announced – Leeds (3 CCGs: West / North / South and East), Blackburn with Darwen CCG, West Hampshire CCG and Somerset CCG – we still don’t know which practices will be participating, and are waiting to see exactly what patients and GPs will be told.

With new Regulations and Directions still to be published, including clarification on the definition of “promotion of health” and sanctions for misuse, and with issues such as commercial re-use and access to patient data after the pathfinder stage still to be resolved, a number of crucial concerns must be addressed before the scheme moves forward.

We shall, of course, keep you updated as more information becomes available.

Meanwhile, the next Open Meeting of the care.data Advisory Group, on which medConfidential sits, will be held in central Manchester on 26 November. This will be the third in a series of public events where patients have the chance to ask questions about care.data and hear directly from NHS England. For more details or to register to attend, please visit the Open Meeting webpage.

And finally

Thank you for all your support – to those who have been sending us tip-offs and researching particular issues, to everyone involved in organising meetings and events, and to the volunteers who are helping us handle parts of the enormous workload that comes from tackling care.data and related issues on multiple fronts.

Please do pass this newsletter on to your friends and family. They can receive future editions by joining our mailing list at http://medconfidential.org/contact/

Phil Booth and Sam Smith
Coordinators, medConfidential
7th November 2014

What is a data usage report?

In short, you should know where your medical records have gone, and why.

Whether you have opted in or out of care.data, there are a whole host of other data flows that relate both to direct care and to all the other things that happen around the NHS. You may have a Summary Care Record (SCR), and your hospital (HES) records may – or may not – be sent to various places depending on your consent where it is applied, and irrespective of your consent where it isn’t.

Some of these data flows are routine; for example, the NHS Business Services Authority sorts out paying prescriptions, so it gets a copy of that data so it can do its statutory job. But if you’re treated in a hospital the various organisations, both private and public, who provide services to that hospital may also get a copy of (some of) your medical record for various reasons.

Why does this matter for you?

If you don’t know where your data has gone, there’s no way to know whether your wishes are being respected. And when there is a problem, there’s no way to know whether you personally were affected.

Most SCR records will not be accessed or viewed when they shouldn’t have been, but without you knowing when your SCR was accessed and by which organisation, you have no way to know whether or not your confidential details have been protected. NHS bodies have that information, and can tell the Health and Social Care Information Centre.

Since the debacle in February, the HSCIC has undertaken a process of significant internal procedural change. In March 2014, it couldn’t say to whom it had sent data that month. By February 2015, it should be possible for HSCIC to tell each individual patient exactly where their medical record went, and why – both for their direct care and for the variety of other uses around the system.

There is, for example, a broad base of support for medical research. The UK wins more than its fair share of Nobel prizes and other measures of esteem, not to mention the development of new treatments to help all. As a patient, your medical records will have been used in a variety of these studies for decades, but until things began to change this summer there has been no way for you – as a patient who contributed – to receive the knowledge of the outcome of these research programmes, even though many years may have passed since your records were used.

HSCIC should remember, and can tell you. Academics and researchers are already required to tell their funders (and hence the public) of the outcomes of their research – in academic papers or other published outputs – so if they tell HSCIC, then HSCIC can tell you about the projects in which your data was involved, however small or large its contribution.

A data usage report (that covers all uses) means you won’t merely have to trust that your data was treated properly by the NHS. You can read your report, and know for yourself.

There are some parts of the health and care system that won’t and shouldn’t ask for NHS numbers, so these will not be included in the report – but if your NHS number is used, then it should be included.

If there are good reasons why something shouldn’t be included in the data usage report, then maybe the NHS number shouldn’t be used. If data can be linked then it likely will be linked at some point, and if this shouldn’t happen then there may be better measures that can be used to prevent linkage, such as not using the NHS number.

Why is a data usage report so important?

Data ‘wants’ to be copied. Without a full commitment to individuals knowing where their data goes – and this must be for everyone, not just those who don’t choose to opt out – there will continue to be mistakes caused by secrecy that are catastrophic to public trust in the handling of NHS patients’ data.

What might a data usage report look like?

In September, medConfidential produced an example of a personalised data usage report [278 kB PDF file] (edit – there’s a 2021 updated example now too). We understand that discussions have moved on and that some of the sections may be slightly different, but this is an active discussion we look forward to seeing happen.

Only with a data usage report, available to every patient, can care.data go forwards. With the emerging details of where patients’ data goes, and on what basis, this cannot be mishandled as so much of the care.data programme has been up to now.


This post was written in 2014 – there is an implementation update for 2015 and 2016, 2019, 2020, and 2021.