Commercial Re-use Licences for HES & disappearing webpages

It has become increasingly clear in recent weeks that patients have been kept in the dark about where their data has been going, in what form and what is being done with it.

Well now for the first time, we can show you a picture:

OmegaSolver HALO Patient Analyser screengrabClick on the image to show full size.
For obvious reasons, we have redacted the day element of every date.

Please note that we are not suggesting the following is unlawful or that patient confidentiality has been deliberately breached.

The image above comes from a company called OmegaSolver Ltd, formed in March 2013, which sells a product called HALO Patient Analyser, which it describes thus:

Patient Treatment Analyser is a unique dataset solution provided to pharmaceutical companies and trusts, who want to analyse and understand the treatment given to patients suffering from a specific disease…

The patient Analyser provides a robust query engine where users can query based on a large number of fields such as –

First Diagnosis
Period (Years)
Hospital Visit Frequency
Hospital Stays
CCG / TRUST / Hospitals
Treatment Specialist
Planned / Unplanned Admissions and many more

The Patient summary report gives a summarised report of the queried data for new and overlapping patients, Gender split with age range distribution over a three year period. Patient Analyser is a one of its kind analytical tool with a simple yet understandable data visualization tool, which is currently a unique offering.

The image, a screenshot looking at five out of 163,316 patients’ data, shows detailed information about each individual patient including their medical diagnoses ordered by actual dates of each hospital visit, tracking episode to episode – the detailed state of each individual patient’s health as he or she passes through hospital care.

For example, patient OS060900 (the ‘pseudonym’) is aged 81-85 and had 5 conditions diagnosed in October 2010. She has visited hospital 257 times, mostly as outpatient visits, but spent 5 days in hospital at which point 8 conditions were diagnosed, then 6 days later the incidents scroll off the screen.

Patient OS084761, also 81-85 years old, was in hospital in April 2010 and he was still there with the same diagnoses 3 days later, though it looks like he left a day later with at least one additional diagnosis.

We are not certain that the codes in the screenshot are the same as the ones used by GPs, but if they are then some of the events and/or diagnoses referenced in the screenshot would include:

  • Posterior fixation of rectum
  • Removal of left breast
  • Suberosis (cork-handlers’ lung)
  • Explosive personality disorder
  • Bilateral mastectomy or mammoplasty
  • Removal of left fallopian tube
  • Removal of left ovary

What this illustrates quite starkly about pseudonyms is just how irrelevant they are when there is so much other identifiable data in the rest of the row. ‘Pseudonymised’ data may obscure some of the most obvious pieces of identifying information, such as your NHS number, but it clearly doesn’t hide rich detail about a person’s life and health that could just as easily be used to identify them.

Given that companies are already combining health data with social media data, you can see the ever-growing risk of re-identification from simply having tweeted about having had an accident on a certain date, or having posted a Facebook update about a relative going into hospital.

N.B. We sincerely hope this screenshot was taken from a set of mock data, not the actual HES data of 163,316 NHS patients. We look forward to clarification from OmegaSolver in due course.

We have noticed in recent days that some of the “information intermediaries” supplied with data by the Health and Social Care Information Centre under “commercial re-use licenses” are pulling web pages when contacted by the press about what they are doing.

Last Monday the Guardian, Wired and others reported on a company called Earthware with a ‘Hopsital [sic] Episodes Map’ on its website, which it described thus:

Healthcare companies and the NHS use Hospital Episode Statistics (HES) data to understand the flow of patients through the healthcare system. HES is a dataset containing details of all admissions, outpatient appointments and A&E attendances at NHS hospitals in England.

The map appeared to be making Hospital Episodes Statistics (HES) data available for arbitrary queries on a public web page without any form of password protection. The company pulled the map, but later put out a statement saying:

Earthware statement, 3/3/14

The third party, which cannot be named at this point, has since removed all the text from pages on its website that mentioned HES data.

Another information intermediary which last week was happy to declare it held “over 900 million linked patient HES records” and “patient level linked HES data”, has updated its site and now claims to hold “over 1 billion linked patient HES records dating back 10 years” but adds the qualification, “this data is non identifiable and non sensitive”. The company’s website also clearly states, “HES data provided by the Health & Social Care Information Centre under Commercial Re-use licence 2013.”

We suspect HSCIC and its information intermediaries’ definition of “non sensitive” may be somewhat different from the patients whose hospital details are being sold.

And in the light of the OmegaSolver image, the bald assertion that vast quantities of information-rich patient-level health data are completely “non identifiable” simply will not wash.