Data all the way down
“Turtles all the way down”
Are you a clinician about to see a patient, reviewing the patient record in order to prepare for the consultation?
Or are you a manager assessing the demands on your services, and running quality improvement projects in order to improve your outcomes?
Or perhaps you’re a researcher looking to extend knowledge and inform the best clinical practices of tomorrow?
Most of us working in health and care operate in all three of these domains, but historically, each is considered separate.
Patient records might still be on paper, or perhaps you use an electronic patient record system.
Your administrative data including time from referral to different assessments and treatments are held in a patient administrative system, or more likely, across multiple administrative systems, one or more for each organisation.
Your research might use a bespoke registry or research system for each of your research projects.
Software in health and care
Software is like magic offering us the potential to automate and systematise safe and effective clinical processes and most importantly, build learning systems that can continuously monitor and improve from where we are currently.
As a result, digital, and being data-driven, means our planning cycles should become shorter and shorter, because we’re using small focused interventions and data to constantly evaluate our hypotheses and build a continuously learning healthcare system.
But we have major problem:
We separate the things that should be the same.and we join the things that should be separate.
Here’s an example from the past. The Bristol Heart scandal occurred in Bristol, UK in the 1990s, in which babies died at high rates after cardiac surgery. The report into the scandal including this in its conclusion:
The various clinical systems, many of them paper based, differed from one another and had no relationship with the administrative hospital-wide systems. The funding made available in the late 1980s and early 1990s for medical and later clinical audit helped to reinforce this separation by making available to groups of clinicians money for small local computer systems. The lack of any connection between these different systems, one administrative, the others clinical, for collecting data cannot be explained solely on the basis of some technical or technological reason. It was just as strongly a reflection of a mindset that clinical matters were the sole domain of clinicians and non-clinical matters, to do with the management of resources and with the movement of patients into and through the hospital, were the preserve of managers and administrators.
Bristol Heart Enquiry, 2001. (quoted in “Using data for patient safety”)
But why do we separate clinical from administrative systems, and both of those from research systems?
There are a variety of reasons, including understanding who has responsibility for procuring, funding and designing such systems as well as issues of information governance.
The false separation of the needs of these three - administration, clinical practice and research, is so embedded in our culture that we even often use different tools, languages and frameworks to build the software - you might run R and python for your research, your analytics by an off-the-shelf package such as Qlik, and your electronic health record software in java or C++.
But are they so different really?
It’s straightforward to think of examples in which it is helpful to combine data relating to what is traditionally considered to be administrative with clinical data. One good example is the flow of patients through a hospital - lengths of stay are best understood in the context of medical problems and co-morbidities. Another might be making sense of clinical or diagnostic information, such as how many patients are referred with problem ‘x’. It’s all data, and it makes little sense that we use widely different software for the different purposes to which we put our data.
So what’s the solution?
A single system?
So why not have a single system? What a seductive yet naive idea!
Its naive because what do we know about building effective, adaptable, flexible computer software?
“Flow is difficult to achieve when each team depends on a complicated web of interactions with many other teams. For a fast flow of change to software systems, we need to remove hand-offs and align most teams to the main streams of change within the organization. However, many organizations experience huge problems with the responsibility boundaries assigned to teams. Typically, little thought is given to the viability of the boundaries for teams, resulting in a lack of ownership, disengagement, and a glacially slow rate of delivery.” Skelton, Matthew. Team Topologies (Kindle Locations 2147-2151). IT Revolution Press. Kindle Edition.
We have to break up our complex health and care domain into smaller subunits. It’s just too complicated and its safer and more effective to break up our problems into smaller tractable problems.
Health professionals can be seduced by the idea of buying a ‘single system’ in order to solve their problems because, quite rightly, they’ve been scarred by their experiences having to use multiple ‘systems’ in order to get the information that they need - paper records, that system, another system, another login, a different login.
But, but but!
But there is a paradox. We know that the worst ways to create a seamless “one system” approach are thinking we can build a single monolithic application or expect the procurement of a “system” to solve, for example, closer working and communication between health and social care.
The best way to slow down delivery is to have an approach to technical architecture and wider governance structures that slows down the software value chain. The key to software delivery is delivery, and yet some organisations treat software as if they were managing capital projects such as building a new road or a bridge. In most cases, the technology is the easy bit; its the implementation on the ground across multiple complex adaptive systems that is most difficult. But we also need a design that makes it easy and safe to change.
People who think building technology is difficult and complex tend to want to centralise and control its development, because that feels less risky, but that approach is wrong.
If we believe that software can make our work in health and care more effective, are we using the right methods in order to safely deliver that software, at pace?
We know existing software systems, whether for direct patient care, for analytics / governance or for research, we seem to make the same mistakes again and again:
Too often, software architecture and system design occurs as a consequence of organisational or management structures, rather than stepping back and truly understanding the problem domain and how to carve it up into manageable chunks. You need to understand Conway’s Law, and the reverse Conway’s manoeuvre - and place the patient foremost in the design and architecture of your systems. The dependencies between architecture, governance and standards, June 2021
The conclusion must be, therefore, that we need to step back and re-appraise how we architect, fund and standardise digital health and care.
We can get a ‘single system’ by focusing on data, standards, architecture and modularity, and enable innovation and continuous improvement - and not unduly limit our ability to adapt and change to the ever changing requirements the complex adaptive domains of health and care throw at us!
Towards data, data-driven health and composable modularity in software infrastructure.
Firstly, we’ve got to recognise that fundamentally, it’s all just data.
We need to collect, make sense of and use data in order to do what we want to do, whether that’s running our health services, seeing an individual patient, improving our services or furthering our medical knowledge.
Secondly, we need structured meaningful data and that means data standards.
If we think you have had a myocardial infarction, then we need computer systems that can make the same sense of the data, and we need to think of our different sources as reference data as products - and use ‘product-thinking’ in how we publish, document, and make those data available - it needs to consider user experience of that data product in just as much of a way as a user-facing application.
Thirdly, we need to make using data, and using data standards as easy as possible.
No-one starts building software nowadays from nothing, but instead make use of a range of standard building blocks in order to achieve the functionality they want - whether its a cryptography library, or machine learning, or networking, these elements are commodities - usually open-source, widely available and widely-tested.
So if you’re starting to build clinical applications now, where are the building blocks, the commodity libraries, frameworks and services with which you can innovate? Where is the platform on which you can build?
We need readily-available software modules that can make consuming and making sense of data as easy as possible. That means many software services that simply wrap reference data in order to make it usable in whatever context you need and that those services are composable with others much like an orchestra is made up of many different sections, instruments and musicians. And who can argue with Aristotle? “the whole is greater than the sum of its parts”.
What are the building blocks?
When I first built an electronic health record system, I needed the basics - understanding who the patient is, who the professional was, when someone was seen, where they were seen, what were the characteristics that help define the cohort - such as problem, diagnosis, treatment whether by surgery or drug, and geographical indicators, or derived indicators such as socio-economic deprivation.
I’ve realised that building such functionality into a single system is wrong.
As such, I’ve developed replacements as individual software components. Each is usable in isolation but I can combine them, whether as libraries in a single application or as a suite of individually running microservices, in order to solve problems. Each is designed for use in operational administrative, clinical and research systems - so are usable for direct care, service management or in my analytics. Each is open source.
- hermes - a SNOMED CT terminology server and library.
- dmd - UK dictionary of medicines and devices server and library
- clods - UK organisational and geographical reference data, including a FHIR r4 server
- deprivare - UK deprivation indices made available in-process or via a microservice
- concierge - integration / interoperability with other NHS software services including patient identity / demographics / staff indexes / PAS / document repositories etc Each underlying system can be viewed through the lens of an open standard - e.g. a HL7 FHIR view.
- trud - easy access to NHS Digital’s UK reference data services
- hades - a FHIR R4 SNOMED terminology server
- nhspd - UK postal code database with links to geographical, administrative and organisational units
Let’s look at two examples in more detail:
hermes makes SNOMED CT available to you easily, so you can embed as a library and run in-process or run as a terminology server. I use this to make it easy to use and make sense of SNOMED CT - both in clinical applications and in my analytics pipelines.
clods allows you to download and make sense of organisational reference data. I use that with the NHS postcode directory (make available via nhspd to find out where a patient lives and make sense of those data - including linking to indices of deprivation. My software applications record the date, time and place of clinical encounters, and can leverage those codes - and so other systems not under my control can understand that a particular encounter occurred at, say, the University Hospital of Wales. Using these data is essential for interoperability.
dmd allows you to download and make sense of drug data in the UK. What kind of drugs is that patient given? How have the doses changed over time? How many milligrams of that ingredient is the patient taking? Is this patient on a type of immunosuppressant drug? This library and microservice takes the NHSBSA dm+d distribution and makes it usable in your applications - whether that’s for clinical care or for running your analytics.
For example, I switch on different functionality in my homegrown EPR based on the diagnoses of a specific patient - e.g. does this patient have a type of motor neurone disease - while doing the same in analytics - e.g. give me all of the patients who have received a type of botulinum toxin - while doing the same in clinical research - e.g. how do the outcomes of multiple sclerosis vary by socioeconomic deprivation?
What’s needed next?
We need to think carefully about how we conceive and build the clinical applications of the future. They should be data-focused, data-driven and made up of a blend of open, modular services.
So what are the shared services that are needed for clinical, administrative and research applications?
There is usually a pattern - usually interesting but often parochial data sources - we need to identify those and make them more usable - by treating them as first-class data products. That means at a minimum, good documentation, persistent non-reused identifiers and simple ways of tracking publication such as a good metadata.
It’s too often the case that governmental data are published on a portal for download, but do not meet these minimum requirements, making it difficult to write software that can recognise publication of a new version and download that latest issue.
After the data, we need easily usable computing services that wrap that data product and make it accessible to applications, irrespective of context. That means building software that provides an API, whether in-process or as distributed microservices.
It is usually necessary to provide different abstractions across these services to make it easier for clients to consume health and care computing services. That usually means providing a ‘view’ of those data in an open standard.
For example, while clods adopts the DCB0090 standard for organisation data, that is a UK standard. Most applications will simply want to make use of organisational data at a simpler, high-level, so I make a FHIR R4 server available with a few lines of code mapping DCB0090 into FHIR to make it simpler to consume.
So I don’t expect client software to understand the different categories of health and care organisation in DCB0090, but we can provide a ‘facade’ across those data to map into data and formats that clients can understand, such as X.500, FHIR or openEHR and provide those data on-demand. So, for example, I can view the NHS Wales’ staff directory in X.500 format (native) or as a facade using a W3C organization ontological view or as a HL7 FHIR ‘view’. It’s all data.
Data are first-class and we should treat data products as first-class as well, together with composable software tools that make using those data products in our applications, whether user-facing for direct care, analytics or research.
Mark
PS. The title of this post alludes to the phrase “Turtles all the way down”, an expression of the problem of infinite regress.