Healthcare Data: The Race for New Oil in Southeast Asia

Integra Partners
7 min readJul 12, 2022

Data, The New Oil, and The New Oil Rigs

Health tech startups come in many forms. You have the Electronic Health Record (EHR) platforms, at-home test kits, remote monitoring medical devices, and AI image analysis tools, to name a few. Spend enough time speaking with healthtech founders, though, and you will soon realize that no matter the sub-sector, most of them are playing towards the same endgame — to accumulate sufficient, and sufficiently high-quality data to be of interest to the major stakeholders of the healthcare ecosystem, insurers and pharmaceutical companies. Put in another way, for many of these healthtech startups, their ostensible products — the kits, the software platforms, the devices — are just the digital world’s equivalent of an oil rig, drilling for data.

Data is the new oil, they say, and in the world of tech, it’s been drill, baby, drill for quite some time now, fueled by the twin forces of Venture Capital (VC) and the growing abundance of connected devices. But how similar are oil and data, really? And what can their similarities and differences teach us, especially in the emerging healthtech sector in Southeast Asia, where talent has been proliferating, valuations rising, but exits remain somewhat unproven?

Different Machines, Different Strategies, Different Data

As when drilling for oil, the equipment itself is of paramount importance. Different acquisition methods will predispose startups to prioritizing and accumulating certain types of data, depending on where and how they interact with patients during their healthcare journeys. Startups selling consumer-grade DNA tests, for example, might gather huge amounts of direct, first-party genetic data in a short period of time (especially if they’re backed by deep-pocketed VCs who are happy to fund high customer acquisition costs). But such data will also likely be episodic (from a single point in time) rather than longitudinal (from the same patient over a period of time) in nature. Episodic data is also far less appealing (and useful) to insurance and pharma companies.

Other than the data from analyzing the test kits, medical history is usually collected as part of the process. However, the information is usually self-reported by consumers through online surveys and therefore patchy and less reliable. This is why as these types of companies mature, they start to offer complementary services like genetic counseling, which enables them to build longer-term, repeated interactions with patients — and acquire data from that same patient over time.

On the flipside, startups focusing on EHRs, especially in emerging markets, will likely struggle with their initial go-to-market. Driving EHR adoption isn’t as simple as convincing someone to take a saliva sample — it requires convincing entire clinics and/or hospitals to change the way they do things every day. However, the raw data you get access to will likely be longitudinal as the same patients visit, and acquired through clinical tests and examinations rather than primarily self-reported. Even so not all EHR systems are created equal. Depending on whether it’s used in an oncology center or a GP clinic, the type of patient data collected will look very different — and be valued differently as well.

Let’s take a look at the different types of machines we might find in the health tech space, and the implications of their various data acquisition strategies. Of course, these are generalizations — there are many startups playing in each of the categories below who have found their own ways to defy the limitations of their initial data acquisition strategies.

Everyone has the same end goal of data aggregation, but there are different means of getting there, each with its own strengths and weaknesses. In the end, though, it all comes down to three attributes: breadth, depth, and exclusivity. As in, the breadth of the data set when it comes to population size and demographic diversity; the depth of each patient profile and his/her care journey; and exclusivity in terms of access and ownership to more unique data. We’ll talk more about how the three attributes determine the use case of the data, as well as the premium placed on the asset in a little bit.

The Rig Operators and Rig Operability

The second consideration, of course, is the human element. Who operates the rig has a huge impact on whether the machine is used to its full potential. We think about usability in two ways.

First, user experience encourages usage among trained medical staff. In theory, workflow software and diagnostic support algorithms could make the lives of physicians much easier by saving them a lot of time through automation. In reality, however, the automation is not very useful if the number of conditions that can be recognized and diagnosed by the algorithm are limited. For example, take an AI tool that helps diagnose lung cancer. Radiologists still have to spend the same amount of time examining each scan or X-ray to check for tuberculosis, pneumonia, or other possible conditions that the AI can’t identify. In the end, adoption of these diagnostic tools can be challenging if the new technology doesn’t add much to the existing workflow of the medical professionals that they’re targeting.

Second, technology enables us to tap into lower skilled resources. Portable ECG devices, compared to their bulky hospital-bound ancestors, have made it much easier to get scans done with easy-placement chest straps and even AI that can produce clear scans by the noisy roadside. AI guidance is especially helpful when it comes to ultrasound, where operator skill can impact results significantly (unlike MRIs or X-rays, ultrasounds are taken using a wand held by an operator, who decides the angle and depth from which the recording is taken.) With AI that is able to tell you whether the device is placed correctly and guide you step-by-step, even untrained staff that are unfamiliar with taking echos can use the machine. These features are highly valuable especially in regions like Southeast Asia where a significant number of people are in rural areas with limited access to specialized expertise and equipment.

The Data Refinery: From Raw to Useful

Data preparation is a key next step to ensure the final product can be useful to the acquirer. In this case, we’re talking about the big boys and girls of the healthcare ecosystem: the large medtechs, clinical research orgs, global pharmaceutical companies, and insurers. They don’t want, and can’t use, raw data. They want their data sets cleaned, curated, and structured — ready to answer the questions they want to ask of it.

How much are they willing to pay for that data, you might ask? That depends if it’s diesel… or jet fuel. Put in another way, the potential use case for the data influences its premium in price. Exits have been few and far between, but here are some examples we’ve found in the table below.

It’s not much at the moment, but the examples above are a good starting point for us to understand how and where premiums accrue across different types of data. At first glance, we can see how genomic data is a hotter commodity than EHR data, but oncology-focused data sets tend to be more in demand than less curated general data. We begin to see some trends start to form.

When Data is Not Oil

Unlike crude that gets processed and separated, data becomes more valuable when amalgamated and layered on top of each other. If anything, the table above shows us that exits in the space are data sets getting swallowed by larger data sets — IBM adding Truven Health to its ever growing Watson database; Roche ingesting Flatiron and Foundation Medicine to complement its oncology therapeutics R&D, and so on. It’s the expansion of breadth and depth we mentioned before.

Another point we should make is around the reusability of data and how it affects price. Simply put, reusability is largely determined by ownership rights and exclusivity. Who gets to mine the data? Who gets access to the mined data? Although data wells are pretty much inexhaustible, different rigs mining from the same well over and over again (like different health techs partnering with the same disease registries or hospitals) commoditizes the data extracted and results in lower prices. At the other end of the spectrum, we can see that precision health companies that own and guard the gates to the genomic data that they harvest enjoy a frothy price premium. Ultimately, it’s about controlling the access to high-demand supply.

Putting It All Together

Now, back to the overarching question we brought up at the start — how does everything we’ve discussed translate to exits for health techs in Southeast Asia? By now we can agree that there’s no straightforward answer. However, we can start to piece together some rules of thumb on how we can think about it:

If the endgame is to accumulate sufficient, and sufficiently high-quality data, then what does it take to get the big fish to bite? We’ve established that health techs that accumulate data across the three buckets of breadth, depth, and exclusivity are heading in the right direction. Ultimately, however, we think that the key to health tech exits will come down to breadth even as depth and exclusivity are table stakes. Achieving regional breadth is likely the most challenging to accomplish out of the trifecta, and therefore will be the biggest differentiator among health techs — especially in Southeast Asia, where there’s great cultural, infrastructural, and political diversity. Whoever manages to build an oil rig that taps on the many wells across the region, will stand a much better chance at getting the attention of these global healthcare giants.

Written and illustrated by Theodore Ng, Analyst at Integra Partners



Integra Partners

Integra Partners invests in early stage companies in the areas of financial services, insurance and healthcare. Visit us at