How big data is transforming medical information insights

By Daniel Ghinn

User research informs education website

Next Tuesday I will be speaking at the DIA 8th Annual European Medical Information and Communications Conference in London, in the opening session of what is set to be a great agenda exploring some of the most important issues in pharma medical information and communications today. The session is being chaired by Isabelle Widmer and I will be speaking alongside Luciana Fantini from Eli Lilly in Italy and Georgios Koumakis from Roche in Greece. I’m looking forward to reconnecting with old friends and making some new ones too, and with some support from my colleagues Stefan Marcus and Paul Grant (who will be missing the event itself as he will be speaking at Health2.0 in Santa Clara, CA) we will also be displaying a huge poster with a special version of Creation Healthcare’s analysis of 100,000 Healthcare Professionals on Twitter worldwide.

If you’re not heading to the DIA conference, or just can’t wait, here’s a preview of what I will be presenting:

Big Data?What exactly is ‘Big Data’, and why has it become such a ‘buzzword’? (The ‘word cloud’ on this slide is simply a visualisation of the DIA Conference agenda.)

Big data is not a new idea at all, especially in the pharmaceutical industry where we have been working with big datasets for decades. My own first experience of data in pharma was twenty years ago with a company called Medicare Audits, which was later purchased by research company IMS Health. With Medicare Audits I developed tools to analyse data from UK hospital pharmacies, extracting insights for pharmaceutical companies. Helping pharma to do better planning and develop strategies based on customer behaviors was an exciting introduction to the world of data in the healthcare industry.

Japan sign

What is really important, of course, is actionable insight, not simply data.

This photo shows a sign I saw while working in Tokyo, which reminds me of some of the kinds of data visualisations we sometimes see. What exactly does it mean? Stay on the red line? Walk on sticks? Watch out for a man wearing a hat? Turn left?

In Medical Information departments, we have access to a lot of data. The question is, do we and our colleagues know the power of all this data, or how this data can be better explored to help provide value and commercial business insights?

Dark data

Where is the ‘dark data’ in your organization? It’s probably big data too. Perhaps it’s in your medical contact center logs? Or your medical information website analytics?

Your HCP portal might be a source to explore. Or perhaps you have access to data from closed HCP networks?




When did you last see something like this? If you use Amazon, Netflix, Target or any number of other online or e-commerce based services, your past behaviors provide a lot of data about your possible future actions. You might also have ‘behavioral data’  about your pharma customers, that would provide a further dimension to your big data analysis.

Apple & Google health 'big data'

Google and Apple, whose products and technologies are integrated into the lives of most people in the developed world in more ways than most people are aware, are investing heavily into health ‘big data’. Earlier this year, Google met with the FDA to talk about its smart contact lens that could measure and transmit data on glucose from tears; Apple’s newly launched Watch is also a health sensor.

There are endless possibilities for developers of apps to integrate with Google and Apple health technologies. Will pharma use them?



Let’s take ‘public social media’ as our dataset of today – let’s consider all conversations taking place in public social media among healthcare professionals, in all therapy areas all over the world.

Here’s a map showing the location of 100,000 public Twitter profiles of healthcare professionals, worldwide. It’s from a poster being presented here at the conference where you can go and see just where these healthcare professionals are, and explore the differences in online behaviour between nurses, surgeons, pharmacists and other kinds of roles.

HCP social media big dataThe healthcare professionals in this poster have so far posted around 350 million Tweets, and keep Tweeting, at a rate of around 2 million Tweets every week!

With so much data in our hands, an important question is: how do we find the signal in the noise?

We could of course filter the data by healthcare professional role, location, or by searching the data for mentions of particular diseases or drugs.

Multi-dimensional data analysis

One approach that we take is to consider the data as multi-dimensional. For example when we consider healthcare professionals in public social media, we can look at their locatoin; their role; the topics they are interested in; and we can look at how these change over time.




HCP social media language processingLet’s consider topics and content posted. We can use a number of language processing tools to learn about needs, interests and ideas. We can use this technique to discover new or emerging areas of concern in a particular disease area.




HCP social media geolocationWe could look at geolocation. Analysing the geography of social media behaviour might indicate particular trends by location.

Here, for example, we were able to discover that the topics of interest to healthcare professionals talking about diabetes varied considerably from one country to another.



HCP social media by roles

Or we can consider ‘Personally Identifiable Information’ from data published by healthcare professionals. For example, as in this case we might analyse conversation by role, to identify particular topics of interest to individual role such as nurses compared with bariatric surgeons.



HCP social media over time

We can also analyse data over time, to understand how any of the other dimensions is changing. In this example we analysed healthcare professionals talking about a particular pharmaceutical company’s products over time, and we see a couple of big spikes in volume. Then we looked at the role types who were most active during these spikes which helped us understand the significance of the data during the events that caused the spikes.

Spiegelhalter quote


Here’s a quote from David Spiegelhalter, Professor of the Public Understanding of Risk at Cambridge University. He said that just having a lot of data does not make your problems disappear; in fact it makes them worse. That might not sound very encouraging, but I like that it’s an antidote to the idea that big data is the simple answer to everything.


Intelligent toolsSo today we have all sorts of technology tools that are getting better and better at analysing language in big data. And you’ve no doubt heard about the powerful ‘intelligent’ computing system IBM Watson, a form of machine learning.




Human analystsBut we can’t do this without real humans too! In my company we use a team of data experts and analysts, reviewing what the ‘computer’ has decided about our data. And, of course, the computer cannot tell you what your business question is. (Although it can give you clues that might inspire your questions.)



The ball fell... quote


Here’s an example of a sentence that would be difficult for a computer to understand with certainty. Can you see why?





Animal Crackers quoteOr how about this quote, from the 1930 Groucho Marx movie, Animal Crackers…





Physician tweets


And we see this in conversations among healthcare professionals in public social media, too. Here is part of a conversation among two physicians who, as it happens, are based thousands of miles from each other in different countries.



Get started with big data


So to get started, think about these areas:

Where will you find data you can analyse? It may be closer than you think.

And think about the four dimensions you could use to analyse the data – language; location; personally identifiable information; time.



You have the questionsFinally: For all the big data in the world, it’s up to YOU to ask the questions.







…and here’s my official and required DIA disclaimer:

The views and opinions expressed in these slides are those of the individual presenter and should not be attributed to Drug Information Association, Inc. (“DIA”), its directors, officers, employees, volunteers, members, chapters, councils, Special Interest Area Communities or affiliates, or any organisation with which the presenter is employed or affiliated. These PowerPoint slides are the intellectual property of the individual presenter and are protected under the copyright laws of the United States of America and other countries. Used by permission. All rights reserved. Drug Information Association, DIA and DIA logo are registered trademarks or trademarks of Drug Information Association Inc. All other trademarks are the property of their respective owners.

View all articles >

Meet the Author

Daniel Ghinn

Daniel has been at the helm throughout the company’s life since 1998. His rich expertise in working with pharmaceutical businesses has enabled CREATION to build business solutions that fit our clients’ needs.

Daniel is married to Jo, has three children, a cat, a dog, 28 fish, and 160,000 bees.