Generative AI is impacting future radiology.
Artificial Intelligence

Generative AI makes diagnosis easier in radiology

The latest breakthroughs in the use of artificial intelligence are ushering in radical changes not only in our everyday lives, but also in the world of medicine.

Doreen Pfeiffer
Published on November 20, 2023

Artificial intelligence is not only influencing how we communicate and interact with our environment, it is also transforming radiology departments. Learn more about so-called generative artificial intelligence and hear from Johannes Haubold, MD, senior physician for clinical AI integration at University Hospital Essen, Germany, on how this technology is already impacting clinical workflows.

The Greek philosopher Aristotle once said, “The whole is greater than the sum of its parts.” This is particularly evident in the brain, where the interplay between human cells produces capabilities far beyond those of individual cells. The same principle applies in other areas, where the key is not simply to reproduce information but rather to creatively interconnect the available information to create something entirely new. As humans, we excel at this and can therefore finish incomplete sentences, extend images with sensible new content, or compose pieces of music. New forms of artificial intelligence show that at least some of these capabilities can be imitated using data analysis and pattern recognition.

First Microsoft, and now Google – the tech giants are getting on board with a trend originally started by OpenAI. Chatbots such as ChatGPT or Bard are the most popular examples of a special category of artificial intelligence (AI) known as generative AI. Here, “generative” refers to the fact that artificial neural networks create new content based on existing information. This form of AI is rapidly gaining ground on human intelligence, producing images, code, video content, and text with astonishing precision.

AI models cover wide-ranging tasks, constantly redevelop themselves, and are able to learn quickly once they are established. These models are used in medicine and research and come into their own in such fields. For example, generative AI assists with drug development by calculating the molecular structures of various ingredients and their chemical behavior. This allows neural networks to combine pharmaceutical ingredients in a specific way in order to improve their effectiveness. [1]

Not only in pharmacology, but also in radiology, generative AI is changing the way people work by helping with the detection and segmentation of radiological image data or by improving image quality. Although the technology is still in its infancy, initial prototypes show its huge potential to support the work of radiologists in the future, says Johannes Haubold, MD, senior physician for clinical AI integration at University Hospital Essen, Germany.

The senior physician is heading the clinical AI integration at University Hospital Essen, Germany and leads a working group that seeks to build a bridge between artificial intelligence and clinical routine with a view to integrating the latest developments into the world of practice at an early stage.

Dr. Johannes Haubold, Senior physician for clinical AI integration at University Hospital Essen, Germany

What applications does generative AI have, and what advantages does it offer in radiology? 

Generative AI, particularly in the form of large language models, can help us perform various tasks. For example, we’re working on creating an interface with a database so that we can interact and communicate with it. We can already search hundreds of databases containing information about patients or their diseases and obtain this information in a clearly presented form. For example, this makes it easy for us to seek out cases with similar clinical courses in order to learn from them and develop improved treatments. Some of these capabilities are still just a vision of the future, but large language models mean that scenarios such as these are now conceivable. 

In other words, it essentially works like ChatGPT – you ask the network: “Can you show me this patient’s medical history and their specific state of health?” 

It depends on the type of large language model. In principle, the models we develop at University Hospital Essen work like a chat system. They indicate what you’re searching for, and the algorithm provides you with the information. In addition to that, however, the algorithm also indicates which data was used to answer the question. This is particularly important when it comes to verifying the quality of the results. In any case, quality control is a vital issue when working with large language models – particularly in medicine. 

It’s interesting that you mention quality control. In late September 2023, there was a headline in the USA that ran, “ChatGPT diagnoses 4 yr old’s chronic pain after 17 doctors fail to do so,” and attracted considerable attention. How reliable is generative AI, or does it involve some risks, particularly in medical applications? 

I think headlines like this appear in relation to every new technology as people probe the limits of its capabilities. Large Language Models like ChatGPT are a very interesting development that allows us to do things that were simply inconceivable beforehand – but there are risks to using the technology, particularly when it comes to quality control. For example, if you ask ChatGPT to write an introduction for an article, you can also ask it to set out the sources of the information. Then, you can ask the program whether the references are real, and ChatGPT will confirm that they are real, and it will tell you in which database you have to look for them. On closer inspection, however, you’ll find that the publications the algorithm refers to do not exist, no matter how hard you look for them. That is potentially dangerous in the world of medicine, which is why it’s so important to carry out a sort of quality inspection in order to nip these “hallucinations” in the bud. 

Can you run us through the process of creating a new algorithm at your university hospital? 

At its heart, every development process should begin with a clinical need. In other words, what is the specific problem, or the gap that the algorithm is intended to fill? In the next step, we then clarify the ethical regulations, because we need to access large volumes of data when creating the algorithm. Subsequently, we search for a data set and export it in anonymized form. Finding data of this kind involves a process of data integration, for which a huge FHIR server is available in Essen. Here, all of the information comes together in one place, so that it’s easy for us to access structured data. The next step is to think about which algorithm is best equipped to answer the question. We generally use freely accessible networks and adapt them to our needs. Often, we begin by training multiple algorithms simultaneously and then compare them with one another to identify which is the most efficient. Once training is complete, the algorithm undergoes various testing phases, ideally in collaboration with other research institutions to validate its efficiency. Then comes a clinical evaluation: Does it really help us in the routine or are there any obstacles? Eventually, the integration into the clinical workflows must be optimized and evaluated. 

In collaboration with Siemens Healthineers, a prototype is being developed to evaluate and further develop a software assistant for radiological diagnosis. Can you tell us a bit more about the project? 

At the moment, we’re jointly developing two different algorithms – both of which are large language models. In simple terms, one of the algorithms is able to answer clinical questions about a patient’s state of health. The second algorithm forms a sort of bridge between communication and the discovery of datasets, so that it is possible to create FHIR queries with natural language questions. For example, you could instruct the system to look for all patients who received a specific drug in the last two years and subsequently experienced kidney damage. The large language model transmits these questions, assigns the corresponding datasets, and therefore allows us to confront large datasets with specific research questions. 

In December 2019, you published an article in the journal Der Radiologe on the topic of “Artificial intelligence in radiology – what can be expected in the next few years?” In the article, you gave an overview of the developments you expect to see in the next 5 to 10 years. Have any of your predictions already come true, and what trends or innovations do you expect to see within the next five years? 

I spoke about various scenarios in the article – and we’ve certainly seen numerous developments since then. For example, I mentioned AI that uses image conversion to speed up MRI and sequence generation. Today, several algorithms have already been CE-marked. I had also hoped to be able to use chat systems and communication for clinical reports, which is something we’re working on right now. I’m optimistic that it can be achieved within the next five years.


By Doreen Pfeiffer
Doreen Pfeiffer is an editor at Siemens Healthineers.