THIS CONTENT IS BROUGHT TO YOU BY the Norwegian centre for E-health research - read more

NorDeClin-BERT is a language model based on technology originally developed by Google in 2018.

AI can understand your medical records: A new language model could revolutionise healthcare

Researchers believe that Norway has now made a significant step forward in the use of artificial intelligence in the healthcare sector.

Published

Researchers at the Norwegian Centre for E-health Research have now succeeded in developing the first Norwegian clinical language model, called NorDeClin-BERT.

The model is based on natural language processing, which allows computers to understand and process human language.

The new model draws knowledge from so-called clinical texts. These are texts from, for example, a medical record and other written documents that doctors and healthcare personnel use in patient care.

The model opens up new possibilities in health research and patient care.

But first, researchers need to anonymise data from you as a patient to develop the service.

“A model that's capable of decoding and understanding the language of healthcare professionals is an important innovation,” says Phuong Dinh Ngo.

The language model is trained on the Norwegian general language model NorBERT.

BERT is a system that can understand contexts in text. NorBERT is trained on Norwegian text to handle the Norwegian language.

The researchers have used data from the gastro-surgical department at the University Hospital of North Norway. The information from here is pseudonymised. This means that it is potentially difficult to identify personal information. 

“The goal is to get the model approved soon so that more people can use it and thereby provide invaluable help in the healthcare service,” says researcher Phuong Dinh Ngo in the Department of Health Data and Analysis at the Norwegian Centre for E-health Research.

Language technology

The language model is based on technology originally developed by Google in 2018.

It is trained on Norwegian clinical texts. It can understand medical terms and the contexts in which they are used. This is crucial for the model to be adopted in the healthcare sector. Precise and accurate understanding of text can be vital.

“Artificial intelligence is already helping to solve some tasks in healthcare services. This project is a step forward for the use of AI in healthcare. I am focused on the safe use of AI, and here we have a language model trained on genuine Norwegian health data. That's good,” says State Secretary in the Ministry of Health and Care Services, Ellen Rønning-Arnesen (Labour Party).

She commends the development of an AI model that preserves culture and language in the Norwegian healthcare service.

“With AI, healthcare personnel will be able to use their time more effectively,” says State Secretary in the Ministry of Health and Care Services, Ellen Rønning-Arnesen (Labour Party).

NorDeClin-BERT

BERT is an abbreviation for Bidirectional Encoder Representations from Transformers and is the name of a family of language models developed by Google and launched in 2018.

There are already other BERT models that process the Norwegian language: NorBERT1-3 from the University of Oslo and nb-BERT from the National Library of Norway. These models have a good general understanding of text but less comprehension of clinical medical text.

The project is scheduled to be completed in 2025, but the first model is expected to be available in the second half of 2024.

Challenges and solutions

One of the biggest challenges in development has been access to data.

Clinical texts contain sensitive personal information. Extensive approvals are required to use this data for research.

Researchers have been working for four to five years to gain access to the necessary data. They have also developed methods that safeguard privacy.

“It has been a long process to access clinical text and then remove sensitive information from it. Now, what's left is to get the necessary approvals in place to use the language model,” says researcher Miguel Angel Tejedor Hernandez.

"Now, what's left is to get the necessary approvals in place," says Miguel Angel Tejedor Hernandez.

Revolution for the healthcare sector?

Researchers believe the model has the potential to revolutionise how healthcare professionals handle clinical information.

The model can assist with automatic diagnosis coding, identify the names of medications in texts, and also anonymise text.

A faster and more accurate overview of patient information can improve patient safety. It can also streamline hospital administration.

“Clinical text is different from regular Norwegian text in that doctors and healthcare personnel may write it in different ways. They can use different names with different meanings. A model that is capable of decoding and understanding the language from healthcare professionals is therefore an important innovation to improve both patient treatment and efficiency in the healthcare sector,” says Phuong Dinh Ngo.

Competition and collaboration

The researchers have now developed the first clinical language model within gastro surgery. Other institutions are also working on similar projects. 

“With AI, healthcare personnel will be able to use their time more effectively. It can contribute to more labour-saving processes,” says State Secretary Rønning-Arnesen.

This could involve more efficient and improved content in medical records. AI can support doctors by compiling information from blood tests, imaging studies, and medical records. The system can incorporate new research-based knowledge. It can also, for example, suggest possible diagnoses and assist the doctor in assessing treatment risks.

NorDeClin-BERT has benefited from collaboration with the Swedish research infrastructure Health Bank at Stockholm University. The researchers have also collaborated with the gastro-surgical department at the University Hospital of North Norway.

The road ahead

The researchers have applied for approval of the language model. The goal is to share it with other researchers and healthcare institutions. 

They also hope to implement the system in some hospitals to observe how it performs in a real clinical treatment process.

“The project is planned to be completed in 2025, but already in the second half of 2024, we hope and expect that the first version of NorDeClin-BERT will be available for use in the healthcare sector,” says Miguel Angel Tejedor Hernandez.

The goal is for the model to become a resource for the entire Norwegian healthcare system, with the potential for further development and adaptation to additional medical fields.

———

Read the Norwegian version of this article on forskning.no

Powered by Labrador CMS