Digital transformation has reconfigured healthcare systems, by integrating digital technologies into clinical and administrative processes, driving improvements in the quality of services, patient safety and the organizational efficiency of institutions. In addition to the computerization of clinical records, the digitalization has driven the adoption of advanced clinical decision support systems, based on artificial intelligence, interoperability between information systems and automated analysis of large volumes of data in real time.

Home / Publications / Publication

Home / Publications / Publication

Sistema de apoio à decisão clínica MammoClass

Publication type: Article Summary
Original title: Implementação de uma ferramenta de suporte à entrada de texto em Português, escrito e falado, para o sistema de apoio à decisão clínica MammoClass
Article publication date: October 2016
Source: Repositório Aberto da Universidade do Porto
Author: Ricardo Rocha
Supervisor: Inês Dutra

What is the goal, target audience, and areas of digital health it addresses?
     The aim of the study is to improve the MammoClass tool, allowing information from mammograms dictated or written in Portuguese to be converted into standardized terms, facilitating clinical analysis and prediction of the risk of malignancy of breast lesions. The target audience includes health professionals, particularly radiologists and primary care doctors, as well as health technology researchers and decision-makers involved in the management and digital transformation of clinical services. This study falls within the areas of digital health, with relevance to artificial intelligence applied to health, voice-based interfaces, interoperability of clinical information systems and medical decision support tools.

What is the context?
     Digital transformation has reconfigured healthcare systems, by integrating digital technologies into clinical and administrative processes, driving improvements in the quality of services, patient safety and the organizational efficiency of institutions. In addition to the computerization of clinical records, the digitalization has driven the adoption of advanced clinical decision support systems, based on artificial intelligence, interoperability between information systems and automated analysis of large volumes of data in real time.

     In the field of breast imaging, these technologies are particularly important given the complexity of imaging data and the need for rapid and accurate decisions, especially in light of the high incidence of breast cancer. The disease is characterised by uncontrolled cell proliferation in breast tissue, with a high potential for local invasion and spread to other organs. Despite the increased screening efforts, indicators such as mortality rates remain worrying, underscoring the urgent need for digital solutions capable of promoting earlier detection, more accurate risk stratification and a reduction in the clinical and social impact of the disease.

What are the current approaches?
     Currently, the approaches used to support clinical decision-making in breast imaging are largely based on standardising the interpretation of exams and using computer tools based on structured data. Mammography (X-ray of the breast) remains the main imaging test in screening programmes, playing a central role in the early detection of potentially malignant alterations. To standardise the description of results and facilitate communication between health professionals, the Breast Imaging Reporting and Data System (BI-RADS®) was developed, which establishes a standardised lexicon with 43 specific descriptors for characterising breast structures. These include the presence of dark margins or an irregular shape of the lesion. Based on the combination of these and other parameters, radiologists assign a category from 1 to 7, with higher values indicating a higher risk of malignancy.

     This structured data is essential for feeding clinical decision support systems such as MammoClass, a platform that allows you to predict the likelihood of a mammographic alteration being benign or malignant based on an analysis of BI-RADS® descriptors. However, in clinical practice, most reports are still predominantly written in free text, through dictation or handwriting, which makes efficient conversion to structured formats difficult and limits data interoperability.

     To mitigate this limitation, some institutions use voice recognition technologies, which allow clinical dictation to be transcribed automatically. However, when applied to the Portuguese language, these tools have high error rates, especially in environments with background noise or in the face of regional variations in pronunciation. In addition, these solutions are generally designed for full speech transcription and are not optimised for the selective identification and extraction of relevant clinical terminology. As a result, the quality and reliability of the extracted data is compromised.

What does innovation consist of? How is the impact of this study assessed?
     The innovation of this study consisted of the development of MammoClass V2, an expanded and updated version of the original system, which incorporates new functionalities designed to facilitate the entry and structuring of clinical data in mammograms. This version introduces an approach centred on the automatic extraction of BI-RADS® descriptors from clinical text written in Portuguese, promoting the direct integration of this structured data into the clinical decision support system. Instead of transcribing the entire speech, the system focuses exclusively on identifying and extracting the relevant descriptors, ignoring the rest of the content.

     To this end, a multi-platform web interface was developed (accessible by computer, tablet and smartphone), which allows clinical text to be entered — in Portuguese — using three different formats: free writing, form filling and dictation with support for voice recognition technologies (Speech-to-Text). The extraction of the terms is carried out by a specialised linguistic parser, designed to recognise clinical terminology even when expressed with grammatical variations or in different linguistic sequences, ensuring greater robustness and flexibility in textual interpretation.

     To evaluate this system, a methodological approach was designed that combined quantitative tests. The quantitative component included two data input scenarios: (i) lists of BI-RADS® descriptors previously defined and individually dictated and (ii) complete clinical reports written in natural language from mammograms performed at Centro Hospitalar São João, in Porto. At the same time, two different speech recognition technologies were compared: the Web Speech API, based on Google cloud services, and the Julius/Coruja system, an open-source tool with local execution, guaranteeing greater control and privacy of the processed data.

What are the main results? What is the future of this approach?
     The results of this study showed that the approach developed enabled the automatic extraction of BI-RADS® descriptors from clinical text in Portuguese, with consistent performance. In quantitative tests, the system achieved an average accuracy of 81 percent and a sensitivity of over 85 percent, both on lists of individually dictated terms and on complete clinical reports written in natural language. These figures showed the tool’s ability to correctly identify relevant clinical terminology and reduce omissions. The linguistic parser proved to be effective in detecting descriptors expressed with grammatical variations, different verb tenses or different syntactic orders, demonstrating flexibility in natural language processing. This performance was particularly evident in texts written or voice-dictated, often found in clinical practice.

     As far as voice recognition is concerned, the Web Speech API performed better in terms of transcription fidelity, especially in quiet environments and with speakers with neutral diction. On the other hand, the Julius/Coruja system, although with a slight reduction in the recognition rate, proved to be advantageous as it operates locally, offering greater privacy and independence from external connectivity – characteristics that are relevant to sensitive clinical contexts.

     Despite these advances, the solution presents opportunities for further development, particularly in terms of expanding the recognised clinical vocabulary, resilience to less controlled contexts and the integration of more advanced language models. In the short term, pilot studies are planned in healthcare institutions to validate the tool’s impact on clinical practice, especially in the production and standardisation of reports. In the medium term, it is anticipated that the approach will be extended to other areas of imaging and that language models based on machine learning will be incorporated, with a view to improving sensitivity without compromising clinical specificity.

Do you have an innovative idea in healthcare field?

Share it with us and see it come to life.
We will help bring your projects to life!

Newsletter

Receive the latest updates from the InovarSaúde portal.

República Portuguesa logo
logotipo SNS
SPMS logotipo

Follow Us

YouTube
LinkedIn

Co-funded by

PRR Logotipo
república Portuguesa logo
União Europeia Logo

Newsletter

Receive the latest updates from the InovarSaúde portal.

República Portuguesa logo
SNS Logo
SPMS Logo

Follow Us

Co-funded by

PRR Logotipo
República Portuguesa logo
União Europeia Logo

Home / Publications / Publication

Sistema de apoio à decisão clínica MammoClass

Publication type: Article Summary
Original title: Implementação de uma ferramenta de suporte à entrada de texto em Português, escrito e falado, para o sistema de apoio à decisão clínica MammoClass
Article publication date: October 2016
Source: Repositório Aberto da Universidade do Porto
Author: Ricardo Rocha
Supervisor: Inês Dutra

What is the goal, target audience, and areas of digital health it addresses?
     The aim of the study is to improve the MammoClass tool, allowing information from mammograms dictated or written in Portuguese to be converted into standardized terms, facilitating clinical analysis and prediction of the risk of malignancy of breast lesions. The target audience includes health professionals, particularly radiologists and primary care doctors, as well as health technology researchers and decision-makers involved in the management and digital transformation of clinical services. This study falls within the areas of digital health, with relevance to artificial intelligence applied to health, voice-based interfaces, interoperability of clinical information systems and medical decision support tools.

What is the context?
     Digital transformation has reconfigured healthcare systems, by integrating digital technologies into clinical and administrative processes, driving improvements in the quality of services, patient safety and the organizational efficiency of institutions. In addition to the computerization of clinical records, the digitalization has driven the adoption of advanced clinical decision support systems, based on artificial intelligence, interoperability between information systems and automated analysis of large volumes of data in real time.

     In the field of breast imaging, these technologies are particularly important given the complexity of imaging data and the need for rapid and accurate decisions, especially in light of the high incidence of breast cancer. The disease is characterised by uncontrolled cell proliferation in breast tissue, with a high potential for local invasion and spread to other organs. Despite the increased screening efforts, indicators such as mortality rates remain worrying, underscoring the urgent need for digital solutions capable of promoting earlier detection, more accurate risk stratification and a reduction in the clinical and social impact of the disease.

What are the current approaches?
     Currently, the approaches used to support clinical decision-making in breast imaging are largely based on standardising the interpretation of exams and using computer tools based on structured data. Mammography (X-ray of the breast) remains the main imaging test in screening programmes, playing a central role in the early detection of potentially malignant alterations. To standardise the description of results and facilitate communication between health professionals, the Breast Imaging Reporting and Data System (BI-RADS®) was developed, which establishes a standardised lexicon with 43 specific descriptors for characterising breast structures. These include the presence of dark margins or an irregular shape of the lesion. Based on the combination of these and other parameters, radiologists assign a category from 1 to 7, with higher values indicating a higher risk of malignancy.

     This structured data is essential for feeding clinical decision support systems such as MammoClass, a platform that allows you to predict the likelihood of a mammographic alteration being benign or malignant based on an analysis of BI-RADS® descriptors. However, in clinical practice, most reports are still predominantly written in free text, through dictation or handwriting, which makes efficient conversion to structured formats difficult and limits data interoperability.

     To mitigate this limitation, some institutions use voice recognition technologies, which allow clinical dictation to be transcribed automatically. However, when applied to the Portuguese language, these tools have high error rates, especially in environments with background noise or in the face of regional variations in pronunciation. In addition, these solutions are generally designed for full speech transcription and are not optimised for the selective identification and extraction of relevant clinical terminology. As a result, the quality and reliability of the extracted data is compromised.

What does innovation consist of? How is the impact of this study assessed?
     The innovation of this study consisted of the development of MammoClass V2, an expanded and updated version of the original system, which incorporates new functionalities designed to facilitate the entry and structuring of clinical data in mammograms. This version introduces an approach centred on the automatic extraction of BI-RADS® descriptors from clinical text written in Portuguese, promoting the direct integration of this structured data into the clinical decision support system. Instead of transcribing the entire speech, the system focuses exclusively on identifying and extracting the relevant descriptors, ignoring the rest of the content.

     To this end, a multi-platform web interface was developed (accessible by computer, tablet and smartphone), which allows clinical text to be entered — in Portuguese — using three different formats: free writing, form filling and dictation with support for voice recognition technologies (Speech-to-Text). The extraction of the terms is carried out by a specialised linguistic parser, designed to recognise clinical terminology even when expressed with grammatical variations or in different linguistic sequences, ensuring greater robustness and flexibility in textual interpretation.

     To evaluate this system, a methodological approach was designed that combined quantitative tests. The quantitative component included two data input scenarios: (i) lists of BI-RADS® descriptors previously defined and individually dictated and (ii) complete clinical reports written in natural language from mammograms performed at Centro Hospitalar São João, in Porto. At the same time, two different speech recognition technologies were compared: the Web Speech API, based on Google cloud services, and the Julius/Coruja system, an open-source tool with local execution, guaranteeing greater control and privacy of the processed data.

What are the main results? What is the future of this approach?
     The results of this study showed that the approach developed enabled the automatic extraction of BI-RADS® descriptors from clinical text in Portuguese, with consistent performance. In quantitative tests, the system achieved an average accuracy of 81 percent and a sensitivity of over 85 percent, both on lists of individually dictated terms and on complete clinical reports written in natural language. These figures showed the tool’s ability to correctly identify relevant clinical terminology and reduce omissions. The linguistic parser proved to be effective in detecting descriptors expressed with grammatical variations, different verb tenses or different syntactic orders, demonstrating flexibility in natural language processing. This performance was particularly evident in texts written or voice-dictated, often found in clinical practice.

     As far as voice recognition is concerned, the Web Speech API performed better in terms of transcription fidelity, especially in quiet environments and with speakers with neutral diction. On the other hand, the Julius/Coruja system, although with a slight reduction in the recognition rate, proved to be advantageous as it operates locally, offering greater privacy and independence from external connectivity – characteristics that are relevant to sensitive clinical contexts.

     Despite these advances, the solution presents opportunities for further development, particularly in terms of expanding the recognised clinical vocabulary, resilience to less controlled contexts and the integration of more advanced language models. In the short term, pilot studies are planned in healthcare institutions to validate the tool’s impact on clinical practice, especially in the production and standardisation of reports. In the medium term, it is anticipated that the approach will be extended to other areas of imaging and that language models based on machine learning will be incorporated, with a view to improving sensitivity without compromising clinical specificity.

Cadeira de rodas

Autonomous Patient Mobility in a Hospital Environment

The internal transport of patients in healthcare institutions, although at first glance it may seem like a simple task, represents a complex, continuous, demanding and time-consuming logistical operation that cuts across all levels of the…

Read more
Voz em IA

The Future of Diagnostics: Speech and AI

Speech is a biomarker that reflects, in a sensitive way, the integrated functioning of several physiological systems, namely the nervous, respiratory, and muscular systems. This complexity makes it a promising resource for detecting changes associated…

Read more
Literatura sobre os cuidados de saúde no futuro

What Literature Reveals About Healthcare in the Future

The healthcare sector is undergoing rapid transformation driven by population aging, increasing complexity of care, and digital advancements, in a context that requires greater integration, sustainability and adaptation to new realities such as the European…

Read more
Perturbação do sono

A Digital Intervention for Insomnia in Oncology

Insomnia is a sleep disorder characterised by persistent difficulties in initiating sleep, maintaining sleep during the night, or achieving restful sleep. These difficulties arise even in the presence of adequate sleeping conditions and are often…

Read more
Sistema de telemonitorização remota

Digital Technology Revolutionising Post-cardiac Surgery

According to the World Health Organisation, cardiovascular disease remains the leading cause of death worldwide, responsible for around 17.9 million deaths a year. Its high prevalence is associated with unhealthy lifestyles characterised by poor diet,…

Read more
Sistema robótico autónomo INSIDE

Autonomous Robotics System for Autism Therapy

Autism spectrum disorder is a neurodevelopmental condition with significant clinical, social and economic repercussions throughout life. According to the World Health Organization, it is estimated to affect approximately 1 in 160 children worldwide. Its origin…

Read more
Enfermeira com um telefone

Mobile Application to Improve Workflows in Nursing Homes

Portugal has one of the highest aging populations in the world, placing increasing pressure on elderly care services, especially in nursing homes. Healthcare professionals in these facilities are often overwhelmed due to the increasing number…

Read more
troca de informações de saúde e interoperabilidade

New Era of Interoperability in Healthcare Systems

The growing use of electronic health records, digital diagnostic systems and remote monitoring technologies has led to a significant increase in the volume and complexity of health data. This increase intensifies the need for continuous,…

Read more
robótica colaborativa

Collaborative Robotics Improves Working Conditions

Workers face growing challenges in the industrial environment. Among the most critical are fatigue and inappropriate postures, often associated with repetitive tasks and working conditions that lack ergonomic suitability. These factors represent significant risks for…

Read more
Benefícios da Eletrônica Médica

Detection of Anxiety and Panic Attacks in Real Time

The growing number of people with anxiety disorders, along with increased awareness of mental health, drives the need for new technological tools that provide remote and continuous monitoring of anxiety and panic disorders. Thus, the…

Read more
tele-ecografia

A Novel Approach for Robotic-assisted Tele-echography

Currently, robotic systems for ultrasound diagnostic procedures fall into two main categories: portable robots that require manual positioning and fully autonomous robotic systems that independently control the ultrasound probe’s orientation and positioning. Portable robots rely…

Read more
Personalização e tecnologia na gestão da Diabetes

Personalization and Technology in Diabetes Management

IPDM has significant potential to improve diabetes management and drive health system reforms to become high-performing, effective, equitable, accessible, and sustainable. Evidence and good practices inspire health system transformation. Adopting person-centred approaches like co-creation and…

Read more
TEF-HEALTH Logo

SPMS Integrates the TEF-Health Initiative

SPMS participates in the TEF-Health initiative as a partner in a consortium composed of 51 entities from 9 European Union countries. This action is co-financed by the European Commission and has a duration of five…

Read more
Global Digital Health Partnership Logo

SPMS Represents Portugal as Vice-president of GDHP

The GDHP is an intergovernmental organization in the digital health sector that facilitates cooperation and collaboration between government representatives and the World Health Organization (WHO). Its purpose is to foster policymaking that promote the digitalization…

Read more
Portugal INCoDe.2030

Digital Transformation of Health at INCoDe.2030 in Tomar

The “National Digital Skills Initiative e.2030, Portugal” (INCoDe.2030) is an initiative that aims to improve the Portuguese population’s level of digital skills, placing Portugal at the level of the most advanced European countries in this…

Read more
HealthData@PT Logo

HealthData@PT: New SPMS Initiative for Health Data

Action HealthData@PT is launched in the context of the implementation of the European Health Data Space, and is an initiative approved by the European Commission under the EU4Health 2021-2027 programme. This initiative contributes to the…

Read more

Do you have an innovative idea in healthcare field?

Share it with us and see it come to life.
We will help bring your projects to life!

Newsletter

Receive the latest updates from the Inovarsaúde portal.

República Portuguesa logo
SNS Logo
SPMS Logo

Follow Us

YouTube
LinkedIn

Co-funded by

PRR Logotipo
República Portuguesa logo
União Europeia Logo
Scroll to Top