REVIEW ARTICLE
REVISTA DE LA FACULTAD DE MEDICINA HUMANA 2024 - Universidad Ricardo Palma
1 Department of Medicine, Fundación Universitaria San Martin, Sabaneta, Colombia.
2 Department of Medicine, Universidad de la Sabana, Chía, Colombia.
3 Department of Medicine, Fundación Universitaria San Martin, Cali, Colombia.
4 Department of Medicine, Universidad Simón Bolívar, Barranquilla, Colombia.
5 Department of Medicine, Unidad Central del Valle del Cauca, Tuluá, Colombia.
6 Department of Medicine, Universidad del Valle, Cali, Colombia.
7 Department of Medicine, Universidad del Magdalena, Santa Marta, Colombia.
8 Department of Medicine, Universidad Militar Nueva Granada, Bogotá, Colombia.
9 Fac Ciències Salut Blanquerna, University Ramon Llul, Barcelona, Spain.
a Physician.
b Master's in Epidemiology and Public Health.
c Doctoral Candidate in Health, Well-being, and Bioethics.
ABSTRACT
Introduction: Breast cancer remains one of the most prevalent cancers globally, specifically the
most common in females. The use of artificial intelligence promises to contribute to early diagnosis
through imaging. Previously, the landscape and evolution of this scientific production have not been
described.
Methods: Cross-sectional bibliometric study using Scopus as the data source. The bibliometrix
package in R was employed for calculating bibliometric indicators and visualizing the results.
Results: 1292 documents published between 1989 and 2024 were selected. 75.3% (n=973) were
articles with primary data, followed by 16.2% (n=209) corresponding to reviews. An international
collaboration rate of 26.5% was identified, with an annual production growth of 10.78%. It was observed
that risk classification through screening, digital breast tomosynthesis, transfer learning,
segmentation, and feature selection were the most commonly used keywords. In the last five years, deep
learning and mammography have been the most popular topics. International collaboration has been led by
the United States, China, and the United Kingdom.
Conclusions: A notable growth in global research on the use of artificial intelligence in breast
cancer imaging for detection was identified, particularly since the 2010s, primarily through the
publication of articles with primary data. The relationship between artificial intelligence and imaging
for breast cancer diagnosis has focused on risk and prediction.
Keywords: Artificial Intelligence, Mammography, Mammary Ultrasonography, Breast Neoplasms,
Bibliometrics. (Source:
MeSH).
RESUMEN
Introducción: El cáncer de mama sigue siendo uno de los cánceres más frecuentes a nivel global,
específicamente, el más frecuente en el sexo femenino. El uso de inteligencia artificial promete
contribuir al diagnóstico precoz, a través de la imagenología. Previamente, no se ha descrito el
panorama y avance de esta producción científica.
Métodos: Estudio bibliométrico de corte transversal, que usó Scopus como fuente de datos. Se
utilizó el paquete bibliometrix de R para el cálculo de indicadores bibliométricos y visualización de
los resultados.
Resultados: Se seleccionaron 1292 documentos, publicados entre 1989 y 2024. El 75,3% (n=973)
fueron artículos con datos primarios, seguido de un 16,2% (n=209) correspondiente a revisiones. Se
identificó una colaboración internacional del 26,5%, y un crecimiento anual de la producción del 10,78%.
Se observó que, la clasificación de riesgo por screening, tomosíntesis digital de la mama, aprendizaje
por transferencia, segmentación y selección por características, son las palabras clave más comúnmente
usadas. En los últimos cinco años, el aprendizaje profundo y la mamografía, han sido los temas con mayor
popularidad. La colaboración internacional, ha sido liderada por Estados Unidos, China y Reino Unido.
Conclusiones: Se identificó un crecimiento notable en la investigación global sobre el uso de
inteligencia artificial en imagenología para la detección de cáncer de mama, marcado a partir de la
década del 2010, esencialmente por medio de publicación de artículos con datos primarios. La relación
entre inteligencia artificial e imagenología para diagnóstico de cáncer de mama, se ha centrado en
riesgo y predicción.
Palabras clave: Inteligencia Artificial, Mamografía, Ecografía Mamaria, Cáncer de Mama,
Bibliometría. (Fuente:
DeCS-BIREME)
Introduction
Breast cancer remains one of the most common cancers globally, particularly prevalent in women, with
high morbidity, mortality, healthcare costs, and impact on quality of life (1, 2). In Latin America, it is estimated that one-fourth of global cases are
diagnosed, with nearly 500,000 instances (2). While the prognosis for breast
cancers has significantly improved in high-income countries, it remains weak in low- and middle-income
countries due to significant barriers in implementing early diagnosis and management strategies (3 - 6).
Screening, primarily through mammography, has proven useful in the early detection of this cancer
(7). However, its reproducibility and impact vary across different scenarios
due to factors related to human resource training, infrastructure, or public policies. Therefore, tools
have been designed to complement the performance of this diagnostic aid, such as artificial
intelligence, to promote patient flow (8). The execution of studies with
algorithms designed based on imaging patterns and clinical characteristics has significantly improved
the diagnostic performance of breast cancer (9). Nonetheless, there are no
research groups, work lines, or large cohorts facilitating the grouping of population-based data
(10 - 12).
To understand the global research landscape on a tool that can modify early breast cancer detection and
be replicable in numerous scenarios, including low- and middle-income countries (13), this study aimed to analyze the global scientific production related
to the use of artificial intelligence in imaging for breast cancer detection.
Methodology
A cross-sectional bibliometric study was conducted using Scopus, the largest database of peer-reviewed
scientific literature. This database has been previously used for this type of analysis (14, 15). Unlike other search engines, citation indices,
and databases like PubMed or Web of Science, Scopus has a greater number of indexed Latin American
biomedical journals, facilitating the identification of evidence from this region.
A structured search was designed and executed to identify articles related to the use of artificial
intelligence in imaging for breast cancer detection. This took into account the affiliation reported in
the metadata and corroborated by the official full-text publication. The search strategy was built using
MeSH terms and synonyms, both in English and Spanish. Following a pilot test, the following search was
defined:
TITLE-ABS-KEY(“Breast Carcinoma In Situ”) OR TITLE-ABS-KEY(“Breast Ductal Carcinoma”) OR
TITLE-ABS-KEY(“Lobular Carcinoma”) OR TITLE-ABS-KEY(“Triple Negative Breast Neoplasms”) OR
TITLE-ABS-KEY(“Unilateral Breast Neoplasms”) OR TITLE-ABS-KEY(“Inflammatory Breast Neoplasms”) OR
TITLE-ABS-KEY(“Breast Cancer”) OR TITLE-ABS-KEY(“Mammary Cancer”) OR TITLE-ABS-KEY(“Malignant Neoplasm
of Breast”) OR TITLE-ABS-KEY(“Breast Malignant Neoplasm”) OR TITLE-ABS-KEY(“Breast Malignant Tumor”) OR
TITLE-ABS-KEY(“Cancer of Breast”) OR TITLE-ABS-KEY( “Cancer of the Breast”) OR TITLE-ABS-KEY(“Breast
Carcinoma”) AND TITLE-ABS-KEY(“Artificial Intelligence”) OR TITLE-ABS-KEY(“Computational Intelligence”)
OR TITLE-ABS-KEY(“Machine Intelligence”) OR TITLE-ABS-KEY(“Computer Reasoning”) OR
TITLE-ABS-KEY(“Computer Vision System”) OR TITLE-ABS-KEY(“Machine Learning”) OR TITLE-ABS-KEY(“Deep
Learning”) OR TITLE-ABS-KEY(“Sentiment Analysis”) OR TITLE-ABS-KEY( “Neural Networks”) AND
TITLE-ABS-KEY(“Early Detection of Cancer”) OR TITLE-ABS-KEY(“Cancer Screening”) OR TITLE-ABS-KEY(“Cancer
Early Diagnosis”) OR TITLE-ABS-KEY(“Early Diagnosis”).
This search was conducted until February 10, 2024, and filtered with the labels "Humans" and "Journals."
This excluded literature not following the regular peer-review process for publication in scientific
journals, such as books, book series, abstracts, and conference proceedings. No time limit window was
set for the inclusion of articles.
Subsequently, a manual review was conducted to remove duplicates and articles unrelated to the topic of
interest based on the title, abstract, and keywords. This was all done in Microsoft Office Excel 2016.
Next, the data of the variables of interest were standardized to reduce discrepancies in the way
metadata is originally recorded. Thus, categories were regrouped. For example, in the case of article
typology, all original studies providing primary data, regardless of observational or experimental
design, were categorized as "Primary Data Articles"; similarly, all reviews, regardless of design
(whether narrative, systematic, or meta-analysis), were categorized as "Reviews." Editorials, letters to
the editor, comments, etc., were categorized as "Correspondence."
For statistical analysis, network metrics were employed to visualize trends, characteristics, and
calculate scientific impact. The bibliometrix package in R was used for this analysis, which allows the
calculation of quantitative bibliometric indicators and the visualization of results (version 4.3.1)
(16). Synonyms, errors, plurals, and variants were strictly regrouped to
homogenize the analysis. Keywords, authors, and institutions were standardized in this way.
Additionally, a descriptive analysis of the scientific production found was executed. The most prolific
authors and the distribution of publications were characterized using Lotka's Law. Collaboration
networks were constructed to determine the degree and strength of collaboration between countries.
To measure the impact of institutions and countries, the h-index and the absolute value of accumulated
citations were used. The definitions and specifications of these metrics' use in bibliometric studies
have been previously described (17, 18). The calculation
of frequencies and percentages was performed using Microsoft Office Excel 2016.
Ethical Aspects: This study did not require approval from an ethics committee, considering it did not
involve research on humans, biological models, or medical history.
Results
Initially, 1833 documents were identified. After applying inclusion and exclusion criteria, 1292
documents were finally selected. Of the total documents initially identified, 540 were conference
papers. The time window for the analyzed evidence ranged from 1989 to 2024 (35 years). Among the
selected documents, 75.3% (n=973) were primary data articles, followed by 16.2% (n=209) reviews. An
international collaboration rate of 26.5% was identified, with an annual production growth of 10.78%
(Table 1). A slow growth was observed until 2013, after which there was a notable increase in
publication volume, peaking in 2023 with over 300 articles published (Figure 1-A). In contrast, the
number of citations obtained over time fluctuated, peaking in 2019 (Figure 1-B). Applying Lotka's law,
it was found that 84% of authors had published only one document, followed by 9.8% with two documents.
|
n |
% |
---|---|---|
Article type |
|
|
Primary data articles |
973 |
75,3 |
Reviews |
209 |
16,2 |
Correspondences* |
110 |
8,5 |
Authors |
|
|
Authorships |
5517 |
- |
Authors of single-authored documents (N=5517) |
85 |
1,54 |
Collaboration |
|
|
Single-authored articles |
94 |
- |
Co-authorships per article (average) |
5,7 |
- |
International co-authorship |
26,5 |
- |
Keywords |
2206 |
- |
Journals |
535 |
- |
Average article age (years) |
3,84 |
- |
Average citations per document |
23,9 |
- |
Annual growth |
- |
10,78 |
*Includes letters to the editor, editorials, comments, etc.
Figure 1. Annual Scientific Growth of Global Research on the Use of Artificial Intelligence in Imaging for Breast Cancer Detection. A. Annual Publication Frequency. B. Average Citations Received per Article per Year.
The United States was the most prolific country with 311 documents, and also had the highest impact
(h-index of 52 and 11,757 citations). It was followed by China (h-index of 33 and 4231 citations) and
India (h-index of 30 and 2862 citations), with 213 and 186 documents, respectively. Regarding
affiliations/institutions, Radboud University Medical Center (Netherlands) was the most prolific and
impactful, with 29 documents and an h-index of 19 (1425 citations), followed by Harvard Medical School
(h-index of 12 with 1814 citations), Karolinska Institutet (h-index of 14 with 1145 citations), and
Massachusetts General Hospital (h-index of 14 with 1858 citations), all with 24 documents each.
In terms of journals, Radiology had the highest number of documents (n=44) (Figure 2-A). However, Nature
received the highest number of citations (2102 citations) (Figure 2-B). Still, Radiology had the highest
impact, measured by h-index and g-index (19 and 38, respectively) (Figure 2C-D), while Diagnostics had
the highest m-index (2.75) (Figure 2-E). Radiology and Cancers were the journals that grew most notably
in the last seven years (Figure 2-F).
Figure 2. Impact and publication frequency in journals with the highest number of documents on the use of artificial intelligence in imaging for breast cancer detection. A. Frequency of Published Articles. B. Total Citations Received. C. h-Index Obtained. D. g-Index Obtained. E. m-Index Obtained. F. Cumulative Frequency Over Time of Articles in the Most Popular Journals.
Regarding research trends and patterns, a word cloud construction revealed that risk classification by
screening, digital breast tomosynthesis, transfer learning, segmentation, and feature selection were the
most commonly used keywords (Figure 3-A). In the last five years, deep learning and mammography have
been the most popular topics (Figure 3-B), while in the last 10 years, other topics such as machine
learning, neural networks, breast density, data mining, and risk stratification have also gained great
interest in this field (Figure 3C-D). Biomarkers linked to breast ultrasound and their diagnostic
potential emerge as thematic niches, while digital breast tomosynthesis and liquid biopsy are emerging
topics (Figure 3-E). The multiple correspondence factor analysis shows a notable association between the
topics of: 1) Radiomics, nuclear magnetic resonance, and risk stratification; 2) Mammography, neural
networks, and image classification; 3) Breast density, tomosynthesis, and thermography (Figure 3-F).
Figure 3. Evolution and trends in global research on the use of artificial intelligence in imaging for breast cancer detection. A. Word Cloud of the Most Frequent Keywords B. Evolution of the Most Frequent Topics Over Time C. Topic Frequency Since 2010 D. Co-occurrence Network of Keywords E. Thematic Map with Degree of Development and Relevance of Topics F. Multiple Correspondence Analysis with Degree of Contribution of Each Topic.
In terms of collaboration networks, it was observed that Harvard Medical School, University of
Pennsylvania, Karolinska Institutet, and Radboud University Medical Center lead international
collaboration, with all institutions collaborating primarily with European and North American
institutions (Figure 4-A). Regarding countries, strong collaboration was identified between the United
States, China, and the United Kingdom. Specifically, China collaborates heavily with other Asian
countries, while the United States and the United Kingdom collaborate with European countries (Figure
4-B). Apart from Brazil, no other Latin American country stood out in international collaboration on the
topic of interest.
Figure 4. Institutional and country collaboration networks in global research on the use of artificial intelligence in imaging for breast cancer detection. A. Collaboration Between Affiliations B. Collaboration Between Countries.
Summarizing the articles with the highest impact to date, measured by the number of citations received,
the top three were: 1) International evaluation of an AI system for breast cancer screening (1282
citations; published in Nature in 2020; DOI: 10.1038/s41586-019-1799-6); 2) Artificial intelligence in
cancer imaging: Clinical challenges and applications (901 citations; published in CA: A Cancer Journal
for Clinicians in 2019; DOI: 10.3322/caac.21552); 3) Deep Learning to Improve Breast Cancer Detection on
Screening Mammography (541 citations; published in Scientific Reports in 2019; DOI:
10.1038/s41598-019-48995-4).
Discussion
This analysis reveals for the first time the evolution of global research patterns and trends related to
the use of artificial intelligence in imaging for breast cancer diagnosis. It was identified that,
although the first publications were recognized in the late 1980s and early 1990s, it was only from the
2010s onwards that there was a gradual yet notable increase in global scientific production on the use
of artificial intelligence applied to imaging for breast cancer detection. This can be explained by the
dissemination and advancement of omics tools, linked to artificial intelligence, which have rapidly
expanded (19, 20). However, countries in Latin America
and Africa still have modest levels of research and international collaboration, despite being regions
with significant needs in breast cancer care and early detection (21).
Possibly, the absence of evidence and data on the current state of applied research in artificial
intelligence in imaging and breast cancer has hindered the construction of an evidence-based roadmap
that promotes research in this field.
It can be inferred that due to the massive existence of networks, consensus, and international
collaborations on breast cancer (22, 23), primarily
located in the United States and Europe, these continents have significantly progressed in innovating
the application of new artificial intelligence techniques linked to early detection, risk
stratification, and prediction of breast cancer. Even so, an international collaboration percentage of
less than 30% was determined. The application of translational research, searching for biomarkers
through the use of omics, and supported by data mining analyzed by artificial intelligence, allows the
construction of clusters based on common clinical, imaging, and histopathological characteristics, to
achieve applicable results with acceptable performance in clinical practice (24,
25).
Due to the emergence of this research niche, a notable number of citations and accumulated impact can be
observed, despite the few years of dramatic growth in scientific production. The use of deep learning,
neural networks, machine learning, and transfer learning allows feeding algorithms with a high degree of
precision to identify patterns suggestive of malignancy, facilitating the precise detection of breast
cancer (26). Considering that there are non-modifiable variables in the
pathophysiology and evolution of cancer (27), it is necessary to rigorously
and solidly reproduce these types of studies to drive the achievement of health goals.
Favorably, given the expected construction of new knowledge, the existing evidence is predominantly
based on primary data. However, based on the gap in data origin, there are still many places in the
world where data production is very low or non-existent, which could bias the predictive potential of an
algorithm based on clinical, social, or genetic characteristics of different populations. Nevertheless,
this does not detract from the significant advancement identified in the present analysis.
As limitations, the use of a single database and citation index, Scopus, is noted, but it has been
reported as the database with the largest number of indexed literature in health sciences. Additionally,
the inherent bias of the margin of error of recorded metadata is acknowledged. However, to control this,
the authors conducted a manual review and standardization process.
Conclusions
A notable growth in global research on the use of artificial intelligence in imaging for breast cancer
detection was identified, marked from the 2010s, primarily through the publication of primary data
articles. The production has been led by the United States, China, and India. However, international
collaboration networks are led by the United States, China, and the United Kingdom. Among the most
popular research niches and patterns are transfer learning, deep learning, neural networks, machine
learning, segmentation, and feature selection, linked to mammography and digital breast tomosynthesis
for risk stratification.
Authorship contributions:
The authors participated in the conceptualization, research, methodology, resources, and
drafting of the original manuscript.
Financing:
Self-funded.
Declaration of conflict of interest:
The authors declare no conflict of interest.
Recevied:
February 20, 2024
Approved:
June 16, 2024
Correspondence author:
Yelson Picón Jaimes
Address:
Fac Ciències Salut Blanquerna, University Ramon Llul, Barcelona, España.
Phone:
+34 645 68 54 60
E-mail:
colmedsurg.center@gmail.com
Article published by the Journal of the faculty of Human Medicine of the Ricardo Palma University. It is an open access article, distributed under the terms of the Creatvie Commons license: Creative Commons Attribution 4.0 International, CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/), that allows non-commercial use, distribution and reproduction in any medium, provided that the original work is duly cited. For commercial use, please contact revista.medicina@urp.edu.pe.