he overall positive picture of AI painted in many presentations at the European Congress of Radiology in Vienna this summer and at other conferences is not necessarily mirrored outside the IT business and technologists’ world of medical imaging. Features are seen in a different light by radiologists in hospitals and private offices as well as independent expert AI scientists and major consulting companies.
Reservations are building up against naive and simplistic promises of what AI will be able to deliver. The often described neural networks are clearly gross oversimplifications of the actual neurons of a human brain. The neurons of a well-trained radiologist work faster and more efficient — although computer assistance can facilitate administrative, diagnostic, and research routines.
Jack Copeland, a cutting-edge AI researcher and leading professor in the field, wrote about artificial intelligence:
“Exaggerated claims of success, in professional journals as well as the popular press, have damaged its reputation. At the present time even an embodied system displaying the overall intelligence of a cockroach is proving elusive, let alone a system that can rival a human being. The difficulty of scaling up AI’s modest achievements cannot be overstated [1].”
Eric Daimler, the chief executive of Conexus AI in San Francisco, shares that opinion:
"The trendy foundational models of deep learning are not software composable. This is a limitation of the models and means that they will always have weaknesses that are more appropriate to jobs with low-consequence outcomes. Deploying this tech alone in life-critical environments in not currently solvable with just bigger models [2]."
AI research attempts to reach one of three goals:
(1) Strong AI which aims to build machines that think;
(2) Cognitive Simulation — here computers are used to test theories about how the human mind works — for example, theories about how people recognize faces or recall memories; and
(3) Applied AI, also known as advanced information processing, which aims to produce commercially viable ‘smart’ systems — for example, expert medical diagnosis systems such as supervised or unsupervised computer assisted detection or diagnosis (CAD) [3] or machine (deep) learning (ML) [4].
Software designs offered for medical imaging are not genuine AI, but rather basic or sophisticated CAD or ML systems. Machine learning is concerned with the question of how to construct computer programs that automatically improve with experience. Their aim in radiology is that more routine imaging, including diagnosis and reporting, be done in an automated way. For this purpose four prerequisites must be met:
data of sufficient quantity and quality,
a powerful algorithm,
a narrowly defined task area,
a concrete goal to be achieved.
Of the four prerequisites, sufficient amounts of data will be easily available; however, its quality is and will remain imprecise, inadequate, and often irreproducible as described for instance by Lloret [5]:
“One of the problems comes from the variability of the data itself (e.g., contrast, resolution, signal-to-noise) which make the Deep Learning models suffer from a poor generalization when the training data come from different machines (different vendor, model, etc.) with different acquisition parametrization or any underlying component that can cause the data distribution to shift.”
More so, it is well known that scanner effects can be subtly yet significantly affect machine learning [6]. This holds for both quantification and detection, the most common AI/ML applications that prospective vendors apply for approval to the FDA. We have discussed the pitfalls of such quantifications earlier [7].
Suitable algorithms will be obtainable — yet each AI vendor is producing its individual AI variants. Some will be better than others and all will presumably deliver — slightly or distinctly — different results.
As far as the task areas and concrete goals, there will be dozens, perhaps hundreds of different softwares for different organs or diagnostic questions. There won’t be one general algorithm based on training datasets for the whole human body — with all the variations from children to old people [8] and covering sufficient geographic locations representing diverse cohorts [9,10].
At the end, the software should be able to draw inferences relevant to the solution of the particular task or situation. Often validation of the CAD and ML systems is missing [11]. As one example for many, a group from the University of Cambridge scrutinized several thousand publications and concluded:
“Despite the huge efforts of researchers to develop machine learning models for COVID-19 diagnosis and prognosis, we found methodological flaws and many biases throughout the literature, leading to highly optimistic reported performance [12].”
AI in medicine and particularly in medical imaging, has long slipped out of dependable scientists' control. Looking at the publications and talks at meetings, there are more unqualified than qualified contributions. Similar to the frenzied hype with functional imaging (fMRI) that led to some 40,000 fMRI papers of ‘questionable validity’ [13,14], it is to be feared that the way applied AI is used in medical imaging carries an analogous risk.
A new approach to research is catching on: Fast Science. All and sundry presume to have an expertise in anything, including AI, but most lack the competence to explain and judge. There is a quasi-religious belief in artificial intelligence with science fiction fantasies. Everybody wants to beat a possible financial or career competitor by a whisker. Often the arguments are not scientific but ad hominem:
“… beneficial AI applications run the risk of not being adopted because of a lack of proven health and economic benefits and may lead to potential health loss and unnecessary costs, which are likely to persist until AI, with its seemingly endless possibilities, is recognized as an intervention that can and should be properly assessed [15].”
In other words, if you don’t jump on the AI train immediately you are guilty — you hurt patients and waste their money. You are coerced into jumping on the train of ‘endless possibilities’. Crowd psychology teaches that there is a human desire to be member of a group, thinking, behaving and deciding the same way without individual critical evaluation — to minimize conflict and not be excluded: don’t check whether AI works and has proven positive impact — just be part of it.
Medicine is on its way back to arbitrary research without questioning its understanding of scientific groundwork. Thus, half-baked layman's wishes determine the direction — and in most cases IT specialists, health administrators, even natural scientists getting involved in medicine are these laymen. A wave of hocus-pocus and hocus-bogus is rolling.
However, reassessment during the last years has led to a certain pensiveness. It seems as if many of the promised benefits are missing. Will the results and the outcome be cheaper, faster, more reliable and better than the evaluation of medical images by a trained radiologist?
Surveys by radiological societies and consulting firms are sobering: artificial intelligence faces a slow acceptance and is achieving fairly limited success. One of the conclusions of an analysis of the news magazine The Economist together with the Swiss Pictet Banking Group reads:
“Findings suggest that AI investment is increasingly concentrated in a narrowing field of commercial applications, which may come at the expense of more exploratory and foundational research [16].”
The acceptance by radiologists is guarded and slack as both the European Society of Radiology (ESR) and the American College of Radiology (ACR) reveal:
ESR: “In the previous ESR survey conducted in 2018, 51% of respondents expected that the use of AI tools would lead to a reduced reporting workload. The actual contributions of AI to the workload of diagnostic radiologists were assessed in a recent analysis based on large number of published studies. It was concluded that although there was often added value to patient care, workload was decreased in only 4% but increased in 48% and remained unchanged in 46% institutions. In summary, this survey suggests that, compared with initial expectations, the use of AI-powered algorithms in practical clinical radiology today is limited, most importantly because the impact of these tools on the reduction of radiologists’ workload remains unproven [17].”
ACR: “Approximately 30% of radiologists [in the U.S.A.] are currently using AI as part of their practice. Large practices were more likely to use AI than smaller ones, and of those using AI in clinical practice, most were using AI to enhance interpretation, most commonly detection of intracranial hemorrhage, pulmonary emboli, and mammographic abnormalities. Of practices not currently using AI, 20% plan to purchase AI tools in the next 1 to 5 years. … Conclusion: … The survey results indicate a modest penetrance of AI in clinical practice [18].”
AI is mostly used and tried out in university and other teaching hospitals — to produce articles and talks to promote the career of younger doctors. The increase of the number of examinations particularly at high-throughput institutions doesn’t necessarily go hand in glove with quality.
“Recently published medical imaging studies often add value to radiological patient care. However, they likely increase the overall workload of diagnostic radiologists, and this particularly applies to AI studies [19].”
There are numerous risks, ‘second-order’ effects, and unexpected, uncontrollable implications of employing AI/CAD/ML.
With only small amounts of training data, deep learning models can figure out demographic features such as age, sex, body-mass index, and race even from corrupted, cropped, and noisy anonymous chest x-rays and CT images with high discriminative performance — often when clinical experts are unable to pinpoint these features. This ability creates an enormous risk for all possible deployments in medical imaging because the AI software could run amok invisibly in the background. It is a bias that might lead to wrong diagnoses and therapy, as well as to discrimination of patients [20,21,22].
The results are artificial ‘gossip’ and 'rumors'. You can’t trust this secondary outcome and it has nothing to do with the task the software has been asked to perform. Artificial intelligence of this kind is not intelligent enough to distinguish real facts from self-created fiction.
A report by the US-American consulting company McKinsey discusses other potential risks of AI in detail. It claims on the one hand that AI will improve our lives by "enhancing our healthcare experiences" — whatever that might mean — but also sees:
"There also are second-order effects, such as the atrophy of skills (for example, the diagnostic skills of medical professionals) as AI systems grow in importance [23]."
However, trained radiologists are essential: While the likes of the regulatory authorities may develop a series of testing cases to compare products from different vendors these cases will only reflect a limited range of pathologies — rarer pathologies may not be included. The end user has no concept of how good or bad the particular algorithm is at making a correct diagnosis in a particular case and the system is likely to provide a black or white response.
Can the software say ‘I don't know, I am not sure — we need an expert opinion in this area’? The user of the software will be unaware that there may be a degree of uncertainty or bias.
Radiologists tend to know colleagues who have particular expertise in certain fields. They can refer difficult cases to them up for a second opinion. Does this referral request have any equivalent place in AI? The pundits will say that AI will improve as the training data increases — but what happens when radiologists providing difficult or rare diagnostic solutions no longer exist because AI has superseded them?
What happens if a health system fails (in this case the British NHS) and there are no radiologists available? Then it is: any port in a storm. For most European countries it was an unthinkable development although even on the continent this was brought up earlier [24]. Now we read:
“Radiographer reporting is accepted practice in the UK. With a national shortage of radiographers and radiologists, artificial intelligence (AI) support in reporting may help minimise the backlog of unreported images [25].”
The authors explain that a minimum of 50% of plain x-ray images should be reported by a radiographer with the help of computed diagnosis. They admit that the complexity of these systems means that the processes are not transparent, sometimes even to the developers.
Meanwhile, of all institutions, the European Union has woken up and wants ‘a risk-based approach’ to AI [26].
Preconceived ideas deceive the senses. Who is credible and trustworthy in AI development? A study of industry ties reveals:
“We found that the prevalence of financial ties to industry … was high. For nearly 30% of comments, we were unable to determine whether or not there was a financial tie, and disclosure of ties was non-existent. The proportion of academic submitters was relatively low, and the use of scientific evidence to support comments was sparse. We recommend that the FDA requires disclosure of potential conflict-of-interest, and encourages greater academic participation and use of scientific evidence in public comments [27].”
There are several hundred vendors of “AI” for medical imaging applications, among them a large number of start-ups and spin-offs. There is no place in the market for more than 90% of them. They will not stay alive and disappear because there will be no return on investment — be it government or European Union money, venture capital or other sources. Some have already merged with competitors because they cannot survive as standalone companies. What will happen to their employees, their founders, the venture capital invested, the state grants given?
One of the taboo topics in AI sales in medicine is the question of accountability: Who is liable if a computer's decision causes damage? Is it the manufacturer or the user? If a company tries to sell you an AI program you have to insist that in the sales contract the company underwrites its use and that it takes all responsibilities for possible performance failures.
A long time ago I wrote a column ‘How to purchase an MR machine • In ten easy lessons.’ It began with this sentence:
“Murphy’s Law is the most reliable guideline when buying an MR machine: anything that can go wrong usually does [28].”
The column could be easily adapted to AI/CAD/ML. Thus, I was not not puzzled when a ‘saleswoman scientist’ I knew well confessed to me: “I know that our software doesn’t work, but we sell it anyway.”
What was taken for granted yesterday will change today. The high-technology wonderland needs permanent change to earn money.
My advice to department heads: Train your people even better than today, and wait and see until the method is established and proven — or not. Don’t waste time and money. And never forget: Neither radiologists nor AI are infallible.
1. Copeland BJ. Artificial intelligence. Encyclopædia Britannica; (retrieved July 2022).
2. Daimler E. Lower expectations for AI. The Economist. 2 July 2022. 14.
3. Alaux A, Rinck PA. Multispectral analysis of magnetic resonance images: a comparison between supervised and unsupervised classification techniques. In Proc. Int. Symp. on Tissue Characterisation in Magnetic Resonance Imaging, European Soc. for Mag. Res. in Med. & Biol, ISBN 354051532. 19th–21st April 1989, Wiesbaden, Germany, 165–169;
— Rinck PA. Chapter Fifteen: Image Processing and Visualization. In: Rinck PA. Magnetic Resonance in Medicine. The Basic Textbook of the European Magnetic Resonance Forum. 12th edition; 2018|2020. Free offprint.
4. Montagnon M, Cerny M, Cadrin-Chênevert A, et al. Deep learning workflow in radiology: a primer. Insights Imaging. 2020; 11:22. doi.org/10.1186/s13244-019-0832-5
5. Lloret Iglesias L, Sanz Bellón P, Pérez del Barrio A, Menéndez Fernández — Miranda P, Rodríguez González D, Vega JA, González Mandly AA, Parra Blanco JA. A primer on deep learning and convolutional neural networks for clinicians. Insights Imaging. 2021; 12: 117
6. Ferrari E, Bosco P, Calderoni S, Oliva P, Palumbo L, Spera G, Fantacci ME, Retico A. Dealing with confounders and outliers in classification medical studies: The Autism Spectrum Disorders case study. Artif Intell Med. 2020; 108: 101926. doi: 10.1016/j.artmed.2020.101926.
7. Rinck PA. All is not what it seems in the messy world of research. Don’t play it again, Sam. Rinckside 2021; 32,6: 17-18.;
— Rinck PA. Mapping the biological world. Rinckside 2017; 28,7: 13-15.
8. Gauriau R, Bizzo BC, Kitamura FC, et al. A Deep Learning-based model for detecting abnormalities on brain MR images for triaging: preliminary results from a multisite experience. Radiol Artif Intell. 21 April 2021; 3(4):e200184.doi: 10.1148/ryai.2021200184.
9. Kaushal A, Altman R, Langlotz C. Geographic distribution of US cohorts used to train deep learning algorithms. JAMA. 2020; 324 (12): 1212–1213. doi:10.1001/jama.2020.12067
10. Wu E, Wu K, Daneshjou R, Ouyang D, Ho DE, Zou J. How medical AI devices are evaluated: limitations and recommendations from an analysis of FDA approvals. Nat Med 2021; 27: 582–584. doi.org/10.1038/s41591-021-01312-x
11. Rinck PA. Some reflections on artificial intelligence in medicine. Rinckside 2018; 29,5: 11-13;
— Rinck PA. Artificial intelligence meets validity. Rinckside 2019; 30,5: 13-15.
12. Roberts M, Driggs D, Thorpe M, et al. Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans. Nat Mach Intell. 2021; 3: 199–217. doi.org/10.1038/s42256-021-00307-0
13. Eklund A, Nichols TE, Knutsson H. Cluster failure: Why fMRI inferences for spatial extent have inflated false-positive rates. PNAS 2016; 113, 28: 7900-7905.
14. Rinck PA. Debacles mar “Big Science” and fMRI research. Rinckside 2016; 27,7: 17-18.
15. Voets MM, Veltman J, Slump CH, Siesling S, Koffijberg H. Systematic review of health economic evaluations focused on artificial intelligence in healthcare: the tortoise and the cheetah. Value in Health. 2022; 25,3: 340-349.
16. The Economist and The Pictet Group. AI is currently enjoying a heyday, but is innovation slowing? 2022; (retrieved July 2022).
17. European Society of Radiology (ESR). Current practical experience with artificial intelligence in clinical radiology: a survey of the European Society of Radiology. Insights Imaging. 2022; 13: 107. doi.org/10.1186/s13244-022-01247-y
18. Allen B, Agarwal S, Coombs L, Wald C, Dreyer K. 2020 ACR data science institute artificial intelligence survey. J Am Coll Radiol 2021; 18: 1153–1159.
19. Kwee TC, Kwee RM. Workload of diagnostic radiologists in the foreseeable future based on recent scientific advances: growth expectations and role of artificial intelligence. Insights Imaging. 29 June 2021; 12(1): 88. doi: 10.1186/s13244-021-01031-4.
20. Jabbour S, Fouhey D, Kazerooni E, Sjoding MW, Wiens J. Deep learning applied to chest x-rays: exploiting and preventing shortcuts. PMLR 2020; 126: 750–782.
21. Seyyed-Kalantari L, Zhang H, McDermott MBA, Chen IY, Ghassemi M. Underdiagnosis bias of artificial intelligence algorithms applied to chest radiographs in under-served patient populations. Nat Med. 2021;27: 2176-2182. doi: 10.1038/s41591-021-01595-0.
22. Gichoya JW, Banerjee I, Bhimireddy AR, et al. AI recognition of patient race in medical imaging: a modelling study. Lancet Digit Health. 2022; 4(6):e406-e414. doi: 10.1016/S2589-7500(22)00063-2.
23. Cheatham B, Javanmardian K, Samandari H. Confronting the risks of artificial intelligence. McKinsey Quarterly. 26 April 2019.
24. Rinck PA. Rude awakening: Will radiographers eventually take over? Rinckside 2011; 22,4: 7-8.
25. Rainey C, O'Regan T, Matthew J, et al. UK reporting radiographers’ perceptions of AI in radiographic image interpretation – current perspectives and future developments. Radiography, 2022; 28: 881-888.
26. European Commission. Regulatory framework proposal on artificial intelligence. 2022 (retrieved July 2022).
27, Smith JA, Abhari RE, Hussain Z, Heneghan C, Collins GS, Carr AJ. Industry ties and evidence in public comments on the FDA framework for modifications to artificial intelligence/machine learning-based medical devices: a cross sectional study. BMJ Open. 14 October 2020. 10(10): e039969. doi: 10.1136/bmjopen-2020-039969.
28. Rinck PA. How to purchase an MR machine • In ten easy lessons. Rinckside 1992; 3,4: 9-10.
Citation: Rinck PA. The state of Artificial Intelligence in medical imaging • Part II
Are radiologists’ neurons are faster and cheaper?
Rinckside 2022; 33,3: 5-9.
Rinckside • ISSN 2364-3889
is published both in an electronic and in a printed version. It is listed by the German National Library.
Rinck is my last name, and a rink is an area of combat or contest.
Rinkside means by the rink. In a double meaning “Rinckside” means the page by Rinck. Sometimes I could also imagine “Rincksighs”, “Rincksights” or “Rincksites” …
⇒ more