Artificial intelligence meets validity

Rinckside 2019; 30,5: 13-15.

adiology has gone through turbulent years, even decades. It has changed permanently, from analog to digital, and from x-rays and radioisotope methods via ultrasound to magnetic resonance. The latest hyper-hype in radiology is artificial intelligence [1].

Today’s healthcare system is no longer run by doctors, but by administrators, hospital managers, lawyers, and business economists applying corporate strategies. Many of them don’t or don’t want to understand the caring or human aspects of medicine but base their outcome merely on numbers and collected data. Most of these people seem to hazily consider AI to be anthropomorphic, but computers are and will be non-human entities.

"Great potential and promising results"

Internet opinion leaders, "experts" and "influencers" like to talk about artificial intelligence in general and in radiology in particular: "Great potential and promising results". If you check, many of them have financial interests in companies dealing with AI, even if they don’t admit it publicly.

However, if you ask them about concrete applications, you get vague answers, such as that the introduction of AI has the potential to greatly enhance every component of the imaging value chain. It's all "beta", “will be” and you have to try it (you pay first, of course), to find out and incorporate it with your own results – which then will become the property of the distributing company. It’s a commercial dream deal – little work, a lot of public relations and marketing, and a steady income.

Yet, in many cases the data quality of the input is bad and the necessary trustworthy infrastructure does not exist or requires a much greater technical effort than expected. In many instances the complexity of the problem to be solved is not taken into account by the promoters of the application nor by its users because they don’t understand the first thing about it.

Often complex software and hardware used are impossible to link – it’s not only one program but many components that have to connect and grasp the incoming data to process them in the expected way. In newspeak, this is politely called “lack of maturity”.

Validation is a neglected or simply ignored factor. How difficult it is to implement, for instance, contouring tools and applying them in AI studies is demonstrated in a recent paper by Zheng Chang:

“Before the AI contouring tool is fully adopted into clinical use as a part of standard practice, it needs validation in more independent multicenter studies with larger patient cohorts,” he wrote. “Although the AI contouring tool shows promising results for NPC primary tumor delineation in this study, section-by-section verification of tumor contour by radiation oncologists should never be omitted [2].”

In the 1980s and 1990s I led an image processing group in the department I headed; a number of important innovations in the field of image processing, image visualization, data collection, and early applications of very specific AI were developed during this time and became basic and expert knowledge, including the knowledge of pitfalls and setbacks [3].

Validation is among them; it seems nearly impossible, because the parameters of most digital radiological examinations are not exactly reproducible [1]. However, extremely thorough validation must take place before AI algorithms are clinically feasible.

A Korean paper on AI highlighted that in the first half of 2018 “of 516 eligible published studies, only 6% (31 studies) performed external validation. None of the 31 studies adopted all three design features: diagnostic cohort design, the inclusion of multiple institutions, and prospective data collection for external validation … Nearly all of the studies published in the study period that evaluated the performance of AI algorithms for diagnostic analysis of medical images were designed as proof-of-concept technical feasibility studies and did not have the design features that are recommended for robust validation of the real-world clinical performance of AI algorithms [4]."

In June this year, Ranga Yogeshwar, a physicist from Luxembourg, turned German media science journalist and television presenter was interviewed for the newsletter of Deutsche Röntgengesellschaft. His comments were knowledgeable, competent, critical, and to the point. He also spoke out:

“The question [is] of how we validate data, also in science. Where do training data come from? Are they certified? How certain can we be that training data may not contain an a priori error? I still miss a differentiated discussion. It is sometimes frightening on what basis data are collected and trained, even in powerful AI systems.

“For me, it is a question of reflected progress, in which data is validated on the one hand and data flows and access rights are clearly regulated on the other [5]."

An intriguing contribution in one of the discussions of the recent conference "Standing at the Crossroads: 40 Years of MR Contrast Agents" [6] was the opinion of researchers in the exact sciences that radiologists will only be involved in patient studies with common techniques and contrast agents, but not in dedicated MR studies with techniques using novel diagnostic, therapeutic and theragnostic compounds. The reason given is the radiologists’ lack of background in dedicated MR techniques and biochemical interactions of targeted compounds and tracers. Such examinations or interventions would become the domain of other disciplines, e.g., oncologists and neuroscientists, perhaps also specialists in nuclear medicine.

The scientists’ comments were supported by a cardiologist who stressed that it is his feeling that radiology is experiencing a major change, most likely a decline. He described that he and his colleagues have undergone training and are now completely independent of any radiology input. He believed that there is a similar trend in neurology and orthopedics.

This complicates validation of possible AI data collection even more.

Another issue is the question whether radiologists really understand what is happening with and in their equipment and the optimization of examinations. Basic T2- or T2*-weighted sequences, for instance, have been superseded with all sorts of concealed manipulations to improve speed – tricks that are hidden and users are largely unaware of. Similarly with CT, the impact of energy, contrast agent volume and timing means most radiologists are completely dependent on built-in protocols. Thus, average radiologists or other imaging professionals are excluded from making any changes of patient studies.

It is interesting how radiology is seen by some of the scientists developing the tools for radiology. This opinion will not find many friends in the radiological community, but the involvement of non-radiologists is already noticeable for some time. More so, simple applications of artificial intelligence will shift routine image assessment away from radiologists.

Human laziness will rely on AI. There is more to this laziness than commonly thought. The human brain delegates tasks to the background that it doesn't consider relevant. Relying on the “responsible” performance of hard- and software will allow vigilance to easily fade.

A good example is the shift of our cognitive systems from the task of supervising a fully autonomous device to a less relevant device, e.g. away from the performance of a radiological AI system. It was shown in a group of people that driving cars with manual transmission was associated with better attention and less failure than driving cars with automatic transmission [7]. To my knowledge, nobody has ever looked into such issues possibly also to be found with AI but it’s important to be aware of it [8].

References

1. Rinck PA. Some reflections on artificial intelligence in medicine. Rinckside 2018; 29,5: 11-13. and: Rinck PA. Will artificial intelligence increase costs of medical imaging? Rinckside 2018; 29,6: 15-16.
2. Chang Z. Will AI improve tumor delineation accuracy for radiation therapy? Radiology 2019. Published online: 26 March 2019 | doi.org/10.1148/radiol.2019190385
3. Rinck PA. Chapter 15 – Image processing and visualization / Chapter 16 – Dynamic imaging. in: Rinck PA. Magnetic Resonance in Medicine. A Critical Introduction. 12th ed. BoD, Norderstedt, Germany. 2018. ISBN 978-3-7460-9518-9.
4. Kim DW, Jang HY, Kim KW, Shin Y, Park SH. Design characteristics of studies reporting the performance of artificial intelligence algorithms for diagnostic analysis of medical images: results from recently published papers. Korean J Radiol 2019; 20: 405-410.
5. Yogeshwar R. Die Daten sind das Programm. Press release of Deutsche Röntgengesellschaft. June 2019.
6. Rinck PA. At the crossroads: MR contrast agents. Rinckside 2019; 30,4: 9-11.
7. Cox DJ1, Punja M, Powers K, Merkel RL, Burket R, Moore M, Thorndike F, Kovatchev B. Manual transmission enhances attention and driving performance of ADHD adolescent males: pilot study. J Atten Disord. 2006; 10: 212-216.
8. Rinck PA. Total reliance on autopilot is a risk to life. Rinckside 2012; 23,8: 17-18.

Citation: Rinck PA. Artificial intelligence meets validity. Rinckside 2019; 30,5: 13-15.

A digest version of this column was published as:
Data validation promises to make or break AI.
Aunt Minnie Europe. Maverinck. 24 July 2019.

Rinckside • ISSN 2364-3889
is published both in an electronic and in a printed version. It is listed by the German National Library.

→ Print version (pdf).

The Author

PAR

Rinck is my last name, and a rink is an area of combat or contest.

Rinkside means by the rink. In a double meaning “Rinckside” means the page by Rinck. Sometimes I could also imagine “Rincksighs”, “Rincksights” or “Rincksites” …
⇒ more

Contact

Artificial intelligence meets validity

Rinckside 2019; 30,5: 13-15.

"Great potential and promising results"

References

The Author

Bulletin Board