What Machine Learning Can and Cannot Do

Google has launched a new multimodal medical AI that could revolutionize medical imaging.

The AI tool uses the approach, called ELIXR, which is short for Embeddings for Language/Image-aligned X-Rays.

It is lightweight and multimodal, meaning it can process both images and text. This makes it well-suited for tasks such as disease classification, semantic search, and radiology report verification.

It is important to note that ELIXR’s training input consists of a dataset of very large amounts of medical images as well as the corresponding free-text radiology reports. This allows the models to learn the subtle nuances of medical images that would be difficult for them to capture using conventional binary labels.

In addition to standard disease classification, ELIXR can also perform a variety of other tasks. For example, it can be used to search for specific features within a chest X-ray (CXR) image, respond to natural language queries, and even verify the accuracy of radiology reports.

The modular design of ELIXR makes it adaptable for a variety of applications. Different vision encoders and base language models can be swapped out as needed, allowing the models to be fine-tuned for specific tasks.

However, making the most of AI in medicine requires combining the strength of expert systems trained with predictive AI with the flexibility made possible through generative AI.

How to combine them, and what’s the best way to combine…are not clear yet, and requires ongoing research and collaboration among healthcare providers, medical institutions, and government entities.

Is it the sign of a new breakthrough in AI towards the development of AGI (Artificial General Intelligence)?
You may find more info on: [2308.01317] ELIXR: Towards a general purpose X-ray artificial intelligence system through alignment of large language models and radiology vision encoders