← All news


New York University School of Medicine researchers described in Nature Medicine a machine learning-based program that can distinguish lung cancer subtypes and at least six driving mutations using only histological slide images, in less time than could a pathologist and standard profiling techniques. The available online code, which the authors think can be applied across cancers, could help clinicians diagnose cancer and make treatment decisions faster.

The deep-learning program was trained to distinguish between lung adenocarcinoma, lung squamous cell carcinoma and non-cancerous lung tissue using open-source code from Google by analyzing hundreds of pixels within images of patient samples from The Cancer Genome Atlas (TCGA).

Among 340 additional patient images, the program then correctly distinguished between the two tumor types with area under the curve (AUC) for true positives vs. false positives ranging from 0.861-0.977.

The researcher’s machine learning technique also correctly classified more images in the TCGA data set than pathologists. The program took an average of 20 seconds to classify a slide; a pathologist took at least one minute per slide.

The program was also trained to identify common, clinically relevant lung cancer mutations from the slide images in about 20 seconds, a process that otherwise requires immuno-staining or tumor sequencing that can take a week or more.

Six frequently mutated genes including serine/threonine kinase 11 (STK11; LKB1), K-Ras (KRAS), p53 and multiple mutations of EGFR could be reliably identified by the program, suggesting it could help identify mutations that guide treatment decisions. The paper’s authors suggested the algorithm could detect morphologic changes in tissue caused by certain mutations.

Study author Aristotelis Tsirigos told BioCentury the group plans to improve the algorithm’s accuracy by using more images to train the algorithm, including images with more detailed annotations from pathologists. Tsirigos is an assistant professor of pathology and director of the applied bioinformatics laboratory at NYU.