Deep learning outperformed 11 pathologists in the classification of histopathological melanoma images
Deep learning outperformed 11 pathologists in the classification of histopathological melanoma images-2019
Abstract Background: The diagnosis of most cancers is made by a board-certified pathologist based on a tissue biopsy under the microscope. Recent research reveals a high discordance between individual pathologists. For melanoma, the literature reports on 25e26% of discordance for classifying a benign nevus versus malignant melanoma. A recent study indicated the potential of deep learning to lower these discordances. However, the performance of deep learning in classifying histopathologic melanoma images was never compared directly to human experts. The aim of this study is to perform such a first direct comparison. Methods: A total of 695 lesions were classified by an expert histopathologist in accordance with current guidelines (350 nevi/345 melanoma). Only the haematoxylin & eosin (H&E) slides of these lesions were digitalised via a slide scanner and then randomly cropped. A total of 595 of the resulting images were used to train a convolutional neural network (CNN). The additional 100 H&E image sections were used to test the results of the CNN in comparison to 11 histopathologists. Three combined McNemar tests comparing the results of the CNNs test runs in terms of sensitivity, specificity and accuracy were predefined to test for significance (p < 0.05). Findings: The CNN achieved a mean sensitivity/specificity/accuracy of 76%/60%/68% over 11 test runs. In comparison, the 11 pathologists achieved a mean sensitivity/specificity/accuracy of 51.8%/66.5%/59.2%. Thus, the CNN was significantly (p Z 0.016) superior in classifying the cropped images. Interpretation: With limited image information available, a CNN was able to outperform 11 histopathologists in the classification of histopathological melanoma images and thus shows promise to assist human melanoma diagnoses.
KEYWORDS : Melanoma | Pathology | Histopathology | Deep learning | Artificial intelligence
Machine learning to predict occult nodal metastasis in early oral squamous cell carcinoma
یادگیری ماشین برای پیش بینی متاستاز گره غشایی در کارسینوم سلول سنگفرشی اولیه دهان-2019
Objectives: To develop and validate an algorithm to predict occult nodal metastasis in clinically node negative oral cavity squamous cell carcinoma (OCSCC) using machine learning. To compare algorithm performance to a model based on tumor depth of invasion (DOI). Materials and methods: Patients who underwent primary tumor extirpation and elective neck dissection from 2007 to 2013 for clinical T1-2N0 OCSCC were identified from the National Cancer Database (NCDB). Multiple machine learning algorithms were developed to predict pathologic nodal metastasis using clinicopathologic data from 782 patients. The algorithm was internally validated using test data from 654 patients in NCDB and was then externally validated using data from 71 patients treated at a single academic institution. Performance was measured using area under the receiver operating characteristic (ROC) curve (AUC). Machine learning and DOI model performance were compared using Delong’s test for two correlated ROC curves. Results: The best classification performance was achieved with a decision forest algorithm (AUC=0.840). When applied to the single-institution data, the predictive performance of machine learning exceeded that of the DOI model (AUC=0.657, p=0.007). Compared to the DOI model, machine learning reduced the number of neck dissections recommended while simultaneously improving sensitivity and specificity. Conclusion: Machine learning improves prediction of pathologic nodal metastasis in patients with clinical T1- 2N0 OCSCC compared to methods based on DOI. Improved predictive algorithms are needed to ensure that patients with occult nodal disease are adequately treated while avoiding the cost and morbidity of neck dissection in patients without pathologic nodal disease.
Keywords: Oral cancer | Squamous cell carcinoma | Machine learning | Artificial intelligence
Development of machine learning algorithms for prediction of mortality in spinal epidural abscess
توسعه الگوریتم های یادگیری ماشین برای پیش بینی مرگ و میر در آبسه اپیدورال ستون فقرات-2019
BACKGROUND CONTEXT: In-hospital and short-term mortality in patients with spinal epidural abscess (SEA) remains unacceptably high despite diagnostic and therapeutic advancements. Forecasting this potentially avoidable consequence at the time of admission could improve patient management and counseling. Few studies exist to meet this need, and none have explored methodologies such as machine learning. PURPOSE: The purpose of this study was to develop machine learning algorithms for prediction of in-hospital and 90-day postdischarge mortality in SEA. STUDY DESIGN/SETTING: Retrospective, case-control study at two academic medical centers and three community hospitals from 1993 to 2016. PATIENTS SAMPLE: Adult patients with an inpatient admission for radiologically confirmed diagnosis of SEA. OUTCOME MEASURES: In-hospital and 90-day postdischarge mortality. METHODS: Five machine learning algorithms (elastic-net penalized logistic regression, random forest, stochastic gradient boosting, neural network, and support vector machine) were developed and assessed by discrimination, calibration, overall performance, and decision curve analysis. RESULTS: Overall, 1,053 SEA patients were identified in the study, with 134 (12.7%) experiencing in-hospital or 90-day postdischarge mortality. The stochastic gradient boosting model achieved the best performance across discrimination, c-statistic=0.89, calibration, and decision curve analysis. The variables used for prediction of 90-day mortality, ranked by importance, were age, albumin, platelet count, neutrophil to lymphocyte ratio, hemodialysis, active malignancy, and diabetes. The final algorithm was incorporated into a web application available here: https://sorg-apps.shinyapps.io/seamortality/. CONCLUSIONS: Machine learning algorithms show promise on internal validation for prediction of 90-day mortality in SEA. Future studies are needed to externally validate these algorithms inindependent populations.
Keywords: Artificial intelligence | Healthcare | Machine learning | Mortality | Spinal epidural abscess | Spine surgery
The Application of Machine Learning to Quality Improvement Through the Lens of the Radiology Value Network
کاربرد یادگیری ماشین برای بهبود کیفیت از طریق لنز شبکه ارزش رادیولوژی-2019
Recent advances in machine learning and artificial intelligence offer promising applications to radiology quality improvement initiatives as they relate to the radiology value network. Coordination within the interlocking web of systems, events, and stakeholders in the radiology value network may be mitigated though standardization, automation, and a focus on workflow efficiency. In this article the authors present applications of these various strategies via use cases for quality improvement projects at different points in the radiology value network. In addition, the authors discuss opportunities for machine-learning applications in data aggregation as opposed to traditional applications in data extraction.
Key Words: Machine learning | artificial intelligence | radiology quality improvement | radiology value network | data aggregation
Strengths, Weaknesses, Opportunities, and Threats Analysis of Artificial Intelligence and Machine Learning Applications in Radiology
نقاط قوت ، ضعف ، فرصت و تحلیل تهدیدات هوش مصنوعی و برنامه های یادگیری ماشین در رادیولوژی-2019
Currently, the use of artificial intelligence (AI) in radiology, particularly machine learning (ML), has become a reality in clinical practice. Since the end of the last century, several ML algorithms have been introduced for a wide range of common imaging tasks, not only for diagnostic purposes but also for image acquisition and postprocessing. AI is now recognized to be a driving initiative in every aspect of radiology. There is growing evidence of the advantages of AI in radiology creating seamless imaging workflows for radiologists or even replacing radiologists. Most of the current AI methods have some internal and external disadvantages that are impeding their ultimate implementation in the clinical arena. As such, AI can be considered a portion of a business trying to be introduced in the health care market. For this reason, this review analyzes the current status of AI, and specifically ML, applied to radiology from the scope of strengths, weaknesses, opportunities, and threats (SWOT) analysis.
Key Words: Artificial intelligence | deep learning | machine learning | opportunity | radiomics | strength | threat | weakness
Deep learning facilitates the diagnosis of adult asthma
تسهیلات یادگیری عمیق در تشخیص آسم بزرگسالان-2019
Background: We explored whether the use of deep learning to model combinations of symptom-physical signs and objective tests, such as lung function tests and the bronchial challenge test, would improve model performance in predicting the initial diagnosis of adult asthma when compared to the conventional machine learning diagnostic method. Methods: The data were obtained from the clinical records on prospective study of 566 adult outpatients who visited Kindai University Hospital for the first time with complaints of non-specific respiratory symptoms. Asthma was comprehensively diagnosed by specialists based on symptom-physical signs and objective tests. Model performance metrics were compared to logistic analysis, support vector machine (SVM) learning, and the deep neural network (DNN) model. Results: For the diagnosis of adult asthma based on symptom-physical signs alone, the accuracy of the DNN model was 0.68, whereas that for the SVM was 0.60 and for the logistic analysis was 0.65. When adult asthma was diagnosed based on symptom-physical signs, biochemical findings, lung function tests, and the bronchial challenge test, the accuracy of the DNN model increased to 0.98 and was significantly higher than the 0.82 accuracy of the SVM and the 0.94 accuracy of the logistic analysis. Conclusions: DNN is able to better facilitate diagnosing adult asthma, compared with classical machine learnings, such as logistic analysis and SVM. The deep learning models based on symptom-physical signs and objective tests appear to improve the performance for diagnosing adult asthma
Keywords: Artificial intelligence | Asthma | Deep learning | Diagnosis | Support vector machine
Machine Learning Models can Detect Aneurysm Rupture and Identify Clinical Features Associated with Rupture
مدلهای یادگیری ماشینی می توانند پارگی آنوریسم را تشخیص دهند و ویژگیهای بالینی مرتبط با پارگی را شناسایی کنند-2019
- BACKGROUND: Machine learning (ML) has been increasingly used in medicine and neurosurgery. We sought to determine whether ML models can distinguish ruptured from unruptured aneurysms and identify features associated with rupture. - METHODS: We performed a retrospective review of patients with intracranial aneurysms detected on vascular imaging at our institution between 2002 and 2018. The dataset was used to train 3 ML models (random forest, linear support vector machine [SVM], and radial basis function kernel SVM). Relative contributions of individual predictors were derived from the linear SVM model. - RESULTS: Complete data were available for 845 aneurysms in 615 patients. Ruptured aneurysms (n [ 309, 37%) were larger (mean 6.51 mm vs. 5.73 mm; P [ 0.02) and more likely to be in the posterior circulation (20% vs. 11%; P < 0.001) than unruptured aneurysms. Area under the receiver operating curve was 0.77 for the linear SVM, 0.78 for the radial basis function kernel SVM models, and 0.81 for the random forest model. Aneurysm location and size were the 2 features that contributed most significantly to the model. Posterior communicating artery, anterior communicating artery, and posterior inferior cerebellar artery locations were most highly associated with rupture, whereas paraclinoid and middle cerebral artery locations had the strongest association with unruptured status. -CONCLUSIONS: ML models are capable of accurately distinguishing ruptured from unruptured aneurysms and identifying features associated with rupture. Consistent with prior studies, location and size show the strongest association with aneurysm rupture.
Key words : Aneurysm | Aneurysm rupture | Artificial intelligence | Machine learning | Subarachnoid hemorrhage
Artificial Intelligence in Medical Education: Best Practices Using Machine Learning to Assess Surgical Expertise in Virtual Reality Simulation
هوش مصنوعی در آموزش پزشکی: بهترین روش هایی که با استفاده از یادگیری ماشینی برای ارزیابی تخصص جراحی در شبیه سازی واقعیت مجازی انجام می شود-2019
OBJECTIVE: Virtual reality simulators track all movements and forces of simulated instruments, generating enormous datasets which can be further analyzed with machine learning algorithms. These advancements may increase the understanding, assessment and training of psychomotor performance. Consequently, the application of machine learning techniques to evaluate performance on virtual reality simulators has led to an increase in the volume and complexity of publications which bridge the fields of computer science, medicine, and education. Although all disciplines stand to gain from research in this field, important differences in reporting exist, limiting interdisciplinary communication and knowledge transfer. Thus, our objective was to develop a checklist to provide a general framework when reporting or analyzing studies involving virtual reality surgical simulation and machine learning algorithms. By including a total score as well as clear subsections of the checklist, authors and reviewers can both easily assess the overall quality and specific deficiencies of a manuscript. DESIGN: The Machine Learning to Assess Surgical Expertise (MLASE) checklist was developed to help computer science, medicine, and education researchers ensure quality when producing and reviewing virtual reality manuscripts involving machine learning to assess surgical expertise. SETTING: This study was carried out at the McGill Neurosurgical Simulation and Artificial Intelligence Learning Centre. PARTICIPANTS: The authors applied the checklist to 12 articles using machine learning to assess surgical expertise in virtual reality simulation, obtained through a systematic literature review. RESULTS: Important differences in reporting were found between medical and computer science journals. The medical journals proved stronger in discussion quality and weaker in areas related to study design. The opposite trends were observed in computer science journals. CONCLUSIONS: This checklist will aid in narrowing the knowledge divide between computer science, medicine, and education: helping facilitate the burgeoning field of machine learning assisted surgical education. ( J Surg Ed 000:110. 2019 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.)
KEY WORDS: simulation| surgery | education | artificial intelligence | assessment | machine learning
Deep learning for waveform identification of resting needle electromyography signals
یادگیری عمیق برای شناسایی شکل موج سیگنالهای الکترومیوگرافی سوزن ساکن-2019
Objective: Given the recent advent in machine learning and artificial intelligence on medical data analysis, we hypothesized that the deep learning algorithm can classify resting needle electromyography (n- EMG) discharges. Methods: Six clinically observed resting n-EMG signals were used as a dataset. The data were converted to Mel-spectrogram. Data augmentation was then applied to the training data. Deep learning algorithms were applied to assess the accuracies of correct classification, with or without the use of pre-trained weights for deep-learning networks. Results: While the original data yielded the accuracy up to 0.86 on the test dataset, data-augmentation up to 200,000 training images showed significant increase in the accuracy to 1.0. The use of pre-trained weights (fine tuning) showed greater accuracy than ‘‘training from scratch”. Conclusions: Resting n-EMG signals were successfully classified by deep-learning algorithm, especially with the use of data augmentation and transfer learning techniques. Significance: Computer-aided signal identification of clinical n-EMG testing might be possible by deeplearning algorithms.
Keywords: Needle electromyography | Deep learning | Artificial neural network | Data augmentation | Resting discharge
Item response theory in AI: Analysing machine learning classifiers at the instance level
نظریه پاسخ مورد در هش مصنوعی: تجزیه و تحلیل طبقه بندی کننده های یادگیری ماشین در سطح نمونه-2019
AI systems are usually evaluated on a range of problem instances and compared to other AI systems that use different strategies. These instances are rarely independent. Machine learning, and supervised learning in particular, is a very good example of this. Given a machine learning model, its behaviour for a single instance cannot be understood in isolation but rather in relation to the rest of the data distribution or dataset. In a dual way, the results of one machine learning model for an instance can be analysed in comparison to other models. While this analysis is relativeto a population or distribution of models, it can give much more insight than an isolated analysis. Item response theory (IRT) combines this duality between itemsand respondentsto extract latent variables of the items (such as discrimination or difficulty) and the respondents (such as ability). IRT can be adapted to the analysis of machine learning experiments (and by extension to any other artificial intelligence experiments). In this paper, we see that IRT suits classification tasks perfectly, where instances correspond to items and classifiers correspond to respondents. We perform a series of experiments with a range of datasets and classification methods to fully understand what the IRT parameters such as discrimination, difficulty and guessing mean for classification instances (and their relation to instance hardness measures) and how the estimated classifier ability can be used to compare classifier performance in a different way through classifier characteristic curves.
Keywords: Artificial intelligence evaluation | Item response theory | Machine learning | Instance hardness | Classifier metrics