Deep learning
Deep Learning-Based Super-Resolution Reconstruction on Undersampled Brain Diffusion-Weighted MRI for Infarction Stroke: A Comparison to Conventional Iterative Reconstruction
AJNR Am J Neuroradiol. 2025 Jan 8;46(1):41-48. doi: 10.3174/ajnr.A8482.
ABSTRACT
BACKGROUND AND PURPOSE: DWI is crucial for detecting infarction stroke. However, its spatial resolution is often limited, hindering accurate lesion visualization. Our aim was to evaluate the image quality and diagnostic confidence of deep learning (DL)-based super-resolution reconstruction for brain DWI of infarction stroke.
MATERIALS AND METHODS: This retrospective study enrolled 114 consecutive participants who underwent brain DWI. The DWI images were reconstructed with 2 schemes: 1) DL-based super-resolution reconstruction (DWIDL); and 2) conventional compressed sensing reconstruction (DWICS). Qualitative image analysis included overall image quality, lesion conspicuity, and diagnostic confidence in infarction stroke of different lesion sizes. Quantitative image quality assessments were performed by measurements of SNR, contrast-to-noise ratio (CNR), ADC, and edge rise distance. Group comparisons were conducted by using a paired t test for normally distributed data and the Wilcoxon test for non-normally distributed data. The overall agreement between readers for qualitative ratings was assessed by using the Cohen κ coefficient. A P value less than .05 was considered statistically significant.
RESULTS: A total of 114 DWI examinations constituted the study cohort. For the qualitative assessment, overall image quality, lesion conspicuity, and diagnostic confidence in infarction stroke lesions (lesion size <1.5 cm) improved by DWIDL compared with DWICS (all P < .001). For the quantitative analysis, edge rise distance of DWIDL was reduced compared with that of DWICS (P < .001), and no significant difference in SNR, CNR, and ADC values (all P > .05).
CONCLUSIONS: Compared with the conventional compressed sensing reconstruction, the DL-based super-resolution reconstruction demonstrated superior image quality and was feasible for achieving higher diagnostic confidence in infarction stroke.
PMID:39779291 | DOI:10.3174/ajnr.A8482
Predicting Parkinson's Disease Using a Deep-Learning Algorithm to Analyze Prodromal Medical and Prescription Data
J Clin Neurol. 2025 Jan;21(1):21-30. doi: 10.3988/jcn.2024.0175.
ABSTRACT
BACKGROUND AND PURPOSE: Parkinson's disease (PD) is characterized by various prodromal symptoms, and these symptoms are mostly investigated retrospectively. While some symptoms such as rapid eye movement sleep behavior disorder are highly specific, others are common. This makes it challenging to predict those at risk of PD based solely on less-specific prodromal symptoms. The prediction accuracy when using only less-specific symptoms can be improved by analyzing the vast amount of information available using sophisticated deep-learning techniques. This study aimed to improve the performance of deep-learning-based screening in detecting prodromal PD using medical-claims data, including prescription information.
METHODS: We sampled 820 PD patients and 8,200 age- and sex-matched non-PD controls from Korean National Health Insurance cohort data. A deep-learning algorithm was developed using various combinations of diagnostic codes, medication codes, and prodromal periods.
RESULTS: During the prodromal period from year -3 to year 0, predicting PD using only diagnostic codes yielded a high accuracy of 0.937. Adding medication codes for the same period did not increase the accuracy (0.931-0.935). For the earlier prodromal period (year -6 to year -3), the accuracy of PD prediction decreased to 0.890 when using only diagnostic codes. The inclusion of all medication-codes data increased that accuracy markedly to 0.922.
CONCLUSIONS: A deep-learning algorithm using both prodromal diagnostic and medication codes was effective in screening PD. Developing a surveillance system with automatically collected medical-claims data for those at risk of developing PD could be cost-effective. This approach could streamline the process of developing disease-modifying drugs by focusing on the most-appropriate candidates for inclusion in accurate diagnostic tests.
PMID:39778564 | DOI:10.3988/jcn.2024.0175
Identity Model Transformation for boosting performance and efficiency in object detection network
Neural Netw. 2024 Dec 31;184:107098. doi: 10.1016/j.neunet.2024.107098. Online ahead of print.
ABSTRACT
Modifying the structure of an existing network is a common method to further improve the performance of the network. However, modifying some layers in network often results in pre-trained weight mismatch, and fine-tune process is time-consuming and resource-inefficient. To address this issue, we propose a novel technique called Identity Model Transformation (IMT), which keep the output before and after transformation in an equal form by rigorous algebraic transformations. This approach ensures the preservation of the original model's performance when modifying layers. Additionally, IMT significantly reduces the total training time required to achieve optimal results while further enhancing network performance. IMT has established a bridge for rapid transformation between model architectures, enabling a model to quickly perform analytic continuation and derive a family of tree-like models with better performance. This model family possesses a greater potential for optimization improvements compared to a single model. Extensive experiments across various object detection tasks validated the effectiveness and efficiency of our proposed IMT solution, which saved 94.76% time in fine-tuning the basic model YOLOv4-Rot on DOTA 1.5 dataset, and by using the IMT method, we saw stable performance improvements of 9.89%, 6.94%, 2.36%, and 4.86% on the four datasets: AI-TOD, DOTA1.5, coco2017, and MRSAText, respectively.
PMID:39778291 | DOI:10.1016/j.neunet.2024.107098
Skin image analysis for detection and quantitative assessment of dermatitis, vitiligo and alopecia areata lesions: a systematic literature review
BMC Med Inform Decis Mak. 2025 Jan 8;25(1):10. doi: 10.1186/s12911-024-02843-2.
ABSTRACT
Vitiligo, alopecia areata, atopic, and stasis dermatitis are common skin conditions that pose diagnostic and assessment challenges. Skin image analysis is a promising noninvasive approach for objective and automated detection as well as quantitative assessment of skin diseases. This review provides a systematic literature search regarding the analysis of computer vision techniques applied to these benign skin conditions, following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. The review examines deep learning architectures and image processing algorithms for segmentation, feature extraction, and classification tasks employed for disease detection. It also focuses on practical applications, emphasizing quantitative disease assessment, and the performance of various computer vision approaches for each condition while highlighting their strengths and limitations. Finally, the review denotes the need for disease-specific datasets with curated annotations and suggests future directions toward unsupervised or self-supervised approaches. Additionally, the findings underscore the importance of developing accurate, automated tools for disease severity score calculation to improve ML-based monitoring and diagnosis in dermatology. TRIAL REGISTRATION: Not applicable.
PMID:39780145 | DOI:10.1186/s12911-024-02843-2
Feasibility of occlusal plane in predicting the changes in anteroposterior mandibular position: a comprehensive analysis using deep learning-based three-dimensional models
BMC Oral Health. 2025 Jan 8;25(1):42. doi: 10.1186/s12903-024-05345-9.
ABSTRACT
BACKGROUND: A comprehensive analysis of the occlusal plane (OP) inclination in predicting anteroposterior mandibular position (APMP) changes is still lacking. This study aimed to analyse the relationships between inclinations of different OPs and APMP metrics and explore the feasibility of OP inclination in predicting changes in APMP.
METHODS: Overall, 115 three-dimensional (3D) models were reconstructed using deep learning-based cone-beam computed tomography (CBCT) segmentation, and their accuracy in supporting cusps was compared with that of intraoral scanning models. The anatomical landmarks of seven OPs and three APMP metrics were identified, and their values were measured on the sagittal reference plane. The receiver operating characteristic curves of inclinations of seven OPs in distinguishing different anteroposterior skeletal patterns and correlations between inclinations of these OPs and APMP metrics were calculated and compared. For the OP inclination with the highest area under the curve (AUC) values and correlation coefficients, the regression models between this OP inclination and APMP metrics were further calculated.
RESULTS: The deviations in supporting cusps between deep learning-based and intraoral scanning models were < 0.300 mm. The improved functional OP (IFOP) inclination could distinguish different skeletal classification determinations (AUC Class I VS Class II = 0.693, AUC Class I VS Class III = 0.763, AUC Class II VS Class III = 0.899, all P values < 0.01) and the AUC value in skeletal Classes II and III determination was statistically higher than the inclinations of other OPs (all P values < 0.01). Moreover, the IFOP inclination showed statistical correlations with APMP metrics (rAPDI = -0.557, rANB = 0.543, rAF-BF = 0.731, all P values < 0.001) and had the highest correlation coefficients among all OP inclinations (all P values < 0.05). The regression analysis models of IFOP inclination and APMP metrics were yAPDI = -0.917x + 91.144, yANB = 0.395x + 0.292, and yAF-BF = 0.738x - 2.331.
CONCLUSIONS: Constructing the OP using deep learning-based 3D models from CBCT data is feasible. IFOP inclination could be used in predicting the APMP changes. A steeper IFOP inclination corresponded to a more retrognathic mandibular posture.
PMID:39780117 | DOI:10.1186/s12903-024-05345-9
Hybrid natural language processing tool for semantic annotation of medical texts in Spanish
BMC Bioinformatics. 2025 Jan 8;26(1):7. doi: 10.1186/s12859-024-05949-6.
ABSTRACT
BACKGROUND: Natural language processing (NLP) enables the extraction of information embedded within unstructured texts, such as clinical case reports and trial eligibility criteria. By identifying relevant medical concepts, NLP facilitates the generation of structured and actionable data, supporting complex tasks like cohort identification and the analysis of clinical records. To accomplish those tasks, we introduce a deep learning-based and lexicon-based named entity recognition (NER) tool for texts in Spanish. It performs medical NER and normalization, medication information extraction and detection of temporal entities, negation and speculation, and temporality or experiencer attributes (Age, Contraindicated, Negated, Speculated, Hypothetical, Future, Family_member, Patient and Other). We built the tool with a dedicated lexicon and rules adapted from NegEx and HeidelTime. Using these resources, we annotated a corpus of 1200 texts, with high inter-annotator agreement (average F1 = 0.841% ± 0.045 for entities, and average F1 = 0.881% ± 0.032 for attributes). We used this corpus to train Transformer-based models (RoBERTa-based models, mBERT and mDeBERTa). We integrated them with the dictionary-based system in a hybrid tool, and distribute the models via the Hugging Face hub. For an internal validation, we used a held-out test set and conducted an error analysis. For an external validation, eight medical professionals evaluated the system by revising the annotation of 200 new texts not used in development.
RESULTS: In the internal validation, the models yielded F1 values up to 0.915. In the external validation with 100 clinical trials, the tool achieved an average F1 score of 0.858 (± 0.032); and in 100 anonymized clinical cases, it achieved an average F1 score of 0.910 (± 0.019).
CONCLUSIONS: The tool is available at https://claramed.csic.es/medspaner . We also release the code ( https://github.com/lcampillos/medspaner ) and the annotated corpus to train the models.
PMID:39780059 | DOI:10.1186/s12859-024-05949-6
Effective BCDNet-based breast cancer classification model using hybrid deep learning with VGG16-based optimal feature extraction
BMC Med Imaging. 2025 Jan 8;25(1):12. doi: 10.1186/s12880-024-01538-4.
ABSTRACT
PROBLEM: Breast cancer is a leading cause of death among women, and early detection is crucial for improving survival rates. The manual breast cancer diagnosis utilizes more time and is subjective. Also, the previous CAD models mostly depend on manmade visual details that are complex to generalize across ultrasound images utilizing distinct techniques. Distinct imaging tools have been utilized in previous works such as mammography and MRI. However, these imaging tools are costly and less portable than ultrasound imaging. Also, ultrasound imaging is a non-invasive method commonly used for breast cancer screening. Hence, the paper presents a novel deep learning model, BCDNet, for classifying breast tumors as benign or malignant using ultrasound images.
AIM: The primary aim of the study is to design an effective breast cancer diagnosis model that can accurately classify tumors in their early stages, thus reducing mortality rates. The model aims to optimize the weight and parameters using the RPAOSM-ESO algorithm to enhance accuracy and minimize false negative rates.
METHODS: The BCDNet model utilizes transfer learning from a pre-trained VGG16 network for feature extraction and employs an AHDNAM classification approach, which includes ASPP, DTCN, 1DCNN, and an attention mechanism. The RPAOSM-ESO algorithm is used to fine-tune the weights and parameters.
RESULTS: The RPAOSM-ESO-BCDNet-based breast cancer diagnosis model provided 94.5 accuracy rates. This value is relatively higher than the previous models such as DTCN (88.2), 1DCNN (89.6), MobileNet (91.3), and ASPP-DTC-1DCNN-AM (93.8). Hence, it is guaranteed that the designed RPAOSM-ESO-BCDNet produces relatively accurate solutions for the classification than the previous models.
CONCLUSION: The BCDNet model, with its sophisticated feature extraction and classification techniques optimized by the RPAOSM-ESO algorithm, shows promise in accurately classifying breast tumors using ultrasound images. The study suggests that the model could be a valuable tool in the early detection of breast cancer, potentially saving lives and reducing the burden on healthcare systems.
PMID:39780045 | DOI:10.1186/s12880-024-01538-4
A practical approach to the spatial-domain calculation of nonprewhitening model observers in computed tomography
Med Phys. 2025 Jan 8. doi: 10.1002/mp.17599. Online ahead of print.
ABSTRACT
BACKGROUND: Modern reconstruction algorithms for computed tomography (CT) can exhibit nonlinear properties, including non-stationarity of noise and contrast dependence of both noise and spatial resolution. Model observers have been recommended as a tool for the task-based assessment of image quality (Samei E et al., Med Phys. 2019; 46(11): e735-e756), but the common Fourier domain approach to their calculation assumes quasi-stationarity.
PURPOSE: A practical spatial-domain approach is proposed for the calculation of the nonprewhitening (NPW) family of model observers in CT, avoiding the disadvantages of the Fourier domain. The methodology avoids explicit estimation of a noise covariance matrix. A formula is also provided for the uncertainty on estimates of detectability index, for a given number of slices and repeat scans. The purpose of this work is to demonstrate the method and provide comparisons to the conventional Fourier approach for both iterative reconstruction (IR) and a deep Learning-based reconstruction (DLR) algorithm.
MATERIALS AND METHODS: Acquisitions were made on a Revolution CT scanner (GE Healthcare, Waukesha, Wisconsin, USA) and reconstructed using the vendor's IR and DLR algorithms (ASiR-V and TrueFidelity). Several reconstruction kernels were investigated (Standard, Lung, and Bone for IR and Standard for DLR). An in-house developed phantom with two flat contrast levels (2 and 8 mgI/mL) and varying feature size (1-10 mm diameter) was used. Two single-energy protocols (80 and 120 kV) were investigated with two dose levels (CTDIvol = 5 and 13 mGy). The spatial domain calculations relied on repeated scanning, region-of-interest placement and simple operations with image matrices. No more repeat scans were utilized than required for Fourier domain estimations. Fourier domain calculations were made using techniques described in a previous publication (Thor D et al., Med Phys. 2023;50(5):2775-2786). Differences between the calculations in the two domains were assessed using the normalized root-mean-square discrepancy (NMRSD).
RESULTS: Fourier domain calculations agreed closely with those in the spatial domain for all zero-strength IR reconstructions, which most closely resemble traditional filtered backprojection. The Fourier-based calculations, however, displayed higher detectability compared to those in the spatial domain for IR with strong iterative strength and for the DLR algorithm. The NRMSD remained within 10% for the NPW model observer without eye filter, but reached larger values when an eye filter was included. The formula for the uncertainty on the detectability index was validated by bootstrap estimates.
CONCLUSION: A practical methodology was demonstrated for calculating NPW observers in the spatial domain. In addition to being a valuable tool for verifying the applicability of typical Fourier-based methodologies, it lends itself to routine calculations for features embedded in a phantom. Higher estimates of detectability were observed when adopting the Fourier domain methodology for IR and for a DLR algorithm, demonstrating that use of the Fourier domain can indicate greater benefit to noise suppression than suggested by spatial domain calculations. This is consistent with the results of previous authors for the Fourier domain, who have compared to human and other model observers, but not, as in this study, to the NPW model observer calculated in the spatial domain.
PMID:39780034 | DOI:10.1002/mp.17599
Accurate predictions on small data with a tabular foundation model
Nature. 2025 Jan;637(8045):319-326. doi: 10.1038/s41586-024-08328-6. Epub 2025 Jan 8.
ABSTRACT
Tabular data, spreadsheets organized in rows and columns, are ubiquitous across scientific fields, from biomedicine to particle physics to economics and climate science1,2. The fundamental prediction task of filling in missing values of a label column based on the rest of the columns is essential for various applications as diverse as biomedical risk models, drug discovery and materials science. Although deep learning has revolutionized learning from raw data and led to numerous high-profile success stories3-5, gradient-boosted decision trees6-9 have dominated tabular data for the past 20 years. Here we present the Tabular Prior-data Fitted Network (TabPFN), a tabular foundation model that outperforms all previous methods on datasets with up to 10,000 samples by a wide margin, using substantially less training time. In 2.8 s, TabPFN outperforms an ensemble of the strongest baselines tuned for 4 h in a classification setting. As a generative transformer-based foundation model, this model also allows fine-tuning, data generation, density estimation and learning reusable embeddings. TabPFN is a learning algorithm that is itself learned across millions of synthetic datasets, demonstrating the power of this approach for algorithm development. By improving modelling abilities across diverse fields, TabPFN has the potential to accelerate scientific discovery and enhance important decision-making in various domains.
PMID:39780007 | DOI:10.1038/s41586-024-08328-6
Computational microscopy with coherent diffractive imaging and ptychography
Nature. 2025 Jan;637(8045):281-295. doi: 10.1038/s41586-024-08278-z. Epub 2025 Jan 8.
ABSTRACT
Microscopy and crystallography are two essential experimental methodologies for advancing modern science. They complement one another, with microscopy typically relying on lenses to image the local structures of samples, and crystallography using diffraction to determine the global atomic structure of crystals. Over the past two decades, computational microscopy, encompassing coherent diffractive imaging (CDI) and ptychography, has advanced rapidly, unifying microscopy and crystallography to overcome their limitations. Here, I review the innovative developments in CDI and ptychography, which achieve exceptional imaging capabilities across nine orders of magnitude in length scales, from resolving atomic structures in materials at sub-ångstrom resolution to quantitative phase imaging of centimetre-sized tissues, using the same principle and similar computational algorithms. These methods have been applied to determine the 3D atomic structures of crystal defects and amorphous materials, visualize oxygen vacancies in high-temperature superconductors and capture ultrafast dynamics. They have also been used for nanoscale imaging of magnetic, quantum and energy materials, nanomaterials, integrated circuits and biological specimens. By harnessing fourth-generation synchrotron radiation, X-ray-free electron lasers, high-harmonic generation, electron microscopes, optical microscopes, cutting-edge detectors and deep learning, CDI and ptychography are poised to make even greater contributions to multidisciplinary sciences in the years to come.
PMID:39780004 | DOI:10.1038/s41586-024-08278-z
Attention-based deep learning for accurate cell image analysis
Sci Rep. 2025 Jan 8;15(1):1265. doi: 10.1038/s41598-025-85608-9.
ABSTRACT
High-content analysis (HCA) holds enormous potential for drug discovery and research, but widely used methods can be cumbersome and yield inaccurate results. Noisy and redundant signals in cell images impede accurate deep learning-based image analysis. To address these issues, we introduce X-Profiler, a novel HCA method that combines cellular experiments, image processing, and deep learning modeling. X-Profiler combines the convolutional neural network and Transformer to encode high-content images, effectively filtering out noisy signals and precisely characterizing cell phenotypes. In comparative tests on drug-induced cardiotoxicity, mitochondrial toxicity classification, and compound classification, X-Profiler outperformed both DeepProfiler and CellProfiler, as two highly recognized and representative methods in this field. Our results demonstrate the utility and versatility of X-Profiler, and we anticipate its wide application in HCA for advancing drug development and disease research.
PMID:39779905 | DOI:10.1038/s41598-025-85608-9
A hybrid machine learning approach for the personalized prognostication of aggressive skin cancers
NPJ Digit Med. 2025 Jan 8;8(1):15. doi: 10.1038/s41746-024-01329-9.
ABSTRACT
Accurate prognostication guides optimal clinical management in skin cancer. Merkel cell carcinoma (MCC) is the most aggressive form of skin cancer that often presents in advanced stages and is associated with poor survival rates. There are no personalized prognostic tools in use in MCC. We employed explainability analysis to reveal new insights into mortality risk factors for this highly aggressive cancer. We then combined deep learning feature selection with a modified XGBoost framework, to develop a web-based prognostic tool for MCC termed 'DeepMerkel'. DeepMerkel can make accurate personalised, time-dependent survival predictions for MCC from readily available clinical information. It demonstrated generalizability through high predictive performance in an international clinical cohort, out-performing current population-based prognostic staging systems. MCC and DeepMerkel provide the exemplar model of personalised machine learning prognostic tools in aggressive skin cancers.
PMID:39779875 | DOI:10.1038/s41746-024-01329-9
A hybrid CNN model for classification of motor tasks obtained from hybrid BCI system
Sci Rep. 2025 Jan 8;15(1):1360. doi: 10.1038/s41598-024-84883-2.
ABSTRACT
The Hybrid-Brain Computer Interface (BCI) has shown improved performance, especially in classifying multi-class data. Two non-invasive BCI modules are combined to achieve an improved classification which are Electroencephalogram (EEG) and functional Near Infra-red Spectroscopy (fNIRS). Classifying contralateral and ipsilateral motor movements is found challenging among the other mental activity signals. The current work focuses on the performance of deep learning methods like - Convolutional Neural Networks (CNN) and Bidirectional Long-Short term memory (Bi-LSTM) in classifying a four-class motor execution of Right Hand, Left Hand, Right Arm and Left Arm taken from the CORE dataset. The model performance was evaluated using metrics such as Accuracy, F1 - score, Precision, Recall, AUC and ROC curve. The CNN and Hybrid CNN models have resulted in 98.3% and 99% accuracy respectively.
PMID:39779796 | DOI:10.1038/s41598-024-84883-2
Nf-Root: A Best-Practice Pipeline for Deep-Learning-Based Analysis of Apoplastic pH in Microscopy Images of Developmental Zones in Plant Root Tissue
Quant Plant Biol. 2024 Dec 23;5:e12. doi: 10.1017/qpb.2024.11. eCollection 2024.
ABSTRACT
Hormonal mechanisms associated with cell elongation play a vital role in the development and growth of plants. Here, we report Nextflow-root (nf-root), a novel best-practice pipeline for deep-learning-based analysis of fluorescence microscopy images of plant root tissue from A. thaliana. This bioinformatics pipeline performs automatic identification of developmental zones in root tissue images. This also includes apoplastic pH measurements, which is useful for modeling hormone signaling and cell physiological responses. We show that this nf-core standard-based pipeline successfully automates tissue zone segmentation and is both high-throughput and highly reproducible. In short, a deep-learning module deploys deterministically trained convolutional neural network models and augments the segmentation predictions with measures of prediction uncertainty and model interpretability, while aiming to facilitate result interpretation and verification by experienced plant biologists. We observed a high statistical similarity between the manually generated results and the output of the nf-root.
PMID:39777028 | PMC:PMC11706687 | DOI:10.1017/qpb.2024.11
Assessment of human emotional reactions to visual stimuli "deep-dreamed" by artificial neural networks
Front Psychol. 2024 Dec 24;15:1509392. doi: 10.3389/fpsyg.2024.1509392. eCollection 2024.
ABSTRACT
INTRODUCTION: While the fact that visual stimuli synthesized by Artificial Neural Networks (ANN) may evoke emotional reactions is documented, the precise mechanisms that connect the strength and type of such reactions with the ways of how ANNs are used to synthesize visual stimuli are yet to be discovered. Understanding these mechanisms allows for designing methods that synthesize images attenuating or enhancing selected emotional states, which may provide unobtrusive and widely-applicable treatment of mental dysfunctions and disorders.
METHODS: The Convolutional Neural Network (CNN), a type of ANN used in computer vision tasks which models the ways humans solve visual tasks, was applied to synthesize ("dream" or "hallucinate") images with no semantic content to maximize activations of neurons in precisely-selected layers in the CNN. The evoked emotions of 150 human subjects observing these images were self-reported on a two-dimensional scale (arousal and valence) utilizing self-assessment manikin (SAM) figures. Correlations between arousal and valence values and image visual properties (e.g., color, brightness, clutter feature congestion, and clutter sub-band entropy) as well as the position of the CNN's layers stimulated to obtain a given image were calculated.
RESULTS: Synthesized images that maximized activations of some of the CNN layers led to significantly higher or lower arousal and valence levels compared to average subject's reactions. Multiple linear regression analysis found that a small set of selected image global visual features (hue, feature congestion, and sub-band entropy) are significant predictors of the measured arousal, however no statistically significant dependencies were found between image global visual features and the measured valence.
CONCLUSION: This study demonstrates that the specific method of synthesizing images by maximizing small and precisely-selected parts of the CNN used in this work may lead to synthesis of visual stimuli that enhance or attenuate emotional reactions. This method paves the way for developing tools that stimulate, in a non-invasive way, to support wellbeing (manage stress, enhance mood) and to assist patients with certain mental conditions by complementing traditional methods of therapeutic interventions.
PMID:39776961 | PMC:PMC11703666 | DOI:10.3389/fpsyg.2024.1509392
Decorrelative network architecture for robust electrocardiogram classification
Patterns (N Y). 2024 Dec 9;5(12):101116. doi: 10.1016/j.patter.2024.101116. eCollection 2024 Dec 13.
ABSTRACT
To achieve adequate trust in patient-critical medical tasks, artificial intelligence must be able to recognize instances where they cannot operate confidently. Ensemble methods are deployed to estimate uncertainty, but models in an ensemble often share the same vulnerabilities to adversarial attacks. We propose an ensemble approach based on feature decorrelation and Fourier partitioning for teaching networks diverse features, reducing the chance of perturbation-based fooling. We test our approach against white-box attacks in single- and multi-channel electrocardiogram classification and adapt adversarial training and DVERGE into an ensemble framework for comparison. Our results indicate that the combination of decorrelation and Fourier partitioning maintains performance on unperturbed data while demonstrating superior uncertainty estimation on projected gradient descent and smooth adversarial attacks of various magnitudes. Furthermore, our approach does not require expensive optimization with adversarial samples during training. These methods can be applied to other tasks for more robust models.
PMID:39776851 | PMC:PMC11701855 | DOI:10.1016/j.patter.2024.101116
Deep Learning for Discrimination of Early Spinal Tuberculosis from Acute Osteoporotic Vertebral Fracture on CT
Infect Drug Resist. 2025 Jan 3;18:31-42. doi: 10.2147/IDR.S482584. eCollection 2025.
ABSTRACT
BACKGROUND: Early differentiation between spinal tuberculosis (STB) and acute osteoporotic vertebral compression fracture (OVCF) is crucial for determining the appropriate clinical management and treatment pathway, thereby significantly impacting patient outcomes.
OBJECTIVE: To evaluate the efficacy of deep learning (DL) models using reconstructed sagittal CT images in the differentiation of early STB from acute OVCF, with the aim of enhancing diagnostic precision, reducing reliance on MRI and biopsies, and minimizing the risks of misdiagnosis.
METHODS: Data were collected from 373 patients, with 302 patients recruited from a university-affiliated hospital serving as the training and internal validation sets, and an additional 71 patients from another university-affiliated hospital serving as the external validation set. MVITV2, Efficient-Net-B5, ResNet101, and ResNet50 were used as the backbone networks for DL model development, training, and validation. Model evaluation was based on accuracy, precision, sensitivity, F1 score, and area under the curve (AUC). The performance of the DL models was compared with the diagnostic accuracy of two spine surgeons who performed a blinded review.
RESULTS: The MVITV2 model outperformed other architectures in the internal validation set, achieving accuracy of 98.98%, precision of 100%, sensitivity of 97.97%, F1 score of 98.98%, and AUC of 0.997. The performance of the DL models notably exceeded that of the spine surgeons, who achieved accuracy rates of 77.38% and 93.56%. The external validation confirmed the models' robustness and generalizability.
CONCLUSION: The DL models significantly improved the differentiation between STB and OVCF, surpassing experienced spine surgeons in diagnostic accuracy. These models offer a promising alternative to traditional imaging and invasive procedures, potentially promoting early and accurate diagnosis, reducing healthcare costs, and improving patient outcomes. The findings underscore the potential of artificial intelligence for revolutionizing spinal disease diagnostics, and have substantial clinical implications.
PMID:39776757 | PMC:PMC11706012 | DOI:10.2147/IDR.S482584
Adaptive Treatment of Metastatic Prostate Cancer Using Generative Artificial Intelligence
Clin Med Insights Oncol. 2025 Jan 6;19:11795549241311408. doi: 10.1177/11795549241311408. eCollection 2025.
ABSTRACT
Despite the expanding therapeutic options available to cancer patients, therapeutic resistance, disease recurrence, and metastasis persist as hallmark challenges in the treatment of cancer. The rise to prominence of generative artificial intelligence (GenAI) in many realms of human activities is compelling the consideration of its capabilities as a potential lever to advance the development of effective cancer treatments. This article presents a hypothetical case study on the application of generative pre-trained transformers (GPTs) to the treatment of metastatic prostate cancer (mPC). The case explores the design of GPT-supported adaptive intermittent therapy for mPC. Testosterone and prostate-specific antigen (PSA) are assumed to be repeatedly monitored while treatment may involve a combination of androgen deprivation therapy (ADT), androgen receptor-signalling inhibitors (ARSI), chemotherapy, and radiotherapy. The analysis covers various questions relevant to the configuration, training, and inferencing of GPTs for the case of mPC treatment with a particular attention to risk mitigation regarding the hallucination problem and its implications to clinical integration of GenAI technologies. The case study provides elements of an actionable pathway to the realization of GenAI-assisted adaptive treatment of metastatic prostate cancer. As such, the study is expected to help facilitate the design of clinical trials of GenAI-supported cancer treatments.
PMID:39776668 | PMC:PMC11701910 | DOI:10.1177/11795549241311408
Predicting the risk of type 2 diabetes mellitus (T2DM) emergence in 5 years using mammography images: a comparison study between radiomics and deep learning algorithm
J Med Imaging (Bellingham). 2025 Jan;12(1):014501. doi: 10.1117/1.JMI.12.1.014501. Epub 2025 Jan 6.
ABSTRACT
PURPOSE: The prevalence of type 2 diabetes mellitus (T2DM) has been steadily increasing over the years. We aim to predict the occurrence of T2DM using mammography images within 5 years using two different methods and compare their performance.
APPROACH: We examined 312 samples, including 110 positive cases (developed T2DM after 5 years) and 202 negative cases (did not develop T2DM) using two different methods. In the first method, a radiomics-based approach, we utilized radiomics features and machine learning (ML) algorithms. The entire breast region was chosen as the region of interest for extracting radiomics features. Then, a binary breast image was created from which we extracted 668 features and analyzed them using various ML algorithms. In the second method, a complex convolutional neural network (CNN) with a modified ResNet architecture and various kernel sizes was applied to raw mammography images for the prediction task. A nested, stratified five-fold cross-validation was done for both parts A and B to compute accuracy, sensitivity, specificity, and area under the receiver operating curve (AUROC). Hyperparameter tuning was also done to enhance the model's performance and reliability.
RESULTS: The radiomics approach's light gradient boosting model gave 68.9% accuracy, 30.7% sensitivity, 89.5% specificity, and 0.63 AUROC. The CNN method achieved an AUROC of 0.58 over 20 epochs.
CONCLUSION: Radiomics outperformed CNN by 0.05 in terms of AUROC. This may be due to the more straightforward interpretability and clinical relevance of predefined radiomics features compared with the complex, abstract features learned by CNNs.
PMID:39776665 | PMC:PMC11702674 | DOI:10.1117/1.JMI.12.1.014501
Deep-blur: Blind identification and deblurring with convolutional neural networks
Biol Imaging. 2024 Nov 15;4:e13. doi: 10.1017/S2633903X24000096. eCollection 2024.
ABSTRACT
We propose a neural network architecture and a training procedure to estimate blurring operators and deblur images from a single degraded image. Our key assumption is that the forward operators can be parameterized by a low-dimensional vector. The models we consider include a description of the point spread function with Zernike polynomials in the pupil plane or product-convolution expansions, which incorporate space-varying operators. Numerical experiments show that the proposed method can accurately and robustly recover the blur parameters even for large noise levels. For a convolution model, the average signal-to-noise ratio of the recovered point spread function ranges from 13 dB in the noiseless regime to 8 dB in the high-noise regime. In comparison, the tested alternatives yield negative values. This operator estimate can then be used as an input for an unrolled neural network to deblur the image. Quantitative experiments on synthetic data demonstrate that this method outperforms other commonly used methods both perceptually and in terms of SSIM. The algorithm can process a 512 512 image under a second on a consumer graphics card and does not require any human interaction once the operator parameterization has been set up.1.
PMID:39776610 | PMC:PMC11704139 | DOI:10.1017/S2633903X24000096