Deep learning
Evaluation of an enhanced ResNet-18 classification model for rapid On-site diagnosis in respiratory cytology
BMC Cancer. 2025 Jan 3;25(1):10. doi: 10.1186/s12885-024-13402-3.
ABSTRACT
OBJECTIVE: Rapid on-site evaluation (ROSE) of respiratory cytology specimens is a critical technique for accurate and timely diagnosis of lung cancer. However, in China, limited familiarity with the Diff-Quik staining method and a shortage of trained cytopathologists hamper utilization of ROSE. Therefore, developing an improved deep learning model to assist clinicians in promptly and accurately evaluating Diff-Quik stained cytology samples during ROSE has important clinical value.
METHODS: Retrospectively, 116 digital images of Diff-Quik stained cytology samples were obtained from whole slide scans. These included 6 diagnostic categories - carcinoid, normal cells, adenocarcinoma, squamous cell carcinoma, non-small cell carcinoma, and small cell carcinoma. All malignant diagnoses were confirmed by histopathology and immunohistochemistry. The test image set was presented to 3 cytopathologists from different hospitals with varying levels of experience, as well as an artificial intelligence system, as single-choice questions.
RESULTS: The diagnostic accuracy of the cytopathologists correlated with their years of practice and hospital setting. The AI model demonstrated proficiency comparable to the humans. Importantly, all combinations of AI assistance and human cytopathologist increased diagnostic efficiency to varying degrees.
CONCLUSIONS: This deep learning model shows promising capability as an aid for on-site diagnosis of respiratory cytology samples. However, human expertise remains essential to the diagnostic process.
PMID:39754166 | DOI:10.1186/s12885-024-13402-3
Novel transfer learning based bone fracture detection using radiographic images
BMC Med Imaging. 2025 Jan 3;25(1):5. doi: 10.1186/s12880-024-01546-4.
ABSTRACT
A bone fracture is a medical condition characterized by a partial or complete break in the continuity of the bone. Fractures are primarily caused by injuries and accidents, affecting millions of people worldwide. The healing process for a fracture can take anywhere from one month to one year, leading to significant economic and psychological challenges for patients. The detection of bone fractures is crucial, and radiographic images are often relied on for accurate assessment. An efficient neural network method is essential for the early detection and timely treatment of fractures. In this study, we propose a novel transfer learning-based approach called MobLG-Net for feature engineering purposes. Initially, the spatial features are extracted from bone X-ray images using a transfer model, MobileNet, and then input into a tree-based light gradient boosting machine (LGBM) model for the generation of class probability features. Several machine learning (ML) techniques are applied to the subsets of newly generated transfer features to compare the results. K-nearest neighbor (KNN), LGBM, logistic regression (LR), and random forest (RF) are implemented using the novel features with optimized hyperparameters. The LGBM and LR models trained on proposed MobLG-Net (MobileNet-LGBM) based features outperformed others, achieving an accuracy of 99% in predicting bone fractures. A cross-validation mechanism is used to evaluate the performance of each model. The proposed study can improve the detection of bone fractures using X-ray images.
PMID:39754038 | DOI:10.1186/s12880-024-01546-4
Deep learning in 3D cardiac reconstruction: a systematic review of methodologies and dataset
Med Biol Eng Comput. 2025 Jan 4. doi: 10.1007/s11517-024-03273-y. Online ahead of print.
ABSTRACT
This study presents an advanced methodology for 3D heart reconstruction using a combination of deep learning models and computational techniques, addressing critical challenges in cardiac modeling and segmentation. A multi-dataset approach was employed, including data from the UK Biobank, MICCAI Multi-Modality Whole Heart Segmentation (MM-WHS) challenge, and clinical datasets of congenital heart disease. Preprocessing steps involved segmentation, intensity normalization, and mesh generation, while the reconstruction was performed using a blend of statistical shape modeling (SSM), graph convolutional networks (GCNs), and progressive GANs. The statistical shape models were utilized to capture anatomical variations through principal component analysis (PCA), while GCNs refined the meshes derived from segmented slices. Synthetic data generated by progressive GANs enabled augmentation, particularly useful for congenital heart conditions. Evaluation of the reconstruction accuracy was performed using metrics such as Dice similarity coefficient (DSC), Chamfer distance, and Hausdorff distance, with the proposed framework demonstrating superior anatomical precision and functional relevance compared to traditional methods. This approach highlights the potential for automated, high-resolution 3D heart reconstruction applicable in both clinical and research settings. The results emphasize the critical role of deep learning in enhancing anatomical accuracy, particularly for rare and complex cardiac conditions. This paper is particularly important for researchers wanting to utilize deep learning in cardiac imaging and 3D heart reconstruction, bringing insights into the integration of modern computational methods.
PMID:39753994 | DOI:10.1007/s11517-024-03273-y
Assessment of choroidal vessels in healthy eyes using 3-dimensional vascular maps and a semi-automated deep learning approach
Sci Rep. 2025 Jan 3;15(1):714. doi: 10.1038/s41598-025-85189-7.
ABSTRACT
To assess the choroidal vessels in healthy eyes using a novel three-dimensional (3D) deep learning approach. In this cross-sectional retrospective study, swept-source OCT 6 × 6 mm scans on Plex Elite 9000 device were obtained. Automated segmentation of the choroidal layer was achieved using a deep-learning ResUNet model along with a volumetric smoothing approach. Phansalkar thresholding was employed to binarize the choroidal vasculature. The choroidal vessels were visualized in 3D maps, and divided into five sectors: nasal, temporal, superior, inferior, and central. Choroidal thickness (CT) and choroidal vascularity index (CVI) of the whole volumes were calculated using the automated software. The three vessels for each sector were measured, to obtain the mean choroidal vessel diameter (MChVD). The inter-vessel distance (IVD) was defined as the distance between the vessel and the nearest non-collateral vessel. The choroidal biomarkers obtained were compared between different age groups (18 to 34 years old, 35 to 59 years old, and ≥ 60) and sex. Linear mixed models and univariate analysis were used for statistical analysis. A total of 80 eyes of 53 patients were included in the analysis. The mean age of the patients was 44.7 ± 18.5 years, and 54.7% were females. Overall, 44 eyes of 29 females and 36 eyes of 24 males were included in the study. We observed that 33% of the eyes presented at least one choroidal vessel larger than 200 μm crossing the central 3000 μm of the macula. Also, we observed a significant decrease in mean CVI with advancing age (p < 0.05), whereas no significant changes in mean MChVD and IVD were observed (p > 0.05). Furthermore, CVI was increased in females compared to males in each sector, with a significant difference in the temporal sector (p < 0.05). MChVD and IVD did not show any changes with increasing age, whereas CVI decreased with increasing age. Also, CVI was increased in healthy females compared to males. The 3D assessment of choroidal vessels using a deep-learning approach represents an innovative, non-invasive technique for investigating choroidal vasculature, with potential applications in research and clinical practice.
PMID:39753934 | DOI:10.1038/s41598-025-85189-7
Deep learning-based pelvimetry in pelvic MRI volumes for pre-operative difficulty assessment of total mesorectal excision
Surg Endosc. 2025 Jan 3. doi: 10.1007/s00464-024-11485-4. Online ahead of print.
ABSTRACT
BACKGROUND: Specific pelvic bone dimensions have been identified as predictors of total mesorectal excision (TME) difficulty and outcomes. However, manual measurement of these dimensions (pelvimetry) is labor intensive and thus, anatomic criteria are not included in the pre-operative difficulty assessment. In this work, we propose an automated workflow for pelvimetry based on pre-operative magnetic resonance imaging (MRI) volumes.
METHODS: We implement a deep learning-based framework to measure the predictive pelvic dimensions automatically. A 3D U-Net takes a sagittal T2-weighted MRI volume as input and determines five anatomic landmark locations: promontorium, S3-vertebrae, coccyx, dorsal, and cranial part of the os pubis. The landmarks are used to quantify the lengths of the pelvic inlet, outlet, depth, and the angle of the sacrum. For the development of the network, we used MRI volumes from 1707 patients acquired in eight TME centers. The automated landmark localization and pelvic dimensions measurements are assessed by comparison with manual annotation.
RESULTS: A center-stratified fivefold cross-validation showed a mean landmark localization error of 5.6 mm. The inter-observer variation for manual annotation was 3.7 ± 8.4 mm. The automated dimension measurements had a Spearman correlation coefficient ranging between 0.7 and 0.87.
CONCLUSION: To our knowledge, this is the first study to automate pelvimetry in MRI volumes using deep learning. Our framework can measure the pelvic dimensions with high accuracy, enabling the extraction of metrics that facilitate a pre-operative difficulty assessment of the TME.
PMID:39753930 | DOI:10.1007/s00464-024-11485-4
Local corner smoothing based on deep learning for CNC machine tools
Sci Rep. 2025 Jan 2;15(1):404. doi: 10.1038/s41598-024-84577-9.
ABSTRACT
Most of toolpaths for machining is composed of series of short linear segments (G01 command), which limits the feedrate and machining quality. To generate a smooth machining path, a new optimization strategy is proposed to optimize the toolpath at the curvature level. First, the three essential components of optimization are introduced, and the local corner smoothness is converted into an optimization problem. The optimization challenge is then resolved by an intelligent optimization algorithm. Considering the influence of population size and computational resources on intelligent optimization algorithms, a deep learning algorithm (the Double-ResNet Local Smoothing (DRLS) algorithm) is proposed to further improve optimization efficiency. The First-Double-Local Smoothing (FDLS) algorithm is used to optimize the positions of NURBS (Non-Uniform Rational B-Spline) control points, and the Second-Double-Local Smoothing (SDLS) algorithm is employed to optimize the NURBS weights to generate a smoother toolpath, thus allowing the cutting tool to pass through each local corner at a higher feedrate during the machining process. In order to ensure machining quality, geometric constraints, drive condition constraints, and contour error constraints are taken into account during the feedrate planning process. Finally, three simulations are presented to verify the effectiveness of the proposed method.
PMID:39753859 | DOI:10.1038/s41598-024-84577-9
Enhancing Radiographic Diagnosis: CycleGAN-Based Methods for Reducing Cast Shadow Artifacts in Wrist Radiographs
J Imaging Inform Med. 2025 Jan 3. doi: 10.1007/s10278-024-01385-3. Online ahead of print.
ABSTRACT
We extend existing techniques by using generative adversarial network (GAN) models to reduce the appearance of cast shadows in radiographs across various age groups. We retrospectively collected 11,500 adult and paediatric wrist radiographs, evenly divided between those with and without casts. The test subset consisted of 750 radiographs with cast and 750 without cast. We extended the results from a previous study that employed CycleGAN by enhancing the model using a perceptual loss function and a self-attention layer. The CycleGAN model which incorporates a self-attention layer and perceptual loss function delivered a similar quantitative performance as the original model. This model was applied to images from 20 cases where the original reports recommended CT scanning or repeat radiographs without the cast, which were then evaluated by radiologists for qualitative assessment. The results demonstrated that the generated images could improve radiologists' diagnostic confidence, in some cases leading to more decisive reports. Where available, the reports from follow-up imaging were compared with those produced by radiologists reading AI-generated images. Every report, except two, provided identical diagnoses as those associated with follow-up imaging. The ability of radiologists to perform robust reporting with downsampled AI-enhanced images is clinically meaningful and warrants further investigation. Additionally, radiologists were unable to distinguish AI-enhanced from unenhanced images. These findings suggest the cast suppression technique could be integrated as a tool to augment clinical workflows, with the potential benefits of reducing patient doses, improving operational efficiencies, reducing delays in diagnoses, and reducing the number of patient visits.
PMID:39753829 | DOI:10.1007/s10278-024-01385-3
Combining the Variational and Deep Learning Techniques for Classification of Video Capsule Endoscopic Images
J Imaging Inform Med. 2025 Jan 3. doi: 10.1007/s10278-024-01352-y. Online ahead of print.
ABSTRACT
Gastrointestinal tract-related cancers pose a significant health burden, with high mortality rates. In order to detect the anomalies of the gastrointestinal tract that may progress to cancer, a video capsule endoscopy procedure is employed. The number of video capsule endoscopic ( VCE ) images produced per examination is enormous, which necessitates hours of analysis by clinicians. Therefore, there is a pressing need for automated computer-aided lesion classification techniques. Computer-aided systems utilize deep learning (DL) techniques, as they can potentially enhance anomaly detection rates. However, most of the DL techniques available in the literature utilizes the static frames for the classification purpose, which uses only the spatial information of the image. In addition, they only perform binary classification. Thus, the presented work proposes a framework to perform multi-class classification of VCE images by using the dynamic information of the images. The proposed algorithm is a combination of the fractional order variational model and the DL model. The fractional order variational model captures the dynamic information of VCE images by estimating optical flow color maps. Optical flow color maps are fed to the DL model for training. The DL model performs the multi-class classification task and localizes the region of interest with the maximum class score. DL model is inspired by the Faster RCNN approach, and its backbone architecture is EfficientNet B0. The proposed framework achieves the average AUC value of 0.98, mAP value of 0.93, and 0.878 as balanced accuracy value. Hence, the proposed model is efficient in VCE image classification and detection of region of interest.
PMID:39753827 | DOI:10.1007/s10278-024-01352-y
Artificial Intelligence and Cancer Health Equity: Bridging the Divide or Widening the Gap
Curr Oncol Rep. 2025 Jan 3. doi: 10.1007/s11912-024-01627-1. Online ahead of print.
ABSTRACT
PURPOSE OF REVIEW: This review aims to evaluate the impact of artificial intelligence (AI) on cancer health equity, specifically investigating whether AI is addressing or widening disparities in cancer outcomes.
RECENT FINDINGS: Recent studies demonstrate significant advancements in AI, such as deep learning for cancer diagnosis and predictive analytics for personalized treatment, showing potential for improved precision in care. However, concerns persist about the performance of AI tools across diverse populations due to biased training data. Access to AI technologies also remains limited, particularly in low-income and rural settings. AI holds promise for advancing cancer care, but its current application risks exacerbating existing health disparities. To ensure AI benefits all populations, future research must prioritize inclusive datasets, integrate social determinants of health, and develop ethical frameworks. Addressing these challenges is crucial for AI to contribute positively to cancer health equity and guide future research and policy development.
PMID:39753817 | DOI:10.1007/s11912-024-01627-1
Explainable artificial intelligence with UNet based segmentation and Bayesian machine learning for classification of brain tumors using MRI images
Sci Rep. 2025 Jan 3;15(1):690. doi: 10.1038/s41598-024-84692-7.
ABSTRACT
Detecting brain tumours (BT) early improves treatment possibilities and increases patient survival rates. Magnetic resonance imaging (MRI) scanning offers more comprehensive information, such as better contrast and clarity, than any alternative scanning process. Manually separating BTs from several MRI images gathered in medical practice for cancer analysis is challenging and time-consuming. Tumours and MRI scans of the brain are exposed utilizing methods and machine learning technologies, simplifying the process for doctors. MRI images can sometimes appear normal even when a patient has a tumour or malignancy. Deep learning approaches have recently depended on deep convolutional neural networks to analyze medical images with promising outcomes. It supports saving lives faster and rectifying some medical errors. With this motivation, this article presents a new explainable artificial intelligence with semantic segmentation and Bayesian machine learning for brain tumors (XAISS-BMLBT) technique. The presented XAISS-BMLBT technique mainly concentrates on the semantic segmentation and classification of BT in MRI images. The presented XAISS-BMLBT approach initially involves bilateral filtering-based image pre-processing to eliminate the noise. Next, the XAISS-BMLBT technique performs the MEDU-Net+ segmentation process to define the impacted brain regions. For the feature extraction process, the ResNet50 model is utilized. Furthermore, the Bayesian regularized artificial neural network (BRANN) model is used to identify the presence of BTs. Finally, an improved radial movement optimization model is employed for the hyperparameter tuning of the BRANN technique. To highlight the improved performance of the XAISS-BMLBT technique, a series of simulations were accomplished by utilizing a benchmark database. The experimental validation of the XAISS-BMLBT technique portrayed a superior accuracy value of 97.75% over existing models.
PMID:39753735 | DOI:10.1038/s41598-024-84692-7
Object detection in motion management scenarios based on deep learning
PLoS One. 2025 Jan 3;20(1):e0315130. doi: 10.1371/journal.pone.0315130. eCollection 2025.
ABSTRACT
In athletes' competitions and daily training, in order to further strengthen the athletes' sports level, it is usually necessary to analyze the athletes' sports actions at a specific moment, in which it is especially important to quickly and accurately identify the categories and positions of the athletes, sports equipment, field boundaries and other targets in the sports scene. However, the existing detection methods failed to achieve better detection results, and the analysis found that the reasons for this phenomenon mainly lie in the loss of temporal information, multi-targeting, target overlap, and coupling of regression and classification tasks, which makes it more difficult for these network models to adapt to the detection task in this scenario. Based on this, we propose for the first time a supervised object detection method for scenarios in the field of motion management. The main contributions of this method include: designing a TSM module that combines temporal offset operation and spatial convolution operation to enhance the network structure's ability to capture temporal information in the motion scene; designing a deformable attention mechanism that enhances the feature extraction capability of individual target actions in the motion scene; designing a decoupling structure that decouples the regression task from the classification task; and using the above approach for object detection in motion management scenarios. The accuracy of target detection in this scenario is greatly. To evaluate the effectiveness of our designed network and proposed methodology, we conduct experiments on open-source datasets. The final comparison experiment shows that our proposed method outperforms all the other seven common target detection networks on the same dataset with a map_0.5 score of 92.298%. In the ablation experiments, the reduction of each module reduces the accuracy of detection. The two types of experiments prove that the proposed method is effective and can achieve better results when applied to motion management detection scenarios.
PMID:39752546 | DOI:10.1371/journal.pone.0315130
Image segmentation with traveling waves in an exactly solvable recurrent neural network
Proc Natl Acad Sci U S A. 2025 Jan 7;122(1):e2321319121. doi: 10.1073/pnas.2321319121. Epub 2025 Jan 3.
ABSTRACT
We study image segmentation using spatiotemporal dynamics in a recurrent neural network where the state of each unit is given by a complex number. We show that this network generates sophisticated spatiotemporal dynamics that can effectively divide an image into groups according to a scene's structural characteristics. We then demonstrate a simple algorithm for object segmentation that generalizes across inputs ranging from simple geometric objects in grayscale images to natural images. Using an exact solution of the recurrent network's dynamics, we present a precise description of the mechanism underlying object segmentation in the network dynamics, providing a clear mathematical interpretation of how the algorithm performs this task. Object segmentation across all images is accomplished with one recurrent neural network that has a single, fixed set of weights. This demonstrates the expressive potential of recurrent neural networks when constructed using a mathematical approach that brings together their structure, dynamics, and computation.
PMID:39752524 | DOI:10.1073/pnas.2321319121
Automated CAD system for early detection and classification of pancreatic cancer using deep learning model
PLoS One. 2025 Jan 3;20(1):e0307900. doi: 10.1371/journal.pone.0307900. eCollection 2025.
ABSTRACT
Accurate diagnosis of pancreatic cancer using CT scan images is critical for early detection and treatment, potentially saving numerous lives globally. Manual identification of pancreatic tumors by radiologists is challenging and time-consuming due to the complex nature of CT scan images and variations in tumor shape, size, and location of the pancreatic tumor also make it challenging to detect and classify different types of tumors. Thus, to address this challenge we proposed a four-stage framework of computer-aided diagnosis systems. In the preprocessing stage, the input image resizes into 227 × 227 dimensions then converts the RGB image into a grayscale image, and enhances the image by removing noise without blurring edges by applying anisotropic diffusion filtering. In the segmentation stage, the preprocessed grayscale image a binary image is created based on a threshold, highlighting the edges by Sobel filtering, and watershed segmentation to segment the tumor region and we also implement the U-Net method for segmentation. Then refine the geometric structure of the image using morphological operation and extracting the texture features from the image using a gray-level co-occurrence matrix computed by analyzing the spatial relationship of pixel intensities in the refined image, counting the occurrences of pixel pairs with specific intensity values and spatial relationships. The detection stage analyzes the tumor region's extracted features characteristics by labeling the connected components and selecting the region with the highest density to locate the tumor area, achieving a good accuracy of 99.64%. In the classification stage, the system classifies the detected tumor into the normal, pancreatic tumor, then into benign, pre-malignant, or malignant using a proposed reduced 11-layer AlexNet model. The classification stage attained an accuracy level of 98.72%, an AUC of 0.9979, and an overall system average processing time of 1.51 seconds, demonstrating the capability of the system to effectively and efficiently identify and classify pancreatic cancers.
PMID:39752442 | DOI:10.1371/journal.pone.0307900
A weak edge estimation based multi-task neural network for OCT segmentation
PLoS One. 2025 Jan 3;20(1):e0316089. doi: 10.1371/journal.pone.0316089. eCollection 2025.
ABSTRACT
Optical Coherence Tomography (OCT) offers high-resolution images of the eye's fundus. This enables thorough analysis of retinal health by doctors, providing a solid basis for diagnosis and treatment. With the development of deep learning, deep learning-based methods are becoming more popular for fundus OCT image segmentation. Yet, these methods still encounter two primary challenges. Firstly, deep learning methods are sensitive to weak edges. Secondly, the high cost of annotating medical image data results in a lack of labeled data, leading to overfitting during model training. To tackle these challenges, we introduce the Multi-Task Attention Mechanism Network with Pruning (MTAMNP), consisting of a segmentation branch and a boundary regression branch. The boundary regression branch utilizes an adaptive weighted loss function derived from the Truncated Signed Distance Function(TSDF), improving the model's capacity to preserve weak edge details. The Spatial Attention Based Dual-Branch Information Fusion Block links these branches, enabling mutual benefit. Furthermore, we present a structured pruning method grounded in channel attention to decrease parameter count, mitigate overfitting, and uphold segmentation accuracy. Our method surpasses other cutting-edge segmentation networks on two widely accessible datasets, achieving Dice scores of 84.09% and 93.84% on the HCMS and Duke datasets.
PMID:39752440 | DOI:10.1371/journal.pone.0316089
An end-to-end implicit neural representation architecture for medical volume data
PLoS One. 2025 Jan 3;20(1):e0314944. doi: 10.1371/journal.pone.0314944. eCollection 2025.
ABSTRACT
Medical volume data are rapidly increasing, growing from gigabytes to petabytes, which presents significant challenges in organisation, storage, transmission, manipulation, and rendering. To address the challenges, we propose an end-to-end architecture for data compression, leveraging advanced deep learning technologies. This architecture consists of three key modules: downsampling, implicit neural representation (INR), and super-resolution (SR). We employ a trade-off point method to optimise each module's performance and achieve the best balance between high compression rates and reconstruction quality. Experimental results on multi-parametric MRI data demonstrate that our method achieves a high compression rate of up to 97.5% while maintaining superior reconstruction accuracy, with a Peak Signal-to-Noise Ratio (PSNR) of 40.05 dB and Structural Similarity Index (SSIM) of 0.96. This approach significantly reduces GPU memory requirements and processing time, making it a practical solution for handling large medical datasets.
PMID:39752347 | DOI:10.1371/journal.pone.0314944
Quantitative analysis of the dexamethasone side effect on human-derived young and aged skeletal muscle by myotube and nuclei segmentation using deep learning
Bioinformatics. 2025 Jan 3:btae658. doi: 10.1093/bioinformatics/btae658. Online ahead of print.
ABSTRACT
MOTIVATION: Skeletal muscle cells (skMCs) combine together to create long, multi-nucleated structures called myotubes. By studying the size, length, and number of nuclei in these myotubes, we can gain a deeper understanding of skeletal muscle development. However, human experimenters may often derive unreliable results owing to the unusual shape of the myotube, which causes significant measurement variability.
RESULTS: We propose a new method for quantitative analysis of the dexamethasone side effect on human-derived young and aged skeletal muscle by simultaneous myotube and nuclei segmentation using deep learning combined with post-processing techniques. The deep learning model outputs myotube semantic segmentation, nuclei semantic segmentation, and nuclei center, and post-processing applies a watershed algorithm to accurately distinguish overlapped nuclei and identify myotube branches through skeletonization. To evaluate the performance of the model, the myotube diameter and the number of nuclei were calculated from the generated segmented images and compared with the results calculated by human experimenters. In particular, the proposed model produced outstanding outcomes when comparing human-derived primary young and aged skMCs treated with dexamethasone. The proposed standardized and consistent automated image segmentation system for myotubes is expected to help streamline the drug-development process for skeletal muscle diseases.
AVAILABILITY AND IMPLEMENTATION: The code and the data are available at https://github.com/tdn02007/QA-skMCs-Seg.
SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
PMID:39752317 | DOI:10.1093/bioinformatics/btae658
D3-ImgNet: A Framework for Molecular Properties Prediction Based on Data-Driven Electron Density Images
J Phys Chem A. 2025 Jan 3. doi: 10.1021/acs.jpca.4c05519. Online ahead of print.
ABSTRACT
Artificial intelligence technology has introduced a new research paradigm into the fields of quantum chemistry and materials science, leading to numerous studies that utilize machine learning methods to predict molecular properties. We contend that an exemplary deep learning model should not only achieve high-precision predictions of molecular properties but also incorporate guidance from physical mechanisms. Here, we propose a framework for predicting molecular properties based on data-driven electron density images, referred to as D3-ImgNet. This framework integrates group theory, density functional theory-related mechanisms, deep learning techniques, and multiobjective optimization mechanisms, embodying a methodological fusion of data analytics and system optimization. Initially, we focus on atomization energies as the primary target of our study, using the QM9 data set to demonstrate the framework's ability to predict molecular atomization energies with high accuracy and excellent exploration performance. We then further evaluate its predictive capabilities for dipole moments and forces with the QM9X data set, achieving satisfactory results. Additionally, we tested the D3-ImgNet framework on the SN2 reaction data set to demonstrate its ability to precisely predict the minimum energy paths of SN2 chemical reactions, showcasing its portability and adaptability in chemical reaction modeling. Finally, visualizations of the electronic density generated by the framework faithfully replicate the physical phenomenon of electron density transfer. We believe that this framework has the potential to accelerate property predictions and high-throughput screening of functional materials.
PMID:39752232 | DOI:10.1021/acs.jpca.4c05519
Multi-institutional development and testing of attention-enhanced deep learning segmentation of thyroid nodules on ultrasound
Int J Comput Assist Radiol Surg. 2025 Jan 3. doi: 10.1007/s11548-024-03294-w. Online ahead of print.
ABSTRACT
PURPOSE: Thyroid nodules are common, and ultrasound-based risk stratification using ACR's TIRADS classification is a key step in predicting nodule pathology. Determining thyroid nodule contours is necessary for the calculation of TIRADS scores and can also be used in the development of machine learning nodule diagnosis systems. This paper presents the development, validation, and multi-institutional independent testing of a machine learning system for the automatic segmentation of thyroid nodules on ultrasound.
METHODS: The datasets, containing a total of 1595 thyroid ultrasound images from 520 patients with thyroid nodules, were retrospectively collected under IRB approval from University of Chicago Medicine (UCM) and Weill Cornell Medical Center (WCMC). Nodules were manually contoured by a team of UCM and WCMC physicians for ground truth. An AttU-Net, a U-Net architecture with additional attention weighting functions, was trained for the segmentations. The algorithm was validated through fivefold cross-validation by nodule and was tested on two independent test sets: one from UCM and one from WCMC. Dice similarity coefficient (DSC) and percent Hausdorff distance (%HD), Hausdorff distance reported as a percent of the nodule's effective diameter, served as the performance metrics.
RESULTS: On multi-institutional independent testing, the AttU-Net yielded average DSCs (std. deviation) of 0.915 (0.04) and 0.922 (0.03) and %HDs (std. deviation) of 12.9% (4.6) and 13.4% (6.3) on the UCM and WCMC test sets, respectively. Similarity testing showed the algorithm's performance on the two institutional test sets was equivalent up to margins of Δ DSC ≤ 0.013 and Δ %HD ≤ 1.73%.
CONCLUSIONS: This work presents a robust automatic thyroid nodule segmentation algorithm that could be implemented for risk stratification systems. Future work is merited to incorporate this segmentation method within an automatic thyroid classification system.
PMID:39751996 | DOI:10.1007/s11548-024-03294-w
Basic Science and Pathogenesis
Alzheimers Dement. 2024 Dec;20 Suppl 1:e085828. doi: 10.1002/alz.085828.
ABSTRACT
BACKGROUND: Amyloid-β accumulation is a pivotal factor in Alzheimer's disease (AD) progression. As treatment for AD has not been successful yet, the most effective approach lies in early diagnosis and the subsequent delay of disease progression. Hence, this study introduces a deep learning model to predict amyloid-β accumulation in the brain.
METHOD: We mathematically modeled the diffusion of amyloid-β based on its biological traits, encompassing generation, clearance, and diffusion. We converted the model into a deep learning framework with multi-layer perceptron (MLP) and graph convolutional neural network (GCN) (Kipf et al., 2016) to forecast the accumulation of the protein. We extracted the necessary information from various neuroimage data, including T1 structural magnetic resonance (MR) images, 18F-Florbetapir positron emission tomography (PET) scans, and diffusion weighted MR images (DWI), to simulate the diffusion of the protein (Figure 1). We used longitudinal data of 146 subjects, incorporating 436 data points.
RESULT: The proposed model accurately predicted amyloid-β after 2 years (Figure 2), showing a high correlation in the test dataset (median = 0.8273, IQR = [0.7708, 0.8692]), outperforming the previous model (average 0.58) (Kim et al., 2019). We examined generation and clearance terms, mapping top 30% ROIs onto the brain by averaging each term across subjects (Figure 3). The regions with early AD amyloid-β accumulation are believed to be related to the default mode network and prefrontal network (Palmqvist et al., 2017) supported by Figure 3a. The effectiveness of amyloid-β clearance may be influenced by brain activity (Mergenthaler et al., 2013; Ullah et al., 2023). Earlier studies reported diminished metabolism in specific regions during the early AD (Chételat et al., 2020; Kantarci et al., 2021). The proposed model identified high clearance regions (Figure 3b), aligning with regions showing normal metabolism.
CONCLUSION: We introduced a deep learning model that simulates the diffusion of amyloid-β with strong predictive performance and interpretation. While parameters were optimized for the entire group, accuracy varied for some subjects. Also, further investigation is needed to interpret each term comprehensively. Despite the need for individual optimization and additional interpretative analysis, the model may contribute to the diagnosis of AD.
PMID:39751753 | DOI:10.1002/alz.085828
Towards simplified graph neural networks for identifying cancer driver genes in heterophilic networks
Brief Bioinform. 2024 Nov 22;26(1):bbae691. doi: 10.1093/bib/bbae691.
ABSTRACT
The identification of cancer driver genes is crucial for understanding the complex processes involved in cancer development, progression, and therapeutic strategies. Multi-omics data and biological networks provided by numerous databases enable the application of graph deep learning techniques that incorporate network structures into the deep learning framework. However, most existing methods do not account for the heterophily in the biological networks, which hinders the improvement of model performance. Meanwhile, feature confusion often arises in models based on graph neural networks in such graphs. To address this, we propose a Simplified Graph neural network for identifying Cancer Driver genes in heterophilic networks (SGCD), which comprises primarily two components: a graph convolutional neural network with representation separation and a bimodal feature extractor. The results demonstrate that SGCD not only performs exceptionally well but also exhibits robust discriminative capabilities compared to state-of-the-art methods across all benchmark datasets. Moreover, subsequent interpretability experiments on both the model and biological aspects provide compelling evidence supporting the reliability of SGCD. Additionally, the model can dissect gene modules, revealing clearer connections between driver genes in cancers. We are confident that SGCD holds potential in the field of precision oncology and may be applied to prognosticate biomarkers for a wide range of complex diseases.
PMID:39751645 | DOI:10.1093/bib/bbae691