Deep learning
Prediction of ECG signals from ballistocardiography using deep learning for the unconstrained measurement of heartbeat intervals
Sci Rep. 2025 Jan 6;15(1):999. doi: 10.1038/s41598-024-84049-0.
ABSTRACT
We developed a deep learning-based extraction of electrocardiographic (ECG) waves from ballistocardiographic (BCG) signals and explored their use in R-R interval (RRI) estimation. Preprocessed BCG and reference ECG signals were inputted into the bidirectional long short-term memory network to train the model to minimize the loss function of the mean squared error between the predicted ECG (pECG) and genuine ECG signals. Using a dataset acquired with polyvinylidene fluoride and ECG sensors in different recumbent positions from 18 participants, we generated pECG signals from preprocessed BCG signals using the learned model and evaluated the RRI estimation performance by comparing the predicted RRI with the reference RRI obtained from the ECG signal using a leave-one-subject-out cross-validation scheme. A mean absolute error (MAE) of 0.034 s was achieved for the beat-to-beat interval accuracy. To further test the generalization ability of the learned model trained with a short-term-recorded dataset, we collected long-term overnight recordings of BCG signals from 12 different participants and performed validation. The beat-to-beat interval correlation between BCG and ECG signals was 0.82 ± 0.06 with an average MAE of 0.046 s, showing practical performance for long-term measurement of RRIs. These results suggest that the proposed approach can be used for continuous heart rate monitoring in a home environment.
PMID:39762351 | DOI:10.1038/s41598-024-84049-0
Attention activation network for bearing fault diagnosis under various noise environments
Sci Rep. 2025 Jan 6;15(1):977. doi: 10.1038/s41598-025-85275-w.
ABSTRACT
Bearings are critical in mechanical systems, as their health impacts system reliability. Proactive monitoring and diagnosing of bearing faults can prevent significant safety issues. Among various diagnostic methods that analyze bearing vibration signals, deep learning is notably effective. However, bearings often operate in noisy environments, especially during failures, which poses a challenge to most current deep learning methods that assume noise-free data. Therefore, this paper designs a Multi-Location Multi-Scale Multi-Level Information Attention Activation Network (MLSCA-CW) with excellent performance in different kinds of strong noise environments by combining soft threshold, self-activation, and self-attention mechanisms. The model has enhanced filtering performance and multi-location information fusion ability. Our comparative and ablation experiments demonstrate that the model's components, including the multi-location and multi-scale vibration extraction module, soft threshold noise filtering module, multi-scale self-activation mechanism, and layer attention mechanism, are highly effective in filtering noise from various locations and extracting multi-dimensional features. The MLSCA-CW model achieves 92.02% accuracy against various strong noise disturbance and outperforms SOTA methods under challenging working conditions in CWRU dataset.
PMID:39762349 | DOI:10.1038/s41598-025-85275-w
A Collaborative and Scalable Geospatial Data Set for Arctic Retrogressive Thaw Slumps with Data Standards
Sci Data. 2025 Jan 6;12(1):18. doi: 10.1038/s41597-025-04372-7.
ABSTRACT
Arctic permafrost is undergoing rapid changes due to climate warming in high latitudes. Retrogressive thaw slumps (RTS) are one of the most abrupt and impactful thermal-denudation events that change Arctic landscapes and accelerate carbon feedbacks. Their spatial distribution remains poorly characterised due to time-intensive conventional mapping methods. While numerous RTS studies have published standalone digitisation datasets, the lack of a centralised, unified database has limited their utilisation, affecting the scale of RTS studies and the generalisation ability of deep learning models. To address this, we established the Arctic Retrogressive Thaw Slumps (ARTS) dataset containing 23,529 RTS-present and 20,434 RTS-absent digitisations from 20 standalone datasets. We also proposed a Data Curation Framework as a working standard for RTS digitisations. This dataset is designed to be comprehensive, accessible, contributable, and adaptable for various RTS-related studies. This dataset and its accompanying curation framework establish a foundation for enhanced collaboration in RTS research, facilitating standardised data sharing and comprehensive analyses across the Arctic permafrost research community.
PMID:39762331 | DOI:10.1038/s41597-025-04372-7
PPI-CoAttNet: A Web Server for Protein-Protein Interaction Tasks Using a Coattention Model
J Chem Inf Model. 2025 Jan 6. doi: 10.1021/acs.jcim.4c01365. Online ahead of print.
ABSTRACT
Predicting protein-protein interactions (PPIs) is crucial for advancing drug discovery. Despite the proposal of numerous advanced computational methods, these approaches often suffer from poor usability for biologists and lack generalization. In this study, we designed a deep learning model based on a coattention mechanism that was capable of both PPI and site prediction and used this model as the foundation for PPI-CoAttNet, a user-friendly, multifunctional web server for PPI prediction. This platform provides comprehensive services for online PPI model training, PPI and site prediction, and prediction of interactions with proteins associated with highly prevalent cancers. In our Homo sapiens test set for PPI prediction, PPI-CoAttNet achieved an AUC of 0.9841 and an F1 score of 0.9440, outperforming most state-of-the-art models. Additionally, these results are generated in real time, delivering outcomes within minutes. We also evaluated PPI-CoAttNet for downstream tasks, including novel E3 ligase scoring, demonstrating outstanding accuracy. We believe that this tool will empower researchers, especially those without computational expertise, to leverage AI for accelerating drug development.
PMID:39761551 | DOI:10.1021/acs.jcim.4c01365
Assessing the Severity of Connective Tissue-Related Interstitial Lung Disease Using Computed Tomography Quantitative Analysis Parameters
J Comput Assist Tomogr. 2024 Nov 13. doi: 10.1097/RCT.0000000000001693. Online ahead of print.
ABSTRACT
OBJECTIVES: The aims of the study are to predict lung function impairment in patients with connective tissue disease (CTD)-associated interstitial lung disease (ILD) through computed tomography (CT) quantitative analysis parameters based on CT deep learning model and density threshold method and to assess the severity of the disease in patients with CTD-ILD.
METHODS: We retrospectively collected chest high-resolution CT images and pulmonary function test results from 105 patients with CTD-ILD between January 2021 and December 2023 (patients staged according to the gender-age-physiology [GAP] system), including 46 males and 59 females, with a median age of 64 years. Additionally, we selected 80 healthy controls (HCs) with matched sex and age, who showed no abnormalities in their chest high-resolution CT. Based on our previously developed RDNet analysis model, the proportion of the lung occupied by reticulation, honeycombing, and total interstitial abnormalities in CTD-ILD patients (ILD% = total interstitial abnormal volume/total lung volume) were calculated. Using the Pulmo-3D software with a threshold segmentation method of -260 to -600, the overall interstitial abnormal proportion (AA%) and mean lung density were obtained. The correlations between CT quantitative analysis parameters and pulmonary function indices were evaluated using Spearman or Pearson correlation coefficients. Stepwise multiple linear regression analysis was used to identify the best CT quantitative predictors for different pulmonary function parameters. Independent risk factors for GAP staging were determined using multifactorial logistic regression. The area under the ROC curve (AUC) differentiated between the CTD-ILD groups and HCs, as well as among GAP stages. The Kruskal-Wallis test was used to compare the differences in pulmonary function indices and CT quantitative analysis parameters among CTD-ILD groups.
RESULTS: Among 105 CTD-ILD patients (58 in GAP I, 36 in GAP II, and 11 in GAP III), results indicated that AA% distinguished between CTD-ILD patients and HCs with the highest AUC value of 0.974 (95% confidence interval: 0.955-0.993). With a threshold set at 9.7%, a sensitivity of 98.7% and a specificity of 89.5% were observed. Both honeycombing and ILD% showed statistically significant correlations with pulmonary function parameters, with honeycombing displaying the highest correlation coefficient with Composite Physiologic Index (CPI, r = 0.612). Multiple linear regression results indicated honeycombing was the best predictor for both the Dlco% and the CPI. Furthermore, multivariable logistic regression analysis identified honeycombing as an independent risk factor for GAP staging. Honeycombing differentiated between GAP I and GAP II + III with the highest AUC value of 0.729 (95% confidence interval: 0.634-0.811). With a threshold set at 8.0%, a sensitivity of 79.3% and a specificity of 57.4% were observed. Significant differences in honeycombing and ILD% were also noted among the disease groups (P < 0.05).
CONCLUSIONS: An AA% of 9.7% was the optimal threshold for differentiating CTD-ILD patients from HCs. Honeycombing can preliminarily predict lung function impairment and was an independent risk factor for GAP staging, offering significant clinical guidance for assessing the severity of the patient's disease.
PMID:39761506 | DOI:10.1097/RCT.0000000000001693
Deep Learning Reconstruction for Enhanced Resolution and Image Quality in Breath-Hold MRCP: A Preliminary Study
J Comput Assist Tomogr. 2024 Nov 13. doi: 10.1097/RCT.0000000000001680. Online ahead of print.
ABSTRACT
OBJECTIVE: This preliminary study aims to assess the image quality of enhanced-resolution deep learning reconstruction (ER-DLR) in magnetic resonance cholangiopancreatography (MRCP) and compare it with non-ER-DLR MRCP images.
METHODS: Our retrospective study incorporated 34 patients diagnosed with biliary and pancreatic disorders. We obtained MRCP images using a single breath-hold MRCP on a 3T MRI system. We reconstructed MRCP images with ER-DLR (matrix = 768 × 960) and without ER-DLR (matrix = 256 × 320). Quantitative evaluation involved measuring the signal-to-noise ratio (SNR), contrast, contrast-to-noise ratio (CNR) between the common bile duct and periductal tissues, and slope. Two radiologists independently scored image noise, contrast, artifacts, sharpness, and overall image quality for the 2 image types using a 4-point scale. Results are expressed as median and interquartile range (IQR), and we compared quantitative and qualitative scores employing the Wilcoxon test.
RESULTS: In quantitative analyses, ER-DLR significantly improved SNR (21.08 [IQR: 14.85, 31.5] vs 15.07 [IQR: 9.57, 25.23], P < 0.001), CNR (19.29 [IQR: 13.87, 24.98] vs 11.23 [IQR: 8.98, 15.74], P < 0.001), contrast (0.96 [IQR: 0.94, 0.97] vs 0.9 [IQR: 0.87, 0.92], P < 0.001), and slope of MRCP (0.62 [IQR: 0.56, 0.66] vs 0.49 [IQR: 0.45, 0.53], P < 0.001). The qualitative evaluation demonstrated significant improvements in the perceived noise (P < 0.001), contrast (P = 0.013), sharpness (P < 0.001), and overall image quality (P < 0.001).
CONCLUSIONS: ER-DLR markedly increased the resolution, SNR, and CNR of breath-hold-MRCP compared to cases without ER-DLR.
PMID:39761494 | DOI:10.1097/RCT.0000000000001680
Ensemble learning-based predictor for driver synonymous mutation with sequence representation
PLoS Comput Biol. 2025 Jan 6;21(1):e1012744. doi: 10.1371/journal.pcbi.1012744. Online ahead of print.
ABSTRACT
Synonymous mutations, once considered neutral, are now understood to have significant implications for a variety of diseases, particularly cancer. It is indispensable to identify these driver synonymous mutations in human cancers, yet current methods are constrained by data limitations. In this study, we initially investigate the impact of sequence-based features, including DNA shape, physicochemical properties and one-hot encoding of nucleotides, and deep learning-derived features from pre-trained chemical molecule language models based on BERT. Subsequently, we propose EPEL, an effect predictor for synonymous mutations employing ensemble learning. EPEL combines five tree-based models and optimizes feature selection to enhance predictive accuracy. Notably, the incorporation of DNA shape features and deep learning-derived features from chemical molecule represents a pioneering effect in assessing the impact of synonymous mutations in cancer. Compared to existing state-of-the-art methods, EPEL demonstrates superior performance on independent test datasets. Furthermore, our analysis reveals a significant correlation between effect scores and patient outcomes across various cancer types. Interestingly, while deep learning methods have shown promise in other fields, their DNA sequence representations do not significantly enhance the identification of driver synonymous mutations in this study. Overall, we anticipate that EPEL will facilitate researchers to more precisely target driver synonymous mutations. EPEL is designed with flexibility, allowing users to retrain the prediction model and generate effect scores for synonymous mutations in human cancers. A user-friendly web server for EPEL is available at http://ahmu.EPEL.bio/.
PMID:39761306 | DOI:10.1371/journal.pcbi.1012744
Energy consumption forecasting for oil and coal in China based on hybrid deep learning
PLoS One. 2025 Jan 6;20(1):e0313856. doi: 10.1371/journal.pone.0313856. eCollection 2025.
ABSTRACT
The consumption forecasting of oil and coal can help governments optimize and adjust energy strategies to ensure energy security in China. However, such forecasting is extremely challenging because it is influenced by many complex and uncertain factors. To fill this gap, we propose a hybrid deep learning approach for consumption forecasting of oil and coal in China. It consists of three parts, i.e., feature engineering, model building, and model integration. First, feature engineering is to distinguish the different correlations between targeted indicators and various features. Second, model building is to build five typical deep learning models with different characteristics to forecast targeted indicators. Third, model integration is to ensemble the built five models with a tailored, self-adaptive weighting strategy. As such, our approach enjoys all the merits of the five deep learning models (they have different learning structures and temporal constraints to diversify them for ensembling), making it able to comprehensively capture all the characteristics of different indicators to achieve accurate forecasting. To evaluate the proposed approach, we collected the real 880 pieces of data with 39 factors regarding the energy consumption of China ranging from 1999 to 2021. By conducting extensive experiments on the collected datasets, we have identified the optimal features for four targeted indicators (i.e., import of oil, production of oil, import of coal, and production of coal), respectively. Besides, we have demonstrated that our approach is significantly more accurate than the state-of-the-art forecasting competitors.
PMID:39761291 | DOI:10.1371/journal.pone.0313856
Using deep learning to shorten the acquisition time of brain MRI in acute ischemic stroke: Synthetic T2W images generated from b0 images
PLoS One. 2025 Jan 6;20(1):e0316642. doi: 10.1371/journal.pone.0316642. eCollection 2025.
ABSTRACT
OBJECTIVE: This study aimed to assess the feasibility of the deep learning in generating T2 weighted (T2W) images from diffusion-weighted imaging b0 images.
MATERIALS AND METHODS: This retrospective study included 53 patients who underwent head magnetic resonance imaging between September 1 and September 4, 2023. Each b0 image was matched with a corresponding T2-weighted image. A total of 954 pairs of images were divided into a training set with 763 pairs and a test set with 191 pairs. The Hybrid-Fusion Network (Hi-Net) and pix2pix algorithms were employed to synthesize T2W (sT2W) images from b0 images. The quality of the sT2W images was evaluated using three quantitative indicators: Peak Signal-to-Noise Ratio (PSNR), Structural Similarity (SSIM), and Normalized Mean Squared Error (NMSE). Subsequently, two radiologists were required to determine the authenticity of (s)T2W images and further scored the visual quality of sT2W images in the test set using a five-point Likert scale. The overall quality score, anatomical sharpness, tissue contrast and homogeneity were used to reflect the quality of the images at the level of overall and focal parts.
RESULTS: The indicators of pix2pix algorithm in test set were as follows: PSNR, 20.549±1.916; SSIM, 0.702±0.0864; NMSE, 0.239±0.150. The indicators of Hi-Net algorithm were as follows: PSNR, 20.646 ± 2.194; SSIM, 0.722 ± 0.0955; NMSE, 0.469 ± 0.124. Hi-Net performs better than pix2pix, so the sT2W images obtained by Hi-Net were used for radiologist assessment. The two readers accurately identified the nature of the images at rates of 69.90% and 71.20%, respectively. The synthetic images were falsely identified as real at rates of 57.6% and 57.1%, respectively. The overall quality score, sharpness, tissue contrast, and image homogeneity of the sT2Ws images ranged between 1.63 ± 0.79 and 4.45 ± 0.88. Specifically, the quality of the brain parenchyma, skull and scalp, and middle ear region was superior, while the quality of the orbit and paranasal sinus region was not good enough.
CONCLUSION: The Hi-Net is able to generate sT2WIs from low-resolution b0 images, with a better performance than pix2pix. It can therefore help identify incidental lesion through providing additional information, and demonstrates the potential to shorten the acquisition time of brain MRI during acute ischemic stroke imaging.
PMID:39761257 | DOI:10.1371/journal.pone.0316642
Breast cancer classification based on breast tissue structures using the Jigsaw puzzle task in self-supervised learning
Radiol Phys Technol. 2025 Jan 6. doi: 10.1007/s12194-024-00874-y. Online ahead of print.
ABSTRACT
Self-supervised learning (SSL) has gained attention in the medical field as a deep learning approach utilizing unlabeled data. The Jigsaw puzzle task in SSL enables models to learn both features of images and the positional relationships within images. In breast cancer diagnosis, radiologists evaluate not only lesion-specific features but also the surrounding breast structures. However, deep learning models that adopt a diagnostic approach similar to human radiologists are still limited. This study aims to evaluate the effectiveness of the Jigsaw puzzle task in characterizing breast tissue structures for breast cancer classification on mammographic images. Using the Chinese Mammography Database (CMMD), we compared four pre-training pipelines: (1) IN-Jig, pre-trained with both the ImageNet classification task and the Jigsaw puzzle task, (2) Scratch-Jig, pre-trained only with the Jigsaw puzzle task, (3) IN, pre-trained only with the ImageNet classification task, and (4) Scratch, that is trained from random initialization without any pre-training tasks. All pipelines were fine-tuned using binary classification to distinguish between the presence or absence of breast cancer. Performance was evaluated based on the area under the receiver operating characteristic curve (AUC), sensitivity, and specificity. Additionally, detailed analysis was conducted for performance across different radiological findings, breast density, and regions of interest were visualized using gradient-weighted class activation mapping (Grad-CAM). The AUC for the four models were 0.925, 0.921, 0.918, 0.909, respectively. Our results suggest the Jigsaw puzzle task is an effective pre-training method for breast cancer classification, with the potential to enhance diagnostic accuracy with limited data.
PMID:39760975 | DOI:10.1007/s12194-024-00874-y
An accelerated deep learning model can accurately identify clinically important humeral and scapular landmarks on plain radiographs obtained before and after anatomic arthroplasty
Int Orthop. 2025 Jan 6. doi: 10.1007/s00264-024-06401-3. Online ahead of print.
ABSTRACT
PURPOSE: Accurate identification of radiographic landmarks is fundamental to characterizing glenohumeral relationships before and sequentially after shoulder arthroplasty, but manual annotation of these radiographs is laborious. We report on the use of artificial intelligence, specifically computer vision and deep learning models (DLMs), in determining the accuracy of DLM-identified and surgeon identified (SI) landmarks before and after anatomic shoulder arthroplasty.
MATERIALS & METHODS: 240 true anteroposterior radiographs were annotated using 11 standard osseous landmarks to train a deep learning model. Radiographs were modified to allow for a training model consisting of 2,260 images. The accuracy of DLM landmarks was compared to manually annotated radiographs using 60 radiographs not used in the training model. In addition, we also performed 14 different measurements of component positioning and compared these to measurements made based on DLM landmarks.
RESULTS: The mean deviation between DLM vs. SI cortical landmarks was 1.9 ± 1.9 mm. Scapular landmarks had slightly lower deviations compared to humeral landmarks (1.5 ± 1.8 mm vs. 2.1 ± 2.0 mm, p < 0.001). The DLM was also found to be accurate with respect to 14 measures of scapular, humeral, and glenohumeral measurements with a mean deviation of 2.9 ± 2.7 mm.
CONCLUSIONS: An accelerated deep learning model using a base of only 240 annotated images was able to achieve low levels of deviation in identifying common humeral and scapular landmarks on preoperative and postoperative radiographs. The reliability and efficiency of this deep learning model represents a powerful tool to analyze preoperative and postoperative radiographs while avoiding human observer bias.
LEVEL OF EVIDENCE: IV.
PMID:39760903 | DOI:10.1007/s00264-024-06401-3
SchizoLMNet: a modified lightweight MobileNetV2- architecture for automated schizophrenia detection using EEG-derived spectrograms
Phys Eng Sci Med. 2025 Jan 6. doi: 10.1007/s13246-024-01512-y. Online ahead of print.
ABSTRACT
Schizophrenia (SZ) is a chronic neuropsychiatric disorder characterized by disturbances in cognitive, perceptual, social, emotional, and behavioral functions. The conventional SZ diagnosis relies on subjective assessments of individuals by psychiatrists, which can result in bias, prolonged procedures, and potentially false diagnoses. This emphasizes the crucial need for early detection and treatment of SZ to provide timely support and minimize long-term impacts. Utilizing the ability of electroencephalogram (EEG) signals to capture brain activity dynamics, this article introduces a novel lightweight modified MobileNetV2- architecture (SchizoLMNet) for efficiently diagnosing SZ using spectrogram images derived from selected EEG channel data. The proposed methodology involves preprocessing of raw EEG data of 81 subjects collected from Kaggle data repository. Short-time Fourier transform (STFT) is applied to transform pre-processed EEG signals into spectrogram images followed by data augmentation. Further, the generated images are subjected to deep learning (DL) models to perform the binary classification task. Utilizing the proposed model, it achieved accuracies of 98.17%, 97.03%, and 95.55% for SZ versus healthy classification in hold-out, subject independent testing, and subject-dependent testing respectively. The SchizoLMNet model demonstrates superior performance compared to various pretrained DL models and state-of-the-art techniques. The proposed framework will be further translated into real-time clinical settings through a mobile edge computing device. This innovative approach will serve as a bridge between medical staff and patients, facilitating intelligent communication and assisting in effective SZ management.
PMID:39760847 | DOI:10.1007/s13246-024-01512-y
Enhancing percutaneous coronary intervention using TriVOCTNet: a multi-task deep learning model for comprehensive intravascular optical coherence tomography analysis
Phys Eng Sci Med. 2025 Jan 6. doi: 10.1007/s13246-024-01509-7. Online ahead of print.
ABSTRACT
Neointimal coverage and stent apposition, as assessed from intravascular optical coherence tomography (IVOCT) images, are crucial for optimizing percutaneous coronary intervention (PCI). Existing state-of-the-art computer algorithms designed to automate this analysis often treat lumen and stent segmentations as separate target entities, applicable only to a single stent type and overlook automation of preselecting which pullback segments need segmentation, thus limit their practicality. This study aimed for an algorithm capable of intelligently handling the entire IVOCT pullback across different phases of PCI and clinical scenarios, including the presence and coexistence of metal and bioresorbable vascular scaffold (BVS), stent types. We propose a multi-task deep learning model, named TriVOCTNet, that automates image classification/selection, lumen segmentation and stent struts segmentation within a single network by integrating classification, regression and pixel-level segmentation models. This approach allowed a single-network, single-pass implementation with all tasks parallelized for speed and convenience. A joint loss function was specifically designed to optimize each task in situations where each task may or may not be present. Evaluation on 4,746 images achieved classification accuracies of 0.999, 0.997, and 0.998 for lumen, BVS, and metal stent features, respectively. The lumen segmentation performance showed a Euclidean distance error of 21.72 μm and Dice's coefficient of 0.985. For BVS struts segmentation, the Dice's coefficient was 0.896, and for metal stent struts segmentation, the precision was 0.895 and sensitivity was 0.868. TriVOCTNet highlights its clinical potential due to its fast and accurate results, and simplicity in handling all tasks and scenarios through a single system.
PMID:39760844 | DOI:10.1007/s13246-024-01509-7
Artificial intelligence and stroke imaging
Curr Opin Neurol. 2025 Feb 1;38(1):40-46. doi: 10.1097/WCO.0000000000001333. Epub 2024 Nov 14.
ABSTRACT
PURPOSE OF REVIEW: Though simple in its fundamental mechanism - a critical disruption of local blood supply - stroke is complicated by the intricate nature of the neural substrate, the neurovascular architecture, and their complex interactions in generating its clinical manifestations. This complexity is adequately described by high-resolution imaging with sensitivity not only to parenchymal macrostructure but also microstructure and functional tissue properties, in conjunction with detailed characterization of vascular topology and dynamics. Such descriptive richness mandates models of commensurate complexity only artificial intelligence could plausibly deliver, if we are to achieve the goal of individually precise, personalized care.
RECENT FINDINGS: Advances in machine vision technology, especially deep learning, are delivering higher fidelity predictive, descriptive, and inferential tools, incorporating increasingly rich imaging information within ever more flexible models. Impact at the clinical front line remains modest, however, owing to the challenges of delivering models robust to the noisy, incomplete, biased, and comparatively small-scale data characteristic of real-world practice.
SUMMARY: The potential benefit of introducing AI to stroke, in imaging and elsewhere, is now unquestionable, but the optimal approach - and the path to real-world application - remain unsettled. Deep generative models offer a compelling solution to current obstacles and are predicted powerfully to catalyse innovation in the field.
PMID:39760722 | DOI:10.1097/WCO.0000000000001333
Diagnostic Performance of Deep Learning Applications in Hepatocellular Carcinoma Detection Using Computed Tomography Imaging
Turk J Gastroenterol. 2024 Dec 30. doi: 10.5152/tjg.2024.24538. Online ahead of print.
ABSTRACT
Hepatocellular carcinoma (HCC) is a prevalent cancer that significantly contributes to mortality globally, primarily due to its late diagnosis. Early detection is crucial yet challenging. This study leverages the potential of deep learning (DL) technologies, employing the You Only Look Once (YOLO) architecture, to enhance the detection of HCC in computed tomography (CT) images, aiming to improve early diagnosis and thereby patient outcomes. We used a dataset of 1290 CT images from 122 patients, segmented according to a standard 70:20:10 split for training, validation, and testing phases. The YOLO-based DL model was trained on these images, with subsequent phases for validation and testing to assess the model's diagnostic capabilities comprehensively. The model exhibited exceptional diagnostic accuracy, with a precision of 0.97216, recall of 0.919, and an overall accuracy of 95.35%, significantly surpassing traditional diagnostic approaches. It achieved a specificity of 95.83% and a sensitivity of 94.74%, evidencing its effectiveness in clinical settings and its potential to reduce the rate of missed diagnoses and unnecessary interventions. The implementation of the YOLO architecture for detecting HCC in CT scans has shown substantial promise, indicating that DL models could soon become a standard tool in oncological diagnostics. As artificial intelligence technology continues to evolve, its integration into healthcare systems is expected to advance the accuracy and efficiency of diagnostics in oncology, enhancing early detection and treatment strategies and potentially improving patient survival rates.
PMID:39760649 | DOI:10.5152/tjg.2024.24538
Highly-Efficient Differentiation of Reactive Lymphocytes in Peripheral Blood Using Multi-Object Detection Network With Large Kernels
Microsc Res Tech. 2025 Jan 6. doi: 10.1002/jemt.24775. Online ahead of print.
ABSTRACT
Reactive lymphocytes are an important type of leukocytes, which are morphologically transformed from lymphocytes. The increase in these cells is usually a sign of certain virus infections, so their detection plays an important role in the fight against diseases. Manual detection of reactive lymphocytes is undoubtedly time-consuming and labor-intensive, requiring a high level of professional knowledge. Therefore, it is highly necessary to conduct research into computer-assisted diagnosis. With the development of deep learning technology in the field of computer vision, more and more models are being applied in the field of medical imaging. We aim to propose an advanced multi-object detection network and apply it to practical medical scenarios of reactive lymphocyte detection and other leukocyte detection. First, we introduce a space-to-depth convolution (SPD-Conv), which enhances the model's ability to detect dense small objects. Next, we design a dynamic large kernel attention (DLKA) mechanism, enabling the model to better model the context of various cells in clinical scenarios. Lastly, we introduce a brand-new feature fusion network, the asymptotic feature pyramid network (AFPN), which strengthens the model's ability to fuse multi-scale features. Our model ultimately achieves mAP50 of 0.918 for reactive lymphocyte detection and 0.907 for all leukocytes, while also demonstrating good interpretability. In addition, we propose a new peripheral blood cell dataset, providing data support for subsequent related work. In summary, our work takes a significant step forward in the detection of reactive lymphocytes.
PMID:39760201 | DOI:10.1002/jemt.24775
Comprehensive VR dataset for machine learning: Head- and eye-centred video and positional data
Data Brief. 2024 Nov 29;57:111187. doi: 10.1016/j.dib.2024.111187. eCollection 2024 Dec.
ABSTRACT
We present a comprehensive dataset comprising head- and eye-centred video recordings from human participants performing a search task in a variety of Virtual Reality (VR) environments. Using a VR motion platform, participants navigated these environments freely while their eye movements and positional data were captured and stored in CSV format. The dataset spans six distinct environments, including one specifically for calibrating the motion platform, and provides a cumulative playtime of over 10 h for both head- and eye-centred perspectives. The data collection was conducted in naturalistic VR settings, where participants collected virtual coins scattered across diverse landscapes such as grassy fields, dense forests, and an abandoned urban area, each characterized by unique ecological features. This structured and detailed dataset offers substantial reuse potential, particularly for machine learning applications. The richness of the dataset makes it an ideal resource for training models on various tasks, including the prediction and analysis of visual search behaviour, eye movement and navigation strategies within VR environments. Researchers can leverage this extensive dataset to develop and refine algorithms requiring comprehensive and annotated video and positional data. By providing a well-organized and detailed dataset, it serves as an invaluable resource for advancing machine learning research in VR and fostering the development of innovative VR technologies.
PMID:39760008 | PMC:PMC11699299 | DOI:10.1016/j.dib.2024.111187
A dataset of deep learning performance from cross-base data encoding on MNIST and MNIST-C
Data Brief. 2024 Dec 3;57:111194. doi: 10.1016/j.dib.2024.111194. eCollection 2024 Dec.
ABSTRACT
Effective data representation in machine learning and deep learning is paramount. For an algorithm or neural network to capture patterns in data and be able to make reliable predictions, the data must appropriately describe the problem domain. Although there exists much literature on data preprocessing for machine learning and data science applications, novel data representation methods for enhancing machine learning model performance remain highly absent within the literature. This dataset is a compilation of convolutional neural network model performance trained and tested on a wide range of numerical base representations of the MNIST and MNIST-C datasets. This performance data can be further analysed by the research community to uncover trends in model performance against the numerical base of its data. This dataset can be used to produce more research of the same nature, testing cross-base data encoding on machine learning training and testing data for a wide range of real-world applications.
PMID:39760007 | PMC:PMC11697575 | DOI:10.1016/j.dib.2024.111194
Application of a Novel Multimodal-Based Deep Learning Model for the Prediction of Papillary Thyroid Carcinoma Recurrence
Int J Gen Med. 2024 Dec 31;17:6585-6594. doi: 10.2147/IJGM.S486189. eCollection 2024.
ABSTRACT
PURPOSE: Papillary thyroid carcinoma (PTC) is the most common thyroid malignancy. Although its mortality rate is low, some patients experience cancer recurrence during follow-up. In this study, we investigated the accuracy of a novel multimodal model by simultaneously analyzing numeric and time-series data to predict recurrence in patients with PTC after thyroidectomy.
PATIENTS AND METHODS: We analyzed patients with thyroid carcinoma who underwent thyroidectomy at the Chungbuk National University Hospital between January 2006 and December 2021. The proposed model used numerical data, including clinical information at the time of surgery, and time-series data, including postoperative thyroid function test results. For the model training with unbalanced data, we employed weighted binary cross-entropy with weights of 0.8 for the positive (recurrence) group and 0.2 for the negative (nonrecurrence) group. We performed four-fold cross-validation of the dataset to evaluate the model performance.
RESULTS: Our dataset comprised 1613 patients who underwent thyroidectomy, including 1550 and 63 patients with nonrecurrent and recurrent PTC, respectively. Patients with recurrence had a larger tumor size, more tumor multiplicity, and a higher male-to-female ratio than those without recurrence. The proposed model achieved an average area under the curve of 0.9622, F1-score of 0.4603, sensitivity of 0.9042, and specificity of 0.9077.
CONCLUSION: When applying our proposed model, the experimental results showed that it could predict recurrence at least 1 year before occurrence. The multimodal model for predicting PTC recurrence after thyroidectomy showed good performance. In clinical practice, it may help with the early detection of recurrence during the follow-up of patients with PTC after thyroidectomy.
PMID:39759893 | PMC:PMC11699832 | DOI:10.2147/IJGM.S486189
Simple quantitation and spatial characterization of label free cellular images
Heliyon. 2024 Nov 23;10(23):e40684. doi: 10.1016/j.heliyon.2024.e40684. eCollection 2024 Dec 15.
ABSTRACT
Label-free imaging is routinely used during cell culture because of its minimal interference with intracellular biology and capability of observing cells over time. However, label-free image analysis is challenging due to the low contrast between foreground signals and background. So far various deep learning tools have been developed for label-free image analysis and their performance depends on the quality of training data. In this study, we developed a simple computational pipeline that requires no training data and is suited to run on images generated using high-content microscopy equipment. By combining classical image processing functions, Voronoi segmentation, Gaussian mixture modeling and automatic parameter optimization, our pipeline can be used for cell number quantification and spatial distribution characterization based on a single label-free image. We demonstrated the applicability of our pipeline in four morphologically distinct cell types with various cell densities. Our pipeline is implemented in R and does not require excessive computational power, providing novel opportunities for automated label-free image analysis for large-scale or repeated cell culture experiments.
PMID:39759864 | PMC:PMC11700677 | DOI:10.1016/j.heliyon.2024.e40684