Deep learning
PLMC: Language Model of Protein Sequences Enhances Protein Crystallization Prediction
Interdiscip Sci. 2024 Aug 19. doi: 10.1007/s12539-024-00639-6. Online ahead of print.
ABSTRACT
X-ray diffraction crystallography has been most widely used for protein three-dimensional (3D) structure determination for which whether proteins are crystallizable is a central prerequisite. Yet, there are a number of procedures during protein crystallization, including protein material production, purification, and crystal production, which take turns affecting the crystallization outcome. Due to the expensive and laborious nature of this multi-stage process, various computational tools have been developed to predict protein crystallization propensity, which is then used to guide the experimental determination. In this study, we presented a novel deep learning framework, PLMC, to improve multi-stage protein crystallization propensity prediction by leveraging a pre-trained protein language model. To effectively train PLMC, two groups of features of each protein were integrated into a more comprehensive representation, including protein language embeddings from the large-scale protein sequence database and a handcrafted feature set consisting of physicochemical, sequence-based and disordered-related information. These features were further separately embedded for refinement, and then concatenated for the final prediction. Notably, our extensive benchmarking tests demonstrate that PLMC greatly outperforms other state-of-the-art methods by achieving AUC scores of 0.773, 0.893, and 0.913, respectively, at the aforementioned individual stages, and 0.982 at the final crystallization stage. Furthermore, PLMC is shown to be superior for predicting the crystallization of both globular and membrane proteins, as demonstrated by an AUC score of 0.991 for the latter. These results suggest the significant potential of PLMC in assisting researchers with the experimental design of crystallizable protein variants.
PMID:39155325 | DOI:10.1007/s12539-024-00639-6
Application of Photoplethysmography Combined with Deep Learning in Postoperative Monitoring of Flaps
Zhongguo Yi Liao Qi Xie Za Zhi. 2024 Jul 30;48(4):419-425. doi: 10.12455/j.issn.1671-7104.230624.
ABSTRACT
OBJECTIVE: Photoelectric volumetric tracing (PPG) exhibits high sensitivity and specificity in flap monitoring. Deep learning (DL) is capable of automatically and robustly extracting features from raw data. In this study, we propose combining PPG with 1D convolutional neural networks (1D-CNN) to preliminarily explore the method's ability to distinguish the degree of embolism and to localize the embolic site in skin flap arteries.
METHODS: Data were collected under normal conditions and various embolic scenarios by creating vascular emboli in a dermatome artery model and a rabbit dermatome model. These datasets were then trained, validated, and tested using 1D-CNN.
RESULTS: As the degree of arterial embolization increased, the PPG amplitude upstream of the embolization site progressively increased, while the downstream amplitude progressively decreased, and the gap between the upstream and downstream amplitudes at the embolization site progressively widened. 1D-CNN was evaluated in the skin flap arterial model and rabbit skin flap model, achieving average accuracies of 98.36% and 95.90%, respectively.
CONCLUSION: The combined monitoring approach of DL and PPG can effectively identify the degree of embolism and locate the embolic site within the skin flap artery.
PMID:39155256 | DOI:10.12455/j.issn.1671-7104.230624
Development of an Intelligent Multi-Parameter Sleep Diagnosis and Analysis System
Zhongguo Yi Liao Qi Xie Za Zhi. 2024 Jul 30;48(4):373-379. doi: 10.12455/j.issn.1671-7104.240036.
ABSTRACT
Sleep disordered breathing (SDB) is a common sleep disorder with an increasing prevalence. The current gold standard for diagnosing SDB is polysomnography (PSG), but existing PSG techniques have some limitations, such as long manual interpretation times, a lack of data quality control, and insufficient monitoring of gas metabolism and hemodynamics. Therefore, there is an urgent need in China's sleep clinical applications to develop a new intelligent PSG system with data quality control, gas metabolism assessment, and hemodynamic monitoring capabilities. The new system, in terms of hardware, detects traditional parameters like nasal airflow, blood oxygen levels, electrocardiography (ECG), electroencephalography (EEG), electromyography (EMG), electrooculogram (EOG), and includes additional modules for gas metabolism assessment via end-tidal CO 2 and O 2 concentration, and hemodynamic function assessment through impedance cardiography. On the software side, deep learning methods are being employed to develop intelligent data quality control and diagnostic techniques. The goal is to provide detailed sleep quality assessments that effectively assist doctors in evaluating the sleep quality of SDB patients.
PMID:39155248 | DOI:10.12455/j.issn.1671-7104.240036
Deep Learning-Based Artificial Intelligence Model for Automatic Carotid Plaque Identification
Zhongguo Yi Liao Qi Xie Za Zhi. 2024 Jul 30;48(4):361-366. doi: 10.12455/j.issn.1671-7104.240009.
ABSTRACT
This study aims at developing a dataset for determining the presence of carotid artery plaques in ultrasound images, composed of 1761 ultrasound images from 1165 participants. A deep learning architecture that combines bilinear convolutional neural networks with residual neural networks, known as the single-input BCNN-ResNet model, was utilized to aid clinical doctors in diagnosing plaques using carotid ultrasound images. Following training, internal validation, and external validation, the model yielded an ROC AUC of 0.99 (95% confidence interval: 0.91 to 0.84) in internal validation and 0.95 (95% confidence interval: 0.96 to 0.94) in external validation, surpassing the ResNet-34 network model, which achieved an AUC of 0.98 (95% confidence interval: 0.99 to 0.95) in internal validation and 0.94 (95% confidence interval: 0.95 to 0.92) in external validation. Consequently, the single-input BCNN-ResNet network model has shown remarkable diagnostic capabilities and offers an innovative solution for the automatic detection of carotid artery plaques.
PMID:39155246 | DOI:10.12455/j.issn.1671-7104.240009
Use of explainable AI on slit-lamp images of anterior surface of eyes to diagnose allergic conjunctival diseases
Allergol Int. 2024 Aug 17:S1323-8930(24)00077-7. doi: 10.1016/j.alit.2024.07.004. Online ahead of print.
ABSTRACT
BACKGROUND: Artificial intelligence (AI) is a promising new technology that has the potential of diagnosing allergic conjunctival diseases (ACDs). However, its development is slowed by the absence of a tailored image database and explainable AI models. Thus, the purpose of this study was to develop an explainable AI model that can not only diagnose ACDs but also present the basis for the diagnosis.
METHODS: A dataset of 4942 slit-lamp images from 10 ophthalmological institutions across Japan were used as the image database. A sequential pipeline of segmentation AI was constructed to identify 12 clinical findings in 1038 images of seasonal and perennial allergic conjunctivitis (AC), atopic keratoconjunctivitis (AKC), vernal keratoconjunctivitis (VKC), giant papillary conjunctivitis (GPC), and normal subjects. The performance of the pipeline was evaluated by determining its ability to obtain explainable results through the extraction of the findings. Its diagnostic accuracy was determined for 4 severity-based diagnosis classification of AC, AKC/VKC, GPC, and normal.
RESULTS: Segmentation AI pipeline efficiently extracted crucial ACD indicators including conjunctival hyperemia, giant papillae, and shield ulcer, and offered interpretable insights. The AI pipeline diagnosis had a high diagnostic accuracy of 86.2%, and that of the board-certified ophthalmologists was 60.0%. The pipeline had a high classification performance, and the area under the curve (AUC) was 0.959 for AC, 0.905 for normal subjects, 0.847 for GPC, 0.829 for VKC, and 0.790 for AKC.
CONCLUSIONS: An explainable AI model created by a comprehensive image database can be used for diagnosing ACDs with high degree of accuracy.
PMID:39155213 | DOI:10.1016/j.alit.2024.07.004
COVID-19 IgG Antibodies Detection Based on CNN-BiLSTM Algorithm Combined with Fiber-Optic Dataset
J Virol Methods. 2024 Aug 16:115011. doi: 10.1016/j.jviromet.2024.115011. Online ahead of print.
ABSTRACT
The urgent need for efficient and accurate automated screening tools for COVID-19 detection has led to research efforts exploring various approaches. In this study, we present pioneering research on COVID-19 detection using a hybrid model that combines convolutional neural networks (CNN) with a bi-directional long short-term memory (Bi-LSTM) network, in conjunction with fiber optic data for SARS-CoV-2 Immunoglobulin G (IgG) antibodies. Our research introduces a comprehensive data preprocessing pipeline and evaluates the performance of four different deep learning (DL) algorithms: CNN, CNN-RNN, BiLSTM, and CNN-BiLSTM, in classifying samples as positive or negative for the COVID-19 virus. Among these, the CNN-BiLSTM classifier demonstrated superior performance on the training datasets, achieving an accuracy of 89%, a recall of 88%, a precision of 90%, an F1-score of 89%, a specificity of 90%, a geometric mean (G-mean) of 89%, and a receiver operating characteristic (ROC) of 96%. In addition, the achieved classification results were compared with those reported in the literature. The findings indicate that the proposed model has promising potential for classifying COVID-19 and could serve as a valuable tool for healthcare professionals. The use of IgG antibodies to detect the virus enhances the specificity and accuracy of the diagnostic tool.
PMID:39154936 | DOI:10.1016/j.jviromet.2024.115011
NeuroPred-ResSE: Predicting neuropeptides by integrating residual block and squeeze-excitation attention mechanism
Anal Biochem. 2024 Aug 16:115648. doi: 10.1016/j.ab.2024.115648. Online ahead of print.
ABSTRACT
Neuropeptides play crucial roles in regulating neurological function acting as signaling molecules, which provide new opportunity for developing drugs for the treatment of neurological diseases. Therefore, it is very necessary to develop a rapid and accurate prediction model for neuropeptides. Although a few prediction tools have been developed, there is room for improvement in prediction accuracy by using deep learning approach. In this paper, we establish the NeuroPred-ResSE model based on residual block and squeeze-excitation attention mechanism. Firstly, we extract multi-features by using one-hot coding based on the NT5CT5 sequence, dipeptide deviation from expected mean and natural vector. Then, we integrate residual block and squeeze-excitation attention mechanism, which can capture and identify the most relevant attribute features. Finally, the accuracies of the training set and test set are 97.16% and 96.60% based on the 5-fold cross-validation and independent test, respectively, and other evaluation metrics have also obtained satisfactory results. The experimental results show that the performance of the NeuroPred-ResSE model outperforms those of existing state-of-the-art models, and our model is an effective, intelligent and robust prediction tool. The datasets and source codes are available at https://github.com/yunyunliang88/NeuroPred-ResSE.
PMID:39154878 | DOI:10.1016/j.ab.2024.115648
Editorial: Advanced deep learning approaches enable high-throughput biological and biomedicine data analysis
Methods. 2024 Aug 16:S1046-2023(24)00183-X. doi: 10.1016/j.ymeth.2024.08.002. Online ahead of print.
NO ABSTRACT
PMID:39154807 | DOI:10.1016/j.ymeth.2024.08.002
TS-AI: A deep learning pipeline for multimodal subject-specific parcellation with task contrasts synthesis
Med Image Anal. 2024 Aug 8;97:103297. doi: 10.1016/j.media.2024.103297. Online ahead of print.
ABSTRACT
Accurate mapping of brain functional subregions at an individual level is crucial. Task-based functional MRI (tfMRI) captures subject-specific activation patterns during various functions and behaviors, facilitating the individual localization of functionally distinct subregions. However, acquiring high-quality tfMRI is time-consuming and resource-intensive in both scientific and clinical settings. The present study proposes a two-stage network model, TS-AI, to individualize an atlas on cortical surfaces through the prediction of tfMRI data. TS-AI first synthesizes a battery of task contrast maps for each individual by leveraging tract-wise anatomical connectivity and resting-state networks. These synthesized maps, along with feature maps of tract-wise anatomical connectivity and resting-state networks, are then fed into an end-to-end deep neural network to individualize an atlas. TS-AI enables the synthesized task contrast maps to be used in individual parcellation without the acquisition of actual task fMRI scans. In addition, a novel feature consistency loss is designed to assign vertices with similar features to the same parcel, which increases individual specificity and mitigates overfitting risks caused by the absence of individual parcellation ground truth. The individualized parcellations were validated by assessing test-retest reliability, homogeneity, and cognitive behavior prediction using diverse reference atlases and datasets, demonstrating the superior performance and generalizability of TS-AI. Sensitivity analysis yielded insights into region-specific features influencing individual variation in functional regionalization. Additionally, TS-AI identified accelerated shrinkage in the medial temporal and cingulate parcels during the progression of Alzheimer's disease, suggesting its potential in clinical research and applications.
PMID:39154619 | DOI:10.1016/j.media.2024.103297
Multimodal representations of biomedical knowledge from limited training whole slide images and reports using deep learning
Med Image Anal. 2024 Aug 14;97:103303. doi: 10.1016/j.media.2024.103303. Online ahead of print.
ABSTRACT
The increasing availability of biomedical data creates valuable resources for developing new deep learning algorithms to support experts, especially in domains where collecting large volumes of annotated data is not trivial. Biomedical data include several modalities containing complementary information, such as medical images and reports: images are often large and encode low-level information, while reports include a summarized high-level description of the findings identified within data and often only concerning a small part of the image. However, only a few methods allow to effectively link the visual content of images with the textual content of reports, preventing medical specialists from properly benefitting from the recent opportunities offered by deep learning models. This paper introduces a multimodal architecture creating a robust biomedical data representation encoding fine-grained text representations within image embeddings. The architecture aims to tackle data scarcity (combining supervised and self-supervised learning) and to create multimodal biomedical ontologies. The architecture is trained on over 6,000 colon whole slide Images (WSI), paired with the corresponding report, collected from two digital pathology workflows. The evaluation of the multimodal architecture involves three tasks: WSI classification (on data from pathology workflow and from public repositories), multimodal data retrieval, and linking between textual and visual concepts. Noticeably, the latter two tasks are available by architectural design without further training, showing that the multimodal architecture that can be adopted as a backbone to solve peculiar tasks. The multimodal data representation outperforms the unimodal one on the classification of colon WSIs and allows to halve the data needed to reach accurate performance, reducing the computational power required and thus the carbon footprint. The combination of images and reports exploiting self-supervised algorithms allows to mine databases without needing new annotations provided by experts, extracting new information. In particular, the multimodal visual ontology, linking semantic concepts to images, may pave the way to advancements in medicine and biomedical analysis domains, not limited to histopathology.
PMID:39154617 | DOI:10.1016/j.media.2024.103303
Metadata-enhanced contrastive learning from retinal optical coherence tomography images
Med Image Anal. 2024 Aug 10;97:103296. doi: 10.1016/j.media.2024.103296. Online ahead of print.
ABSTRACT
Deep learning has potential to automate screening, monitoring and grading of disease in medical images. Pretraining with contrastive learning enables models to extract robust and generalisable features from natural image datasets, facilitating label-efficient downstream image analysis. However, the direct application of conventional contrastive methods to medical datasets introduces two domain-specific issues. Firstly, several image transformations which have been shown to be crucial for effective contrastive learning do not translate from the natural image to the medical image domain. Secondly, the assumption made by conventional methods, that any two images are dissimilar, is systematically misleading in medical datasets depicting the same anatomy and disease. This is exacerbated in longitudinal image datasets that repeatedly image the same patient cohort to monitor their disease progression over time. In this paper we tackle these issues by extending conventional contrastive frameworks with a novel metadata-enhanced strategy. Our approach employs widely available patient metadata to approximate the true set of inter-image contrastive relationships. To this end we employ records for patient identity, eye position (i.e. left or right) and time series information. In experiments using two large longitudinal datasets containing 170,427 retinal optical coherence tomography (OCT) images of 7912 patients with age-related macular degeneration (AMD), we evaluate the utility of using metadata to incorporate the temporal dynamics of disease progression into pretraining. Our metadata-enhanced approach outperforms both standard contrastive methods and a retinal image foundation model in five out of six image-level downstream tasks related to AMD. We find benefits in both a low-data and high-data regime across tasks ranging from AMD stage and type classification to prediction of visual acuity. Due to its modularity, our method can be quickly and cost-effectively tested to establish the potential benefits of including available metadata in contrastive pretraining.
PMID:39154616 | DOI:10.1016/j.media.2024.103296
Exploring the Trade-Off between generalist and specialized Models: A center-based comparative analysis for glioblastoma segmentation
Int J Med Inform. 2024 Aug 15;191:105604. doi: 10.1016/j.ijmedinf.2024.105604. Online ahead of print.
ABSTRACT
INTRODUCTION: Inherent variations between inter-center data can undermine the robustness of segmentation models when applied at a specific center (dataset shift). We investigated whether specialized center-specific models are more effective compared to generalist models based on multi-center data, and how center-specific data could enhance the performance of generalist models within a particular center using a fine-tuning transfer learning approach. For this purpose, we studied the dataset shift at center level and conducted a comparative analysis to assess the impact of data source on glioblastoma segmentation models.
METHODS & MATERIALS: The three key components of dataset shift were studied: prior probability shift-variations in tumor size or tissue distribution among centers; covariate shift-inter-center MRI alterations; and concept shift-different criteria for tumor segmentation. BraTS 2021 dataset was used, which includes 1251 cases from 23 centers. Thereafter, 155 deep-learning models were developed and compared, including 1) generalist models trained with multi-center data, 2) specialized models using only center-specific data, and 3) fine-tuned generalist models using center-specific data.
RESULTS: The three key components of dataset shift were characterized. The amount of covariate shift was substantial, indicating large variations in MR imaging between different centers. Glioblastoma segmentation models tend to perform best when using data from the application center. Generalist models, trained with over 700 samples, achieved a median Dice score of 88.98%. Specialized models surpassed this with 200 cases, while fine-tuned models outperformed with 50 cases.
CONCLUSIONS: The influence of dataset shift on model performance is evident. Fine-tuned and specialized models, utilizing data from the evaluated center, outperform generalist models, which rely on data from other centers. These approaches could encourage medical centers to develop customized models for their local use, enhancing the accuracy and reliability of glioblastoma segmentation in a context where dataset shift is inevitable.
PMID:39154600 | DOI:10.1016/j.ijmedinf.2024.105604
Virtual multiplexed immunofluorescence staining from non-antibody-stained fluorescence imaging for gastric cancer prognosis
EBioMedicine. 2024 Aug 17;107:105287. doi: 10.1016/j.ebiom.2024.105287. Online ahead of print.
ABSTRACT
BACKGROUND: Multiplexed immunofluorescence (mIF) staining, such as CODEX and MIBI, holds significant clinical value for various fields, such as disease diagnosis, biological research, and drug development. However, these techniques are often hindered by high time and cost requirements.
METHODS: Here we present a Multimodal-Attention-based virtual mIF Staining (MAS) system that utilises a deep learning model to extract potential antibody-related features from dual-modal non-antibody-stained fluorescence imaging, specifically autofluorescence (AF) and DAPI imaging. The MAS system simultaneously generates predictions of mIF with multiple survival-associated biomarkers in gastric cancer using self- and multi-attention learning mechanisms.
FINDINGS: Experimental results with 180 pathological slides from 94 patients with gastric cancer demonstrate the efficiency and consistent performance of the MAS system in both cancer and noncancer gastric tissues. Furthermore, we showcase the prognostic accuracy of the virtual mIF images of seven gastric cancer related biomarkers, including CD3, CD20, FOXP3, PD1, CD8, CD163, and PD-L1, which is comparable to those obtained from the standard mIF staining.
INTERPRETATION: The MAS system rapidly generates reliable multiplexed staining, greatly reducing the cost of mIF and improving clinical workflow.
FUNDING: Stanford 2022 HAI Seed Grant; National Institutes of Health 1R01CA256890.
PMID:39154539 | DOI:10.1016/j.ebiom.2024.105287
Using VIS-NIR hyperspectral imaging and deep learning for non-destructive high-throughput quantification and visualization of nutrients in wheat grains
Food Chem. 2024 Jul 29;461:140651. doi: 10.1016/j.foodchem.2024.140651. Online ahead of print.
ABSTRACT
High-throughput and low-cost quantification of the nutrient content in crop grains is crucial for food processing and nutritional research. However, traditional methods are time-consuming and destructive. A high-throughput and low-cost method of quantification of wheat nutrients with VIS-NIR (400-1700 nm) hyperspectral imaging is proposed in this study. Stepwise linear regression (SLR) was used to predict hundreds of nutrients accurately (R2 > 0.6); results improved when the hyperspectral data was processed with the first derivative. Knockout materials were also used to verify their practical application value. Various nutrients' characteristic wavelengths were mainly concentrated in the visible regions of 400-500 nm and 900-1000 nm. Finally, we proposed an improved pix2pix conditional generative network model to visualize the nutrients distribution and showed better results compared with the original. This research highlights the potential of hyperspectral technology in high-throughput and non-destructive determination and visualization of grain nutrients with deep learning.
PMID:39154465 | DOI:10.1016/j.foodchem.2024.140651
MS<sup>2</sup>3D: A 3D object detection method using multi-scale semantic feature points to construct 3D feature layer
Neural Netw. 2024 Aug 10;179:106623. doi: 10.1016/j.neunet.2024.106623. Online ahead of print.
ABSTRACT
LiDAR point clouds can effectively depict the motion and posture of objects in three-dimensional space. Many studies accomplish the 3D object detection by voxelizing point clouds. However, in autonomous driving scenarios, the sparsity and hollowness of point clouds create some difficulties for voxel-based methods. The sparsity of point clouds makes it challenging to describe the geometric features of objects. The hollowness of point clouds poses difficulties for the aggregation of 3D features. We propose a two-stage 3D object detection framework, called MS23D. (1) We propose a method using voxel feature points from multi-branch to construct the 3D feature layer. Using voxel feature points from different branches, we construct a relatively compact 3D feature layer with rich semantic features. Additionally, we propose a distance-weighted sampling method, reducing the loss of foreground points caused by downsampling and allowing the 3D feature layer to retain more foreground points. (2) In response to the hollowness of point clouds, we predict the offsets between deep-level feature points and the object's centroid, making them as close as possible to the object's centroid. This enables the aggregation of these feature points with abundant semantic features. For feature points from shallow-level, we retain them on the object's surface to describe the geometric features of the object. To validate our approach, we evaluated its effectiveness on both the KITTI and ONCE datasets.
PMID:39154419 | DOI:10.1016/j.neunet.2024.106623
Speed and efficiency: evaluating pulmonary nodule detection with AI-enhanced 3D gradient echo imaging
Eur Radiol. 2024 Aug 18. doi: 10.1007/s00330-024-11027-5. Online ahead of print.
ABSTRACT
OBJECTIVES: Evaluating the diagnostic feasibility of accelerated pulmonary MR imaging for detection and characterisation of pulmonary nodules with artificial intelligence-aided compressed sensing.
MATERIALS AND METHODS: In this prospective trial, patients with benign and malignant lung nodules admitted between December 2021 and December 2022 underwent chest CT and pulmonary MRI. Pulmonary MRI used a respiratory-gated 3D gradient echo sequence, accelerated with a combination of parallel imaging, compressed sensing, and deep learning image reconstruction with three different acceleration factors (CS-AI-7, CS-AI-10, and CS-AI-15). Two readers evaluated image quality (5-point Likert scale), nodule detection and characterisation (size and morphology) of all sequences compared to CT in a blinded setting. Reader agreement was determined using the intraclass correlation coefficient (ICC).
RESULTS: Thirty-seven patients with 64 pulmonary nodules (solid n = 57 [3-107 mm] part-solid n = 6 [ground glass/solid 8 mm/4-28 mm/16 mm] ground glass nodule n = 1 [20 mm]) were analysed. Nominal scan times were CS-AI-7 3:53 min; CS-AI-10 2:34 min; CS-AI-15 1:50 min. CS-AI-7 showed higher image quality, while quality remained diagnostic even for CS-AI-15. Detection rates of pulmonary nodules were 100%, 98.4%, and 96.8% for CS-AI factors 7, 10, and 15, respectively. Nodule morphology was best at the lowest acceleration and was inferior to CT in only 5% of cases, compared to 10% for CS-AI-10 and 23% for CS-AI-15. The nodule size was comparable for all sequences and deviated on average < 1 mm from the CT size.
CONCLUSION: The combination of compressed sensing and AI enables a substantial reduction in the scan time of lung MRI while maintaining a high detection rate of pulmonary nodules.
CLINICAL RELEVANCE STATEMENT: Incorporating compressed sensing and AI in pulmonary MRI achieves significant time savings without compromising nodule detection or characteristics. This advancement holds clinical promise, enhancing efficiency in lung cancer screening without sacrificing diagnostic quality.
KEY POINTS: Lung cancer screening by MRI may be possible but would benefit from scan time optimisation. Significant scan time reduction, high detection rates, and preserved nodule characteristics were achieved across different acceleration factors. Integrating compressed sensing and AI in pulmonary MRI offers efficient lung cancer screening without compromising diagnostic quality.
PMID:39154315 | DOI:10.1007/s00330-024-11027-5
A systematic evaluation of computational methods for cell segmentation
Brief Bioinform. 2024 Jul 25;25(5):bbae407. doi: 10.1093/bib/bbae407.
ABSTRACT
Cell segmentation is a fundamental task in analyzing biomedical images. Many computational methods have been developed for cell segmentation and instance segmentation, but their performances are not well understood in various scenarios. We systematically evaluated the performance of 18 segmentation methods to perform cell nuclei and whole cell segmentation using light microscopy and fluorescence staining images. We found that general-purpose methods incorporating the attention mechanism exhibit the best overall performance. We identified various factors influencing segmentation performances, including image channels, choice of training data, and cell morphology, and evaluated the generalizability of methods across image modalities. We also provide guidelines for choosing the optimal segmentation methods in various real application scenarios. We developed Seggal, an online resource for downloading segmentation models already pre-trained with various tissue and cell types, substantially reducing the time and effort for training cell segmentation models.
PMID:39154193 | DOI:10.1093/bib/bbae407
A method for extracting buildings from remote sensing images based on 3DJA-UNet3
Sci Rep. 2024 Aug 17;14(1):19067. doi: 10.1038/s41598-024-70019-z.
ABSTRACT
Building extraction aims to extract building pixels from remote sensing imagery, which plays a significant role in urban planning, dynamic urban monitoring, and many other applications. UNet3+ is widely applied in building extraction from remote sensing images. However, it still faces issues such as low segmentation accuracy, imprecise boundary delineation, and the complexity of network models. Therefore, based on the UNet3+ model, this paper proposes a 3D Joint Attention (3DJA) module that effectively enhances the correlation between local and global features, obtaining more accurate object semantic information and enhancing feature representation. The 3DJA module models semantic interdependence in the vertical and horizontal dimensions to obtain feature map spatial encoding information, as well as in the channel dimensions to increase the correlation between dependent channel graphs. In addition, a bottleneck module is constructed to reduce the number of network parameters and improve model training efficiency. Many experiments are conducted on publicly accessible WHU,INRIA and Massachusetts building dataset, and the benchmarks, BOMSC-Net, CVNet, SCA-Net, SPCL-Net, ACMFNet, MFCF-Net models are selected for comparison with the 3DJA-UNet3+ model proposed in this paper. The experimental results show that 3DJA-UNet3+ achieves competitive results in three evaluation indicators: overall accuracy, mean intersection over union, and F1-score. The code will be available at https://github.com/EnjiLi/3DJA-UNet3Plus .
PMID:39154127 | DOI:10.1038/s41598-024-70019-z
Research on sports image classification method based on SE-RES-CNN model
Sci Rep. 2024 Aug 17;14(1):19087. doi: 10.1038/s41598-024-69965-5.
ABSTRACT
As computer image processing and digital technologies advance, creating an efficient method for classifying sports images is crucial for the rapid retrieval and management of large image datasets. Traditional manual methods for classifying sports images are impractical for large-scale data and often inaccurate when distinguishing similar images. This paper introduces an SE module that adaptively adjusts the weights of input feature mapping channels, and a Res module that excels in deep feature extraction, preventing gradient vanishing, multi-scale processing, and enhancing generalization in image recognition. Through extensive experimentation on network structure adjustments, the SE-RES-CNN neural network model is applied to sports image classification. The model is trained on a sports image classification dataset from Kaggle, alongside VGG-16 and ResNet50 models. Training results show that the proposed SE-RES-CNN model improves classification accuracy by approximately 5% compared to VGG-16 and ResNet50 models. Testing revealed that the SE-RES-CNN model classifies 100 out of 500 sports images in 6 s, achieving an accuracy rate of up to 98% and a single prediction time of 0.012 s. This validates the model's accuracy and effectiveness, significantly enhancing sports image retrieval and classification efficiency. This validates the model's accuracy and effectiveness, significantly enhancing sports image retrieval and classification efficiency.
PMID:39154107 | DOI:10.1038/s41598-024-69965-5
Multimodal MRI-based deep-radiomics model predicts response in cervical cancer treated with neoadjuvant chemoradiotherapy
Sci Rep. 2024 Aug 17;14(1):19090. doi: 10.1038/s41598-024-70055-9.
ABSTRACT
Platinum-based neoadjuvant chemotherapy (NACT) followed by radical hysterectomy has been proposed as an alternative treatment approach for cervical cancer (CC) in stage Ib2-IIb, who had a strong desire to be treated with surgery. Our study aims to develop a model based on multimodal MRI by using radiomics and deep learning to predict the treatment response in CC patients treated with neoadjuvant chemoradiotherapy (NACRT). From August 2009 to June 2013, CC patients in stage Ib2-IIb (FIGO 2008) who received NACRT at Fujian Cancer Hospital were enrolled in our study. Clinical information, contrast-enhanced T1-weighted imaging (CE-T1WI), and T2-weighted imaging (T2WI) data were respectively collected. Radiomic features and deep abstract features were extracted from the images using radiomics and deep learning models, respectively. Then, ElasticNet and SVM-RFE were employed for feature selection to construct four single-sequence feature sets. Early fusion of two multi-sequence feature sets and one hybrid feature set were performed, followed by classification prediction using four machine learning classifiers. Subsequently, the performance of the models in predicting the response to NACRT was evaluated by separating patients into training and validation sets. Additionally, overall survival (OS) and disease-free survival (DFS) were assessed using Kaplan-Meier survival curves. Among the four machine learning models, SVM exhibited the best predictive performance (AUC=0.86). Among the seven feature sets, the hybrid feature set achieved the highest values for AUC (0.86), ACC (0.75), Recall (0.75), Precision (0.81), and F1-score (0.75) in the validation set, outperforming other feature sets. Furthermore, the predicted outcomes of the model were closely associated with patient OS and DFS (p = 0.0044; p = 0.0039). A model based on MRI images with features from multiple sequences and different methods could precisely predict the response to NACRT in CC patients. This model could assist clinicians in devising personalized treatment plans and predicting patient survival outcomes.
PMID:39154103 | DOI:10.1038/s41598-024-70055-9