Deep learning

An explainable deep learning model for diabetic foot ulcer classification using swin transformer and efficient multi-scale attention-driven network

Mon, 2025-02-03 06:00

Sci Rep. 2025 Feb 3;15(1):4057. doi: 10.1038/s41598-025-87519-1.

ABSTRACT

Diabetic Foot Ulcer (DFU) is a severe complication of diabetes mellitus, resulting in significant health and socio-economic challenges for the diagnosed individual. Severe cases of DFU can lead to lower limb amputation in diabetic patients, making their diagnosis a complex and costly process that poses challenges for medical professionals. Manual identification of DFU is particularly difficult due to their diverse visual characteristics, leading to multiple cases going undiagnosed. To address this challenge, Deep Learning (DL) methods offer an efficient and automated approach to facilitate timely treatment and improve patient outcomes. This research proposes a novel feature fusion-based model that incorporates two parallel tracks for efficient feature extraction. The first track utilizes the Swin transformer, which captures long-range dependencies by employing shifted windows and self-attention mechanisms. The second track involves the Efficient Multi-Scale Attention-Driven Network (EMADN), which leverages Light-weight Multi-scale Deformable Shuffle (LMDS) and Global Dilated Attention (GDA) blocks to extract local features efficiently. These blocks dynamically adjust kernel sizes and leverage attention modules, enabling effective feature extraction. To the best of our knowledge, this is the first work reporting the findings of a dual track architecture for DFU classification, leveraging Swin transformer and EMADN networks. The obtained feature maps from both the networks are concatenated and subjected to shuffle attention for feature refinement at a reduced computational cost. The proposed work also incorporates Grad-CAM-based Explainable Artificial Intelligence (XAI) to visualize and interpret the decision making of the network. The proposed model demonstrated better performance on the DFUC-2021 dataset, surpassing existing works and pre-trained CNN architectures with an accuracy of 78.79% and a macro F1-score of 80%.

PMID:39900977 | DOI:10.1038/s41598-025-87519-1

Categories: Literature Watch

Enhancing depression recognition through a mixed expert model by integrating speaker-related and emotion-related features

Mon, 2025-02-03 06:00

Sci Rep. 2025 Feb 3;15(1):4064. doi: 10.1038/s41598-025-88313-9.

ABSTRACT

The World Health Organization predicts that by 2030, depression will be the most common mental disorder, significantly affecting individuals, families, and society. Speech, as a sensitive indicator, reveals noticeable acoustic changes linked to physiological and cognitive variations, making it a crucial behavioral marker for detecting depression. However, existing studies often overlook the separation of speaker-related and emotion-related features in speech when recognizing depression. To tackle this challenge, we propose a Mixture-of-Experts (MoE) method that integrates speaker-related and emotion-related features for depression recognition. Our approach begins with a Time Delay Neural Network to pre-train a speaker-related feature extractor using a large-scale speaker recognition dataset while simultaneously pre-training a speaker's emotion-related feature extractor with a speech emotion dataset. We then apply transfer learning to extract both features from a depression dataset, followed by fusion. A multi-domain adaptation algorithm trains the MoE model for depression recognition. Experimental results demonstrate that our method achieves 74.3% accuracy on a self-built Chinese localized depression dataset and an MAE of 6.32 on the AVEC2014 dataset. Thus, it outperforms state-of-the-art deep learning methods that use speech features. Additionally, our approach shows strong performance across Chinese and English speech datasets, highlighting its effectiveness in addressing cultural variations.

PMID:39900968 | DOI:10.1038/s41598-025-88313-9

Categories: Literature Watch

A mechanism-informed deep neural network enables prioritization of regulators that drive cell state transitions

Mon, 2025-02-03 06:00

Nat Commun. 2025 Feb 3;16(1):1284. doi: 10.1038/s41467-025-56475-9.

ABSTRACT

Cells are regulated at multiple levels, from regulations of individual genes to interactions across multiple genes. Some recent neural network models can connect molecular changes to cellular phenotypes, but their design lacks modeling of regulatory mechanisms, limiting the decoding of regulations behind key cellular events, such as cell state transitions. Here, we present regX, a deep neural network incorporating both gene-level regulation and gene-gene interaction mechanisms, which enables prioritizing potential driver regulators of cell state transitions and providing mechanistic interpretations. Applied to single-cell multi-omics data on type 2 diabetes and hair follicle development, regX reliably prioritizes key transcription factors and candidate cis-regulatory elements that drive cell state transitions. Some regulators reveal potential new therapeutic targets, drug repurposing possibilities, and putative causal single nucleotide polymorphisms. This method to analyze single-cell multi-omics data demonstrates how the interpretable design of neural networks can better decode biological systems.

PMID:39900922 | DOI:10.1038/s41467-025-56475-9

Categories: Literature Watch

Automated contouring for breast cancer radiotherapy in the isocentric lateral decubitus position: a neural network-based solution for enhanced precision and efficiency

Mon, 2025-02-03 06:00

Strahlenther Onkol. 2025 Feb 3. doi: 10.1007/s00066-024-02364-x. Online ahead of print.

ABSTRACT

BACKGROUND: Adjuvant radiotherapy is essential for reducing local recurrence and improving survival in breast cancer patients, but it carries a risk of ischemic cardiac toxicity, which increases with heart exposure. The isocentric lateral decubitus position, where the breast rests flat on a support, reduces heart exposure and leads to delivery of a more uniform dose. This position is particularly beneficial for patients with unique anatomies, such as those with pectus excavatum or larger breast sizes. While artificial intelligence (AI) algorithms for autocontouring have shown promise, they have not been tailored to this specific position. This study aimed to develop and evaluate a neural network-based autocontouring algorithm for patients treated in the isocentric lateral decubitus position.

MATERIALS AND METHODS: In this single-center study, 1189 breast cancer patients treated after breast-conserving surgery were included. Their simulation CT scans (1209 scans) were used to train and validate a neural network-based autocontouring algorithm (nnU-Net). Of these, 1087 scans were used for training, and 122 scans were reserved for validation. The algorithm's performance was assessed using the Dice similarity coefficient (DSC) to compare the automatically delineated volumes with manual contours. A clinical evaluation of the algorithm was performed on 30 additional patients, with contours rated by two expert radiation oncologists.

RESULTS: The neural network-based algorithm achieved a segmentation time of approximately 4 min, compared to 20 min for manual segmentation. The DSC values for the validation cohort were 0.88 for the treated breast, 0.90 for the heart, 0.98 for the right lung, and 0.97 for the left lung. In the clinical evaluation, 90% of the automatically contoured breast volumes were rated as acceptable without corrections, while the remaining 10% required minor adjustments. All lung contours were accepted without corrections, and heart contours were rated as acceptable in 93.3% of cases, with minor corrections needed in 6.6% of cases.

CONCLUSION: This neural network-based autocontouring algorithm offers a practical, time-saving solution for breast cancer radiotherapy planning in the isocentric lateral decubitus position. Its strong geometric performance, clinical acceptability, and significant time efficiency make it a valuable tool for modern radiotherapy practices, particularly in high-volume centers.

PMID:39900818 | DOI:10.1007/s00066-024-02364-x

Categories: Literature Watch

Artificial intelligence in arthroplasty

Mon, 2025-02-03 06:00

Orthopadie (Heidelb). 2025 Feb 3. doi: 10.1007/s00132-025-04619-6. Online ahead of print.

ABSTRACT

BACKGROUND: Artificial intelligence is very likely to be a pioneering technology in arthroplasty, with a wide range of pre-, intra- and post-operative applications. The opportunities for patients, doctors and healthcare policy are considerable, especially in the context of optimized and individualized patient care.

DATA AVAILABILITY: Despite these diverse possibilities, there are currently only a few AI applications in routine clinical practice, mainly due to the limited availability of analyzable health data. AI systems are only as good as the data they are trained with. If the data is insufficient, incomplete or biased, the AI may draw false conclusions. The current results of such AI applications in arthroplasty must, therefore, be viewed critically, especially as previous data bases were not designed a priori for AI applications.

PROSPECTS: The successful integration of AI, therefore, requires a targeted focus on the development of a specific data structure. In order to exploit the full potential of AI, comprehensive clinical data volumes are required, which can only be realized through a multicentric approach. In this context, ethical and data protection issues remain a further question, and not only in orthopaedics. Cooperative efforts at national and international levels are, therefore, essential in order to research and develop new AI applications.

PMID:39900780 | DOI:10.1007/s00132-025-04619-6

Categories: Literature Watch

<em>De Novo</em> Synthesis of Reticuline and Taxifolin Using Re-engineered Homologous Recombination in <em>Yarrowia lipolytica</em>

Mon, 2025-02-03 06:00

ACS Synth Biol. 2025 Feb 3. doi: 10.1021/acssynbio.4c00853. Online ahead of print.

ABSTRACT

Yarrowia lipolytica has been widely engineered as a eukaryotic cell factory to produce various important compounds. However, the difficulty of gene editing and the lack of efficient neutral sites make rewiring of Y. lipolytica metabolism challenging. Herein, a Cas9 system was established to redesign the Y. lipolytica homologous recombination system, which caused a more than 56-fold increase in the HR efficiency. The fusion expression of the hBrex27 sequence in the C-terminus of Cas9 recruited more Rad51 protein, and the engineered Cas9 decreased NHEJ, achieving 85% single-gene positive efficiency and 25% multigene editing efficiency. With this system, neutral sites on different chromosomes were characterized, and a deep learning model was developed for gRNA activity prediction, thus providing the corresponding integration efficiency and expression intensity. Subsequently, the tool and platform strains were validated by applying them for the de novo synthesis of (S)-reticuline and (2S)-taxifolin. The developed platform strains and tools helped transform Y. lipolytica into an easy-to-operate model cell factory, similar to Saccharomyces cerevisiae.

PMID:39899813 | DOI:10.1021/acssynbio.4c00853

Categories: Literature Watch

Machine Learning-Enabled Drug-Induced Toxicity Prediction

Mon, 2025-02-03 06:00

Adv Sci (Weinh). 2025 Feb 3:e2413405. doi: 10.1002/advs.202413405. Online ahead of print.

ABSTRACT

Unexpected toxicity has become a significant obstacle to drug candidate development, accounting for 30% of drug discovery failures. Traditional toxicity assessment through animal testing is costly and time-consuming. Big data and artificial intelligence (AI), especially machine learning (ML), are robustly contributing to innovation and progress in toxicology research. However, the optimal AI model for different types of toxicity usually varies, making it essential to conduct comparative analyses of AI methods across toxicity domains. The diverse data sources also pose challenges for researchers focusing on specific toxicity studies. In this review, 10 categories of drug-induced toxicity is examined, summarizing the characteristics and applicable ML models, including both predictive and interpretable algorithms, striking a balance between breadth and depth. Key databases and tools used in toxicity prediction are also highlighted, including toxicology, chemical, multi-omics, and benchmark databases, organized by their focus and function to clarify their roles in drug-induced toxicity prediction. Finally, strategies to turn challenges into opportunities are analyzed and discussed. This review may provide researchers with a valuable reference for understanding and utilizing the available resources to bridge prediction and mechanistic insights, and further advance the application of ML in drugs-induced toxicity prediction.

PMID:39899688 | DOI:10.1002/advs.202413405

Categories: Literature Watch

Adaptive wavelet base selection for deep learning-based ECG diagnosis: A reinforcement learning approach

Mon, 2025-02-03 06:00

PLoS One. 2025 Feb 3;20(2):e0318070. doi: 10.1371/journal.pone.0318070. eCollection 2025.

ABSTRACT

Electrocardiogram (ECG) signals are crucial in diagnosing cardiovascular diseases (CVDs). While wavelet-based feature extraction has demonstrated effectiveness in deep learning (DL)-based ECG diagnosis, selecting the optimal wavelet base poses a significant challenge, as it directly influences feature quality and diagnostic accuracy. Traditional methods typically rely on fixed wavelet bases chosen heuristically or through trial-and-error, which can fail to cover the distinct characteristics of individual ECG signals, leading to suboptimal performance. To address this limitation, we propose a reinforcement learning-based wavelet base selection (RLWBS) framework that dynamically customizes the wavelet base for each ECG signal. In this framework, a reinforcement learning (RL) agent iteratively optimizes its wavelet base selection (WBS) strategy based on successive feedback of classification performance, aiming to achieve progressively optimized feature extraction. Experiments conducted on the clinically collected PTB-XL dataset for ECG abnormality classification show that the proposed RLWBS framework could obtain more detailed time-frequency representation of ECG signals, yielding enhanced diagnostic performance compared to traditional WBS approaches.

PMID:39899639 | DOI:10.1371/journal.pone.0318070

Categories: Literature Watch

Capturing continuous, long timescale behavioral changes in Drosophila melanogaster postural data

Mon, 2025-02-03 06:00

PLoS Comput Biol. 2025 Feb 3;21(2):e1012753. doi: 10.1371/journal.pcbi.1012753. Online ahead of print.

ABSTRACT

Animal behavior spans many timescales, from short, seconds-scale actions to daily rhythms over many hours to life-long changes during aging. To access longer timescales of behavior, we continuously recorded individual Drosophila melanogaster at 100 frames per second for up to 7 days at a time in featureless arenas on sucrose-agarose media. We use the deep learning framework SLEAP to produce a full-body postural dataset for 47 individuals resulting in nearly 2 billion pose instances. We identify stereotyped behaviors such as grooming, proboscis extension, and locomotion and use the resulting ethograms to explore how the flies' behavior varies across time of day and days in the experiment. We find distinct daily patterns in all stereotyped behaviors, adding specific information about trends in different grooming modalities, proboscis extension duration, and locomotion speed to what is known about the D. melanogaster circadian cycle. Using our holistic measurements of behavior, we find that the hour after dawn is a unique time point in the flies' daily pattern of behavior, and that the behavioral composition of this hour tracks well with other indicators of health such as locomotion speed and the fraction of time spend moving vs. resting. The method, data, and analysis presented here give us a new and clearer picture of D. melanogaster behavior across timescales, revealing novel features that hint at unexplored underlying biological mechanisms.

PMID:39899595 | DOI:10.1371/journal.pcbi.1012753

Categories: Literature Watch

Unsupervised monocular depth estimation with omnidirectional camera for 3D reconstruction of grape berries in the wild

Mon, 2025-02-03 06:00

PLoS One. 2025 Feb 3;20(2):e0317359. doi: 10.1371/journal.pone.0317359. eCollection 2025.

ABSTRACT

Japanese table grapes are quite expensive because their production is highly labor-intensive. In particular, grape berry pruning is a labor-intensive task performed to produce grapes with desirable characteristics. Because it is considered difficult to master, it is desirable to assist new entrants by using information technology to show the recommended berries to cut. In this research, we aim to build a system that identifies which grape berries should be removed during the pruning process. To realize this, the 3D positions of individual grape berries need to be estimated. Our environmental restriction is that bunches hang from trellises at a height of about 1.6 meters in the grape orchards outside. It is hard to use depth sensors in such circumstances, and using an omnidirectional camera with a wide field of view is desired for the convenience of shooting videos. Obtaining 3D information of grape berries from videos is challenging because they have textureless surfaces, highly symmetric shapes, and crowded arrangements. For these reasons, it is hard to use conventional 3D reconstruction methods, which rely on matching local unique features. To satisfy the practical constraints of this task, we extend a deep learning-based unsupervised monocular depth estimation method to an omnidirectional camera and propose using it. Our experiments demonstrate the effectiveness of the proposed method for estimating the 3D positions of grape berries in the wild.

PMID:39899513 | DOI:10.1371/journal.pone.0317359

Categories: Literature Watch

AFMDD: Analyzing Functional Connectivity Feature of Major Depressive Disorder by Graph Neural Network-Based Model

Mon, 2025-02-03 06:00

J Comput Biol. 2025 Feb 3. doi: 10.1089/cmb.2024.0505. Online ahead of print.

ABSTRACT

The extraction of biomarkers from functional connectivity (FC) in the brain is of great significance for the diagnosis of mental disorders. In recent years, with the development of deep learning, several methods have been proposed to assist in the diagnosis of depression and promote its automatic identification. However, these methods still have some limitations. The current approaches overlook the importance of subgraphs in brain graphs, resulting in low accuracy. Using these methods with low accuracy for FC analysis may lead to unreliable results. To address these issues, we have designed a graph neural network-based model called AFMDD, specifically for analyzing FC features of depression and depression identification. Through experimental validation, our model has demonstrated excellent performance in depression diagnosis, achieving an accuracy of 73.15%, surpassing many state-of-the-art methods. In our study, we conducted visual analysis of nodes and edges in the FC networks of depression and identified several novel FC features. Those findings may provide valuable clues for the development of biomarkers for the clinical diagnosis of depression.

PMID:39899351 | DOI:10.1089/cmb.2024.0505

Categories: Literature Watch

Automated Patient-specific Quality Assurance for Automated Segmentation of Organs at Risk in Nasopharyngeal Carcinoma Radiotherapy

Mon, 2025-02-03 06:00

Cancer Control. 2025 Jan-Dec;32:10732748251318387. doi: 10.1177/10732748251318387.

ABSTRACT

INTRODUCTION: Precision radiotherapy relies on accurate segmentation of tumor targets and organs at risk (OARs). Clinicians manually review automatically delineated structures on a case-by-case basis, a time-consuming process dependent on reviewer experience and alertness. This study proposes a general process for automated threshold generation for structural evaluation indicators and patient-specific quality assurance (QA) for automated segmentation of nasopharyngeal carcinoma (NPC).

METHODS: The patient-specific QA process for automated segmentation involves determining the confidence limit and error structure highlight stage. Three expert physicians segmented 17 OARs using computed tomography images of NPC and compared them using the Dice similarity coefficient, the maximum Hausdorff distance, and the mean distance to agreement. For each OAR, the 95% confidence interval was calculated as the confidence limit for each indicator. If two or more evaluation indicators (N2) or one or more evaluation indicators (N1) exceeded the confidence limits, the structure segmentation result was considered abnormal. The quantitative performances of these two methods were compared with those obtained by artificially introducing small/medium and serious errors.

RESULTS: The sensitivity, specificity, balanced accuracy, and F-score values for N2 were 0.944 ± 0.052, 0.827 ± 0.149, 0.886 ± 0.076, and 0.936 ± 0.045, respectively, whereas those for N1 were 0.955 ± 0.045, 0.788 ± 0.189, 0.878 ± 0.096, and 0.948 ± 0.035, respectively. N2 and N1 had small/medium error detection rates of 97.67 ± 0.04% and 98.67 ± 0.04%, respectively, with a serious error detection rate of 100%.

CONCLUSION: The proposed automated patient-specific QA process effectively detected segmentation abnormalities, particularly serious errors. These are crucial for enhancing review efficiency and automated segmentation, and for improving physician confidence in automated segmentation.

PMID:39899269 | DOI:10.1177/10732748251318387

Categories: Literature Watch

A Multi-View Feature-Based Interpretable Deep Learning Framework for Drug-Drug Interaction Prediction

Mon, 2025-02-03 06:00

Interdiscip Sci. 2025 Feb 3. doi: 10.1007/s12539-025-00687-6. Online ahead of print.

ABSTRACT

Drug-drug interactions (DDIs) can result in deleterious consequences when patients take multiple medications simultaneously, emphasizing the critical need for accurate DDI prediction. Computational methods for DDI prediction have garnered recent attention. However, current approaches concentrate solely on single-view features, such as atomic-view or substructure-view features, limiting predictive capacity. The scarcity of research on interpretability studies based on multi-view features is crucial for tracing interactions. Addressing this gap, we present MI-DDI, a multi-view feature-based interpretable deep learning framework for DDI. To fully extract multi-view features, we employ a Message Passing Neural Network (MPNN) to learn atomic features from molecular graphs generated by RDkit, and transformer encoders are used to learn substructure-view embeddings from drug SMILES simultaneously. These atomic-view and substructure-view features are then amalgamated into a holistic drug embedding matrix. Subsequently, an intricately designed interaction module not only establishes a tractable path for understanding interactions but also directly informs the construction of weight matrices, enabling precise and interpretable interaction predictions. Validation on the BIOSNAP dataset and DrugBank dataset demonstrates MI-DDI's superiority. It surpasses the current benchmarks by a substantial average of 3% on BIOSNAP and 1% on DrugBank. Additional experiments underscore the significance of atomic-view information for DDI prediction and confirm that our interaction module indeed learns more effective information for DDI prediction. The source codes are available at https://github.com/ZihuiCheng/MI-DDI .

PMID:39899225 | DOI:10.1007/s12539-025-00687-6

Categories: Literature Watch

Multi-modal dataset creation for federated learning with DICOM-structured reports

Mon, 2025-02-03 06:00

Int J Comput Assist Radiol Surg. 2025 Feb 3. doi: 10.1007/s11548-025-03327-y. Online ahead of print.

ABSTRACT

Purpose Federated training is often challenging on heterogeneous datasets due to divergent data storage options, inconsistent naming schemes, varied annotation procedures, and disparities in label quality. This is particularly evident in the emerging multi-modal learning paradigms, where dataset harmonization including a uniform data representation and filtering options are of paramount importance.Methods DICOM-structured reports enable the standardized linkage of arbitrary information beyond the imaging domain and can be used within Python deep learning pipelines with highdicom. Building on this, we developed an open platform for data integration with interactive filtering capabilities, thereby simplifying the process of creation of patient cohorts over several sites with consistent multi-modal data.Results In this study, we extend our prior work by showing its applicability to more and divergent data types, as well as streamlining datasets for federated training within an established consortium of eight university hospitals in Germany. We prove its concurrent filtering ability by creating harmonized multi-modal datasets across all locations for predicting the outcome after minimally invasive heart valve replacement. The data include imaging and waveform data (i.e., computed tomography images, electrocardiography scans) as well as annotations (i.e., calcification segmentations, and pointsets), and metadata (i.e., prostheses and pacemaker dependency).Conclusion Structured reports bridge the traditional gap between imaging systems and information systems. Utilizing the inherent DICOM reference system arbitrary data types can be queried concurrently to create meaningful cohorts for multi-centric data analysis. The graphical interface as well as example structured report templates are available at https://github.com/Cardio-AI/fl-multi-modal-dataset-creation .

PMID:39899185 | DOI:10.1007/s11548-025-03327-y

Categories: Literature Watch

Multi-scale dual attention embedded U-shaped network for accurate segmentation of coronary vessels in digital subtraction angiography

Mon, 2025-02-03 06:00

Med Phys. 2025 Feb 3. doi: 10.1002/mp.17618. Online ahead of print.

ABSTRACT

BACKGROUND: Most attention-based networks fall short in effectively integrating spatial and channel-wise information across different scales, which results in suboptimal performance for segmenting coronary vessels in x-ray digital subtraction angiography (DSA) images. This limitation becomes particularly evident when attempting to identify tiny sub-branches.

PURPOSE: To address this limitation, a multi-scale dual attention embedded network (named MDA-Net) is proposed to consolidate contextual spatial and channel information across contiguous levels and scales.

METHODS: MDA-Net employs five cascaded double-convolution blocks within its encoder to adeptly extract multi-scale features. It incorporates skip connections that facilitate the retention of low-level feature details throughout the decoding phase, thereby enhancing the reconstruction of detailed image information. Furthermore, MDA modules, which take in features from neighboring scales and hierarchical levels, are tasked with discerning subtle distinctions between foreground elements, such as coronary vessels of diverse morphologies and dimensions, and the complex background, which includes structures like catheters or other tissues with analogous intensities. To sharpen the segmentation accuracy, the network utilizes a composite loss function that integrates intersection over union (IoU) loss with binary cross-entropy loss, ensuring the precision of the segmentation outcomes and maintaining an equilibrium between positive and negative classifications.

RESULTS: Experimental results demonstrate that MDA-Net not only performs more robustly and effectively on DSA images under various image conditions, but also achieves significant advantages over state-of-the-art methods, achieving the optimal scores in terms of IoU, Dice, accuracy, and Hausdorff distance 95%.

CONCLUSIONS: MDA-Net has high robustness for coronary vessels segmentation, providing an active strategy for early diagnosis of cardiovascular diseases. The code is publicly available at https://github.com/30410B/MDA-Net.git.

PMID:39899182 | DOI:10.1002/mp.17618

Categories: Literature Watch

Redefining healthcare - The transformative power of generative AI in modern medicine

Mon, 2025-02-03 06:00

Rev Esp Enferm Dig. 2025 Feb 3. doi: 10.17235/reed.2025.11081/2024. Online ahead of print.

ABSTRACT

Over the last decade, technological advances in deep learning (artificial neural networks, big data and computing power) have made possible to build digital solutions that imitate human cognitive process (language, vision, hearing, etc) and are able to generate new content when prompted. This generative AI is going to disrupt healthcare. Healthcare professionals must get prepared because there are ethical and legal challenges that must be identified and tackled.

PMID:39898717 | DOI:10.17235/reed.2025.11081/2024

Categories: Literature Watch

Accuracy of a Cascade Network for Semi-Supervised Maxillary Sinus Detection and Sinus Cyst Classification

Mon, 2025-02-03 06:00

Clin Implant Dent Relat Res. 2025 Feb;27(1):e13431. doi: 10.1111/cid.13431.

ABSTRACT

OBJECTIVE: Maxillary sinus mucosal cysts represent prevalent oral and maxillofacial diseases, and their precise diagnosis is essential for surgical planning in maxillary sinus floor elevation. This study aimed to develop a deep learning-based pipeline for the classification of maxillary sinus lesions in cone beam computed tomography (CBCT) images to provide auxiliary support for clinical diagnosis.

METHODS: This study utilized 45 136 maxillary sinus images from CBCT scans of 541 patients. A cascade network was designed, comprising a semi-supervised maxillary sinus area object detection module and a maxillary sinus lesions classification module. The object detection module employed a semi-supervised pseudo-labelling training strategy to expand the maxillary sinus annotation dataset. In the classification module, the performance of Convolutional Neural Network and Transformer architectures was compared for maxillary sinus mucosal lesion classification. The object detection and classification modules were evaluated using metrics including Accuracy, Precision, Recall, F1 score, and Average Precision, with the object detection module additionally assessed using Precision-Recall Curve.

RESULTS: The fully supervised pseudo-label generation model achieved an average accuracy of 0.9433, while the semi-supervised maxillary sinus detection model attained 0.9403. ResNet-50 outperformed in classification, with accuracies of 0.9836 (sagittal) and 0.9797 (coronal). Grad-CAM visualization confirmed accurate focus on clinically relevant lesion features.

CONCLUSION: The proposed pipeline achieves high-precision detection and classification of maxillary sinus mucosal lesions, reducing manual annotation while maintaining accuracy.

PMID:39898709 | DOI:10.1111/cid.13431

Categories: Literature Watch

Enhancing feature-aided data association tracking in passive sonar arrays: An advanced Siamese network approach

Mon, 2025-02-03 06:00

J Acoust Soc Am. 2025 Feb 1;157(2):681-698. doi: 10.1121/10.0035577.

ABSTRACT

Feature-aided tracking integrates supplementary features into traditional methods and improves the accuracy of data association methods that rely solely on kinematic measurements. However, previous applications of feature-aided data association methods in multi-target tracking of passive sonar arrays directly utilized raw features for likelihood calculations, causing performance degradation in complex marine scenarios with low signal-to-noise ratio and close-proximity trajectories. Inspired by the successful application of deep learning, this study proposes BiChannel-SiamDinoNet, an advanced network derived from the Siamese network and integrated into the joint probability data association framework to calculate feature measurement likelihood. This method forms an embedding space through the feature structure of acoustic targets, bringing similar targets closer together. This makes the system more robust to variations, capable of capturing complex relationships between measurements and targets and effectively discriminating discrepancies between them. Additionally, this study refines the network's feature extraction module to address underwater acoustic signals' unique line spectrum and implement the knowledge distillation training method to improve the network's capability to assess consistency between features through local representations. The performance of the proposed method is assessed through simulation analysis and marine experiments.

PMID:39898705 | DOI:10.1121/10.0035577

Categories: Literature Watch

Enhancing U-Net-based Pseudo-CT generation from MRI using CT-guided bone segmentation for radiation treatment planning in head &amp; neck cancer patients

Mon, 2025-02-03 06:00

Phys Med Biol. 2025 Jan 31. doi: 10.1088/1361-6560/adb124. Online ahead of print.

ABSTRACT

OBJECTIVE: This study investigates the effects of various training protocols on enhancing the precision of MRI-only Pseudo-CT generation for radiation treatment planning and adaptation in head & neck cancer patients. It specifically tackles the challenge of differentiating bone from air, a limitation that frequently results in substantial deviations in the representation of bony structures on Pseudo-CT images.

APPROACH: The study included 25 patients, utilizing pre-treatment MRI-CT image pairs. Five cases were randomly selected for testing, with the remaining 20 used for model training and validation. A 3D U-Net deep learning model was employed, trained on patches of size 643with an overlap of 323. MRI scans were acquired using the Dixon gradient echo (GRE) technique, and various contrasts were explored to improve Pseudo-CT accuracy, including in-phase, water-only, and combined water-only and fat-only images. Additionally, bone extraction from the fat-only image was integrated as an additional channel to better capture bone structures on Pseudo-CTs. The evaluation involved both image quality and dosimetric metrics.

MAIN RESULTS: The generated Pseudo-CTs were compared with their corresponding registered target CTs. The mean absolute error (MAE) and peak signal-to-noise ratio (PSNR) for the base model using combined water-only and fat-only images were 19.20 ± 5.30 HU and 57.24 ± 1.44 dB, respectively. Following the integration of an additional channel using a CT-guided bone segmentation, the model's performance improved, achieving MAE and PSNR of 18.32 ± 5.51 HU and 57.82 ± 1.31 dB, respectively. The dosimetric assessment confirmed that radiation treatment planning on Pseudo-CT achieved accuracy comparable to conventional CT. The measured results are statistically significant, with ap-value < 0.05.

SIGNIFICANCE: This study demonstrates improved accuracy in bone representation on Pseudo-CTs achieved through a combination of water-only, fat-only and extracted bone images; thus, enhancing feasibility of MRI-based simulation for radiation treatment planning.

PMID:39898433 | DOI:10.1088/1361-6560/adb124

Categories: Literature Watch

Automated Detection and Severity Prediction of Wheat Rust Using Cost-Effective Xception Architecture

Mon, 2025-02-03 06:00

Plant Cell Environ. 2025 Feb 3. doi: 10.1111/pce.15413. Online ahead of print.

ABSTRACT

Wheat crop production is under constant threat from leaf and stripe rust, an airborne fungal disease caused by the pathogen Puccinia triticina. Early detection and efficient crop phenotyping are crucial for managing and controlling the spread of this disease in susceptible wheat varieties. Current detection methods are predominantly manual and labour-intensive. Traditional strategies such as cultivating resistant varieties, applying fungicides and practicing good agricultural techniques often fall short in effectively identifying and responding to wheat rust outbreaks. To address these challenges, we propose an innovative computer vision-based disease severity prediction pipeline. Our approach utilizes a deep learning-based classifier to differentiate between healthy and rust-infected wheat leaves. Upon identifying an infected leaf, we apply Grabcut-based segmentation to isolate the foreground mask. This mask is then processed in the CIELAB color space to distinguish leaf rust stripes and spores. The disease severity ratio is calculated to measure the extent of infection on each test leaf. This paper introduces a ground-breaking disease severity prediction method, offering a low-cost, accessible and automated solution for wheat rust disease screening in field conditions using digital colour images. Our approach represents a significant advancement in crop disease management, promising timely interventions and better control measures for wheat rust.

PMID:39898421 | DOI:10.1111/pce.15413

Categories: Literature Watch

Pages