Deep learning

Combined deep learning and radiomics in pretreatment radiation esophagitis prediction for patients with esophageal cancer underwent volumetric modulated arc therapy

Tue, 2024-07-16 06:00

Radiother Oncol. 2024 Jul 14:110438. doi: 10.1016/j.radonc.2024.110438. Online ahead of print.

ABSTRACT

PURPOSE: To develop a combined radiomics and deep learning (DL) model in predicting radiation esophagitis (RE) of a grade ≥ 2 for patients with esophageal cancer (EC) underwent volumetric modulated arc therapy (VMAT) based on computed tomography (CT) and radiation dose (RD) distribution images.

MATERIALS AND METHODS: A total of 273 EC patients underwent VMAT were retrospectively reviewed and enrolled from two centers and divided into training (n = 152), internal validation (n = 66), and external validation (n = 55) cohorts, respectively. Radiomic and dosiomic features along with DL features using convolutional neural networks were extracted and screened from CT and RD images to predict RE. The performance of these models was evaluated and compared using the area under curve (AUC) of the receiver operating characteristic curves (ROC).

RESULTS: There were 5 and 10 radiomic and dosiomic features were screened, respectively. XGBoost achieved a best AUC of 0.703, 0.694 and 0.801, 0.729 with radiomic and dosiomic features in the internal and external validation cohorts, respectively. ResNet34 achieved a best prediction AUC of 0.642, 0.657 and 0.762, 0.737 for radiomics based DL model (DLR) and RD based DL model (DLD) in the internal and external validation cohorts, respectively. Combined model of DLD + Dosiomics + clinical factors achieved a best AUC of 0.913, 0.821 and 0.805 in the training, internal, and external validation cohorts, respectively.

CONCLUSION: Although the dose was not responsible for the prediction accuracy, the combination of various feature extraction methods was a factor in improving the RE prediction accuracy. Combining DLD with dosiomic features was promising in the pretreatment prediction of RE for EC patients underwent VMAT.

PMID:39013503 | DOI:10.1016/j.radonc.2024.110438

Categories: Literature Watch

PTFSpot: deep co-learning on transcription factors and their binding regions attains impeccable universality in plants

Tue, 2024-07-16 06:00

Brief Bioinform. 2024 May 23;25(4):bbae324. doi: 10.1093/bib/bbae324.

ABSTRACT

Unlike animals, variability in transcription factors (TFs) and their binding regions (TFBRs) across the plants species is a major problem that most of the existing TFBR finding software fail to tackle, rendering them hardly of any use. This limitation has resulted into underdevelopment of plant regulatory research and rampant use of Arabidopsis-like model species, generating misleading results. Here, we report a revolutionary transformers-based deep-learning approach, PTFSpot, which learns from TF structures and their binding regions' co-variability to bring a universal TF-DNA interaction model to detect TFBR with complete freedom from TF and species-specific models' limitations. During a series of extensive benchmarking studies over multiple experimentally validated data, it not only outperformed the existing software by >30% lead but also delivered consistently >90% accuracy even for those species and TF families that were never encountered during the model-building process. PTFSpot makes it possible now to accurately annotate TFBRs across any plant genome even in the total lack of any TF information, completely free from the bottlenecks of species and TF-specific models.

PMID:39013383 | DOI:10.1093/bib/bbae324

Categories: Literature Watch

Self-adaptive deep learning-based segmentation for universal and functional clinical and preclinical CT image analysis

Tue, 2024-07-16 06:00

Comput Biol Med. 2024 Jul 15;179:108853. doi: 10.1016/j.compbiomed.2024.108853. Online ahead of print.

ABSTRACT

BACKGROUND: Methods to monitor cardiac functioning non-invasively can accelerate preclinical and clinical research into novel treatment options for heart failure. However, manual image analysis of cardiac substructures is resource-intensive and error-prone. While automated methods exist for clinical CT images, translating these to preclinical μCT data is challenging. We employed deep learning to automate the extraction of quantitative data from both CT and μCT images.

METHODS: We collected a public dataset of cardiac CT images of human patients, as well as acquired μCT images of wild-type and accelerated aging mice. The left ventricle, myocardium, and right ventricle were manually segmented in the μCT training set. After template-based heart detection, two separate segmentation neural networks were trained using the nnU-Net framework.

RESULTS: The mean Dice score of the CT segmentation results (0.925 ± 0.019, n = 40) was superior to those achieved by state-of-the-art algorithms. Automated and manual segmentations of the μCT training set were nearly identical. The estimated median Dice score (0.940) of the test set results was comparable to existing methods. The automated volume metrics were similar to manual expert observations. In aging mice, ejection fractions had significantly decreased, and myocardial volume increased by age 24 weeks.

CONCLUSIONS: With further optimization, automated data extraction expands the application of (μ)CT imaging, while reducing subjectivity and workload. The proposed method efficiently measures the left and right ventricular ejection fraction and myocardial mass. With uniform translation between image types, cardiac functioning in diastolic and systolic phases can be monitored in both animals and humans.

PMID:39013341 | DOI:10.1016/j.compbiomed.2024.108853

Categories: Literature Watch

Deep-learning denoising minimizes radiation exposure in neck CT beyond the limits of conventional reconstruction

Tue, 2024-07-16 06:00

Eur J Radiol. 2024 May 22;178:111523. doi: 10.1016/j.ejrad.2024.111523. Online ahead of print.

ABSTRACT

BACKGROUND: Neck computed tomography (NCT) is essential for diagnosing suspected neck tumors and abscesses, but radiation exposure can be an issue. In conventional reconstruction techniques, limiting radiation dose comes at the cost of diminished diagnostic accuracy. Therefore, this study aimed to evaluate the effects of an AI-based denoising post-processing software solution in low-dose neck computer tomography.

MATERIALS AND METHODS: From 01 September 2023 to 01 December 2023, we retrospectively included patients with clinically suspected neck tumors from the same single-source scanner. The scans were reconstructed using Advanced Modeled Iterative Reconstruction (Original) at 100% and simulated 50% and 25% radiation doses. Each dataset was post-processed using a novel denoising software solution (Denoising). Three radiologists with varying experience levels subjectively rated image quality, diagnostic confidence, sharpness, and contrast for all pairwise combinations of radiation dose and reconstruction mode in a randomized, blinded forced-choice setup. Objective image quality was assessed using ROI measurements of mean CT numbers, noise, and a contrast-to-noise ratio (CNR). An adequately corrected mixed-effects analysis was used to compare objective and subjective image quality.

RESULTS: At each radiation dose level, pairwise comparisons showed significantly lower image noise and higher CNR for Denoising than for Original (p < 0.001). In subjective analysis, image quality, diagnostic confidence, sharpness, and contrast were significantly higher for Denoising than for Original at 100 and 50 % (p < 0.001). However, there were no significant differences in the subjective ratings between Original 100 % and Denoising 25 % (p = 0.906).

CONCLUSIONS: The investigated denoising algorithm enables diagnostic-quality neck CT images with radiation doses reduced to 25% of conventional levels, significantly minimizing patient exposure.

PMID:39013270 | DOI:10.1016/j.ejrad.2024.111523

Categories: Literature Watch

Role of artificial intelligence, machine learning and deep learning models in corneal disorders - A narrative review

Tue, 2024-07-16 06:00

J Fr Ophtalmol. 2024 Jul 15;47(7):104242. doi: 10.1016/j.jfo.2024.104242. Online ahead of print.

ABSTRACT

In the last decade, artificial intelligence (AI) has significantly impacted ophthalmology, particularly in managing corneal diseases, a major reversible cause of blindness. This review explores AI's transformative role in the corneal subspecialty, which has adopted advanced technology for superior clinical judgment, early diagnosis, and personalized therapy. While AI's role in anterior segment diseases is less documented compared to glaucoma and retinal pathologies, this review highlights its integration into corneal diagnostics through imaging techniques like slit-lamp biomicroscopy, anterior segment optical coherence tomography (AS-OCT), and in vivo confocal biomicroscopy. AI has been pivotal in refining decision-making and prognosis for conditions such as keratoconus, infectious keratitis, and dystrophies. Multi-disease deep learning neural networks (MDDNs) have shown diagnostic ability in classifying corneal diseases using AS-OCT images, achieving notable metrics like an AUC of 0.910. AI's progress over two decades has significantly improved the accuracy of diagnosing conditions like keratoconus and microbial keratitis. For instance, AI has achieved a 90.7% accuracy rate in classifying bacterial and fungal keratitis and an AUC of 0.910 in differentiating various corneal diseases. Convolutional neural networks (CNNs) have enhanced the analysis of color-coded corneal maps, yielding up to 99.3% diagnostic accuracy for keratoconus. Deep learning algorithms have also shown robust performance in detecting fungal hyphae on in vivo confocal microscopy, with precise quantification of hyphal density. AI models combining tomography scans and visual acuity have demonstrated up to 97% accuracy in keratoconus staging according to the Amsler-Krumeich classification. However, the review acknowledges the limitations of current AI models, including their reliance on binary classification, which may not capture the complexity of real-world clinical presentations with multiple coexisting disorders. Challenges also include dependency on data quality, diverse imaging protocols, and integrating multimodal images for a generalized AI diagnosis. The need for interpretability in AI models is emphasized to foster trust and applicability in clinical settings. Looking ahead, AI has the potential to unravel the intricate mechanisms behind corneal pathologies, reduce healthcare's carbon footprint, and revolutionize diagnostic and management paradigms. Ethical and regulatory considerations will accompany AI's clinical adoption, marking an era where AI not only assists but augments ophthalmic care.

PMID:39013268 | DOI:10.1016/j.jfo.2024.104242

Categories: Literature Watch

Identifying rice field weeds from unmanned aerial vehicle remote sensing imagery using deep learning

Tue, 2024-07-16 06:00

Plant Methods. 2024 Jul 16;20(1):105. doi: 10.1186/s13007-024-01232-0.

ABSTRACT

BACKGROUND: Rice field weed object detection can provide key information on weed species and locations for precise spraying, which is of great significance in actual agricultural production. However, facing the complex and changing real farm environments, traditional object detection methods still have difficulties in identifying small-sized, occluded and densely distributed weed instances. To address these problems, this paper proposes a multi-scale feature enhanced DETR network, named RMS-DETR. By adding multi-scale feature extraction branches on top of DETR, this model fully utilizes the information from different semantic feature layers to improve recognition capability for rice field weeds in real-world scenarios.

METHODS: Introducing multi-scale feature layers on the basis of the DETR model, we conduct a differentiated design for different semantic feature layers. The high-level semantic feature layer adopts Transformer structure to extract contextual information between barnyard grass and rice plants. The low-level semantic feature layer uses CNN structure to extract local detail features of barnyard grass. Introducing multi-scale feature layers inevitably leads to increased model computation, thus lowering model inference speed. Therefore, we employ a new type of Pconv (Partial convolution) to replace traditional standard convolutions in the model.

RESULTS: Compared to the original DETR model, our proposed RMS-DETR model achieved an average recognition accuracy improvement of 3.6% and 4.4% on our constructed rice field weeds dataset and the DOTA public dataset, respectively. The average recognition accuracies reached 0.792 and 0.851, respectively. The RMS-DETR model size is 40.8 M with inference time of 0.0081 s. Compared with three classical DETR models (Deformable DETR, Anchor DETR and DAB-DETR), the RMS-DETR model respectively improved average precision by 2.1%, 4.9% and 2.4%.

DISCUSSION: This model is capable of accurately identifying rice field weeds in complex real-world scenarios, thus providing key technical support for precision spraying and management of variable-rate spraying systems.

PMID:39014411 | DOI:10.1186/s13007-024-01232-0

Categories: Literature Watch

Fully automated segmentation and volumetric measurement of ocular adnexal lymphoma by deep learning-based self-configuring nnU-net on multi-sequence MRI: a multi-center study

Tue, 2024-07-16 06:00

Neuroradiology. 2024 Jul 17. doi: 10.1007/s00234-024-03429-5. Online ahead of print.

ABSTRACT

PURPOSE: To evaluate nnU-net's performance in automatically segmenting and volumetrically measuring ocular adnexal lymphoma (OAL) on multi-sequence MRI.

METHODS: We collected T1-weighted (T1), T2-weighted and T1-weighted contrast-enhanced images with/without fat saturation (T2_FS/T2_nFS, T1c_FS/T1c_nFS) of OAL from four institutions. Two radiologists manually annotated lesions as the ground truth using ITK-SNAP. A deep learning framework, nnU-net, was developed and trained using two models. Model 1 was trained on T1, T2, and T1c, while Model 2 was trained exclusively on T1 and T2. A 5-fold cross-validation was utilized in the training process. Segmentation performance was evaluated using the Dice similarity coefficient (DSC), sensitivity, and positive prediction value (PPV). Volumetric assessment was performed using Bland-Altman plots and Lin's concordance correlation coefficient (CCC).

RESULTS: A total of 147 patients from one center were selected as training set and 33 patients from three centers were regarded as test set. For both Model 1 and 2, nnU-net demonstrated outstanding segmentation performance on T2_FS with DSC of 0.80-0.82, PPV of 84.5-86.1%, and sensitivity of 77.6-81.2%, respectively. Model 2 failed to detect 19 cases of T1c, whereas the DSC, PPV, and sensitivity for T1_nFS were 0.59, 91.2%, and 51.4%, respectively. Bland-Altman plots revealed minor tumor volume differences with 0.22-1.24 cm3 between nnU-net prediction and ground truth on T2_FS. The CCC were 0.96 and 0.93 in Model 1 and 2 for T2_FS images, respectively.

CONCLUSION: The nnU-net offered excellent performance in automated segmentation and volumetric assessment in MRI of OAL, particularly on T2_FS images.

PMID:39014270 | DOI:10.1007/s00234-024-03429-5

Categories: Literature Watch

Identification and Design of Novel Potential Antimicrobial Peptides Targeting Mycobacterial Protein Kinase PknB

Tue, 2024-07-16 06:00

Protein J. 2024 Jul 16. doi: 10.1007/s10930-024-10218-9. Online ahead of print.

ABSTRACT

Antimicrobial peptides have gradually gained advantages over small molecule inhibitors for their multifunctional effects, synthesising accessibility and target specificity. The current study aims to determine an antimicrobial peptide to inhibit PknB, a serine/threonine protein kinase (STPK), by binding efficiently at the helically oriented hinge region. A library of 5626 antimicrobial peptides from publicly available repositories has been prepared and categorised based on the length. Molecular docking using ADCP helped to find the multiple conformations of the subjected peptides. For each peptide served as input the tool outputs 100 poses of the subjected peptide. To maintain an efficient binding for relatively a longer duration, only those peptides were chosen which were seen to bind constantly to the active site of the receptor protein over all the poses observed. Each peptide had different number of constituent amino acid residues; the peptides were classified based on the length into five groups. In each group the peptide length incremented upto four residues from the initial length form. Five peptides were selected for Molecular Dynamic simulation in Gromacs based on higher binding affinity. Post-dynamic analysis and the frame comparison inferred that neither the shorter nor the longer peptide but an intermediate length of 15 mer peptide bound well to the receptor. Residual substitution to the selected peptides was performed to enhance the targeted interaction. The new complexes considered were further analysed using the Elastic Network Model (ENM) for the functional site's intrinsic dynamic movement to estimate the new peptide's role. The study sheds light on prospects that besides the length of peptides, the combination of constituent residues equally plays a pivotal role in peptide-based inhibitor generation. The study envisages the challenges of fine-tuned peptide recovery and the scope of Machine Learning (ML) and Deep Learning (DL) algorithm development. As the study was primarily meant for generation of therapeutics for Tuberculosis (TB), the peptide proposed by this study demands meticulous invitro analysis prior to clinical applications.

PMID:39014259 | DOI:10.1007/s10930-024-10218-9

Categories: Literature Watch

Evaluating the quality of radiomics-based studies for endometrial cancer using RQS and METRICS tools

Tue, 2024-07-16 06:00

Eur Radiol. 2024 Jul 16. doi: 10.1007/s00330-024-10947-6. Online ahead of print.

ABSTRACT

OBJECTIVE: To assess the methodological quality of radiomics-based models in endometrial cancer using the radiomics quality score (RQS) and METhodological radiomICs score (METRICS).

METHODS: We systematically reviewed studies published by October 30th, 2023. Inclusion criteria were original radiomics studies on endometrial cancer using CT, MRI, PET, or ultrasound. Articles underwent a quality assessment by novice and expert radiologists using RQS and METRICS. The inter-rater reliability for RQS and METRICS among radiologists with varying expertise was determined. Subgroup analyses were performed to assess whether scores varied according to study topic, imaging technique, publication year, and journal quartile.

RESULTS: Sixty-eight studies were analysed, with a median RQS of 11 (IQR, 9-14) and METRICS score of 67.6% (IQR, 58.8-76.0); two different articles reached maximum RQS of 19 and METRICS of 90.7%, respectively. Most studies utilised MRI (82.3%) and machine learning methods (88.2%). Characterisation and recurrence risk stratification were the most explored outcomes, featured in 35.3% and 19.1% of articles, respectively. High inter-rater reliability was observed for both RQS (ICC: 0.897; 95% CI: 0.821, 0.946) and METRICS (ICC: 0.959; 95% CI: 0.928, 0.979). Methodological limitations such as lack of external validation suggest areas for improvement. At subgroup analyses, no statistically significant difference was noted.

CONCLUSIONS: Whilst using RQS, the quality of endometrial cancer radiomics research was apparently unsatisfactory, METRICS depicts a good overall quality. Our study highlights the need for strict compliance with quality metrics. Adhering to these quality measures can increase the consistency of radiomics towards clinical application in the pre-operative management of endometrial cancer.

CLINICAL RELEVANCE STATEMENT: Both the RQS and METRICS can function as instrumental tools for identifying different methodological deficiencies in endometrial cancer radiomics research. However, METRICS also reflected a focus on the practical applicability and clarity of documentation.

KEY POINTS: The topic of radiomics currently lacks standardisation, limiting clinical implementation. METRICS scores were generally higher than the RQS, reflecting differences in the development process and methodological content. A positive trend in METRICS score may suggest growing attention to methodological aspects in radiomics research.

PMID:39014086 | DOI:10.1007/s00330-024-10947-6

Categories: Literature Watch

From vision to text: A comprehensive review of natural image captioning in medical diagnosis and radiology report generation

Tue, 2024-07-16 06:00

Med Image Anal. 2024 Jul 8;97:103264. doi: 10.1016/j.media.2024.103264. Online ahead of print.

ABSTRACT

Natural Image Captioning (NIC) is an interdisciplinary research area that lies within the intersection of Computer Vision (CV) and Natural Language Processing (NLP). Several works have been presented on the subject, ranging from the early template-based approaches to the more recent deep learning-based methods. This paper conducts a survey in the area of NIC, especially focusing on its applications for Medical Image Captioning (MIC) and Diagnostic Captioning (DC) in the field of radiology. A review of the state-of-the-art is conducted summarizing key research works in NIC and DC to provide a wide overview on the subject. These works include existing NIC and MIC models, datasets, evaluation metrics, and previous reviews in the specialized literature. The revised work is thoroughly analyzed and discussed, highlighting the limitations of existing approaches and their potential implications in real clinical practice. Similarly, future potential research lines are outlined on the basis of the detected limitations.

PMID:39013207 | DOI:10.1016/j.media.2024.103264

Categories: Literature Watch

Diagnostic accuracy of deep learning using speech samples in depression: a systematic review and meta-analysis

Tue, 2024-07-16 06:00

J Am Med Inform Assoc. 2024 Jul 16:ocae189. doi: 10.1093/jamia/ocae189. Online ahead of print.

ABSTRACT

OBJECTIVE: This study aims to conduct a systematic review and meta-analysis of the diagnostic accuracy of deep learning (DL) using speech samples in depression.

MATERIALS AND METHODS: This review included studies reporting diagnostic results of DL algorithms in depression using speech data, published from inception to January 31, 2024, on PubMed, Medline, Embase, PsycINFO, Scopus, IEEE, and Web of Science databases. Pooled accuracy, sensitivity, and specificity were obtained by random-effect models. The diagnostic Precision Study Quality Assessment Tool (QUADAS-2) was used to assess the risk of bias.

RESULTS: A total of 25 studies met the inclusion criteria and 8 of them were used in the meta-analysis. The pooled estimates of accuracy, specificity, and sensitivity for depression detection models were 0.87 (95% CI, 0.81-0.93), 0.85 (95% CI, 0.78-0.91), and 0.82 (95% CI, 0.71-0.94), respectively. When stratified by model structure, the highest pooled diagnostic accuracy was 0.89 (95% CI, 0.81-0.97) in the handcrafted group.

DISCUSSION: To our knowledge, our study is the first meta-analysis on the diagnostic performance of DL for depression detection from speech samples. All studies included in the meta-analysis used convolutional neural network (CNN) models, posing problems in deciphering the performance of other DL algorithms. The handcrafted model performed better than the end-to-end model in speech depression detection.

CONCLUSIONS: The application of DL in speech provided a useful tool for depression detection. CNN models with handcrafted acoustic features could help to improve the diagnostic performance.

PROTOCOL REGISTRATION: The study protocol was registered on PROSPERO (CRD42023423603).

PMID:39013193 | DOI:10.1093/jamia/ocae189

Categories: Literature Watch

QC-GN<sup>2</sup>oMS<sup>2</sup>: a Graph Neural Net for High Resolution Mass Spectra Prediction

Tue, 2024-07-16 06:00

J Chem Inf Model. 2024 Jul 16. doi: 10.1021/acs.jcim.4c00446. Online ahead of print.

ABSTRACT

Predicting the mass spectrum of a molecular ion is often accomplished via three generalized approaches: rules-based methods for bond breaking, deep learning, or quantum chemical (QC) modeling. Rules-based approaches are often limited by the conditions for different chemical subspaces and perform poorly under chemical regimes with few defined rules. QC modeling is theoretically robust but requires significant amounts of computational time to produce a spectrum for a given target. Among deep learning techniques, graph neural networks (GNNs) have performed better than previous work with fingerprint-based neural networks in mass spectra prediction. To explore this technique further, we investigate the effects of including quantum chemically derived information as edge features in the GNN to increase predictive accuracy. The models we investigated include categorical bond order, bond force constants derived from extended tight-binding (xTB) quantum chemistry, and acyclic bond dissociation energies. We evaluated these models against a control GNN with no edge features in the input graphs. Bond dissociation enthalpies yielded the best improvement with a cosine similarity score of 0.462 relative to the baseline model (0.437). In this work we also apply dynamic graph attention which improves performance on benchmark problems and supports the inclusion of edge features. Between implementations, we investigate the nature of the molecular embedding for spectra prediction and discuss the recognition of fragment topographies in distinct chemistries for further development in tandem mass spectrometry prediction.

PMID:39013165 | DOI:10.1021/acs.jcim.4c00446

Categories: Literature Watch

Stretchable Piezoresistive Pressure Sensor Array with Sophisticated Sensitivity, Strain-Insensitivity, and Reproducibility

Tue, 2024-07-16 06:00

Adv Sci (Weinh). 2024 Jul 16:e2405374. doi: 10.1002/advs.202405374. Online ahead of print.

ABSTRACT

This study delves into the development of a novel 10 by 10 sensor array featuring 100 pressure sensor pixels, achieving remarkable sensitivity up to 888.79 kPa-1, through the innovative design of sensor structure. The critical challenge of strain sensitivity inherent is addressed in stretchable piezoresistive pressure sensors, a domain that has seen significant interest due to their potential for practical applications. This approach involves synthesizing and electrospinning polybutadiene-urethane (PBU), a reversible cross-linking polymer, subsequently coated with MXene nanosheets to create a conductive fabric. This fabrication technique strategically enhances sensor sensitivity by minimizing initial current values and incorporating semi-cylindrical electrodes with Ag nanowires (AgNWs) selectively coated for optimal conductivity. The application of a pre-strain method to electrode construction ensures strain immunity, preserving the sensor's electrical properties under expansion. The sensor array demonstrated remarkable sensitivity by consistently detecting even subtle airflow from an air gun in a wind sensing test, while a novel deep learning methodology significantly enhanced the long-term sensing accuracy of polymer-based stretchable mechanical sensors, marking a major advancement in sensor technology. This research presents a significant step forward in enhancing the reliability and performance of stretchable piezoresistive pressure sensors, offering a comprehensive solution to their current limitations.

PMID:39013112 | DOI:10.1002/advs.202405374

Categories: Literature Watch

Triple-0: Zero-shot denoising and dereverberation on an end-to-end frozen anechoic speech separation network

Tue, 2024-07-16 06:00

PLoS One. 2024 Jul 16;19(7):e0301692. doi: 10.1371/journal.pone.0301692. eCollection 2024.

ABSTRACT

Speech enhancement is crucial both for human and machine listening applications. Over the last decade, the use of deep learning for speech enhancement has resulted in tremendous improvement over the classical signal processing and machine learning methods. However, training a deep neural network is not only time-consuming; it also requires extensive computational resources and a large training dataset. Transfer learning, i.e. using a pretrained network for a new task, comes to the rescue by reducing the amount of training time, computational resources, and the required dataset, but the network still needs to be fine-tuned for the new task. This paper presents a novel method of speech denoising and dereverberation (SD&D) on an end-to-end frozen binaural anechoic speech separation network. The frozen network requires neither any architectural change nor any fine-tuning for the new task, as is usually required for transfer learning. The interaural cues of a source placed inside noisy and echoic surroundings are given as input to this pretrained network to extract the target speech from noise and reverberation. Although the pretrained model used in this paper has never seen noisy reverberant conditions during its training, it performs satisfactorily for zero-shot testing (ZST) under these conditions. It is because the pretrained model used here has been trained on the direct-path interaural cues of an active source and so it can recognize them even in the presence of echoes and noise. ZST on the same dataset on which the pretrained network was trained (homo-corpus) for the unseen class of interference, has shown considerable improvement over the weighted prediction error (WPE) algorithm in terms of four objective speech quality and intelligibility metrics. Also, the proposed model offers similar performance provided by a deep learning SD&D algorithm for this dataset under varying conditions of noise and reverberations. Similarly, ZST on a different dataset has provided an improvement in intelligibility and almost equivalent quality as provided by the WPE algorithm.

PMID:39012881 | DOI:10.1371/journal.pone.0301692

Categories: Literature Watch

Streak artefact removal in x-ray dark-field computed tomography using a convolutional neural network

Tue, 2024-07-16 06:00

Med Phys. 2024 Jul 16. doi: 10.1002/mp.17305. Online ahead of print.

ABSTRACT

BACKGROUND: Computed tomography (CT) relies on the attenuation of x-rays, and is, hence, of limited use for weakly attenuating organs of the body, such as the lung. X-ray dark-field (DF) imaging is a recently developed technology that utilizes x-ray optical gratings to enable small-angle scattering as an alternative contrast mechanism. The DF signal provides structural information about the micromorphology of an object, complementary to the conventional attenuation signal. A first human-scale x-ray DF CT has been developed by our group. Despite specialized processing algorithms, reconstructed images remain affected by streaking artifacts, which often hinder image interpretation. In recent years, convolutional neural networks have gained popularity in the field of CT reconstruction, amongst others for streak artefact removal.

PURPOSE: Reducing streak artifacts is essential for the optimization of image quality in DF CT, and artefact free images are a prerequisite for potential future clinical application. The purpose of this paper is to demonstrate the feasibility of CNN post-processing for artefact reduction in x-ray DF CT and how multi-rotation scans can serve as a pathway for training data.

METHODS: We employed a supervised deep-learning approach using a three-dimensional dual-frame UNet in order to remove streak artifacts. Required training data were obtained from the experimental x-ray DF CT prototype at our institute. Two different operating modes were used to generate input and corresponding ground truth data sets. Clinically relevant scans at dose-compatible radiation levels were used as input data, and extended scans with substantially fewer artifacts were used as ground truth data. The latter is neither dose-, nor time-compatible and, therefore, unfeasible for clinical imaging of patients.

RESULTS: The trained CNN was able to greatly reduce streak artifacts in DF CT images. The network was tested against images with entirely different, previously unseen image characteristics. In all cases, CNN processing substantially increased the image quality, which was quantitatively confirmed by increased image quality metrics. Fine details are preserved during processing, despite the output images appearing smoother than the ground truth images.

CONCLUSIONS: Our results showcase the potential of a neural network to reduce streak artifacts in x-ray DF CT. The image quality is successfully enhanced in dose-compatible x-ray DF CT, which plays an essential role for the adoption of x-ray DF CT into modern clinical radiology.

PMID:39012833 | DOI:10.1002/mp.17305

Categories: Literature Watch

Surface Reconstruction from Point Clouds: A Survey and a Benchmark

Tue, 2024-07-16 06:00

IEEE Trans Pattern Anal Mach Intell. 2024 Jul 16;PP. doi: 10.1109/TPAMI.2024.3429209. Online ahead of print.

ABSTRACT

Reconstruction of a continuous surface of two-dimensional manifold from its raw, discrete point cloud observation is a long-standing problem in computer vision and graphics research. The problem is technically ill-posed, and becomes more difficult considering that various sensing imperfections would appear in the point clouds obtained by practical depth scanning. In literature, a rich set of methods has been proposed, and reviews of existing methods are also provided. However, existing reviews are short of thorough investigations on a common benchmark. The present paper aims to review and benchmark existing methods in the new era of deep learning surface reconstruction. To this end, we contribute a large-scale benchmarking dataset consisting of both synthetic and real-scanned data; the benchmark includes object- and scene-level surfaces and takes into account various sensing imperfections that are commonly encountered in practical depth scanning. We conduct thorough empirical studies by comparing existing methods on the constructed benchmark, and pay special attention on robustness of existing methods against various scanning imperfections; we also study how different methods generalize in terms of reconstructing complex surface shapes. Our studies help identity the best conditions under which different methods work, and suggest some empirical findings. For example, while deep learning methods are increasingly popular in the research community, our systematic studies suggest that, surprisingly, a few classical methods perform even better in terms of both robustness and generalization; our studies also suggest that the practical challenges of misalignment of point sets from multi-view scanning, missing of surface points, and point outliers remain unsolved by all the existing surface reconstruction methods. We expect that the benchmark and our studies would be valuable both for practitioners and as a guidance for new innovations in future research. We make the benchmark publicly accessible at https://Gorilla-Lab-SCUT.github.io/SurfaceReconstructionBenchmark.

PMID:39012756 | DOI:10.1109/TPAMI.2024.3429209

Categories: Literature Watch

Enhancing Generalizability in Biomedical Entity Recognition: Self-Attention PCA-CLS Model

Tue, 2024-07-16 06:00

IEEE/ACM Trans Comput Biol Bioinform. 2024 Jul 16;PP. doi: 10.1109/TCBB.2024.3429234. Online ahead of print.

ABSTRACT

One of the primary tasks in the early stages of data mining involves the identification of entities from biomedical corpora. Traditional approaches relying on robust feature engineering face challenges when learning from available (un-)annotated data using data-driven models like deep learning-based architectures. Despite leveraging large corpora and advanced deep learning models, domain generalization remains an issue. Attention mechanisms are effective in capturing longer sentence dependencies and extracting semantic and syntactic information from limited annotated datasets. To address out-of-vocabulary challenges in biomedical text, the PCA-CLS (Position and Contextual Attention with CNN-LSTM-Softmax) model combines global self-attention and character-level convolutional neural network techniques. The model's performance is evaluated on eight distinct biomedical domain datasets encompassing entities such as genes, drugs, diseases, and species. The PCA-CLS model outperforms several state-of-the-art models, achieving notable F1-scores, including 88.19% on BC2GM, 85.44% on JNLPBA, 90.80% on BC5CDR-chemical, 87.07% on BC5CDR-disease, 89.18% on BC4CHEMD, 88.81% on NCBI, and 91.59% on the s800 dataset.

PMID:39012749 | DOI:10.1109/TCBB.2024.3429234

Categories: Literature Watch

fNIRS-Driven Depression Recognition Based on Cross-Modal Data Augmentation

Tue, 2024-07-16 06:00

IEEE Trans Neural Syst Rehabil Eng. 2024 Jul 16;PP. doi: 10.1109/TNSRE.2024.3429337. Online ahead of print.

ABSTRACT

Early diagnosis and intervention of depression promote complete recovery, with its traditional clinical assessments depending on the diagnostic scales, clinical experience of doctors and patient cooperation. Recent researches indicate that functional near-infrared spectroscopy (fNIRS) based on deep learning provides a promising approach to depression diagnosis. However, collecting large fNIRS datasets within a standard experimental paradigm remains challenging, limiting the applications of deep networks that require more data. To address these challenges, in this paper, we propose an fNIRS-driven depression recognition architecture based on cross-modal data augmentation (fCMDA), which converts fNIRS data into pseudo-sequence activation images. The approach incorporates a time-domain augmentation mechanism, including time warping and time masking, to generate diverse data. Additionally, we design a stimulation task-driven data pseudo-sequence method to map fNIRS data into pseudo-sequence activation images, facilitating the extraction of spatial-temporal, contextual and dynamic characteristics. Ultimately, we construct a depression recognition model based on deep classification networks using the imbalance loss function. Extensive experiments are performed on the two-class depression diagnosis and five-class depression severity recognition, which reveal impressive results with accuracy of 0.905 and 0.889, respectively. The fCMDA architecture provides a novel solution for effective depression recognition with limited data.

PMID:39012734 | DOI:10.1109/TNSRE.2024.3429337

Categories: Literature Watch

Concept-based Lesion Aware Transformer for Interpretable Retinal Disease Diagnosis

Tue, 2024-07-16 06:00

IEEE Trans Med Imaging. 2024 Jul 16;PP. doi: 10.1109/TMI.2024.3429148. Online ahead of print.

ABSTRACT

Existing deep learning methods have achieved remarkable results in diagnosing retinal diseases, showcasing the potential of advanced AI in ophthalmology. However, the black-box nature of these methods obscures the decision-making process, compromising their trustworthiness and acceptability. Inspired by the concept-based approaches and recognizing the intrinsic correlation between retinal lesions and diseases, we regard retinal lesions as concepts and propose an inherently interpretable framework designed to enhance both the performance and explainability of diagnostic models. Leveraging the transformer architecture, known for its proficiency in capturing long-range dependencies, our model can effectively identify lesion features. By integrating with image-level annotations, it achieves the alignment of lesion concepts with human cognition under the guidance of a retinal foundation model. Furthermore, to attain interpretability without losing lesion-specific information, our method employs a classifier built on a cross-attention mechanism for disease diagnosis and explanation, where explanations are grounded in the contributions of human-understandable lesion concepts and their visual localization. Notably, due to the structure and inherent interpretability of our model, clinicians can implement concept-level interventions to correct the diagnostic errors by simply adjusting erroneous lesion predictions. Experiments conducted on four fundus image datasets demonstrate that our method achieves favorable performance against state-of-the-art methods while providing faithful explanations and enabling conceptlevel interventions. Our code is publicly available at https://github.com/Sorades/CLAT.

PMID:39012729 | DOI:10.1109/TMI.2024.3429148

Categories: Literature Watch

Radiomics of pituitary adenoma using computer vision: a review

Tue, 2024-07-16 06:00

Med Biol Eng Comput. 2024 Jul 16. doi: 10.1007/s11517-024-03163-3. Online ahead of print.

ABSTRACT

Pituitary adenomas (PA) represent the most common type of sellar neoplasm. Extracting relevant information from radiological images is essential for decision support in addressing various objectives related to PA. Given the critical need for an accurate assessment of the natural progression of PA, computer vision (CV) and artificial intelligence (AI) play a pivotal role in automatically extracting features from radiological images. The field of "Radiomics" involves the extraction of high-dimensional features, often referred to as "Radiomic features," from digital radiological images. This survey offers an analysis of the current state of research in PA radiomics. Our work comprises a systematic review of 34 publications focused on PA radiomics and other automated information mining pertaining to PA through the analysis of radiological data using computer vision methods. We begin with a theoretical exploration essential for understanding the theoretical background of radionmics, encompassing traditional approaches from computer vision and machine learning, as well as the latest methodologies in deep radiomics utilizing deep learning (DL). Thirty-four research works under examination are comprehensively compared and evaluated. The overall results achieved in the analyzed papers are high, e.g., the best accuracy is up to 96% and the best achieved AUC is up to 0.99, which establishes optimism for the successful use of radiomic features. Methods based on deep learning seem to be the most promising for the future. In relation to this perspective DL methods, several challenges are remarkable: It is important to create high-quality and sufficiently extensive datasets necessary for training deep neural networks. Interpretability of deep radiomics is also a big open challenge. It is necessary to develop and verify methods that will explain to us how deep radiomic features reflect various physics-explainable aspects.

PMID:39012416 | DOI:10.1007/s11517-024-03163-3

Categories: Literature Watch

Pages