Deep learning

deep learning: Latest results from PubMed

URL: https://pubmed.ncbi.nlm.nih.gov/rss-feed/?feed_id=1rmqmrY0qppU3YGIhI4yOg4EHo8T8cGeZzT5QkT7cIOPiglIw_&utm_content=1rmqmrY0qppU3YGIhI4yOg4EHo8T8cGeZzT5QkT7cIOPiglIw_&utm_source=Other&v=2.18.0.post9+e462414&ff=20250810015524&fc=20210220094940&utm_medium=rss&utm_campaign=pubmed-2

Updated: 1 min 39 sec ago

An interpretable XAI deep EEG model for schizophrenia diagnosis using feature selection and attention mechanisms

Wed, 2025-08-06 06:00

Front Oncol. 2025 Jul 22;15:1630291. doi: 10.3389/fonc.2025.1630291. eCollection 2025.

ABSTRACT

INTRODUCTION: Schizophrenia is a severe psychological disorder that significantly impacts an individual's life and is characterized by abnormalities in perception, behavior, and cognition. Conventional Schizophrenia diagnosis techniques are time- consuming and prone to error. The study proposes a novel automated technique for diagnosing Schizophrenia based on electroencephalogram (EEG) sensor data, aiming to enhance interpretability and prediction performance.

METHODS: This research utilizes Deep Learning (DL) models, including the Deep Neural Network (DNN), Bi-Directional Long Short-Term Memory-Gated Recurrent Unit (BiLSTM- GRU), and BiLSTM with Attention, for the detection of Schizophrenia based on EEG data. During preprocessing, SMOTE is applied to address the class imbalance. Important EEG characteristics that influence model decisions are highlighted by the interpretable BiLSTM-Attention model using attention weights in conjunction with SHAP and LIME explainability tools. In addition to fine-tuning input dimensionality, F-test feature selection increases learning efficiency.

RESULTS: Through the integration of feature importance analysis and conventional performance measures, this study presents valuable insights into the discriminative neurophysiological patterns associated with Schizophrenia, advancing both diagnostic and neuroscientific expertise. The experiment's findings show that the BiLSTM with attention mechanism model provides and accuracy of 0.68%.

DISCUSSION: The results show that the recommended approach is useful for Schizophrenia diagnosis.

PMID:40766336 | PMC:PMC12324168 | DOI:10.3389/fonc.2025.1630291

Categories: Literature Watch

Cross-level Cross-Scale Inference and Imputation of Single-cell Spatial Proteomics

Wed, 2025-08-06 06:00

Res Sq [Preprint]. 2025 Jul 28:rs.3.rs-7108570. doi: 10.21203/rs.3.rs-7108570/v1.

ABSTRACT

High-throughput single-cell and spatial omics technologies have transformed biological research. Despite these advances, reliably identifying the molecular drivers and their interplays across biological levels and scales remains a significant challenge. Current experimental methods are limited by batch effects, the lack of simultaneous multi-modal measurements in individual cells, limited coverage of measured proteins, poor generalization to unseen conditions, and insufficient spatial context at a single-cell resolution. To overcome these challenges, we introduce scProSpatial, a unified, multi-modal, multi-scale deep learning framework designed to infer and impute high fidelity single-cell spatial proteomics from scRNA-seqs. Through comprehensive evaluations, scProSpatial accurately predicts spatial abundances of proteins in the absence of shared transcriptomics features, expands protein coverages by 50 times, and generalizes robustly to out-of-distribution scenarios. A case study in metastatic breast cancer further illustrates its utility, demonstrating scProSpatial's potential to drive cross-level, cross-scale multi-omics integration and analysis and reveal deeper insights into complex biological systems.

PMID:40766228 | PMC:PMC12324605 | DOI:10.21203/rs.3.rs-7108570/v1

Categories: Literature Watch

Measured spectrum environment map dataset with multi-radiation sources in urban scenarios

Wed, 2025-08-06 06:00

Data Brief. 2025 Jul 20;62:111909. doi: 10.1016/j.dib.2025.111909. eCollection 2025 Oct.

ABSTRACT

This paper presents a measured spectrum strength dataset in the urban scenario with multiple radiation sources, aiming to address the limitation of open datasets for spectrum environment map (SEM) in realistic multi-source dynamic scenarios. The dataset was collected through high-precision measurements, covering the 30 MHz, 115 MHz, and 2 GHz frequency bands, with a spatial resolution of 1m×1 m. It includes spectrum strength or received signal strength (RSS) data in dBm for 80×105 grids. Each grid includes the information such as longitude, latitude, altitude, and time. The experiment utilizes three radiation sources with isotropic antennas and a mobile signal receiving system equipped with a spectrum analyzer and a GPS module. It collects data along a pre-defined path at a constant speed. The key feature of this dataset is its realistic representation of nonline ar characteristics of propagation channel in a multi-radiation source coexistence scenario. Its applications include the verification of spectrum map completion algorithms, wireless channel modelling, deep learning-driven signal prediction, and the optimization of Wi-Fi/cellular networks.

PMID:40766197 | PMC:PMC12319677 | DOI:10.1016/j.dib.2025.111909

Categories: Literature Watch

Current applications of deep learning in vertebral fracture diagnosis

Tue, 2025-08-05 06:00

Osteoporos Int. 2025 Aug 6. doi: 10.1007/s00198-025-07604-z. Online ahead of print.

ABSTRACT

Deep learning is a machine learning method that mimics neural networks to build decision-making models. Recent advances in computing power and algorithms have enhanced deep learning's potential for vertebral fracture diagnosis in medical imaging. The application of deep learning in vertebral fracture diagnosis, including the identification of vertebrae and classification of vertebral fracture types, might significantly reduce the workload of radiologists and orthopedic surgeons as well as greatly improve the accuracy of vertebral fracture diagnosis. In this narrative review, we will summarize the application of deep learning models in the diagnosis of vertebral fractures.

PMID:40764417 | DOI:10.1007/s00198-025-07604-z

Categories: Literature Watch

BlurryScope enables compact, cost-effective scanning microscopy for HER2 scoring using deep learning on blurry images

Tue, 2025-08-05 06:00

NPJ Digit Med. 2025 Aug 6;8(1):506. doi: 10.1038/s41746-025-01882-x.

ABSTRACT

We developed a rapid scanning optical microscope, termed "BlurryScope", that leverages continuous image acquisition and deep learning to provide a cost-effective and compact solution for automated inspection and analysis of tissue sections. This device offers comparable speed to commercial digital pathology scanners, but at a significantly lower price point and smaller size/weight. Using BlurryScope, we implemented automated classification of human epidermal growth factor receptor 2 (HER2) scores on motion-blurred images of immunohistochemically (IHC) stained breast tissue sections, achieving concordant results with those obtained from a high-end digital scanning microscope. Using a test set of 284 unique patient cores, we achieved testing accuracies of 79.3% and 89.7% for 4-class (0, 1+, 2+, 3+) and 2-class (0/1+, 2+/3+) HER2 classification, respectively. BlurryScope automates the entire workflow, from image scanning to stitching and cropping, as well as HER2 score classification.

PMID:40764388 | DOI:10.1038/s41746-025-01882-x

Categories: Literature Watch

A deep learning framework for gender sensitive speech emotion recognition based on MFCC feature selection and SHAP analysis

Tue, 2025-08-05 06:00

Sci Rep. 2025 Aug 5;15(1):28569. doi: 10.1038/s41598-025-14016-w.

ABSTRACT

Speech is one of the most efficient methods of communication among humans, inspiring advancements in machine speech processing under Natural Language Processing (NLP). This field aims to enable computers to analyze, comprehend, and generate human language naturally. Speech processing, as a subset of artificial intelligence, is rapidly expanding due to its applications in emotion recognition, human-computer interaction, and sentiment analysis. This study introduces a novel algorithm for emotion recognition from speech using deep learning techniques. The proposed model achieves up to a 15% improvement compared to state-of-the-art deep learning methods in speech emotion recognition. It employs advanced supervised learning algorithms and deep neural network architectures, including Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units. These models are trained on labeled datasets to accurately classify emotions such as happiness, sadness, anger, fear, surprise, and neutrality. The research highlights the system's real-time application potential, such as analyzing audience emotional responses during live television broadcasts. By leveraging advancements in deep learning, the model achieves high accuracy in understanding and predicting emotional states, offering valuable insights into user behavior. This approach contributes to diverse domains, including media analysis, customer feedback systems, and human-machine interaction, showcasing the transformative potential of combining speech processing with neural networks.

PMID:40764384 | DOI:10.1038/s41598-025-14016-w

Categories: Literature Watch

PreMode predicts mode-of-action of missense variants by deep graph representation learning of protein sequence and structural context

Tue, 2025-08-05 06:00

Nat Commun. 2025 Aug 5;16(1):7189. doi: 10.1038/s41467-025-62318-4.

ABSTRACT

Accurate prediction of the functional impact of missense variants is important for disease gene discovery, clinical genetic diagnostics, therapeutic strategies, and protein engineering. Previous efforts have focused on predicting a binary pathogenicity classification, but the functional impact of missense variants is multi-dimensional. Pathogenic missense variants in the same gene may act through different modes of action (i.e., gain/loss-of-function) by affecting different aspects of protein function. They may result in distinct clinical conditions that require different treatments. We develop a new method, PreMode, to perform gene-specific mode-of-action predictions. PreMode models effects of coding sequence variants using SE(3)-equivariant graph neural networks on protein sequences and structures. Using the largest-to-date set of missense variants with known modes of action, we show that PreMode reaches state-of-the-art performance in multiple types of mode-of-action predictions by efficient transfer-learning. Additionally, PreMode's prediction of G/LoF variants in a kinase is consistent with inactive-active conformation transition energy changes. Finally, we show that PreMode enables efficient study design of deep mutational scans and can be expanded to fitness optimization of non-human proteins with active learning.

PMID:40764308 | DOI:10.1038/s41467-025-62318-4

Categories: Literature Watch

Smartphone video-based early diagnosis of blepharospasm using dual cross-attention modeling enhanced by facial pose estimation

Tue, 2025-08-05 06:00

NPJ Digit Med. 2025 Aug 5;8(1):505. doi: 10.1038/s41746-025-01904-8.

ABSTRACT

Blepharospasm is a focal dystonia characterized by involuntary eyelid contractions that impair vision and social function. The subtle clinical signs of blepharospasm make early and accurate diagnosis difficult, delaying timely intervention. In this study, we propose a dual cross-attention deep learning framework that integrates temporal video features and facial landmark dynamics to assess blepharospasm severity, frequency, and diagnosis from smartphone-recorded facial videos. A retrospective dataset of 847 patient videos collected from two hospitals (2016-2023) was used for model development. The model achieved high accuracy for severity (0.828) and frequency (0.82), and moderate performance for diagnosis (0.674).SHAP analysis identified case-specific video fragments contributing to predictions, enhancing interpretability. In a prospective evaluation on an independent dataset (N = 179), AI assistance improved junior ophthalmologist's diagnostic accuracy by up to 18.5%. These findings demonstrate the potential of an explainable, smartphone-compatible video model to support early detection and assessment of blepharospasm.

PMID:40764679 | DOI:10.1038/s41746-025-01904-8

Categories: Literature Watch

YOLO-LeafNet: a robust deep learning framework for multispecies plant disease detection with data augmentation

Tue, 2025-08-05 06:00

Sci Rep. 2025 Aug 5;15(1):28513. doi: 10.1038/s41598-025-14021-z.

ABSTRACT

Plant diseases significantly harm crops, resulting in significant economic losses across the globe. In order to reduce the harm that these diseases produce, plant diseases must be diagnosed accurately and timely manner. In this work, a YOLO-LeafNet approach is proposed for detecting diseases from leaf images of four distinct species, namely, grape, bell pepper, corn, and potato. About 8850 leaf images have been acquired for this work from five different publicly available datasets on Kaggle. All the acquired images were pre-processed by applying four different image pre-processing operations. The number of images in the training dataset was tripled for better model performance by applying five different augmentation operations. The augmented dataset was then used to train YoloV5, YoloV8, and the proposed YOLO-LeafNet. The performance of all three models is evaluated in terms of recall, precision, and Mean Average Precision (mAP). The YoloV5 attained a precision of 0.861, recall of 0.868, mAP50 of 0.944, and 0.815 of mAP50-95, and YoloV8 attained 0.977 precision, 0.975 recall, 0.984 of mAP50, and mAP50-95 of 0.915, whereas the proposed the YOLO-LeafNet attained precision of 0.985, recall of 0.980, mAP50 of 0.990, and mAP50-95 of 0.940. The experimental results reveal that the proposed YOLO-LeafNet outperformed YOLOv5 and YOLOv8 in terms of all performance metrics.

PMID:40764650 | DOI:10.1038/s41598-025-14021-z

Categories: Literature Watch

Transfer learning driven fake news detection and classification using large language models

Tue, 2025-08-05 06:00

Sci Rep. 2025 Aug 5;15(1):28490. doi: 10.1038/s41598-025-10670-2.

ABSTRACT

Today, the problem of using social media to spread false information is not only widespread but also quite serious. The extensive dissemination of fake news, regardless of whether it is produced by human beings or computer programs, has a negative impact not only on society but also on individuals in terms of politics and society. Currently of social networks, the quick dissemination of news provides a challenge when it comes to establishing the reliability of the information in a satisfactory manner. Because of this, the requirement for automated technologies that can identify fake news has become of the utmost importance. Existing fake news detection methods often suffer from challenges such as limited labeled data, inability to fully capture complex linguistic nuances, and inadequate integration of different embedding techniques, which restrict their effectiveness and generalizability. In this work, we propose a novel multi-stage transfer learning framework that leverages the strengths of pre-trained large language models, particularly RoBERTa, tailored specifically for fake news detection in limited data scenarios. Unlike prior studies which primarily rely on standard fine-tuning, our approach introduces a systematic comparison of word embedding techniques such as Word2Vec and one-hot encoding, combined with a refined fine-tuning process to enhance model performance and interpretability. The experimental results on two real-world benchmark datasets demonstrate that our method achieves a significant accuracy improvement of at least 3.9% over existing state-of-the-art models, while also providing insights into the role of embedding techniques in fake news classification. To address these limitations, our approach fills the gap by combining multi-stage transfer learning with embedding comparisons and task-specific optimizations, enabling more robust and accurate detection on small datasets. Based on the findings of our experiments conducted on two datasets derived from the real world, we have determined that the transfer learning-based strategy that we have developed can outperform the most advanced approaches by a minimum of 3.9% in terms of accuracy and offering a rational explanation.

PMID:40764622 | DOI:10.1038/s41598-025-10670-2

Categories: Literature Watch

Public concerns about human metapneumovirus: insights from Google search trends, X social networks, and web news mining to enhance public health communication

Tue, 2025-08-05 06:00

BMC Public Health. 2025 Aug 5;25(1):2650. doi: 10.1186/s12889-025-24017-z.

ABSTRACT

The respiratory virus known as human metapneumovirus (hMPV) is linked to seasonal outbreaks and primarily affects elderly people and young children. Infodemiology, which uses digital data sources, including social media, online news, and search trends, is a useful substitute for monitoring public concerns and risk perceptions because surveillance gaps and underreporting impede public health interventions despite their clinical value. To assess public search interest, we analyzed global search behavior between June 1, 2024, and June 1, 2025, and examined over 1.3 million tweets collected during the peak outbreak period from January to March 2025. Our findings show a sharp rise in public interest following official reports of HMPV outbreak in China, with simultaneous search peaks across both hemispheres regardless of season. Search activity expanded to 177 countries and revealed sustained interest in Australia, Thailand, the United Kingdom, and the United States. Regional differences in terminology and platform usage were also observed, with non-English-speaking countries favoring the abbreviation "HMPV" and English-speaking regions more often using the full term. Additionally, discrepancies between search activity and social media engagement in some countries point to distinct patterns of public information-seeking behavior. These results underscore the importance of adapting health communication strategies to local language norms and preferred digital platforms. They also highlight the need for real-time monitoring and proactive responses to misinformation. Together, search and social media data offer a valuable lens for understanding public sentiment and improving the reach, accuracy, and impact of global outbreak communication.

PMID:40764541 | DOI:10.1186/s12889-025-24017-z

Categories: Literature Watch

Dynamic and interpretable deep learning model for predicting respiratory failure following cardiac surgery

Tue, 2025-08-05 06:00

BMC Anesthesiol. 2025 Aug 5;25(1):394. doi: 10.1186/s12871-025-03239-z.

NO ABSTRACT

PMID:40764535 | DOI:10.1186/s12871-025-03239-z

Categories: Literature Watch

Pyramidal attention-based T network for brain tumor classification: a comprehensive analysis of transfer learning approaches for clinically reliable and reliable AI hybrid approaches

Tue, 2025-08-05 06:00

Sci Rep. 2025 Aug 6;15(1):28669. doi: 10.1038/s41598-025-11574-x.

ABSTRACT

Brain tumors are a significant challenge to human health as they impair the proper functioning of the brain and the general quality of life, thus requiring clinical intervention through early and accurate diagnosis. Although current state-of-the-art deep learning methods have achieved remarkable progress, there is still a gap in the representation learning of tumor-specific spatial characteristics and the robustness of the classification model on heterogeneous data. In this paper, we introduce a novel Pyramidal Attention-Based bi-partitioned T Network (PABT-Net) that combines the hierarchical pyramidal attention mechanism and T-block based bi-partitioned feature extraction, and a self-convolutional dilated neural classifier as the final task. Such an architecture increases the discriminability of the space and decreases the false forecasting by adaptively focusing on informative areas in brain MRI images. The model was thoroughly tested on three benchmark datasets, Figshare Brain Tumor Dataset, Sartaj Brain MRI Dataset, and Br35H Brain Tumor Dataset, containing 7023 images labeled in four tumor classes: glioma, meningioma, no tumor, and pituitary tumor. It attained an overall classification accuracy of 99.12%, a mean cross-validation accuracy of 98.77%, a Jaccard similarity index of 0.986, and a Cohen's Kappa value of 0.987, indicating superb generalization and clinical stability. The model's effectiveness is also confirmed by tumor-wise classification accuracies: 96.75%, 98.46%, and 99.57% in glioma, meningioma, and pituitary tumors, respectively. Comparative experiments with the state-of-the-art models, including VGG19, MobileNet, and NASNet, were carried out, and ablation studies proved the effectiveness of NASNet incorporation. To capture more prominent spatial-temporal patterns, we investigated hybrid networks, including NASNet with ANN, CNN, LSTM, and CNN-LSTM variants. The framework implements a strict nine-fold cross-validation procedure. It integrates a broad range of measures in its evaluation, including precision, recall, specificity, F1-score, AUC, confusion matrices, and the ROC analysis, consistent across distributions. In general, the PABT-Net model has high potential to be a clinically deployable, interpretable, state-of-the-art automated brain tumor classification model.

PMID:40764518 | DOI:10.1038/s41598-025-11574-x

Categories: Literature Watch

Deep-learning-enabled online mass spectrometry of the reaction product of a single catalyst nanoparticle

Tue, 2025-08-05 06:00

Nat Commun. 2025 Aug 5;16(1):7203. doi: 10.1038/s41467-025-62602-3.

ABSTRACT

Extracting weak signals from noise is a generic challenge in experimental science. In catalysis, it manifests itself as the need to quantify chemical reactions on nanoscopic surface areas, such as single nanoparticles or even single atoms. Here, we address this challenge by combining the ability of nanofluidic reactors to focus reaction product from tiny catalyst surfaces towards online mass spectrometric analysis with the high capacity of a constrained denoising auto-encoder to discern weak signals from noise. Using CO oxidation and C2H4 hydrogenation on Pd as model reactions, we demonstrate that the catalyst surface area required for online mass spectrometry can be reduced by ≈ 3 orders of magnitude compared to state of the art, down to a single nanoparticle with 0.0072 ± 0.00086 μm2 surface area. These results advocate deep learning to improve resolution in mass spectrometry in general and for online reaction analysis in single-particle catalysis in particular.

PMID:40764516 | DOI:10.1038/s41467-025-62602-3

Categories: Literature Watch

Development and evaluation of deep neural networks for the classification of subtypes of renal cell carcinoma from kidney histopathology images

Tue, 2025-08-05 06:00

Sci Rep. 2025 Aug 5;15(1):28585. doi: 10.1038/s41598-025-10712-9.

ABSTRACT

Kidney cancer is a leading cause of cancer-related mortality, with renal cell carcinoma (RCC) being the most prevalent form, accounting for 80-85% of all renal tumors. Traditional diagnosis of kidney cancer requires manual examination and analysis of histopathology images, which is time-consuming, error-prone, and depends on the pathologist's expertise. Recently, deep learning algorithms have gained significant attention in histopathology image analysis. In this study, we developed an efficient and robust deep learning architecture called RenalNet for the classification of subtypes of RCC from kidney histopathology images. The RenalNet is designed to capture cross-channel and inter-spatial features at three different scales simultaneously and combine them together. Cross-channel features refer to the relationships and dependencies between different data channels, while inter-spatial features refer to patterns within small spatial regions. The architecture contains a CNN module called multiple channel residual transformation (MCRT), to focus on the most relevant morphological features of RCC by fusing the information from multiple paths. Further, to improve the network's representation power, a CNN module called Group Convolutional Deep Localization (GCDL) has been introduced, which effectively integrates three different feature descriptors. As a part of this study, we also introduced a novel benchmark dataset for the classification of subtypes of RCC from kidney histopathology images. We obtained digital hematoxylin and eosin (H&E) stained WSIs from The Cancer Genome Atlas (TCGA) and acquired region of interest (ROIs) under the supervision of experienced pathologists resulted in the creation of patches. To demonstrate that the proposed model is generalized and independent of the dataset, it has experimented on three well-known datasets. Compared to the best-performing state-of-the-art model, RenalNet achieves accuracies of 91.67%, 97.14%, and 97.24% on three different datasets. Additionally, the proposed method significantly reduces the number of parameters and FLOPs, demonstrating computationally efficient with 2.71 × [Formula: see text] FLOPs & 0.2131 × [Formula: see text] parameters.

PMID:40764501 | DOI:10.1038/s41598-025-10712-9

Categories: Literature Watch

Encrypted traffic classification encoder based on lightweight graph representation

Tue, 2025-08-05 06:00

Sci Rep. 2025 Aug 5;15(1):28564. doi: 10.1038/s41598-025-05225-4.

ABSTRACT

In recent years, traffic encryption technology has been widely adopted for user information protection, leading to a substantial increase in encrypted traffic in communication networks. To address issues such as unclear local key features and low classification accuracy in traditional malicious traffic detection and normal application classification, this paper introduces an encrypted traffic classification encoder based on lightweight graph representation. By converting packet byte sequences into graphs to construct byte-level traffic graphs, we propose building a weighted output applied through a weight matrix to facilitate model lightweighting. The lightweight graph representation serves as the network input, and the design mainly includes an embedding layer, a traffic encoder layer based on graph neural networks, and a time information extraction layer, which can separately embed headers and payloads. We propose using GraphSAGE with sampling averaging to encode each byte-level traffic graph into an overall representation vector for each packet. For end-to-end training, an improved Transformer-based model is employed with relative position encoding of time series to generate final classification results for downstream tasks. To evaluate the reliability of the method, the proposed approach is tested on three application classification datasets: WWT, ISCX-2012, and ISCX-Tor, for classifying network encrypted traffic and conducting ablation experiments for comparison. Ultimately, comparison are made with more than 12 baseline models. The results show that the F1 scores reached 0.9938 and 0.9856 on ISCX-2012 and ISCX-Tor, respectively. Through lightweight experiments, it is found that the number of parameters is reduced by 18.2% compared to that of the original model TFE-GNN. Therefore, the results indicate that the proposed improved method can enhance the accuracy of detecting network traffic applications and abnormal behaviors while reducing the model's parameter count. Considering both the model parameters and accuracy dimensions, this paper introduces a lightweight graph representation-based encrypted traffic classification encoder that outperforms various existing models.

PMID:40764490 | DOI:10.1038/s41598-025-05225-4

Categories: Literature Watch

Road damage detection based on improved YOLO algorithm

Tue, 2025-08-05 06:00

Sci Rep. 2025 Aug 5;15(1):28506. doi: 10.1038/s41598-025-14461-7.

ABSTRACT

With urbanization accelerating and transportation demand growing, road damage has become an increasingly pressing issue. Traditional manual inspection methods are not only time-consuming but also costly, struggling to meet current demands. As a result, adopting deep learning-based road damage detection technologies has emerged as a leading-edge and efficient solution. This paper presents an enhanced object detection algorithm built upon YOLOv5. By integrating CA (Channel Attention) and SA (Spatial Attention) dual-branch attention mechanisms alongside the GIoU (Generalized Intersection over Union) loss, the model's detection accuracy and localization capabilities are strengthened. The dual-branch attention mechanisms enhance feature representation in channel and spatial dimensions, while the GIoU loss optimizes bounding box regression-yielding notable improvements, particularly in small object detection and bounding box localization accuracy. Public datasets are used for training and testing, with pavement distress indices derived from simulated detection calculations. Experimental results show that compared to existing methods, this algorithm boosts the retrieval rate by 2.3%, increases the average value by 0.3, and improves the harmonic mean F1 by 0.7 relative to other models. Additionally, expected pavement evaluation results are obtained through calculating PCI (Pavement Condition Index) values.

PMID:40764422 | DOI:10.1038/s41598-025-14461-7

Categories: Literature Watch

Vision-based volumetric estimation of localized construction and demolition waste

Tue, 2025-08-05 06:00

Waste Manag. 2025 Aug 4;206:115046. doi: 10.1016/j.wasman.2025.115046. Online ahead of print.

ABSTRACT

Accurate estimation of the quantity of localized construction and demolition waste (CDW) is critical for optimizing the upstream operations of the CDW's reverse supply chain (RSC). However, existing studies extensively focus on downstream RSC operations with approaches that quantify large-scale material stockpiles through semi-automated workflows relying on expensive, non-portable devices. These approaches are impractical for upstream operations such as quantifying small-scale, localized CDW stockpiles scattered around urban environments, requiring frequent estimations. In contrast, this study proposes a novel vision-based framework that enables automated, fast, and accurate volume estimation of small-scale localized CDW using a consumer-grade imaging device. The framework incorporates a hybrid segmentation technique involving a ground plane identification process through a novel rule-based modification to the Random Sample Consensus (RANSAC) algorithm, followed by a clustering process. A new Multi-View Classification Model (MVCM) based on ResNet-50 architecture is also developed to recognize CDW clusters. A Delaunay triangulation-based approach estimates the volume of recognized CDW clusters. The framework is developed and validated using one of the most extensive datasets comprising 184 scans from the laboratory and the field environment. The MVCM achieved a high F1 score of 0.97 for identifying CDW using 3500 images. The framework demonstrates high accuracy for volume estimation, achieving an absolute percentage error (APE) of 8.97% compared to manual measurements. The overall process achieves an end-to-end processing time of 11 min, underscoring its efficiency and suitability for field deployment. The proposed framework is of significant practical value for localized CDW quantification and decision-making in upstream RSC operations.

PMID:40763364 | DOI:10.1016/j.wasman.2025.115046

Categories: Literature Watch

Leveraging deep learning for the detection of socially desirable tendencies in personnel selection: A proof-of-concept

Tue, 2025-08-05 06:00

PLoS One. 2025 Aug 5;20(8):e0329205. doi: 10.1371/journal.pone.0329205. eCollection 2025.

ABSTRACT

We propose a deep learning-based method for detecting Socially Desirable Responding (SDR)-the tendency for individuals to distort questionnaire responses to present themselves in a favorable light. Our objective is to showcase that such novel methods can be leveraged to design instruments that have the potential to measure this construct in an effective way. Participants' tendency to engage in SDR was initially modelled by specifying a latent variable model from Big Five personality scores, using data from 91 participants in a job application simulation (Big Five questionnaire and video introduction). Nonverbal visual cues (5,460 data points following data augmentation) were extracted from the participants' video presentations in form of sequences of images for training a transfer learning model designated as Entrans. The objective of Entrans is to discern patterns within these cues in order to detect whether sample participants manifest higher or lower SDR tendency. We conducted a regression-based prediction task to train and evaluate Entrans, resulting in a promising performance (MSE = .07, RMSE = .27, ρ = .27). A further analysis was conducted using a classification-based prediction task, which corroborated the potential of Entrans as a tool for detecting SDR (AUC = .71). These results were further analyzed by a Grad-CAM method to elucidate the underlying model behaviors. Findings suggest that the middle and lower parts of the face were the regions relied upon by Entrans to identify individuals with higher tendency of SDR in the classification task. These tentative interpretations give rise to the suggestion that socially desirable responding in a questionnaire and impression management in a job interview might share a common underlying cause. While the detection of SDR during personnel selection presents a significant challenge for organizations, our proof-of-concept demonstrates how machine learning might be leveraged to develop practical solutions as well as addressing theoretical questions.

PMID:40763162 | DOI:10.1371/journal.pone.0329205

Categories: Literature Watch

UFPF: A Universal Feature Perception Framework for Microscopic Hyperspectral Images

Tue, 2025-08-05 06:00

IEEE Trans Image Process. 2025 Aug 5;PP. doi: 10.1109/TIP.2025.3594151. Online ahead of print.

ABSTRACT

In recent years, deep learning has shown immense promise in advancing medical hyperspectral imaging diagnostics at the microscopic level. Despite this progress, most existing research models remain constrained to single-task or single-scene applications, lacking robust collaborative interpretation of microscopic hyperspectral features and spatial information, thereby failing to fully explore the clinical value of hyperspectral data. In this paper, we propose a microscopic hyperspectral universal feature perception framework (UFPF), which extracts high-quality spatial-spectral features of hyperspectral data, providing a robust feature foundation for downstream tasks. Specifically, this innovative framework captures different sequential spatial nearest-neighbor relationships through a hierarchical corner-to-center mamba structure. It incorporates the concept of "progressive focus towards the center", starting by emphasizing edge information and gradually refining attention from the edges towards the center. This approach effectively integrates richer spatial-spectral information, boosting the model's feature extraction capability. On this basis, a dual-path spatial-spectral joint perception module is developed to achieve the complementarity of spatial and spectral information and fully explore the potential patterns in the data. In addition, a Mamba-attention Mix-alignment is designed to enhance the optimized alignment of deep semantic features. The experimental results on multiple datasets have shown that this framework significantly improves classification and segmentation performance, supporting the clinical application of medical hyperspectral data. The code is available at: https://github.com/Qugeryolo/UFPF.

PMID:40763051 | DOI:10.1109/TIP.2025.3594151

Categories: Literature Watch

Anil Jegga

Deep learning

An interpretable XAI deep EEG model for schizophrenia diagnosis using feature selection and attention mechanisms

Cross-level Cross-Scale Inference and Imputation of Single-cell Spatial Proteomics

Measured spectrum environment map dataset with multi-radiation sources in urban scenarios

Current applications of deep learning in vertebral fracture diagnosis

BlurryScope enables compact, cost-effective scanning microscopy for HER2 scoring using deep learning on blurry images

A deep learning framework for gender sensitive speech emotion recognition based on MFCC feature selection and SHAP analysis

PreMode predicts mode-of-action of missense variants by deep graph representation learning of protein sequence and structural context

Smartphone video-based early diagnosis of blepharospasm using dual cross-attention modeling enhanced by facial pose estimation

YOLO-LeafNet: a robust deep learning framework for multispecies plant disease detection with data augmentation

Transfer learning driven fake news detection and classification using large language models

Public concerns about human metapneumovirus: insights from Google search trends, X social networks, and web news mining to enhance public health communication

Dynamic and interpretable deep learning model for predicting respiratory failure following cardiac surgery

Pyramidal attention-based T network for brain tumor classification: a comprehensive analysis of transfer learning approaches for clinically reliable and reliable AI hybrid approaches

Deep-learning-enabled online mass spectrometry of the reaction product of a single catalyst nanoparticle

Development and evaluation of deep neural networks for the classification of subtypes of renal cell carcinoma from kidney histopathology images

Encrypted traffic classification encoder based on lightweight graph representation

Road damage detection based on improved YOLO algorithm

Vision-based volumetric estimation of localized construction and demolition waste

Leveraging deep learning for the detection of socially desirable tendencies in personnel selection: A proof-of-concept

UFPF: A Universal Feature Perception Framework for Microscopic Hyperspectral Images

Pages