Deep learning

Development of an abdominal acupoint localization system based on AI deep learning

Mon, 2025-03-17 06:00

Zhongguo Zhen Jiu. 2025 Mar 12;45(3):391-396. doi: 10.13703/j.0255-2930.20240207-0003. Epub 2024 Oct 28.

ABSTRACT

This study aims to develop an abdominal acupoint localization system based on computer vision and convolutional neural networks (CNNs). To address the challenge of abdominal acupoint localization, a multi-task CNNs architecture was constructed and trained to locate the Shenque (CV8) and human body boundaries. Based on the identified Shenque (CV8), the system further deduces key characteristics of four acupoints: Shangwan (CV13), Qugu (CV2), and bilateral Daheng (SP15). An affine transformation matrix is applied to accurately map image coordinates to an acupoint template space, achieving precise localization of abdominal acupoints. Testing has verified that this system can accurately identify and locate abdominal acupoints in images. The development of this localization system provides technical support for TCM remote education, diagnostic assistance, and advanced TCM equipment, such as intelligent acupuncture robots, facilitating the standardization and intelligent advancement of acupuncture.

PMID:40097227 | DOI:10.13703/j.0255-2930.20240207-0003

Categories: Literature Watch

Artificial intelligence for predicting interstitial fibrosis and tubular atrophy using diagnostic ultrasound imaging and biomarkers

Mon, 2025-03-17 06:00

BMJ Health Care Inform. 2025 Mar 17;32(1):e101192. doi: 10.1136/bmjhci-2024-101192.

ABSTRACT

BACKGROUND: Chronic kidney disease (CKD) is a global health concern characterised by irreversible renal damage that is often assessed using invasive renal biopsy. Accurate evaluation of interstitial fibrosis and tubular atrophy (IFTA) is crucial for CKD management. This study aimed to leverage machine learning (ML) models to predict IFTA using a combination of ultrasonography (US) images and patient biomarkers.

METHODS: We retrospectively collected US images and biomarkers from 632 patients with CKD across three hospitals. The data were subjected to pre-processing, exclusion of sub-optimal images, and feature extraction using a dual-path convolutional neural network. Various ML models, including XGBoost, random forest and logistic regression, were trained and validated using fivefold cross-validation.

RESULTS: The dataset was divided into training and test datasets. For image-level IFTA classification, the best performance was achieved by combining US image features and patient biomarkers, with logistic regression yielding an area under the receiver operating characteristic curve (AUROC) of 99%. At the patient level, logistic regression combining US image features and biomarkers provided an AUROC of 96%. Models trained solely on US image features or biomarkers also exhibited high performance, with AUROC exceeding 80%.

CONCLUSION: Our artificial intelligence-based approach to IFTA classification demonstrated high accuracy and AUROC across various ML models. By leveraging patient biomarkers alone, this method offers a non-invasive and robust tool for early CKD assessment, demonstrating that biomarkers alone may suffice for accurate predictions without the added complexity of image-derived features.

PMID:40097202 | DOI:10.1136/bmjhci-2024-101192

Categories: Literature Watch

Magnetic resonance imaging-based radiation treatment plans for dogs may be feasible with the use of generative adversarial networks

Mon, 2025-03-17 06:00

Am J Vet Res. 2025 Mar 17:1-8. doi: 10.2460/ajvr.24.08.0248. Online ahead of print.

ABSTRACT

OBJECTIVE: The purpose of this research was to examine the feasibility of utilizing generative adversarial networks (GANs) to generate accurate pseudo-CT images for dogs.

METHODS: This study used head standard CT images and T1-weighted transverse with contrast 3-D fast spoiled gradient echo head MRI images from 45 nonbrachycephalic dogs that received treatment between 2014 and 2023. Two conditional GANs (CGANs), one with a U-Net generator and a PatchGAN discriminator and another with a residual neural network (ResNet) U-Net generator and ResNet discriminator were used to generate the pseudo-CT images.

RESULTS: The CGAN with a ResNet U-Net generator and ResNet discriminator had an average mean absolute error of 109.5 ± 153.7 HU, average peak signal-to-noise ratio of 21.2 ± 4.31 dB, normalized mutual information of 0.89 ± 0.05, and dice similarity coefficient of 0.91 ± 0.12. The dice similarity coefficient for the bone was 0.71 ± 0.17. Qualitative results indicated that the most common ranking was "slightly similar" for both models. The CGAN with a ResNet U-Net generator and ResNet discriminator produced more accurate pseudo-CT images than the CGAN with a U-Net generator and PatchGAN discriminator.

CONCLUSIONS: The study concludes that CGAN can generate relatively accurate pseudo-CT images but suggests exploring alternative GAN extensions.

CLINICAL RELEVANCE: Implementing generative learning into veterinary radiation therapy planning demonstrates the potential to reduce imaging costs and time.

PMID:40096825 | DOI:10.2460/ajvr.24.08.0248

Categories: Literature Watch

Optimized attention-enhanced U-Net for autism detection and region localization in MRI

Mon, 2025-03-17 06:00

Psychiatry Res Neuroimaging. 2025 Mar 14;349:111970. doi: 10.1016/j.pscychresns.2025.111970. Online ahead of print.

ABSTRACT

Autism spectrum disorder (ASD) is a neurodevelopmental condition that affects a child's cognitive and social skills, often diagnosed only after symptoms appear around age 2. Leveraging MRI for early ASD detection can improve intervention outcomes. This study proposes a framework for autism detection and region localization using an optimized deep learning approach with attention mechanisms. The pipeline includes MRI image collection, pre-processing (bias field correction, histogram equalization, artifact removal, and non-local mean filtering), and autism classification with a Symmetric Structured MobileNet with Attention Mechanism (SSM-AM). Enhanced by Refreshing Awareness-aided Election-Based Optimization (RA-EBO), SSM-AM achieves robust classification. Abnormality region localization utilizes a Multiscale Dilated Attention-based Adaptive U-Net (MDA-AUnet) further optimized by RA-EBO. Experimental results demonstrate that our proposed model outperforms existing methods, achieving an accuracy of 97.29%, sensitivity of 97.27%, specificity of 97.36%, and precision of 98.98%, significantly improving classification and localization performance. These results highlight the potential of our approach for early ASD diagnosis and targeted interventions. The datasets utilized for this work are publicly available at https://fcon_1000.projects.nitrc.org/indi/abide/.

PMID:40096789 | DOI:10.1016/j.pscychresns.2025.111970

Categories: Literature Watch

Exploring the significance of the frontal lobe for diagnosis of schizophrenia using explainable artificial intelligence and group level analysis

Mon, 2025-03-17 06:00

Psychiatry Res Neuroimaging. 2025 Mar 13;349:111969. doi: 10.1016/j.pscychresns.2025.111969. Online ahead of print.

ABSTRACT

Schizophrenia (SZ) is a complex mental disorder characterized by a profound disruption in cognition and emotion, often resulting in a distorted perception of reality. Magnetic resonance imaging (MRI) is an essential tool for diagnosing SZ which helps to understand the organization of the brain. Functional MRI (fMRI) is a specialized imaging technique to measure and map brain activity by detecting changes in blood flow and oxygenation. The proposed paper correlates the results using an explainable deep learning approach to identify the significant regions of SZ patients using group-level analysis for both structural MRI (sMRI) and fMRI data. The study found that the heat maps for Grad-CAM show clear visualization in the frontal lobe for the classification of SZ and CN with a 97.33% accuracy. The group difference analysis reveals that sMRI data shows intense voxel activity in the right superior frontal gyrus of the frontal lobe in SZ patients. Also, the group difference between SZ and CN during n-back tasks of fMRI data indicates significant voxel activation in the frontal cortex of the frontal lobe. These findings suggest that the frontal lobe plays a crucial role in the diagnosis of SZ, aiding clinicians in planning the treatment.

PMID:40096788 | DOI:10.1016/j.pscychresns.2025.111969

Categories: Literature Watch

Deep learning algorithm classification of tympanostomy tube images from a heterogenous pediatric population

Mon, 2025-03-17 06:00

Int J Pediatr Otorhinolaryngol. 2025 Mar 13;192:112311. doi: 10.1016/j.ijporl.2025.112311. Online ahead of print.

ABSTRACT

IMPORTANCE: The ability to augment routine post-operative tube check appointments with at-home digital otoscopes and deep learning AI could improve health care access as well as reduce financial and time burden on families.

OBJECTIVE: Tympanostomy tube checks are necessary but are also burdensome to families and impact access to care for other children seeking otolaryngologic care. Telemedicine care would be ideal, but ear exams are limited. This study aimed to assess whether an artificial intelligence (AI) algorithm trained with images from an over-the-counter digital otoscope can accurately assess tube status as in place and patent, extruded, or absent.

DESIGN: A prospective study of children aged 10 months to 10 years being seen for tympanostomy tube follow-up was carried out in three clinics from May-November 2023. A smartphone otoscope was used by non-MDs to capture images of the ear canal and tympanic membranes. Pediatric otolaryngologist exam findings (tube in place, extruded, absent) were used as a gold standard. A deep learning algorithm was trained and tested with these images. Statistical analysis was performed to determine the performance of the algorithm.

SETTING: 3 urban, pediatric otolaryngology clinics within an academic medical center.

PARTICIPANTS: Pediatric patients aged 10 months to 10 years with a past or current history of tympanostomy tubes were recruited. Patients were excluded from this study if they had a history of myringoplasty, tympanoplasty, or cholesteatoma. Main Outcome MeasureCalculated accuracy, sensitivity, and specificity for the deep learning algorithm in classifying tubal status as either in place and patent, extruded in external ear canal, or absent.

RESULTS: A heterogeneous group of 69 children yielded 296 images. Multiple types of tympanostomy tubes were included. The image capture success rate was 90.8 % in all subjects and 80 % in children with developmental delay/autism spectrum disorder. The classification accuracy was 97.1 %, sensitivity 97.1 %, and specificity 98.6 %.

CONCLUSION: A deep learning algorithm was trained with images from a representative pediatric population. It was highly accurate, sensitive, and specific. These results suggest that AI technology could be used to augment tympanostomy tube checks.

PMID:40096786 | DOI:10.1016/j.ijporl.2025.112311

Categories: Literature Watch

Extraction of fetal heartbeat locations in abdominal phonocardiograms using deep attention transformer

Mon, 2025-03-17 06:00

Comput Biol Med. 2025 Mar 16;189:110002. doi: 10.1016/j.compbiomed.2025.110002. Online ahead of print.

ABSTRACT

Assessing fetal health traditionally involves techniques like echocardiography, which require skilled professionals and specialized equipment, making them unsuitable for low-resource settings. An emerging alternative is Phonocardiography (PCG), which offers affordability but suffers from challenges related to accuracy and complexity. To address these limitations, we propose a deep learning model, Fetal Heart Sounds U-NetR (FHSU-NETR), capable of extracting both fetal and maternal heart rates directly from raw PCG signals. FHSU-NETR is designed for practical implementation in various healthcare environments, enhancing accessibility and reliability of fetal monitoring. Due to its enhanced capacity to simulate remote interactions and capture global context, the suggested pipeline utilizes the self-attention mechanism of the transformer. Validated with data from 20 normal subjects, including a case of fetal tachycardia arrhythmia, FHSU-NETR demonstrated exceptional performance. It accurately identified most of the fetal heartbeat locations with a low mean difference in fetal heart rate estimation (-2.55±10.25 bpm) across the entire dataset, and successfully detected the arrhythmia case. Similarly, FHSU-NETR showed a low mean difference in maternal heart rate estimation (-1.15±5.76 bpm) compared to the ground-truth maternal ECG. The model's exceptional ability to identify arrhythmia cases within the dataset underscores its potential for real-world application and generalization. By leveraging the capabilities of deep learning, our proposed model holds promise to reduce the reliance on medical experts for the interpretation of extensive PCG recordings, thereby enhancing efficiency in clinical settings.

PMID:40096767 | DOI:10.1016/j.compbiomed.2025.110002

Categories: Literature Watch

EEG-based emotion recognition with autoencoder feature fusion and MSC-TimesNet model

Mon, 2025-03-17 06:00

Comput Methods Biomech Biomed Engin. 2025 Mar 17:1-18. doi: 10.1080/10255842.2025.2477801. Online ahead of print.

ABSTRACT

Electroencephalography (EEG) signals are widely employed due to their spontaneity and robustness against artifacts in emotion recognition. However, existing methods are often unable to fully integrate high-dimensional features and capture changing patterns in time series when processing EEG signals, which results in limited classification performance. This paper proposes an emotion recognition method (AEF-DL) based on autoencoder fusion features and MSC-TimesNet models. Firstly, we segment the EEG signal in five frequency bands into time windows of 0.5 s, extract power spectral density (PSD) features and differential entropy (DE) features, and implement feature fusion using the autoencoder to enhance feature representation. Based on the TimesNet model and incorporating the multi-scale convolutional kernels, this paper proposes an innovative deep learning model (MSC-TimesNet) for processing fused features. MSC-TimesNet efficiently extracts inter-period and intra-period information. To validate the performance of the proposed method, we conducted systematic experiments on the public datasets DEAP and Dreamer. In dependent experiments with subjects, the classification accuracies reached 98.97% and 95.71%, respectively; in independent experiments with subjects, the accuracies reached 97.23% and 92.95%, respectively. These results demonstrate that the proposed method exhibits significant advantages over existing methods, highlighting its effectiveness and broad applicability in emotion recognition tasks.

PMID:40096584 | DOI:10.1080/10255842.2025.2477801

Categories: Literature Watch

Dynamic glucose enhanced imaging using direct water saturation

Mon, 2025-03-17 06:00

Magn Reson Med. 2025 Mar 17. doi: 10.1002/mrm.30447. Online ahead of print.

ABSTRACT

PURPOSE: Dynamic glucose enhanced (DGE) MRI studies employ CEST or spin lock (CESL) to study glucose uptake. Currently, these methods are hampered by low effect size and sensitivity to motion. To overcome this, we propose to utilize exchange-based linewidth (LW) broadening of the direct water saturation (DS) curve of the water saturation spectrum (Z-spectrum) during and after glucose infusion (DS-DGE MRI).

METHODS: To estimate the glucose-infusion-induced LW changes (ΔLW), Bloch-McConnell simulations were performed for normoglycemia and hyperglycemia in blood, gray matter (GM), white matter (WM), CSF, and malignant tumor tissue. Whole-brain DS-DGE imaging was implemented at 3 T using dynamic Z-spectral acquisitions (1.2 s per offset frequency, 38 s per spectrum) and assessed on four brain tumor patients using infusion of 35 g of D-glucose. To assess ΔLW, a deep learning-based Lorentzian fitting approach was used on voxel-based DS spectra acquired before, during, and post-infusion. Area-under-the-curve (AUC) images, obtained from the dynamic ΔLW time curves, were compared qualitatively to perfusion-weighted imaging parametric maps.

RESULTS: In simulations, ΔLW was 1.3%, 0.30%, 0.29/0.34%, 7.5%, and 13% in arterial blood, venous blood, GM/WM, malignant tumor tissue, and CSF, respectively. In vivo, ΔLW was approximately 1% in GM/WM, 5% to 20% for different tumor types, and 40% in CSF. The resulting DS-DGE AUC maps clearly outlined lesion areas.

CONCLUSIONS: DS-DGE MRI is highly promising for assessing D-glucose uptake. Initial results in brain tumor patients show high-quality AUC maps of glucose-induced line broadening and DGE-based lesion enhancement similar and/or complementary to perfusion-weighted imaging.

PMID:40096575 | DOI:10.1002/mrm.30447

Categories: Literature Watch

Accelerated EPR imaging using deep learning denoising

Mon, 2025-03-17 06:00

Magn Reson Med. 2025 Mar 17. doi: 10.1002/mrm.30473. Online ahead of print.

ABSTRACT

PURPOSE: Trityl OXO71-based pulse electron paramagnetic resonance imaging (EPRI) is an excellent technique to obtain partial pressure of oxygen (pO2) maps in tissues. In this study, we used deep learning techniques to denoise 3D EPR amplitude and pO2 maps.

METHODS: All experiments were performed using a 25 mT EPR imager, JIVA-25®. The MONAI implementation of four neural networks (autoencoder, Attention UNet, UNETR, and UNet) was tested, and the best model (UNet) was then enhanced with joint bilateral filters (JBF). The training dataset was comprised of 227 3D images (56 in vivo and 171 in vitro), 159 images for training, 45 for validation, and 23 for testing. UNet with 1, 2, and 3 JBF layers was tested to improve image SNR, focusing on multiscale structural similarity index measure and edge sensitivity preservation. The trained algorithm was tested using acquisitions with 15, 30, and 150 averages in vitro with a sealed deoxygenated OXO71 phantom and in vivo with fibrosarcoma tumors grown in a hind leg of C3H mice.

RESULTS: We demonstrate that UNet with 2 JBF layers (UNet+JBF2) provides the best outcome. We demonstrate that using the UNet+JBF2 model, the SNR of 15-shot amplitude maps provides higher SNR compared to 150-shot pre-filter maps, both in phantoms and in tumors, therefore, allowing 10-fold accelerated imaging. We demonstrate that the trained algorithm improves SNR in pO2 maps.

CONCLUSIONS: We demonstrate the application of deep learning techniques to EPRI denoising. Higher SNR will bring the EPRI technique one step closer to clinics.

PMID:40096518 | DOI:10.1002/mrm.30473

Categories: Literature Watch

YOLO-ACE: Enhancing YOLO with Augmented Contextual Efficiency for Precision Cotton Weed Detection

Mon, 2025-03-17 06:00

Sensors (Basel). 2025 Mar 6;25(5):1635. doi: 10.3390/s25051635.

ABSTRACT

Effective weed management is essential for protecting crop yields in cotton production, yet conventional deep learning approaches often falter in detecting small or occluded weeds and can be restricted by large parameter counts. To tackle these challenges, we propose YOLO-ACE, an advanced extension of YOLOv5s, which was selected for its optimal balance of accuracy and speed, making it well suited for agricultural applications. YOLO-ACE integrates a Context Augmentation Module (CAM) and Selective Kernel Attention (SKAttention) to capture multi-scale features and dynamically adjust the receptive field, while a decoupled detection head separates classification from bounding box regression, enhancing overall efficiency. Experiments on the CottonWeedDet12 (CWD12) dataset show that YOLO-ACE achieves notable mAP@0.5 and mAP@0.5:0.95 scores-95.3% and 89.5%, respectively-surpassing previous benchmarks. Additionally, we tested the model's transferability and generalization across different crops and environments using the CropWeed dataset, where it achieved a competitive mAP@0.5 of 84.3%, further showcasing its robust ability to adapt to diverse conditions. These results confirm that YOLO-ACE combines precise detection with parameter efficiency, meeting the exacting demands of modern cotton weed management.

PMID:40096500 | DOI:10.3390/s25051635

Categories: Literature Watch

Quality of Experience (QoE) in Cloud Gaming: A Comparative Analysis of Deep Learning Techniques via Facial Emotions in a Virtual Reality Environment

Mon, 2025-03-17 06:00

Sensors (Basel). 2025 Mar 5;25(5):1594. doi: 10.3390/s25051594.

ABSTRACT

Cloud gaming has rapidly transformed the gaming industry, allowing users to play games on demand from anywhere without the need for powerful hardware. Cloud service providers are striving to enhance user Quality of Experience (QoE) using traditional assessment methods. However, these traditional methods often fail to capture the actual user QoE because some users are not serious about providing feedback regarding cloud services. Additionally, some players, even after receiving services as per the Service Level Agreement (SLA), claim that they are not receiving services as promised. This poses a significant challenge for cloud service providers in accurately identifying QoE and improving actual services. In this paper, we have compared our previous proposed novel technique that utilizes a deep learning (DL) model to assess QoE through players' facial expressions during cloud gaming sessions in a virtual reality (VR) environment. The EmotionNET model technique is based on a convolutional neural network (CNN) architecture. Later, we have compared the EmotionNET technique with three other DL techniques, namely ConvoNEXT, EfficientNET, and Vision Transformer (ViT). We trained the EmotionNET, ConvoNEXT, EfficientNET, and ViT model techniques on our custom-developed dataset, achieving 98.9% training accuracy and 87.8% validation accuracy with the EmotionNET model technique. Based on the training and comparison results, it is evident that the EmotionNET model technique predicts and performs better than the other model techniques. At the end, we have compared the EmotionNET results on two network (WiFi and mobile data) datasets. Our findings indicate that facial expressions are strongly correlated with QoE.

PMID:40096493 | DOI:10.3390/s25051594

Categories: Literature Watch

Landsat Time Series Reconstruction Using a Closed-Form Continuous Neural Network in the Canadian Prairies Region

Mon, 2025-03-17 06:00

Sensors (Basel). 2025 Mar 6;25(5):1622. doi: 10.3390/s25051622.

ABSTRACT

The Landsat archive stands as one of the most critical datasets for studying landscape change, offering over 50 years of imagery. This invaluable historical record facilitates the monitoring of land cover and land use changes, helping to detect trends in and the dynamics of the Earth's system. However, the relatively low temporal frequency and irregular clear-sky observations of Landsat data pose significant challenges for multi-temporal analysis. To address these challenges, this research explores the application of a closed-form continuous-depth neural network (CFC) integrated within a recurrent neural network (RNN) called CFC-mmRNN for reconstructing historical Landsat time series in the Canadian Prairies region from 1985 to present. The CFC method was evaluated against the continuous change detection (CCD) method, widely used for Landsat time series reconstruction and change detection. The findings indicate that the CFC method significantly outperforms CCD across all spectral bands, achieving higher accuracy with improvements ranging from 33% to 42% and providing more accurate dense time series reconstructions. The CFC approach excels in handling the irregular and sparse time series characteristic of Landsat data, offering improvements in capturing complex temporal patterns. This study underscores the potential of leveraging advanced deep learning techniques like CFC to enhance the quality of reconstructed satellite imagery, thus supporting a wide range of remote sensing (RS) applications. Furthermore, this work opens up avenues for further optimization and application of CFC in higher-density time series datasets such as MODIS and Sentinel-2, paving the way for improved environmental monitoring and forecasting.

PMID:40096481 | DOI:10.3390/s25051622

Categories: Literature Watch

Fault Diagnosis Method for Centrifugal Pumps in Nuclear Power Plants Based on a Multi-Scale Convolutional Self-Attention Network

Mon, 2025-03-17 06:00

Sensors (Basel). 2025 Mar 5;25(5):1589. doi: 10.3390/s25051589.

ABSTRACT

The health status of rotating machinery equipment in nuclear power plants is of paramount importance for ensuring the overall normal operation of the power plant system. In particular, significant failures in large rotating machinery equipment, such as main pumps, pose critical safety hazards to the system. Therefore, this paper takes pump equipment as a representative of rotating machinery in nuclear power plants and proposes a fault diagnosis method based on a multi-scale convolutional self-attention network for three types of faults: outer ring fracture, inner ring fracture, and rolling element pitting corrosion. Within the multi-scale convolutional self-attention network, a multi-scale hybrid feature complementarity mechanism is introduced. This mechanism leverages an adaptive encoder to capture deep feature information from the acoustic signals of rolling bearings and constructs a hybrid-scale feature set based on deep features and original signal characteristics in the time-frequency domain. This approach enriches the fault information present in the feature set and establishes a nonlinear mapping relationship between fault features and rolling bearing faults. The results demonstrate that, without significantly increasing model complexity or the volume of feature data, this method achieves a substantial increase in fault diagnosis accuracy, exceeding 99.5% under both vibration signal and acoustic signal conditions.

PMID:40096472 | DOI:10.3390/s25051589

Categories: Literature Watch

Deep-Learning-Based Analysis of Electronic Skin Sensing Data

Mon, 2025-03-17 06:00

Sensors (Basel). 2025 Mar 6;25(5):1615. doi: 10.3390/s25051615.

ABSTRACT

E-skin is an integrated electronic system that can mimic the perceptual ability of human skin. Traditional analysis methods struggle to handle complex e-skin data, which include time series and multiple patterns, especially when dealing with intricate signals and real-time responses. Recently, deep learning techniques, such as the convolutional neural network, recurrent neural network, and transformer methods, provide effective solutions that can automatically extract data features and recognize patterns, significantly improving the analysis of e-skin data. Deep learning is not only capable of handling multimodal data but can also provide real-time response and personalized predictions in dynamic environments. Nevertheless, problems such as insufficient data annotation and high demand for computational resources still limit the application of e-skin. Optimizing deep learning algorithms, improving computational efficiency, and exploring hardware-algorithm co-designing will be the key to future development. This review aims to present the deep learning techniques applied in e-skin and provide inspiration for subsequent researchers. We first summarize the sources and characteristics of e-skin data and review the deep learning models applicable to e-skin data and their applications in data analysis. Additionally, we discuss the use of deep learning in e-skin, particularly in health monitoring and human-machine interactions, and we explore the current challenges and future development directions.

PMID:40096464 | DOI:10.3390/s25051615

Categories: Literature Watch

Research on Network Intrusion Detection Model Based on Hybrid Sampling and Deep Learning

Mon, 2025-03-17 06:00

Sensors (Basel). 2025 Mar 4;25(5):1578. doi: 10.3390/s25051578.

ABSTRACT

This study proposes an enhanced network intrusion detection model, 1D-TCN-ResNet-BiGRU-Multi-Head Attention (TRBMA), aimed at addressing the issues of incomplete learning of temporal features and low accuracy in the classification of malicious traffic found in existing models. The TRBMA model utilizes Temporal Convolutional Networks (TCNs) to improve the ResNet18 architecture and incorporates Bidirectional Gated Recurrent Units (BiGRUs) and Multi-Head Self-Attention mechanisms to enhance the comprehensive learning of temporal features. Additionally, the ResNet network is adapted into a one-dimensional version that is more suitable for processing time-series data, while the AdamW optimizer is employed to improve the convergence speed and generalization ability during model training. Experimental results on the CIC-IDS-2017 dataset indicate that the TRBMA model achieves an accuracy of 98.66% in predicting malicious traffic types, with improvements in precision, recall, and F1-score compared to the baseline model. Furthermore, to address the challenge of low identification rates for malicious traffic types with small sample sizes in unbalanced datasets, this paper introduces TRBMA (BS-OSS), a variant of the TRBMA model that integrates Borderline SMOTE-OSS hybrid sampling. Experimental results demonstrate that this model effectively identifies malicious traffic types with small sample sizes, achieving an overall prediction accuracy of 99.88%, thereby significantly enhancing the performance of the network intrusion detection model.

PMID:40096461 | DOI:10.3390/s25051578

Categories: Literature Watch

AD-VAE: Adversarial Disentangling Variational Autoencoder

Mon, 2025-03-17 06:00

Sensors (Basel). 2025 Mar 4;25(5):1574. doi: 10.3390/s25051574.

ABSTRACT

Face recognition (FR) is a less intrusive biometrics technology with various applications, such as security, surveillance, and access control systems. FR remains challenging, especially when there is only a single image per person as a gallery dataset and when dealing with variations like pose, illumination, and occlusion. Deep learning techniques have shown promising results in recent years using VAE and GAN, with approaches such as patch-VAE, VAE-GAN for 3D Indoor Scene Synthesis, and hybrid VAE-GAN models. However, in Single Sample Per Person Face Recognition (SSPP FR), the challenge of learning robust and discriminative features that preserve the subject's identity persists. To address these issues, we propose a novel framework called AD-VAE, specifically for SSPP FR, using a combination of variational autoencoder (VAE) and Generative Adversarial Network (GAN) techniques. The proposed AD-VAE framework is designed to learn how to build representative identity-preserving prototypes from both controlled and wild datasets, effectively handling variations like pose, illumination, and occlusion. The method uses four networks: an encoder and decoder similar to VAE, a generator that receives the encoder output plus noise to generate an identity-preserving prototype, and a discriminator that operates as a multi-task network. AD-VAE outperforms all tested state-of-the-art face recognition techniques, demonstrating its robustness. The proposed framework achieves superior results on four controlled benchmark datasets-AR, E-YaleB, CAS-PEAL, and FERET-with recognition rates of 84.9%, 94.6%, 94.5%, and 96.0%, respectively, and achieves remarkable performance on the uncontrolled LFW dataset, with a recognition rate of 99.6%. The AD-VAE framework shows promising potential for future research and real-world applications.

PMID:40096455 | DOI:10.3390/s25051574

Categories: Literature Watch

A Multimodal Data Fusion and Embedding Attention Mechanism-Based Method for Eggplant Disease Detection

Mon, 2025-03-17 06:00

Plants (Basel). 2025 Mar 4;14(5):786. doi: 10.3390/plants14050786.

ABSTRACT

A novel eggplant disease detection method based on multimodal data fusion and attention mechanisms is proposed in this study, aimed at improving both the accuracy and robustness of disease detection. The method integrates image and sensor data, optimizing the fusion of multimodal features through an embedded attention mechanism, which enhances the model's ability to focus on disease-related features. Experimental results demonstrate that the proposed method excels across various evaluation metrics, achieving a precision of 0.94, recall of 0.90, accuracy of 0.92, and mAP@75 of 0.91, indicating excellent classification accuracy and object localization capability. Further experiments, through ablation studies, evaluated the impact of different attention mechanisms and loss functions on model performance, all of which showed superior performance for the proposed approach. The multimodal data fusion combined with the embedded attention mechanism effectively enhances the accuracy and robustness of the eggplant disease detection model, making it highly suitable for complex disease identification tasks and demonstrating significant potential for widespread application.

PMID:40094753 | DOI:10.3390/plants14050786

Categories: Literature Watch

Integrative Approaches to Soybean Resilience, Productivity, and Utility: A Review of Genomics, Computational Modeling, and Economic Viability

Mon, 2025-03-17 06:00

Plants (Basel). 2025 Feb 21;14(5):671. doi: 10.3390/plants14050671.

ABSTRACT

Soybean is a vital crop globally and a key source of food, feed, and biofuel. With advancements in high-throughput technologies, soybeans have become a key target for genetic improvement. This comprehensive review explores advances in multi-omics, artificial intelligence, and economic sustainability to enhance soybean resilience and productivity. Genomics revolution, including marker-assisted selection (MAS), genomic selection (GS), genome-wide association studies (GWAS), QTL mapping, GBS, and CRISPR-Cas9, metagenomics, and metabolomics have boosted the growth and development by creating stress-resilient soybean varieties. The artificial intelligence (AI) and machine learning approaches are improving genetic trait discovery associated with nutritional quality, stresses, and adaptation of soybeans. Additionally, AI-driven technologies like IoT-based disease detection and deep learning are revolutionizing soybean monitoring, early disease identification, yield prediction, disease prevention, and precision farming. Additionally, the economic viability and environmental sustainability of soybean-derived biofuels are critically evaluated, focusing on trade-offs and policy implications. Finally, the potential impact of climate change on soybean growth and productivity is explored through predictive modeling and adaptive strategies. Thus, this study highlights the transformative potential of multidisciplinary approaches in advancing soybean resilience and global utility.

PMID:40094561 | DOI:10.3390/plants14050671

Categories: Literature Watch

A Diffusion-Based Detection Model for Accurate Soybean Disease Identification in Smart Agricultural Environments

Mon, 2025-03-17 06:00

Plants (Basel). 2025 Feb 22;14(5):675. doi: 10.3390/plants14050675.

ABSTRACT

Accurate detection of soybean diseases is a critical component in achieving intelligent agricultural management. However, traditional methods often underperform in complex field scenarios. This paper proposes a diffusion-based object detection model that integrates the endogenous diffusion sub-network and the endogenous diffusion loss function to progressively optimize feature distributions, significantly enhancing detection performance for complex backgrounds and diverse disease regions. Experimental results demonstrate that the proposed method outperforms multiple baseline models, achieving a precision of 94%, recall of 90%, accuracy of 92%, and mAP@50 and mAP@75 of 92% and 91%, respectively, surpassing RetinaNet, DETR, YOLOv10, and DETR v2. In fine-grained disease detection, the model performs best on rust detection, with a precision of 96% and a recall of 93%. For more complex diseases such as bacterial blight and Fusarium head blight, precision and mAP exceed 90%. Compared to self-attention and CBAM, the proposed endogenous diffusion attention mechanism further improves feature extraction accuracy and robustness. This method demonstrates significant advantages in both theoretical innovation and practical application, providing critical technological support for intelligent soybean disease detection.

PMID:40094551 | DOI:10.3390/plants14050675

Categories: Literature Watch

Pages