Deep learning
Advancing neural decoding with deep learning
Nat Comput Sci. 2025 Jul 11. doi: 10.1038/s43588-025-00837-2. Online ahead of print.
NO ABSTRACT
PMID:40646317 | DOI:10.1038/s43588-025-00837-2
Transitive prediction of small-molecule function through alignment of high-content screening resources
Nat Biotechnol. 2025 Jul 11. doi: 10.1038/s41587-025-02729-2. Online ahead of print.
ABSTRACT
High-content image-based phenotypic screens (HCSs) provide a scalable approach to characterize biological functions of compounds. The widespread adoption of HCS has led to a growing body of available profile datasets. However, study-specific experimental and computational choices lead to profile datasets that cannot be directly combined. A critical, long-standing challenge is how to integrate these rich but currently isolated HCS dataset resources. Here we introduce a contrastive, deep-learning framework that leverages sparse sets of overlapping profiles as fiducials to align heterogeneous HCS profile datasets in a shared latent space. We demonstrate that this alignment facilitates accurate 'transitive' predictions, whereby the function of an uncharacterized compound screened in one dataset can be predicted through comparison with characterized compounds already profiled in other datasets. In silico alignment of HCS resources provides a path to unify fast-growing HCS resources and accelerate early drug discovery efforts.
PMID:40646169 | DOI:10.1038/s41587-025-02729-2
Improved model MASW YOLO for small target detection in UAV images based on YOLOv8
Sci Rep. 2025 Jul 11;15(1):25027. doi: 10.1038/s41598-025-10428-w.
ABSTRACT
The present paper proposes an algorithmic model, MASW-YOLO, that improves YOLOv8n. This model aims to address the problems of small targets, missed detection, and misdetection of UAV viewpoint feature detection targets. The backbone network structure is enhanced by incorporating a multi-scale convolutional MSCA attention mechanism, which introduces a deep convolution process to aggregate local information. This method aims to increase small-target detection accuracy. Concurrently, the neck network structure is reconstructed, with the fusion effect of multi-scale weakening of non-adjacent levels addressed by using an AFPN progressive pyramid network to replace the PANFPN structure of the base model. The MSCA and AFPN form a multiscale feature synergy mechanism, whereby the response values of MSCA become inputs to AFPN, and the multiscale integration of AFPN further amplifies the advantages of MSCA. The use of flexible non-maximum suppression Soft-NMS is chosen to replace the non-maximum suppression NMS to improve the model's detection of occlusion and dense targets. The loss function of the model is optimised through the implementation of Wise-IoU, which serves as a replacement for the loss function of the baseline model, thereby enhancing the accuracy of bounding box regression, especially perform better when the target deformation or scale change is large. Experiments conducted on the VisDrone2019 dataset demonstrate that the average detection accuracy of the MASW-YOLO algorithm is 38.3%, which is augmented by 7.9% through the utilisation of the original YOLOv8n network. Furthermore, the number of network parameters is reduced by 19.6%.
PMID:40646143 | DOI:10.1038/s41598-025-10428-w
A meta fusion model combining geographic data and twitter sentiment analysis for predicting accident severity
Sci Rep. 2025 Jul 11;15(1):25122. doi: 10.1038/s41598-025-91484-0.
ABSTRACT
In recent years, advancements in deep learning and real-time data processing have significantly enhanced traffic management and accident prediction capabilities. Building on these developments, this study introduces an innovative approach ConvoseqNet to improve traffic accident prediction by integrating traditional traffic data with real-time social media insights, specifically using geographic data and Twitter sentiment analysis. ConvoseqNet combines Convolutional Neural Networks (CNNs) with Long Short-Term Memory (LSTM) networks in a sequential architecture, enabling it to effectively capture complex spatiotemporal patterns in traffic data. To further enhance prediction accuracy, a meta-model called MetaFusionNetwork is proposed, which combines predictions from ConvoseqNet and a Random Forest Classifier. Results show that ConvoseqNet alone achieved the highest predictive accuracy, demonstrating its capacity to capture diverse accident-related patterns. Additionally, MetaFusionNetwork's performance highlights the advantages of combining model outputs for better prediction. This research contributes to real-time data-driven traffic management by leveraging innovative data fusion techniques, improving prediction accuracy, and providing insights into model interpretability and computational efficiency. By addressing the challenges of integrating heterogeneous data sources, this approach presents a significant advancement in traffic accident prediction and safety enhancement.
PMID:40646141 | DOI:10.1038/s41598-025-91484-0
Towards energy-efficient joint relay selection and resource allocation for D2D communication using hybrid heuristic-based deep learning
Sci Rep. 2025 Jul 12;15(1):25179. doi: 10.1038/s41598-025-08290-x.
ABSTRACT
Fifth generation (5G) networks are desired to offer improved data rates employed for enhancing innovations of device-to-device (D2D) communication, small base stations densification, and multi-tier heterogeneous networks. In relay-assisted D2D communication, relays are employed to minimize data rate degradation when D2D users are distant from one another. However, resource sharing between relay-based and cellular D2D connections often results in mutual interferences, reducing the system sum rate. Moreover, traditional relay nodes consume their own energy to support D2D communication without gaining any benefit, affecting network sustainability. To address these challenges, this work proposes an efficient relay selection and resource allocation using the novel hybrid manta ray foraging with chef-based optimization (HMRFCO). The relay selection process considers parameters like spectral efficiency, energy efficiency, throughput, delay, and network capacity to attain effectual performance. Then, the data provided as the input to the adaptive residual gated recurrent unit (AResGRU) model for the automatic prediction of an optimal number of relays and allocation of resources. Here, the AResGRU technique's parameters are optimized by the same HMRFCO for improving the prediction task. Finally, the designed AResGRU model offered the predicted outcome.
PMID:40646067 | DOI:10.1038/s41598-025-08290-x
Digital security risk identification and model construction of smart city based on deep learning
Sci Rep. 2025 Jul 11;15(1):25061. doi: 10.1038/s41598-025-09894-z.
ABSTRACT
In view of the network security risks caused by the integration of the Industrial Internet of Things (IIoT) in the construction of smart cities, this research proposes a digital security identification model (DL-DSIM) based on deep learning, which aims to improve the data transmission efficiency and system security in the smart city environment. With the widespread application of advanced technologies in smart cities, the rapid development of IIoT has become an important force in promoting its evolution, but at the same time, the expansion of the attack surface has also brought many security risks. In order to deal with these problems, this paper designs a flexible three-layer architecture framework, and introduces a new intrusion detection feature selection method that combines flock optimization (CSO) and genetic algorithm (GA) to reduce the complexity of feature selection and enhance the detection and processing power of security vulnerabilities through deep neural networks (DNN). The simulation experiment results show that DL-DSIM achieved 99.13% accuracy, 98.5% recall rate, 98.39% F value, 99.16% accuracy and 95.62% specificity in the training phase, and also achieved 96.1% accuracy, 95.48% recall rate, 96.38% F value, 95.89% accuracy and 93% specificity in the test phase. These achievements fully reflect the efficiency of DL-DSIM in resisting network security threats, provide a reliable security mechanism for IIoT systems in smart cities, and further promote the sustainable development of smart cities and the construction of digital security.
PMID:40646059 | DOI:10.1038/s41598-025-09894-z
A hybrid YOLO-UNet3D framework for automated protein particle annotation in Cryo-ET images
Sci Rep. 2025 Jul 11;15(1):25033. doi: 10.1038/s41598-025-09522-w.
ABSTRACT
Accurate localization and identification of protein complexes in cryo-electron tomography (cryo-ET) volumes are essential for understanding cellular functions and disease mechanisms. However, automated annotation of these macromolecular assemblies remains challenging due to low signal-to-noise ratios, missing wedge artifacts, heterogeneous backgrounds, and structural diversity. In this study, we present a hybrid framework integrating You Only Look Once (YOLO) object detection with UNet3D volumetric segmentation, enhanced by density-based spatial clustering of applications with noise (DBSCAN) post-processing for automated protein particle annotation in cryo-ET volumes. Our approach combines YOLO's efficient region proposal capabilities with UNet3D's powerful 3D feature extraction through a dual-branch architecture featuring optimized Spatial Pyramid Pooling-Fast (SPPF) modules and asymmetric feature splitting. Extensive experiments on the Chan Zuckerberg Initiative Imaging (CZII) cryo-ET dataset demonstrate that our method significantly outperforms existing state-of-the-art approaches, including DeepFinder, standard UNet3D, YOLOv5-3D, and 3D ResNet models, achieving a mean recall of 0.8848 and F4-score of 0.7969. The framework demonstrates robust performance across various protein particle types and imaging conditions, offering a promising technical solution for high-throughput structural biology workflows requiring accurate macromolecular annotation in cellular cryo-ET data.
PMID:40646021 | DOI:10.1038/s41598-025-09522-w
Mobile malware detection method using improved GhostNetV2 with image enhancement technique
Sci Rep. 2025 Jul 11;15(1):25019. doi: 10.1038/s41598-025-07742-8.
ABSTRACT
In recent years, image-based feature extraction and deep learning classification methods are widely used in the field of malware detection, which helps improve the efficiency of automatic malicious feature extraction and enhances the overall performance of detection models. However, recent studies reveal that adversarial sample generation techniques pose significant challenges to malware detection models, as their effectiveness significantly declines when identifying adversarial samples. To address this problem, we propose a malware detection method based on an improved GhostNetV2 model, which simultaneously enhances detection performance for both normal malware and adversarial samples. First, Android classes.dex files are converted into RGB images, and image enhancement is performed using the Local Histogram Equalization technique. Subsequently, the Gabor method is employed to transform three-channel images into single-channel images, ensuring consistent detection accuracy for malicious code while reducing training and inference time. Second, we make three improvements to GhostNetV2 to more effectively identify malicious code, including introducing channel shuffling in the Ghost module, replacing the squeeze and excitation mechanism with a more efficient channel attention mechanism, and optimizing the activation function. Finally, extensive experiments are conducted to evaluate the proposed method. Results demonstrate that our model achieves superior performance compared to 20 state-of-the-art deep learning models, attaining detection accuracies of 97.7% for normal malware and 92.0% for adversarial samples.
PMID:40646017 | DOI:10.1038/s41598-025-07742-8
Automated assessment of laparoscopic pattern cutting skills using computer vision and deep learning
Surgery. 2025 Jul 10;185:109540. doi: 10.1016/j.surg.2025.109540. Online ahead of print.
ABSTRACT
BACKGROUND: Pattern cutting assessment in Fundamentals of Laparoscopic Surgery currently relies on manual measurement, which can be time-consuming and prone to variability and human error. An automated, objective assessment system could enhance the efficiency, reliability, and standardization of surgical skills evaluation.
METHODS: We developed a machine learning-enhanced computer vision system that analyzes digital images of cut specimens, comparing them against predefined circular targets. The system uses a You Only Look Once-based deep learning model trained on synthetic data for specimen segmentation, complemented by comprehensive error analysis measuring both area-based and radial deviations. Its performance was evaluated using both synthetic test samples and real surgical specimens.
RESULTS: The segmentation model achieved 98.32% accuracy on synthetic test samples, with a mean absolute error of 34.7 mm2. Analysis of real surgical specimens demonstrated the system's ability to measure deviation accurately, with assessments aligning closely with expert surgical evaluations of cutting performance. The system provides detailed quantitative feedback through color-coded visualizations and radial deviation analysis.
CONCLUSION: The automated assessment system offers objective and quantitative evaluation of pattern cutting skills with potential for standardized implementation across Fundamentals of Laparoscopic Surgery testing centers. Although distortion in the shape of the sample material presents some challenges, the system's comprehensive analysis capabilities, rapid processing, and reduced need for human evaluation make it a promising tool for surgical skills assessment and training.
PMID:40644739 | DOI:10.1016/j.surg.2025.109540
Optimizing EV charging stations and power trading with deep learning and path optimization
PLoS One. 2025 Jul 11;20(7):e0325119. doi: 10.1371/journal.pone.0325119. eCollection 2025.
ABSTRACT
The rapid growth of electric vehicles (EVs) presents significant challenges for power grids, particularly in managing fluctuating demand and optimizing the placement of charging infrastructure. This study proposes an integrated framework combining deep learning, reinforcement learning, path optimization, and power trading strategies to address these challenges. A Long Short-Term Memory (LSTM) model was employed to predict regional EV charging demand, improving forecasting accuracy by 12.3%. A Deep Q-Network (DQN) optimized charging station placement, reducing supply-demand imbalances by 8.9%. Path optimization, using the Dijkstra algorithm, minimized travel times for EV users by 11.4%. Additionally, regional power trading was optimized to balance electricity supply and demand, reducing locational marginal price (LMP) disparities by 10%. The combined system resulted in reduced grid congestion, lower operational costs, and improved user satisfaction. These findings demonstrate the potential of integrating advanced machine learning techniques with power grid management to support the growing demand for EVs.
PMID:40644458 | DOI:10.1371/journal.pone.0325119
Short-horizon neonatal seizure prediction using EEG-based deep learning
PLOS Digit Health. 2025 Jul 11;4(7):e0000890. doi: 10.1371/journal.pdig.0000890. eCollection 2025 Jul.
ABSTRACT
Strategies to predict neonatal seizure risk have typically focused on long-term static predictions with prediction horizons spanning days during the acute postnatal period. Higher temporal resolution or short-horizon neonatal seizure prediction, on the time-frame of minutes, remains unexplored. Here, we investigated quantitative electroencephalography (QEEG) based deep learning (DL) for short-horizon seizure prediction. We used two publicly available EEG seizure datasets with a total of 132 neonates containing a total of 281 hours of EEG data. We benchmarked current state-of-the-art time-series DL methods for seizure prediction, identifying convolutional LSTM (ConvLSTM) as having the strongest performance at preictal state classification. We assessed ConvLSTM performance in a seizure alarm system over varying short-range (1-7 minutes) seizure prediction horizons (SPH) and seizure occurrence periods (SOP) and identified optimal performance at SPH 3 min and SOP 7 min, with AUROC 0.8. At 80% sensitivity, false detection rate was 0.68 events/hour with time-in-warning of 0.36. Model calibration was moderate, with an expected calibration error of 0.106. These findings establish the feasibility of short-horizon neonatal seizure prediction and warrant the need for further validation.
PMID:40644380 | DOI:10.1371/journal.pdig.0000890
Position Based Camera-2D LiDAR Fusion and Person Following for Mobile Robots
IEEE Int Conf Rehabil Robot. 2025 May;2025:314-319. doi: 10.1109/ICORR66766.2025.11062955.
ABSTRACT
Person following is a crucial feature for mobile robots, with many using 2D LiDAR-based tracking due to its cost-effectiveness. However, such methods have limitations, such as tracking only from the front or back and being prone to false positives from environmental artifacts. More advanced tracking approaches combine position and appearance information. These can be classified into two categories: 1) position-based tracking, which fuses features from RGBD cameras and LiDARs, and 2) image-based tracking, which uses deep learning to match person appearances in images and then computes position via 3D de-projection. While position-based tracking has shown to be effective for short-term tracking, it has not been tested for person following application on a real robot. Image-based methods have been deployed on real robots but have not been compared to position-based tracking of target person for person following application, where consistent tracking without ID switching is crucial. This work presents a position-based target person tracking system tested on a real robot for person following application using deep learning for person detection, multi-sensor fusion (RGBD cameras and LiDAR), and UCMCtrack algorithm. We compare this with the SORT algorithms for image-based tracking. Our results show that position-based tracking is more suitable for person following, as image-based methods are prone to ID switching due to close proximity in images but distant in 3D space.
PMID:40644251 | DOI:10.1109/ICORR66766.2025.11062955
Personalization of Wearable Sensor-Based Joint Kinematics Estimation Using Computer Vision for Hip Exoskeleton Applications
IEEE Int Conf Rehabil Robot. 2025 May;2025:534-539. doi: 10.1109/ICORR66766.2025.11063180.
ABSTRACT
Accurate lower-limb joint kinematics estimation is critical for patient monitoring, rehabilitation, and exoskeleton control. While previous studies have employed wearable sensor-based deep learning (DL) models for estimating joint kinematics, these methods often require extensive new datasets to adapt to unseen gait patterns. Meanwhile, researchers in computer vision have advanced human pose estimation models, which are easy to deploy and capable of real-time inference. However, such models are infeasible in scenarios where cameras cannot be used. To address these limitations, we propose a computer vision-based DL adaptation framework for realtime joint kinematics estimation. This framework requires only a small dataset (i.e., 1-2 gait cycles) and does not rely on professional motion capture setups. Using transfer learning, we adapted our temporal convolutional network (TCN) to stiff knee gait data, allowing the model to reduce root mean square error by 9.7 % and 19.9 % compared to a TCN trained on only able-bodied and stiff knee dataset, respectively. Our framework demonstrated a potential for smartphone camera-trained DL model to estimate real-time joint kinematics across novel users in clinical populations with applications in wearable robots.
PMID:40644220 | DOI:10.1109/ICORR66766.2025.11063180
Adaptive Torque Control of Exoskeletons Under Spasticity Conditions via Reinforcement Learning
IEEE Int Conf Rehabil Robot. 2025 May;2025:705-711. doi: 10.1109/ICORR66766.2025.11063182.
ABSTRACT
Spasticity is a common movement disorder symptom in individuals with cerebral palsy, hereditary spastic paraplegia, spinal cord injury and stroke, being one of the most disabling features in the progression of these diseases. Despite the potential benefit of using wearable robots to treat spasticity, their use is not currently recommended to subjects with a level of spasticity above $1^{+}$ in the Modified Ashworth Scale. The varying dynamics of this velocity-dependent tonic stretch reflexes makes difficult to deploy safe personalized controllers. Here, we describe a novel adaptive torque controller via deep reinforcement learning (RL) for a knee exoskeleton under joint spasticity conditions, which accounts for task performance and interaction forces reduction. To train the RL agent, we developed a digital twin, including a musculoskeletalexoskeleton system with joint misalignment and a differentiable spastic reflexes model for the muscles activation. Results for a simulated knee extension movement showed that the agent learns to control the exoskeleton for individuals with different levels of spasticity. The proposed controller was able to reduce maximum torques applied to the human joint under spastic conditions by an average of $\mathbf{1 0. 6 \%}$ and decreases the root mean square until the settling time by 8.9 % compared to a conventional compliant controller.
PMID:40644208 | DOI:10.1109/ICORR66766.2025.11063182
Exploring Cortical Responses to Blood Flow Restriction through Deep Learning
IEEE Int Conf Rehabil Robot. 2025 May;2025:546-552. doi: 10.1109/ICORR66766.2025.11063023.
ABSTRACT
Blood flow restriction (BFR) training, which combines low-intensity resistance exercises with restricted blood flow, is effective in promoting muscle hypertrophy and strength. However, its impact on cortical activity remains largely unexplored, presenting an opportunity to investigate neural mechanisms using brain-computer interfaces (BCIs). Deep learning (DL)-based BCIs, with their large capacity for decoding complex brain signals, offer a promising avenue for such exploration. This study utilized magnetoencephalography (MEG) to analyze cortical responses in six subjects across three conditions-before, during, and after BFR. After preprocessing steps, such as data standardization and Euclidean-space alignment to optimize performance, the BaseNet architecture was utilized to classify the data. The models were tested using within-subject, cross-subject, and cross-time data splits. The results revealed classification accuracy well above 90% for individual subjects, indicating that cortical responses to BFR are detectable on a personal level. However, cross-subject models achieved only chance-level accuracy (33%), highlighting significant variability between individuals. Cross-time models showed better performance, with accuracy exceeding 50%. These findings suggest that while BFR elicits distinct cortical activity patterns, these responses are highly individualized, presenting challenges for generalization.
PMID:40644184 | DOI:10.1109/ICORR66766.2025.11063023
Simultaneous Recognition of Locomotion Mode, Phase, and Phase Progression Using Deep Learning Models
IEEE Int Conf Rehabil Robot. 2025 May;2025:1-6. doi: 10.1109/ICORR66766.2025.11062982.
ABSTRACT
Despite advances in gait-assist wearable robots, application in real-world scenarios remains limited, largely due to challenges in developing an effective user intention recognition algorithm. These algorithms are crucial as they enable the robot to move harmoniously with the user by predicting their intent during various locomotion activities such as level walking, stair ascent, stair descent, and sit-to-stand. It is essential to not only identify these locomotion modes but also their phases and progression for real-time assistance. Traditional classification methods often require extensive manual feature extraction from signals like those from inertial measurement units (IMU), electromyography, and plantar force sensors. Recent machine learning, particularly deep learning approaches, have simplified this process through automatic feature extraction. However, no existing method simultaneously predicts locomotion modes, phases, and phase progression, which is significant for personalized assistance. This study introduces a deep learning framework that classifies locomotion modes and phases and estimates the phase progressions using IMU data from sensors placed on the sternum and limbs. Results from five participants show that our model effectively classifies the locomotion phase and well estimates the phase progression percentage. The model was evaluated using a leave-one-subject-out approach, ensuring generalizability across different users.
PMID:40644172 | DOI:10.1109/ICORR66766.2025.11062982
Optimizing Locomotor Task Sets for Training a Biological Joint Moment Estimator
IEEE Int Conf Rehabil Robot. 2025 May;2025:1518-1523. doi: 10.1109/ICORR66766.2025.11063074.
ABSTRACT
Accurate estimation of a user's biological joint moment from wearable sensor data is vital for improving exoskeleton control during real-world locomotor tasks. However, most state-of-the-art methods rely on deep learning techniques that necessitate extensive in-lab data collection, posing challenges in acquiring sufficient data to develop robust models. To address this challenge, we introduce a locomotor task set optimization strategy designed to identify a minimal, yet representative, set of tasks that preserves model performance while significantly reducing the data collection burden. In this optimization, we performed a cluster analysis on the dimensionally reduced biomechanical features of various cyclic and non-cyclic tasks. We identified the minimal viable clusters (i.e., tasks) to train a neural network for estimating hip joint moments and evaluated its performance. Our cross-validation analysis across subjects showed that the optimized task set-based model achieved a root mean squared error of $0.29 \pm 0.06 \text{Nm} / \text{kg}$. This performance was significantly better than using only cyclic tasks ($\mathbf{p}<0.05$) and was comparable to using the full set of tasks. Our results demonstrate the ability to maintain model accuracy while significantly reducing the cost associated with data collection and model training. This highlights the potential for future exoskeleton designers to leverage this strategy to minimize the data requirements for deep learning-based models in wearable robot control.
PMID:40644131 | DOI:10.1109/ICORR66766.2025.11063074
Event-based Stereo Depth Estimation: A Survey
IEEE Trans Pattern Anal Mach Intell. 2025 Jul 11;PP. doi: 10.1109/TPAMI.2025.3586559. Online ahead of print.
ABSTRACT
Stereopsis has widespread appeal in computer vision and robotics as it is the predominant way by which we perceive depth to navigate our 3D world. Event cameras are novel bio-inspired sensors that detect per-pixel brightness changes asynchronously, with very high temporal resolution and high dynamic range, enabling machine perception in high-speed motion and broad illumination conditions. The high temporal precision also benefits stereo matching, making disparity (depth) estimation a popular research area for event cameras ever since their inception. Over the last 30 years, the field has evolved rapidly, from low-latency, low-power circuit design to current deep learning (DL) approaches driven by the computer vision community. The bibliography is vast and difficult to navigate for non-experts due its highly interdisciplinary nature. Past surveys have addressed distinct aspects of this topic, in the context of applications, or focusing only on a specific class of techniques, but have overlooked stereo datasets. This survey provides a comprehensive overview, covering both instantaneous stereo and long-term methods suitable for simultaneous localization and mapping (SLAM), along with theoretical and empirical comparisons. It is the first to extensively review DL methods as well as stereo datasets, even providing practical suggestions for creating new benchmarks to advance the field. The main advantages and challenges faced by event-based stereo depth estimation are also discussed. Despite significant progress, challenges remain in achieving optimal performance in not only accuracy but also efficiency, a cornerstone of event-based computing. We identify several gaps and propose future research directions. We hope this survey inspires future research in depth estimation with event cameras and related topics, by serving as an accessible entry point for newcomers, as well as a practical guide for seasoned researchers in the community.
PMID:40644099 | DOI:10.1109/TPAMI.2025.3586559
Semi-supervised Medical Image Segmentation Using Heterogeneous Complementary Correction Network and Confidence Contrastive Learning
Interdiscip Sci. 2025 Jul 11. doi: 10.1007/s12539-025-00727-1. Online ahead of print.
ABSTRACT
Semi-supervised medical image segmentation techniques have demonstrated significant potential and effectiveness in clinical diagnosis. The prevailing approaches using the mean-teacher (MT) framework achieve promising image segmentation results. However, due to the unreliability of the pseudo labels generated by the teacher model, existing methods still have some inherent limitations that must be considered and addressed. In this paper, we propose an innovative semi-supervised method for medical image segmentation by combining the heterogeneous complementary correction network and confidence contrastive learning (HC-CCL). Specifically, we develop a triple-branch framework by integrating a heterogeneous complementary correction (HCC) network into the MT framework. HCC serves as an auxiliary branch that corrects prediction errors in the student model and provides complementary information. To improve the capacity for feature learning in our proposed model, we introduce a confidence contrastive learning (CCL) approach with a novel sampling strategy. Furthermore, we develop a momentum style transfer (MST) method to narrow the gap between labeled and unlabeled data distributions. In addition, we introduce a Cutout-style augmentation for unsupervised learning to enhance performance. Three medical image datasets (including left atrial (LA) dataset, NIH pancreas dataset, Brats-2019 dataset) were employed to rigorously evaluate HC-CCL. Quantitative results demonstrate significant performance advantages over existing approaches, achieving state-of-the-art performance across all metrics. The implementation will be released at https://github.com/xxmmss/HC-CCL .
PMID:40643755 | DOI:10.1007/s12539-025-00727-1
Current Applications and Limitations of Augmented Reality in Urological Surgery: A Practical Primer and 'State of the Field'
Curr Urol Rep. 2025 Jul 11;26(1):56. doi: 10.1007/s11934-025-01283-3.
ABSTRACT
PURPOSE OF REVIEW: To provide a primer for how augmented reality (AR)-guided surgical technology works at a fundamental level and discuss recent advances and limitations in a rapidly advancing field, including studies aiming to reduce current issues limiting wider adoption.
RECENT FINDINGS: Among the studies published within the last five years, AR-guided technologies have advanced from pre-operative planning to intraoperative use in procedures including robot-assisted radical prostatectomy, percutaneous nephrolithotomy, and renal transplantation. Artificial intelligence (AI) and deep learning techniques have allowed for development of automatic registration to address challenges with soft tissue deformation. Subspecialities which may benefit from further AR/MR adoption include reconstructive and andrology, which were underrepresented in our review. Augmented reality refers to the process of superimposing digital information (e.g., preoperative imaging) on top of the physical world. Along with its interactive counterpart, mixed reality (MR), AR has become an area of sustained research interest in the urological surgery space. This technology has significant implications for surgical accuracy, efficiency, and medical education. As a result, it is critical for clinicians to both be aware of advancements in the field and understand the basics of this technology. We discuss articles published from March 2021 to February 2025, across a range of urologic procedures and applications, and discuss how recent trends point to a shift towards higher-powered, prospective studies incorporating intraoperative usage of AR/MR.
PMID:40643724 | DOI:10.1007/s11934-025-01283-3